Don't miss a thing from RSNA!

To ensure you receive all future messages, please add our new sender domain, info.rsna.org, to your contacts or safe sender list.

OK

AI Boosts Accuracy and Interobserver Agreement of Knee Osteoarthritis Grading

The Kellgren-Lawrence grading scale is important for treatment eligibility, clinical trial inclusion


John Carrino, MD, MPH
Carrino

AI assistance significantly improved the accuracy and consistency of knee osteoarthritis grading among radiologists and orthopedists, according to a new multicenter study.

The findings potentially pave the way for more reliable diagnoses, more equitable and consistent insurance approvals and uniform inclusion in clinical trials worldwide.

Affecting over 350 million people worldwide, knee osteoarthritis (OA) is a significant public health concern. Knee OA is commonly diagnosed via radiography, with orthopedists relying on these radiographic images when identifying candidates for surgery.

Knee radiographs are also used to diagnose and assess the severity of knee OA via the Kellgren-Lawrence (KL) grading scale. A score of zero to four is given, with zero meaning no osteoarthritis and four meaning severe osteoarthritis. Existing literature points to inconsistencies across interobserver agreement for grading knee OA.

“Accurate KL grading is crucial because it guides clinical decisions and, in the U.S., often determines insurance coverage for knee arthroplasty, impacting patient access to necessary treatment and also represents the FDA-recommended radiographic inclusion criteria for clinical trials of knee OA,” said John A. Carrino, MD, MPH, vice chairman of radiology and imaging at the Hospital for Special Surgery (HSS), and professor of radiology at Weill Cornell Medicine in New York City.

Dr. Carrino wrote a commentary on a Radiology study that evaluated the difference in readers’ interobserver performance with versus without AI assistance for KL grading and regardless of reader experience.

Authors of the study emphasize that several U.S. health insurance providers require a KL grade of three or four to approve a patient for knee replacement.

knee pain

AI Tools Can Reduce Variability, Subjectivity

The retrospective study included 11 readers, including junior and senior radiologists and orthopedists, who evaluated 225 standing knee radiographs for knee OA.

The radiographs were obtained from three teaching hospitals in the European Union and included adults aged 20 or older with standing frontal knee X-rays and a lateral view of the painful knee. Patients with acute pain, prior knee replacement, repeat exams or positioning devices were excluded.

The readers conducted their examinations with and without assistance from the commercially available, CE-certified AI tool RBknee.

The tool analyzed all weight-bearing frontal radiographs and corresponding lateral projection radiographs of the knee or knees, outputting the KL grade on the frontal view as well as the presence or absence of patellar osteophytes on the lateral view.

“AI tools like RBknee can mitigate the subjectivity and inconsistency inherent in KL grading across different human readers, clinical sites and imaging preferences,” Dr. Carrino said.

“The study’s demonstration of external validation across different sites, imaging acquisition methods and imaging equipment was surprising and highlights the potential scalability of AI for knee OA grading for both clinical practice and clinical trials.”

— JOHN A. CARRINO, MD, MPH

Reader Performance Improved Across Experience Levels

As compared to a reference standard of the majority vote of three musculoskeletal radiology consultants, the researchers found that AI assistance increased the KL grading performance of junior readers. Three of the six junior readers who participated in the study showed higher KL grading performance with AI assistance, versus without. Of the three junior readers who showed improvement, two were radiologists and one was an orthopedist.

Interobserver agreement for KL grading across all readers was also higher with AI assistance, resulting in an overall agreement that improved from moderate to strong. Senior readers achieved an almost perfect agreement when assisted with AI.

“The study showed that the tool could grade KL across several clinical sites in different countries in the EU that performed the acquisition in three distinct ways: standing posteroanterior, standing anteroposterior (AP) and standing stitched long leg AP images,” Dr. Carrino said. 

The study’s demonstration of external validation across different sites, imaging acquisition methods and imaging equipment was surprising and highlights the potential scalability of AI for knee OA grading for both clinical practice and clinical trials,” he emphasized.

The findings demonstrate that AI tools may improve the consistency of patient inclusion into clinical trials and candidacy for knee replacement surgery.

Strategies to Combat Automation Bias

The study authors noted that the potential for automation bias was high due to the design of the study. The easiest option for readers was to simply accept the AI tool’s output via a web-based platform where grading fields were pre-populated.

Automation bias was seen in about one-third of inaccurate KL grading due to an incorrect AI suggestion. Some peers express concerns about situations where users may over-rely on AI suggestions,” Dr. Carrino said.

As a way to combat automation bias, Dr. Carrino suggested that governance and educational strategies are essential. They can be used to monitor the AI model drift and educate current and future users on how to appropriately implement AI tools in their practice.

He added that this study could pave the way for futher exploration of AI assistance in musculoskeletal radiology. “This work emphasizes AI's potential to enhance the accuracy and consistency of knee OA grading, ultimately ensuring more uniform diagnosis and treatment of this widespread condition as well as uniform inclusion in clinical trials across imaging sites and countries worldwide,” Dr. Carrino said.

For More Information

Access the Radiology study, Interobserver Agreement and Performance of Concurrent AI Assistance for Radiographic Evaluation of Knee Osteoarthritis,” and the related commentary, “AI and the Potential for Uniform and Scalable Grading of Knee Osteoarthritis.”

Read previous RSNA News stories on musculoskeletal imaging: