Leveraging AI Models to Ease Screening Mammogram Workloads
Emerging research highlights the use of AI for enhancing early cancer detection and streamlining radiology practices


Deep learning models show great potential for improving cancer screening efficiency and accuracy while reducing radiologists’ workload. Implementation of AI in screening can also help detect cancers earlier, lead to better prognoses and improve the quality and accessibility of care.
“Use of AI in screen-reading is a ‘hot topic’, so there’s an urgency to effectively test and implement it,” said Solveig Hofvind, PhD, a professor at the Cancer Registry, Norwegian Institute of Public Health. “However, to use AI effectively we need to optimize its outcomes both for radiologists and for the women screened. This requires extensive studies.”
Dr. Hofvind is head of BreastScreen Norway, the nation’s population-based screening program. In a study published in Radiology: Artificial Intelligence, she and colleagues sought to determine the ability of two different AI models, referred to as Model A and Model B, to accurately diagnose and locate breast cancer on screening mammograms.
Model A was the commercially available model Lunit INSIGHT MMG, trained on diverse data from multiple countries and mammographic scanners. Model B was developed by the Norwegian Computing Center (NCC) and trained on 600,000 mammography exams taken as part of the BreastScreen Norway program.
Both models were tested on retrospective data from nearly 130,000 screening exams performed between 2008 and 2018. The testing data did not include data used in training Model B.

The models independently assigned a risk score to each mammogram indicating how likely it was to show cancer. Dr. Hofvind and her team set an 11.1% threshold for positive AI detection. The threshold mirrored the read consensus rate for the actual examinations. Both models also marked cancerous regions on the mammograms.
Mammograms of the breast cancer cases identified by each model were reviewed by a panel of three breast radiologists to assess the accuracy of the AI markings. These cases included screen-detected cancers, which are found through routine screening, and interval cancers, which arise between scheduled screenings after a normal result.
Interval cancer is a cancer that is diagnosed after a normal screening result and before the next scheduled screening. Screen-detected cancers are cancers that are identified through screening tests designed to find diseases in individuals who are asymptomatic.
“This study's novelty lies in its detailed, real-world assessment of both screen-detected and interval cancers,” said Alexandre Cadrin-Chênevert, MD, BEng, diagnostic and interventional radiologist at CISSS Lanaudière and Clinical Professor at Laval University in Quebec City.
“Testing the models at an operating threshold that matches the actual decision-making framework used by clinicians makes the results far more relevant to clinical practice than studies using arbitrary thresholds,” he added.
“This is one of the most compelling areas of AI research. A deep learning model trained on hundreds of thousands of mammograms can learn to identify incredibly subtle imaging patterns that correlate with malignancy; patterns that may be below the threshold of perception for the human eye or easily dismissed as normal tissue.”
— ALEXANDRE CADRIN-CHÊNEVERT, MD, BENG
Pairing AI May Boost Cancer Finds
The breast cancer cases identified by each model were divided depending on whether they were screen-detected or interval cancers. Model A identified 92.4% of screen-detected cancers, and Model B identified 93.7% of screen-detected cancers. Intriguingly, the two models had some non-overlapping results, with each detecting some cancers missed by the other.
“This is an important finding because it implies that an ensemble approach by combining multiple AI models could be more powerful than any single model, potentially increasing the overall cancer detection rate,” Dr. Cadrin-Chênevert said.
Each model was also able to identify a significant proportion of interval cancers, with Model A identifying 45.6% of interval cancers and Model B 44.7% of interval cancers. These results were also non-overlapping.
“In screening, a percentage of interval cancers, about 20-30% in informed reviews, are ‘missed’ or are false negative,” Dr. Hofvind said. “If AI can detect these cancers at screening and thereby reduce the rate of interval cancers, it would improve the sensitivity of mammographic screening.”
“This is one of the most compelling areas of AI research,” Dr. Cadrin-Chênevert said. “A deep learning model trained on hundreds of thousands of mammograms can learn to identify incredibly subtle imaging patterns that correlate with malignancy; patterns that may be below the threshold of perception for the human eye or easily dismissed as normal tissue.”
AI Models Accurately Locate Tumors
Combined, both models were able to accurately locate all screen-detected cancers on mammograms. More interesting however, according to Dr. Hofvind, was that in the 21.6% of interval cancer cases classified after radiologist review as false-negative or having minimal signs of malignancy, at least one of the models correctly identified suspicious sites.
“It was interesting to see how the two AI models corresponded in the location of some interval cancers,” Dr. Hofvind said. “Such cases could be chosen for consensus automatically, without needing a radiologist’s review.”
Precise, automated localization would also significantly streamline further examinations after mammography, by helping direct radiologists’ attention to sites of greatest concern.
“This finding suggests that these AI tools can act as a powerful safety net, highlighting regions that might otherwise be overlooked and thus potentially leading to earlier detection.” Dr. Cadrin-Chênevert said.

Testing AI’s Promise in Practice
Dr. Hofvind said that there are various ways in which AI models can be incorporated into screening programs. For example, AI could be used for triage, screening all exams and prioritizing those at highest risk, allowing radiologists to focus on more urgent cases first.
Alternatively, it could serve as a “second reader” in screening programs, providing a second opinion in addition to that of a radiologist. A third option is to use AI for decision support, using its localization ability to focus radiologists’ attention on areas of concern.
However, Dr. Cadrin-Chênevert noted that to effectively use AI in any of these modes, requires significant further research. One potential limitation for an AI model, for example, is a lack of diversity in its training data.
“True robustness comes from training on a diverse dataset that includes patients from various ethnic backgrounds, different age groups and—critically—images from different scanner manufacturers,” Dr. Cadrin-Chênevert said. “Publicly accessible datasets like Annotated Digital Mammograms and Associated Non-Image Datasets (ADMANI) are vital because they provide this diversity.”
Though the results of this study suggest a promising future for AI use in screening, the study was performed on retrospective data. Dr. Hofvind emphasized that the performance of these models remains to be tested through prospective studies.
“We plan to continue our work on retrospective data and to use it to establish validation and quality assurance guidelines.” Dr. Hofvind said. “We have also started a randomized control trial to test cancer detection, recall and consensus when using AI as a triage tool. We have included about 30,000 women so far and aim to include 140,000 in total.”
For More Information
Access the Radiology: Artificial Intelligence article, “Performance of Two Deep Learning-Based AI Models for Breast Cancer Detection and Localization on Screening Mammograms from BreastScreen Norway,” and the related editorial, “Beyond Double Reading: Multiple Deep Learning Models Enhancing Radiologist-led Breast Screening.”
Sharpen your skills with a curated playlist of breast imaging education offerings on EdCentral.
Read previous RSNA News articles on breast imaging: