Int J Gastrointest Interv 2024; 13(3): 65-73
Published online July 31, 2024 https://doi.org/10.18528/ijgii240013
Copyright © International Journal of Gastrointestinal Intervention.
Anson Mwango1,2 , Tayyab Saeed Akhtar2,3 , Sameen Abbas4 , Dua Sadaf Abbasi4 , and Amjad Khan4,5,*
1Department of Clinical Medicine and Therapeutics, University of Nairobi, Nairobi, Kenya
2Faculty of Life Science and Education, University of South Wales, Cardiff, United Kingdom
3Center for Liver and Digestive Diseases, Holy Family Hospital, Rawalpindi, Pakistan
4Department of Pharmacy, Quaid-i-Azam University, Islamabad, Pakistan
5Department of Pharmacy Administration and Clinical Pharmacy, School of Pharmacy, Health Science Center, Xi’an Jiaotong University, Xi’an, China
Correspondence to:*Department of Pharmacy, Quaid-i-Azam University, Islamabad 45320, Pakistan.
E-mail address: amjadkhan@qau.edu.pk (A. Khan).
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/bync/4.0) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Colorectal cancer has substantial morbidity and mortality. Approximately one-quarter of cases are overlooked during screening colonoscopy, leading to interval colorectal cancer. The use of artificial intelligence (AI) through deep learning systems has demonstrated promising results in the detection of polyps and adenomas. Consequently, our objective was to evaluate the impact of AI on adenoma detection. To identify relevant studies, we searched the PubMed, MEDLINE, and Cochrane Library databases without restrictions on publication date. Ultimately, we analyzed 16 randomized controlled trials involving 13,685 participants. The primary outcome assessed was the effect of AI-assisted colonoscopy (AIAC) on the adenoma detection rate (ADR). Secondary outcomes included the polyp detection rate (PDR) and adenomas per colonoscopy (APC). A random-effects model was used to calculate pooled effect sizes, and statistical heterogeneity was evaluated using the Higgins I2 statistic, with I2 cutoff points of 25%, 50%, and 75% indicating low, moderate, and high heterogeneity, respectively. Publication bias was investigated using a funnel plot, and the quality of evidence was appraised using the Grading of Recommendations, Assessment, Development, and Evaluation framework. The findings indicated a 26% greater ADR with AIAC than with standard colonoscopy (40.4% vs. 31.9%). Additionally, AIAC was associated with a 30% greater PDR (52.9% vs. 40.1%) and a 44% higher APC. The findings demonstrate that the integration of AI in colonoscopy improves ADR, PDR, and APC, potentially reducing the incidence of interval colorectal cancer.
Keywords: Adenoma, Artificial intelligence, Colonoscopy, Colorectal neoplasms
Colorectal cancer (CRC) is a significant contributor to global morbidity and mortality. In 2019, it accounted for 2.17 million cases and 1.09 million deaths, while its incidence has more than doubled over the last 10 years.1 Risk factors for CRC include both modifiable and non-modifiable characteristics. Non-modifiable factors include age, family history of CRC, genetic mutations, race, and a history of inflammatory bowel disease. Modifiable factors include diet, smoking, alcohol intake, physical inactivity, and high body mass index. Most CRC lesions arise from adenomatous polyps or sessile serrated lesions. The adenoma-carcinoma pathway accounts for 60%–70% of all CRCs, while the serrated pathway produces about 15%–30% of CRC lesions. These premalignant lesions exhibit identifiable features on colonoscopy.2
CRC mortality can be reduced by addressing modifiable risk factors and employing various screening methods. These include stool-based tests, semi-invasive radiographic methods, and direct visualization of the distal or entire colon through sigmoidoscopy or colonoscopy.2 While colonoscopy is highly sensitive in detecting precancerous and cancerous lesions, its effectiveness can vary, with overlooked cases potentially developing into post-colonoscopy CRC (PCCRC) or interval CRC.3 A recent meta-analysis reported a 26% rate of missed adenomas during colonoscopy, which can be attributed to limitations in mucosal exposure and endoscopist recognition.4 Quality measures have been developed to improve lesion detection, such as the adenoma detection rate (ADR) and the cecal intubation rate. A minimum ADR of 30% for men and 20% for women is recommended, with each 1% increase in ADR reportedly resulting in a 3% decline in PCCRC incidence.5,6
Artificial intelligence (AI) using deep learning shows promise in advancing colonoscopy practice by autonomously analyzing image data to identify patterns. Computer-aided detection (CADe) and computer-aided diagnosis (CADx) are key applications of AI in this field.7 AI also aids in polyp detection, classification, screening, and surveillance, potentially reducing healthcare costs by eliminating the need to remove low-risk polyps. AI may also be beneficial in medical education.8–10 Physician sentiment toward AI is generally positive, and recent clinical trials have broadened the evidence base for the impact of AI on adenoma detection. Given the importance of screening for CRC, the role of AI in colonoscopy is becoming increasingly pivotal. This technology can address the known stages of CRC development, the time required for CRC to develop, and the available procedures for detecting and removing polyps. Consequently, it presents a considerable opportunity to advance CRC prevention and early detection through screening. CADe systems improve adenoma detection, instilling confidence in physicians. This systematic review aims to evaluate the impact of AI-assisted colonoscopy (AIAC) on ADR. The study also explores the effects of AIAC on withdrawal time, polyp detection rate (PDR), and advanced adenomas per colonoscopy (APC).
The application of AI in gastrointestinal endoscopy, particularly colonoscopy, has garnered considerable interest in recent years. Accordingly, this review was conducted, adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.11
A comprehensive literature search was conducted across the PubMed, MEDLINE, and Cochrane Libraries databases using specific search criteria. A filter for “trials” was applied without publication date limitations. The search terms included “Colonoscopy,” “Adenoma,” “AI,” “polyp,” “ADR,” “CADe,” “Colon,” “Colorectal,” “Deep learning,” and other relevant terms. Additional articles were obtained from the references of the identified studies.
The inclusion criteria for this review were participants aged 18 years or older who had undergone colonoscopy. Studies eligible for inclusion were required to be published in English or translated into English, utilize a randomized controlled trial (RCT) design, and include an intervention involving AIAC. Articles published solely as abstracts without detailed data were excluded from the analysis. These criteria were established to ensure a thorough and rigorous examination of relevant research on the effects of AIAC in randomized controlled environments, emphasizing studies with adequate detail for a substantive analysis.
The selection process involved multiple stages to ensure the inclusion of relevant studies. Following the removal of duplicates, titles and abstracts were screened to identify potentially eligible articles. Subsequently, the full texts of these articles were retrieved for a more detailed evaluation based on the inclusion and exclusion criteria. A primary investigator was responsible for data extraction. To capture relevant study details, a data abstraction form was employed, which included information such as the study title, first author, publication year, study design, AI methodology, patient demographics (number, mean age, sex, and Boston Bowel Preparation Scale), characteristics of the endoscopists, ADR, PDR, and withdrawal time.
A risk of bias assessment was conducted to ensure the validity and quality of the included studies. The Cochrane risk of bias tool was utilized for RCTs to identify potential sources of bias that could affect the reliability and generalizability of the results. Studies with a high risk of bias were excluded.
Before initiating this study, we obtained ethical approval from the Subgroup and Faculty Research Ethics Committee at the University of South Wales.
Statistical analysis was performed using SPSS version 21 (IBM Corp.), with a
The initial search strategy identified 633 studies. Of these, 51 were excluded due to being duplicates. The remaining 582 studies underwent screening for inclusion in the analysis, resulting in the exclusion of 565. The full details for one study could not be retrieved. Ultimately, 16 studies were evaluated for eligibility. Further scrutiny of the references for these studies did not yield any additional studies for inclusion. Therefore, only these 16 studies were included in the analysis (Fig. 1).
The 16 studies that met the inclusion criteria involved a total of 13,685 participants, with 6,820 undergoing AIAC and 6,865 undergoing standard colonoscopy. These studies were conducted between 2017 and 2022. The smallest study had a sample size of 223 participants, while the largest included 3,059 participants. Seven studies were conducted in China,14–20 three in Italy,21–23 and one each in England,24 the United States,25 Israel,26 Thailand,27 Latvia,28 and Spain.29 All but six of the studies were single-center in design, and five were not blinded. No significant differences were found between groups undergoing AIAC or standard colonoscopy. Each study employed a convolutional neural network machine learning approach within their respective CADe systems. Tables 1 and 214–29 provide a more detailed summary of the study and sociodemographic characteristics. The overall risk of bias among the included studies was low, as indicated in Fig. 2 and 3.14–29
Table 1 . Characteristics of the Included Studies.
Author | Publication year | Country | AI use | No. of endoscopists | Endoscopist experience | Screening colonoscopy (%) |
---|---|---|---|---|---|---|
Liu et al14 | 2020 | China | Withdrawal | - | - | 66 |
Repici et al22 | 2022 | Italy | Withdrawal and insertion | 10 | Non-expert | 29 |
Liu et al15 | 2020 | China | Withdrawal | 11 | Both | 23 |
Yao et al16 | 2022 | China | Withdrawal | 4 | Expert | 89 |
Wang et al17 | 2020 | China | Withdrawal | 4 | Expert | 16 |
Wang et al18 | 2019 | China | Withdrawal | 8 | Both | 8 |
Wang et al19 | 2023 | China | Withdrawal | 8 | Expert | 17 |
Xu et al20 | 2023 | China | Withdrawal and insertion | 12 | Both | 100 |
Lachter et al26 | 2023 | Israel | Withdrawal and insertion | 7 | Expert | 32 |
Rondonotti et al21 | 2022 | Italy | Withdrawal and insertion | 21 | Expert | 0 |
Repici et al23 | 2020 | Italy | Withdrawal and insertion | 6 | Expert | 22 |
Gimeno-García et al29 | 2023 | Spain | Withdrawal | 8 | Expert | 40 |
Vilkoite et al28 | 2023 | Latvia | Withdrawal | 2 | Expert | - |
Aniwan et al27 | 2023 | Thailand | Withdrawal and insertion | 17 | Both | 89 |
Ahmad et al24 | 2023 | England | Withdrawal and insertion | 8 | Expert | 0 |
Glissen Brown et al25 | 2022 | USA | Withdrawal | - | Expert | 60 |
AI, artificial intelligence..
Table 2 . Sociodemographic Characteristics of the Populations Analyzed in the Included Studies.
Author | Patients (AI:Control) | Age (yr)-AI Mean ± SD or mean (range) | Age (yr)-C Mean ± SD or mean (range) | Male-AI | Male-C |
---|---|---|---|---|---|
Liu et al14 | 1,026 (508:518) | 51.0 ± 12.3 | 50.1 ± 12.7 | 264 (52) | 287 (55) |
Repici et al22 | 660 (330:330) | 61.9 ± 9.8 | 62.6 ± 10.2 | 174 (53) | 156 (47) |
Liu et al15 | 790 (393:397) | 49.8 ± 13.1 | 48.8 ± 13.0 | 180 (46) | 194 (49) |
Yao et al16 | 539 (268:271) | 50.7 ± 13.2 | 50.9 ± 13.6 | 121 (45) | 114 (42) |
Wang et al17 | 962 (484:478) | 49 (39.0–60.0) | 49 (40.3–56.0) | 241 (50) | 254 (53) |
Wang et al18 | 1,058 (522:536) | 51.1 ± 13.2 | 49.9 ± 13.8 | 263 (50) | 249 (46) |
Wang et al19 | 1,261 (636:625) | 46 (36.75–54.00) | 47 (37.00–55.00) | 364 (57) | 326 (52) |
Xu et al20 | 3,059 (1,519:1,540) | 57.49 ± 7.55 | 57.03 ± 7.43 | 707 (47) | 728 (47) |
Lachter et al26 | 674 (330:344) | 61.0 ± 9.95 | 60.8 ± 9.79 | - | - |
Rondonotti et al21 | 800 (405:395) | 62 (56–68) | 61 (55–67) | 213 (53) | 196 (50) |
Repici et al23 | 685 (341:344) | 61.5 ± 9.7 | 61.1 ± 10.6 | 169 (50) | 179 (52) |
Gimeno-García et al29 | 312 (155:157) | 62.99 ± 10.26 | 64.71 ± 11.79 | 82 (53) | 83 (53) |
Vilkoite et al28 | 400 (196:204) | 50.1 ± 15.4 | 51.2 ± 14.5 | 91 (47) | 102 (50) |
Aniwan et al27 | 622 (312:310) | 62.8 ± 6.82 | 62.0 ± 6.82 | 133 (43) | 133 (43) |
Ahmad et al24 | 614 (308:306) | 66.2 ± 5.4 | 66.4 ± 5.4 | 110 (36) | 98 (32) |
Glissen Brown et al25 | 223 (113:110) | 61.18 ± 9.83 | 60.51 ± 8.45 | 54 (48) | 68 (62) |
AI, artificial intelligence; C, control; SD, standard deviation..
Based on data from the 16 included studies, the overall ADR was significantly higher in the AIAC group compared to the control group (2,753/6,820 [40.4%] vs. 2,188/6,865 [31.9%]; relative risk [RR] = 1.26; 95% confidence interval [CI], 1.19–1.33;
Fifteen studies compared the APC between standard colonoscopy and AIAC. The APC was significantly higher with AIAC compared to standard white-light colonoscopy (odds ratio = 1.44; 95% CI, 1.35–1.54;
The quality of evidence was evaluated using the GRADE methodology. The evidence level for the RCTs was downgraded due to the moderate quality of the trials, variability among endoscopists, differences in indications for colonoscopy, and diversity of primary outcomes assessed across the studies.
The objective of this study was to assess the impact of AIAC on adenoma detection. A comprehensive analysis of 16 RCTs that met the inclusion criteria for the meta-analysis revealed significantly superior ADR, PDR, and APC with the use of AI. The findings indicated a 26% relative increase in ADR, reflecting a substantial improvement. AIAC was also associated with a 30% higher PDR. These outcomes support the integration of AIAC into clinical practice. Importantly, ADR serves as a crucial predictor of interval CRC following screening colonoscopy. All studies included in this review reported an ADR exceeding 15%, which aligns with the recommendations of the British Society of Gastroenterology (BSG). The BSG suggests a minimum ADR of 15% and an aspirational ADR of 20%, which is associated with a reduced incidence of interval CRC.30 Notably, all but one study failed to achieve a 15% ADR in the standard colonoscopy group.
Notably, the indication for colonoscopy in these studies was not consistent. Most of the patients did not undergo colonoscopy for screening purposes. Additionally, some of the included studies were conducted in tandem settings, making it difficult to attribute the detection rates solely to failures in recognizing lesions. The higher ADR associated with AIAC in this review may be due to the increased capacity to detect previously missed or unrecognized lesions through AI. This underscores the potential of AI as a valuable adjunct for colonoscopy, particularly for less experienced endoscopists.10 The observed results for ADR and PDR suggest that AIAC improves the identification of precancerous lesions and polyps, facilitating earlier diagnosis and intervention. This represents a major development in CRC prevention, as it could reduce the incidence of advanced-stage CRC and ultimately save lives. The improved diagnostic accuracy and efficiency provided by AI technology enable endoscopists to make more informed decisions during the procedure, improving patient outcomes and reducing the burden on healthcare systems. Furthermore, the impact of AI on ADR extends beyond merely increasing the number of adenomas detected during colonoscopy. The capacity of AI algorithms to assist endoscopists in targeting specific areas of interest may decrease the likelihood of overlooking adenomas. This leads to more thorough examinations and an increase in the overall yield of APC.
The pooled estimates from this study indicated a 44% increase in APC, which may be due to the superior detection of small or miniature lesions by AIAC. Compared to ADR, APC offers a slightly better assessment of the quality of examination of the entire colon and provides a degree of differentiation among endoscopists. However, a potential downside is the increased cost, particularly if endoscopists are required to remove all polyps during the procedure.6 Still, these potential benefits have been previously documented and are being closely considered in colonoscopy quality improvement programs, where APC could signify superior quality of endoscopic examinations.31
The impact of AI on APC is closely linked to its effects on ADR and PDR. By improving the detection of adenomas and polyps, AI technology can increase the overall adenoma yield during colonoscopy, as reflected by APC. This improved APC metric is essential in CRC prevention, as it reflects the effectiveness of colonoscopy in identifying and removing precancerous lesions. A higher APC indicates a more thorough and successful colon examination, reducing the risk of missed adenomas and improving patient outcomes.
This study contributes to the growing body of literature suggesting that the quality metrics of colonoscopy can be enhanced by incorporating CADe devices. Despite limitations, previous research has demonstrated the impact of AIAC. These studies have consistently reported a benefit to ADR associated with AIAC, although the extent of this difference has varied. Some of these meta-analyses incorporated a relatively small sample size, employed different analytical methods, and included retrospective studies, which generally provide a lower level of evidence.32–34
The interest in AI within the field of medicine has grown, presenting substantial potential for application. In endoscopic practice, the utility of this technology is potentially immense, with the potential to influence clinically relevant outcomes. This systematic review and meta-analysis offers evidence that the additional use of AI in standard colonoscopy may improve lesion detection. Such advancements could decrease the incidence of CRC and refine clinical practice. Future studies should evaluate the impact of the overdiagnosis of smaller lesions, including evaluations across diverse populations, and examine the differential effects on both high and low detectors. Moreover, the more frequent implementation of tandem colonoscopy could provide a more accurate determination of the rate of missed lesions. Overall, the existing evidence is promising and underscores the considerable impact of AI on the practice of colonoscopy.
The present study has several strengths. It exclusively incorporated RCTs, thereby minimizing the risk of bias. A total of 16 trials were selected for analysis, representing a valuable addition to the existing research and providing a larger and more diverse aggregate sample size for evaluation and comparison. This study is among the first to use solely randomized data to demonstrate the impact of AI on the practice of colonoscopy. Our review adhered to the Cochrane and GRADE guidelines and included an extensive systematic search of multiple databases.
Despite the potential benefits of AIAC, challenges and limitations exist in its widespread implementation. Technical issues, such as false positives and false negatives, require ongoing refinement to improve diagnostic accuracy. Moreover, integrating AI into clinical practice necessitates appropriate training for endoscopists to effectively interpret AI-generated data during colonoscopy procedures. Furthermore, the present meta-analysis has several limitations. Notably, seven of the 16 included trials were conducted in China, which limits the generalizability of the findings due to varying epidemiological patterns. The per-polyp analysis does not account for additional patient and lesion characteristics, and some trials lacked essential data, potentially introducing bias. The absence of information regarding outcome areas, screening proportions, and baseline parameters further complicates the analysis. Furthermore, factors that could influence ADR, such as patient-related or image-related variables, were not considered, potentially impacting this measurement.
The consistent improvements in ADR, PDR, and APC observed with AIAC have meaningful clinical implications for gastroenterology and CRC screening. First, AI algorithms are valuable tools for endoscopists, providing real-time CADe and CADx, alerting them to suspicious regions, and increasing confidence in identifying adenomas and polyps during colonoscopy. Second, the superior ADR and APC suggest that AI assistance can contribute to the early detection and intervention of precancerous lesions, potentially reducing the incidence and mortality associated with CRC. The role of AI in improving CRC screening highlights its importance in public health initiatives aimed at combating this common and preventable form of cancer. Furthermore, the increased PDR achieved with AI support may lead to more frequent and effective surveillance of patients at high risk, facilitating targeted and personalized strategies in the detection of various types of polyps.8,35
We are grateful to the staff at the Faculty of Life Science and Education, University of South Wales, for their support.
None.
All data generated or analyzed during this study are included in this article. The datasets used and/or analyzed in this study are available from the corresponding author upon reasonable request.
No potential conflict of interest relevant to this article was reported.
© The Society of Gastrointestinal Intervention. Powered by INFOrang Co., Ltd.