Gastroenterologist against the machine-opportunities and limitations of machine learning models for prediction of advanced adenoma

Semmler, Georg; Wernly, Sarah; Wernly, Bernhard; Mamandipoor, Behrooz; Bachmayer, Sebastian; Semmler, Lorenz; Aigner, Elmar; Datz, Christian; Osmani, Venet

doi:10.1055/s-0041-1734270

Background & Aims Screening for colorectal cancer (CRC) relies on colonoscopy and/or fecal occult blood test while other (non-invasive) risk-stratification systems have not been implemented into European guidelines. Here, we evaluated the potential of Machine Learning (ML) methods to optimize prediction of advanced adenoma (AA). Patients & Methods 5862 individuals participating in a screening program for colorectal cancer were included after excluding patients with history of CRC, symptomatic patients and those with insufficient colonoscopy. Adenoma were diagnosed histologically with AA being ≥1cm in size, or high-grade dysplasia/ villous features being present. Clinical, laboratory and lifestyle parameters were assessed at the time of colonoscopy. Logistic regression (LR) and extreme gradient boosting algorithms (XGBoost) were evaluated for AA-prediction based on readily-available laboratory/clinical/lifestyle parameters. The dataset was divided into a derivation cohort (for model development and internal cross-validation) and an external validation cohort. Results The mean age was 58.7±9.7 years with 2811 males (48.0 %). 1404 (24.0 %) suffered from obesity (BMI≥30kg/m2), 871 (14.9 %) from diabetes, and 2095 (39.1 %) from the metabolic syndrome. Any adenoma was detected in 1884 (32.1 %) and any AA in 437 (7.5 %). 659 individuals (11.2 %) had a first-degree relative with a history of CRC. Modelling 36 laboratory parameters, 8 clinical parameters and data on 8 food types/dietary patterns, a moderate accuracy to predict AA with XGBoost (AUC of 0.66-0.68) and LR (AUC of 0.65-0.66) could be achieved. Limiting variables to established risk factors for AA did not significantly improve performance. Also, subgroup analyses in subjects without genetic predisposition or gender-specific analyses showed similar results. Conclusion ML, based on point prevalence laboratory and clinical information, does not accurately predict AA. Non-invasive risk-prediction seems insufficient to replace current CRC screening programs. However, the potential for sequential application before colonoscopy to increase pre-test probability warrants further investigation.

IRIS Institutional Research Information System