Screening for colorectal cancer (CRC) continues to rely on colonoscopy and/or fecal occult blood testing since other (non-invasive) risk-stratification systems have not yet been implemented into European guidelines. In this study, we evaluate the potential of machine learning (ML) methods to predict advanced adenomas (AAs) in 5862 individuals participating in a screening program for colorectal cancer. Adenomas were diagnosed histologically with an AA being ≥ 1 cm in size or with high-grade dysplasia/villous features being present. Logistic regression (LR) and extreme gradient boosting (XGBoost) algorithms were evaluated for AA prediction. The mean age was 58.7 ± 9.7 years with 2811 males (48.0%), 1404 (24.0%) of whom suffered from obesity (BMI ≥ 30 kg/m²), 871 (14.9%) from diabetes, and 2095 (39.1%) from metabolic syndrome. An adenoma was detected in 1884 (32.1%), as well as AAs in 437 (7.5%). Modelling 36 laboratory parameters, eight clinical parameters, and data on eight food types/dietary patterns, moderate accuracy in predicting AAs with XGBoost and LR (AUC-ROC of 0.65–0.68) could be achieved. Limiting variables to established risk factors for AAs did not significantly improve performance. Moreover, subgroup analyses in subjects without genetic predispositions, in individuals aged 45–80 years, or in gender-specific analyses showed similar results. In conclusion, ML based on point-prevalence laboratory and clinical information does not accurately predict AAs.
Machine Learning Models Cannot Replace Screening Colonoscopy for the Prediction of Advanced Colorectal Adenoma
Mamandipoor, Behrooz;Osmani, Venet
2021-01-01
Abstract
Screening for colorectal cancer (CRC) continues to rely on colonoscopy and/or fecal occult blood testing since other (non-invasive) risk-stratification systems have not yet been implemented into European guidelines. In this study, we evaluate the potential of machine learning (ML) methods to predict advanced adenomas (AAs) in 5862 individuals participating in a screening program for colorectal cancer. Adenomas were diagnosed histologically with an AA being ≥ 1 cm in size or with high-grade dysplasia/villous features being present. Logistic regression (LR) and extreme gradient boosting (XGBoost) algorithms were evaluated for AA prediction. The mean age was 58.7 ± 9.7 years with 2811 males (48.0%), 1404 (24.0%) of whom suffered from obesity (BMI ≥ 30 kg/m²), 871 (14.9%) from diabetes, and 2095 (39.1%) from metabolic syndrome. An adenoma was detected in 1884 (32.1%), as well as AAs in 437 (7.5%). Modelling 36 laboratory parameters, eight clinical parameters, and data on eight food types/dietary patterns, moderate accuracy in predicting AAs with XGBoost and LR (AUC-ROC of 0.65–0.68) could be achieved. Limiting variables to established risk factors for AAs did not significantly improve performance. Moreover, subgroup analyses in subjects without genetic predispositions, in individuals aged 45–80 years, or in gender-specific analyses showed similar results. In conclusion, ML based on point-prevalence laboratory and clinical information does not accurately predict AAs.File | Dimensione | Formato | |
---|---|---|---|
Machine_Learning_Models_Cannot_Replace_Screening_Colonoscopy_for_the_Prediction_of_Advanced_Colorectal_Adenoma.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
2.43 MB
Formato
Adobe PDF
|
2.43 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.