BACKGROUND & AIMS: Refining hepatocellular carcinoma (HCC) surveillance programs requires improved individual risk prediction. Thus, we aimed to develop algorithms based on machine learning approaches to predict the risk of HCC more accurately in patients with HCV-related cirrhosis, according to their virological status.
METHODS: Patients with compensated biopsy-proven HCV-related cirrhosis from the French ANRS CO12 CirVir cohort were included in a semi-annual HCC surveillance program. Three prognostic models for HCC occurrence were built, using (i) Fine-Gray regression as a benchmark, (ii) single decision tree (DT), and (iii) random survival forest for competing risks survival (RSF). Model performance was evaluated from C-indexes validated externally in the ANRS CO22 Hepather cohort (n = 668 enrolled between 08/2012-01/2014).
RESULTS: Out of 836 patients analyzed, 156 (19%) developed HCC and 434 (52%) achieved sustained virological response (SVR) (median follow-up 63 months). Fine-Gray regression models identified 6 independent predictors of HCC occurrence in patients before SVR (past excessive alcohol intake, genotype 1, elevated AFP and GGT, low platelet count and albuminemia) and 3 in patients after SVR (elevated AST, low platelet count and shorter prothrombin time). DT analysis confirmed these associations but revealed more complex interactions, yielding 8 patient groups with varying cancer risks and predictors depending on SVR achievement. On RSF analysis, the most important predictors of HCC varied by SVR status (non-SVR: platelet count, GGT, AFP and albuminemia; SVR: prothrombin time, ALT, age and platelet count). Externally validated C-indexes before/after SVR were 0.64/0.64 [Fine-Gray], 0.60/62 [DT] and 0.71/0.70 [RSF].
CONCLUSIONS: Risk factors for hepatocarcinogenesis differ according to SVR status. Machine learning algorithms can refine HCC risk assessment by revealing complex interactions between cancer predictors. Such approaches could be used to develop more cost-effective tailored surveillance programs.
LAY SUMMARY: Patients with HCV-related cirrhosis must be included in liver cancer surveillance programs, which rely on ultrasound examination every 6 months. Hepatocellular carcinoma (HCC) screening is hampered by sensitivity issues, leading to late cancer diagnoses in a substantial number of patients. Refining surveillance periodicity and modality using more sophisticated imaging techniques such as MRI may only be cost-effective in patients with the highest HCC incidence. Herein, we demonstrate how machine learning algorithms (i.e. data-driven mathematical models to make predictions or decisions), can refine individualized risk prediction.