Oncology OTHER

AI tool predicts acute leukemia subtypes with AUROC 0.94 for AML, 0.98 for promyelocytic, 0.84 for ALL in international cohort

Nature communications Published March 30, 2026 Turki Amin T, Fan Yi, Hernández-Sánchez Alberto, Silva Wellington, Fleming Shaun, Yalcin Koray, Van … PubMed ↗ DOI ↗

Key Takeaway

Consider that an AI ensemble method improved AML prediction (AUROC 0.72 to 0.84) while drastically reducing excluded patients from 70.8-92.5% to 12.1% in an international test.

This retrospective study assembled an international cohort of 6206 leukemia patients from 20 centers to test and refine an artificial intelligence (AI) tool designed to support leukemia diagnosis using standard laboratory results. The goal was to address health disparities by potentially improving access to diagnosis. The pretrained algorithm was executed on this diverse cohort, yielding varying accuracy metrics. When applying a confidence cutoff for predictions, the 2000-fold bootstrapped area under the curve (AUROC) metrics were 0.94 for acute myeloid leukemia (AML), 0.98 for the promyelocytic subtype, and 0.84 for acute lymphoblastic leukemia (ALL). However, this confidence cutoff approach excluded a substantial proportion of patients from receiving predictions, ranging from 70.8% to 92.5%. To improve the tool's utility, the researchers enhanced its accuracy and robustness while maintaining generalizability. They implemented an ensemble method combining Isolation Forest and Local Outlier Factor. This refinement increased the AUROC for AML from 0.72 to 0.84 on a hold-out test set specifically for patients who fell below the initial confidence threshold. Importantly, this improved model excluded only 12.1% of patients from predictions, a significant reduction from the earlier high exclusion rates. An additional development noted in the abstract is that the algorithm was retrained specifically for pediatric patients. The study demonstrates a process of international testing and iterative refinement of an AI diagnostic support tool, showing that modifications can substantially reduce the rate of excluded patients while improving performance for a subset of cases.

View Original Abstract ↓

Despite advances for patients with acute leukemia health disparities limit access to diagnosis and treatment. Artificial Intelligence (AI) approaches may address some disparities. We retrospectively assemble a diverse, international cohort of 6206 leukemia patients from 20 centers to test an AI tool designed to support leukemia diagnosis using standard laboratory results. Executing the pretrained algorithm results in varying accuracy metrics. With confidence cutoff predictions, 2000-fold bootstrapped area under the curve (AUROC) metrics are 0.94 for acute myeloid leukemia (AML), 0.98 for the promyelocytic subtype and 0.84 for acute lymphoblastic leukemia. However, this cutoff excludes 70.8-92.5% of patients from predictions. We improve accuracy and robustness, while maintaining generalizability via an ensemble of Isolation Forest and Local Outlier Factor increasing AUROC for AML from 0.72 to 0.84 (hold-out test set, patients below confidence threshold), while excluding only 12.1% of patients. Furthermore, we retrain the algorithm for pediatric patients.