In a recent article published in Nature Medicine, researchers applied artificial intelligence (AI) methods to real-world longitudinal clinical data to design surveillance programs for the early detection of patients at elevated risk of one of the most aggressive diseases, pancreatic cancer.

Study: A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Image Credit: Chinnapong/Shutterstock.com

Background

The incidence of pancreatic cancer is increasing, making it a leading cause of cancer-related deaths worldwide. It is challenging to diagnose pancreatic cancer due to a lack of understanding of its risk factors.

Late detection at advanced or distant metastatic stages hampers treatment, which makes patient survival extremely uncommon. Only two to nine percent of such patients survive at five years.

While age is a recognized risk factor for pancreatic cancer, age-based population-wide screening is impractical due to the high cost of clinical testing, which also yields false-positive results.

In addition, family history data or genetic risk factors for the general population are often unavailable. Thus, there is an urgent need to develop affordable surveillance programs for the early detection of pancreatic cancer in the general population.

About the study

In the present study, researchers used real-world longitudinal clinical records of large numbers of patients to identify quite a few patients at high risk of pancreatic cancer.

They exploited recently developed machine learning (ML) methods using patient records from the Danish National Patient Registry (DNPR) and, subsequently, the United States Veterans Affairs (US-VA) Corporate Data Warehouse (CDW).

The former comprised data for 8.6 million patients captured between 1977 and 2018, corresponding to 24,000 pancreatic cancer cases, whereas the latter had clinical data of three million patients with 3,900 pancreatic cancer cases.

The team trained and tested a diverse array of ML models on the sequence of disease codes in the DNPR and US-VA clinical records and tested the prediction of cancer occurrence within incremental time intervals termed CancerRiskNet.

In constructing predictive models, the team used the three-character category International Classification of Diseases (ICD) diagnostic codes and defined ‘pancreatic cancer patients’ as patients with at least one code under C25, indicating malignant neoplasm of the pancreas.

The accuracy of cancer diagnosis disease codes was ~98%. Finally, the researchers flagged which diagnoses in a patient’s history of diagnosis codes were most informative of cancer risk to propose an ideal surveillance program.

Further, the researchers evaluated the prediction performance of the different models trained in the DNPR using the area under the receiver operating characteristic (AUROC) and relative risk (RR) curves. In addition, they reported ML-derived RR scores of patients with cancer in the high-risk group.

Results

All previous studies using real-world clinical records to predict pancreatic cancer risk fetched encouraging results but did not use the time sequence of disease histories to extract time-sequential longitudinal features. In this study, they evaluated the non-time-sequential models on the DNPR dataset.

Overall, the time-sequential model, Transformer, had the best performance for cancer incidence prediction within 36 months of the assessment date, with an AUROC of 0.879, followed closely by GRU with an AUROC of 0.852.

The RR for this model at an operational point defined by the n = 1,000 highest-risk patients out of one million patients was 104.7.

The performance of the bag-of-words model and the MLP model for predicting cancer occurrence within 36 months in terms of AUROC were 0.807 and 0.845, respectively. However, compared to Transformer, the RRs for bag-of-words and MLP were much lower (104.7 vs. 2.1 and 26.6).

Data exclusion, i.e., excluding input disease diagnoses from the last three, six, and 12 months before pancreatic cancer diagnoses, decreased the performance for the best models from AUROC of 0.879 to AUROCs of 0.843, 0.829, and 0.827 for three-/six-/12-months.

This analysis indicated that an ML model trained on data from both sources had a positive predictive value (PPV) of 0.32 for the 12-month prediction interval. So, about 320 patients would have eventually developed pancreatic cancer.

While physicians might have identified some cases based on recognized pancreatic cancer risk factors, e.g., chronic pancreatitis, a fraction of these, nearly 70, would still be newly identified per a conservative approximation.

Despite the use of common ICD disease codes and similar cancer survival, the cross-application of the DNPR data to the US-VA data lowered the performance of ML models, raising the need for independent model training across geographical regions to attain regionally optimum model performance.

However, an ideal scenario for a multi-institutional collaboration to attain a globally relevant set of prediction rules would require federated learning across different healthcare systems.

Conclusions

The prediction accuracy of ML-based models described in this study could improve with the accessibility of data beyond disease codes, e.g., observations written in clinical notes, laboratory results, and genetic profiles of more people or health-related information from their wearable devices.

Then, clinical implementation of early diagnosis of pancreatic cancer would require the identification of high-risk patients.

Since those at the highest risk is a smaller subset of a large population screened computationally, costly and refined clinical screening and intervention programs will be limited to a few patients.

Nonetheless, AI on clinical records from the real world could potentially shift the focus from late-stage treatment to early-stage cancer treatment, which, in turn, would substantially improve all patients’ quality of life while increasing the benefit-to-cost ratio of cancer care.