Researchers have developed TxGNN, an AI-powered model that outperforms existing methods by predicting treatments for diseases lacking approved therapies, using multi-hop explanations to provide greater transparency and trust.

Research: A foundation model for clinician-centered drug repurposing. Image Credit: unoL / Shutterstock

A recent study published in the journal Nature Medicine developed TxGNN, a graph-based foundation model for zero-shot drug repurposing. Only 5% to 7% of rare diseases have approved drugs. Expanding the use of existing drugs for new indications can help mitigate the global disease burden. Drug repurposing leverages existing safety and efficacy data, allowing faster clinical translation and reduced development costs.

Predicting drug efficacy against all diseases may allow for selecting drugs with fewer side effects, designing more effective treatments for several targets in a disease pathway, and repurposing available drugs for new therapeutic uses.

Drug effects can be matched to new indications by analyzing medical knowledge graphs (KGs). While computational methods have identified repurposing candidates, there are two significant challenges. First, these approaches assume that therapeutic predictions are needed for diseases that already have drugs.

Second, most models tend to identify drugs based on similarities to existing treatments, which fails to address diseases with no available treatments. For clinical use, machine learning models must make zero-shot predictions, i.e., predict drugs for diseases with limited molecular understanding and no approved drugs. However, this ability is markedly lower for existing models.

TxGNN addresses this gap by implementing a zero-shot drug repurposing approach, using a GNN and a specialized disease-similarity-based metric learning module to transfer knowledge from treatable diseases to those without treatments.

The study and findings

In the present study, researchers developed TxGNN, a graph foundation model for zero-shot drug repurposing, that predicts repurposing candidates, including those currently lacking treatments. TxGNN was composed of 1) a graph neural network (GNN)-based encoder, 2) a disease similarity-based metric learning decoder, 3) an all-relationship stochastic pretraining followed by fine-tuning, and 4) a multi-hop graph explanatory module.

TxGNN was trained on a medical KG, collating decades of research across 17,080 diseases. Further, a multi-hop TxGNN Explainer was developed to facilitate the interpretation of drug candidates by linking drug-disease pairs through interpretable medical knowledge paths. This explainer provides human experts with transparent, multi-hop explanations that foster trust in AI-generated predictions.

Model performance was evaluated across various holdout datasets. A holdout dataset was generated by sampling diseases from the KG, which were omitted during training to be used later as test cases. These held-out diseases were random or specifically chosen to evaluate zero-shot prediction.

TxGNN was compared with eight state-of-the-art methods, including a natural-language processing model, BioBERT, GNN methods like HGT and HAN, and network medicine statistical techniques. Under the standard benchmarking strategy, where diseases in the test set already had some indications or contraindications during training, TxGNN outperformed the strongest method, HAN, by a margin of 4.3% in AUPRC (Area Under Precision-Recall Curve) for indications.

Next, the team evaluated models under zero-shot repurposing, wherein models were required to predict therapeutic candidates for diseases lacking treatments. In this case, TxGNN showed a 49.2% increase in AUPRC for drug indications and 35.1% for contraindications compared to the next-best model.

These gains are particularly significant because conventional models struggle in zero-shot settings, where no prior drug-disease relationships are available for training. TxGNN was also evaluated in stringent settings across nine disease areas, achieving AUPRC gains ranging from 0.5% to 59.3% for drug indications and 11.8% to 35.6% for contraindications.

Under this scenario, TxGNN exhibited consistent performance improvements over existing models, with AUPRC gains ranging from 0.5% to 59.3% for drug indications and 11.8% to 35.6% for contraindications. Further, a pilot study was conducted with scientists and clinicians. Participants included two pharmacists, five clinicians, and five clinical researchers. They were asked to assess 16 TxGNN predictions, 12 of which were accurate.

Participants’ exploration time, assessment accuracy, and confidence scores for each prediction were recorded. They significantly improved in confidence and accuracy when predictions were provided with explanations. Moreover, in interviews and questionnaires administered post-task, participants reported greater satisfaction with the TxGNN Explainer, with 91.6% of participants agreeing that TxGNN predictions and explanations were valuable.

In contrast, 75% disagreed, relying on TxGNN predictions without explanations. Next, the team evaluated whether predicted drugs and their explanations align with medical reasoning for the following rare diseases: Kleefstra’s syndrome, Ehlers-Danlos syndrome, and nephrogenic syndrome of inappropriate antidiuresis (NSIAD).

This evaluation protocol included three stages. First, a human expert queried TxGNN to identify potential repurposable drugs. Next, TxGNN Explainer was queried to illustrate why the drug was considered. In the third stage, independent medical evidence was analyzed to verify TxGNN predictions and explanations.

The model identified zolpidem, tretinoin, and amyl nitrite for Kleefstra’s syndrome, Ehlers-Danlos syndrome, and NSIAD, respectively. In all cases, TxGNN explanations were consistent with medical evidence.

Real-world validation through EMRs

The researchers curated a cohort of over 1.2 million adults with at least one drug prescription and disease using electronic medical records (EMRs) from a health system and measured the enrichment of drug-disease co-occurrence. This validation aligns the predictions of TxGNN with real-world clinical use.

Enrichment was estimated as the ratio of odds of using a drug for a disease to those of using it for other diseases. Overall, 619,200 log(odds ratio) [log(OR)] values were derived. TxGNN generated a ranked list of therapeutic candidates for each EMR-phenotyped disease.

Drugs related to the disease were omitted, and the new candidate drugs were classified as top-ranked, top five, top 5%, and bottom 50%. The top-ranked predicted drugs had about 107% higher log(OR) values on average than the mean log(OR) of the bottom 50% predictions, indicating that TxGNN’s predictions align well with off-label prescriptions made by clinicians.

Conclusions

Together, the study developed TxGNN for zero-shot drug repurposing that specifically targets diseases with limited data and therapeutic options. TxGNN consistently outperforms existing methods by offering multi-hop interpretable explanations for its predictions, which enhances trust and usability in clinical workflows. Besides, predicted drugs match human experts’ medical consensus and align with off-label prescription rates in EMRs.

TxGNN’s multi-hop interpretable explanations provide a new level of transparency, fostering trust and enhancing the model’s integration into clinical workflows.