PolyDDI

Higher-order drug-drug interaction prediction using Graph Neural Networks. Predicting emergent adverse effects from 3+ drug combinations for safer polypharmacy.

View on GitHub Quick Start

What is PolyDDI?

Pairwise drug-drug interactions are well-studied, but patients on polypharmacy often take three or more drugs simultaneously. Emergent interactions — adverse effects that arise only when 3+ drugs are combined — remain largely unexplored computationally.

PolyDDI tackles this problem with a GNN-based pipeline built in PyTorch:

GraphSAGE Drug Encoder

Learns drug embeddings from 191K pairwise DDIs (DrugBank via TDC) over a graph of 1,706 drugs, with finetuning on triplet signal for +0.178 AUROC gain.

FAERS Signal Mining

Mines emergent 3-way adverse event signals from 8 quarters of FDA FAERS reports (2023Q1–2024Q4) using Proportional Reporting Ratios with FDR correction.

Hard Negative Evaluation

Permutation-validated evaluation with hard negatives (replace 1 drug in positive triplet), giving honest AUROC 0.763 vs. inflated 0.941 with random negatives.

Key Results

SettingTest AUROCInterpretation
Finetuned SAGE + Hard Neg0.763Realistic performance
Finetuned SAGE + Random Neg0.941Upper bound (trivial negatives)
Frozen SAGE + Hard Neg0.585Pairwise embedding alone
Random Split + Random Neg0.988Memorization ceiling

Key Findings

Pipeline

DrugBank (TDC) FAERS (FDA) | | 01_prepare_data 03_mine_faers (×8 quarters) | | 00_build_vocabulary 04_map_drugs (×8 quarters) | | | 05_merge_mapped_triplets | | | 06_prepare_triplets | | 02_train_pairwise | | | +-------+-------+-------+ | 07_train_triplet (×6 configs) | 08_permutation_test 09_analyze_results 10_summary_table 11_case_study

Quick Start

# Clone and set up git clone https://github.com/brianyu43/polyddi-v2.git cd polyddi-v2 conda create -n polyddi python=3.11 && conda activate polyddi pip install -r requirements.txt # Reproduce all results python scripts/reproduce.py --seed 42 --faers-raw-dir data/faers/raw

Known Limitations

Transductive

Cannot predict for drugs absent from the training graph. Inductive molecular encoders (SMILES/atom-level GNN) are the planned next step.

Noisy Ground Truth

FAERS data contains observation bias, reporting bias, and confounding. Signals are statistical associations, not confirmed causal interactions.

Trivial Negative Problem

Random negative sampling inflates evaluation metrics. Hard negatives provide a more honest performance estimate.

Citation

@misc{polyddi2026, title = {PolyDDI: Higher-Order Drug-Drug Interaction Prediction with Graph Neural Networks}, year = {2026}, url = {https://github.com/brianyu43/polyddi-v2} }

Data Sources

SourceAccessUsage
DrugBank via TDCOpen1,706 drugs, 191K pairwise DDIs
FDA FAERSOpen8 quarters of adverse event reports (2023Q1–2024Q4)