This repository implements streaming GNN inference experiments for Ripple: Scalable Incremental GNN Inferencing on Large Streaming Graphs (Pranjal Naman and Yogesh Simmhan). Reference: IEEE Xplore with DOI 10.1109/icdcs63083.2025.00088, from the 45th IEEE ICDCS (2025) program. Open-access backup: the same work as arXiv:2505.12112. An extended full-length version, RIPPLE++: An Incremental Framework for Efficient GNN Inference on Evolving Graphs, by Pranjal Naman, Parv Agarwal, Hrishikesh Haritas, and Yogesh Simmhan, is on arXiv:2601.12347. The code compares full recomputation and incremental maintenance across several GNN variants under src/workloads/.
Create and activate a Conda environment named ripple with Python 3.10.12:
conda create -n ripple python=3.10.12 -y
conda activate rippleFrom the repository root, install dependencies for CUDA 12.1 (PyTorch and DGL wheels):
pip install -r requirements.txt \
--extra-index-url https://download.pytorch.org/whl/cu121 \
-f https://data.dgl.ai/wheels/cu121/repo.htmlCheck that PyTorch, DGL, and CUDA are available:
python - <<EOF
import torch, dgl
print("Torch:", torch.__version__)
print("DGL:", dgl.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
EOFWorkload w1_gcn_sum uses sum aggregation (see src/models/custom_models.py). Treat rc.py as the full recomputation baseline after each batch of stream events, and in.py as the RIPPLE incremental path.
From the repository root, run:
# 1) Build snapshot + event trace (small stream for a quick test)
cd src/data
python3 get_trace.py --name ogbn-arxiv --initial_snapshot_pct 90 --num_events 50000 --seed 0
# 2) Train GCN-sum and write embeddings (short run)
cd ../workloads/w1_gcn_sum/single
python3 train.py --name arxiv --n_layers 2 --n_hidden 128 --batch_size 1024 --n_epochs 1
# 3a) Baseline: full recomputation on the stream
python3 rc.py --name arxiv --n_layers 2 --batch_size 1000 --n_updates 50000 --n_hidden 128
# 3b) RIPPLE: incremental GNN inference on the same stream
python3 in.py --name arxiv --n_layers 2 --batch_size 1000 --n_updates 50000 --n_hidden 128Logs and artifacts go under data/ and logs/ as configured in the scripts. Use larger --num_events, --n_updates, and --n_epochs for experiments that match the paper settings.
src/data/get_trace.py— stream generationsrc/workloads/<workload>/single/train.py— offline trainingsrc/workloads/<workload>/single/rc.py— recomputation baselinesrc/workloads/<workload>/single/in.py— incremental (RIPPLE-style) processor
1. Ripple:
@inproceedings{naman2025ripple,
author={Naman, Pranjal and Simmhan, Yogesh},
booktitle={2025 IEEE 45th International Conference on Distributed Computing Systems (ICDCS)},
title={Ripple: Scalable Incremental GNN Inferencing on Large Streaming Graphs},
year={2025},
pages={857-867},
doi={10.1109/ICDCS63083.2025.00088}
}2. RIPPLE++ (extended version):
@misc{naman2026ripplepp,
title={RIPPLE++: An Incremental Framework for Efficient GNN Inference on Evolving Graphs},
author={Naman, Pranjal and Agarwal, Parv and Haritas, Hrishikesh and Simmhan, Yogesh},
year={2026},
eprint={2601.12347},
archivePrefix={arXiv},
primaryClass={cs.DC},
url={https://arxiv.org/abs/2601.12347}
}This code is released under Apache License, Version 2.0 https://www.apache.org/licenses/LICENSE-2.0.txt
Copyright (c) 2025 DREAM:Lab, Indian Institute of Science. All rights reserved.