Thank you for your interest in contributing. SemBlend is an active research project and welcomes contributions across the core engine, engine integrations, benchmarks, and documentation.
- Bug reports — open a GitHub issue with a minimal reproducer
- Engine integrations — TRT-LLM, MLC-LLM, or other inference backends
- Embedder backends — alternative embedding models or ANN index backends
- Benchmarks — new datasets, workloads, or latency measurement methodologies
- Documentation — clarifications, examples, tutorials
git clone https://github.com/worldflowai/semblend
cd semblend
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,embedder]"pytest # unit tests (CPU-only, no GPU required)
pytest -m integration # requires vLLM + LMCache installedAll tests must pass before submitting a PR. GPU-dependent tests (Triton kernels, RoPE correction) are excluded from the default run and marked @pytest.mark.gpu.
ruff check . # lint
ruff format . # formatLine length is 100. Type annotations are expected on public functions and class methods. We follow standard Python conventions (PEP 8, PEP 484).
- Fork the repo and create a branch from
main:git checkout -b feat/my-change - Write tests for any new behavior — aim for coverage on the changed code paths
- Ensure
pytestandruff check .both pass locally - Open a PR with a clear description of what the change does and why
- Link any related GitHub issues or upstream PRs (LMCache, vLLM, SGLang)
PRs that touch semblend_core/ (the backend-agnostic engine) must not introduce dependencies on any specific inference engine. Engine-specific code belongs in semblend/integration/.
semblend_core/ Backend-agnostic pipeline (embed → search → align → plan)
pipeline.py 5-stage orchestrator
embedder.py MiniLM, Jaccard, ONNX-GPU embedders
alignment.py Exact and fuzzy chunk alignment
rope_correction.py RoPE delta correction + NoPE two-step
bathtub.py Per-layer deviation scoring (CacheBlend)
semblend/integration/
vllm/ vLLM + LMCache connector (KVConnectorBase_V1)
sglang/ SGLang RadixCache patcher + SemanticPrefixProvider
synapse_kv_connector/ Legacy vLLM connector (thin re-export for backward compat)
tests/ Unit tests
SemBlend is driving standardized semantic caching interfaces upstream:
- LMCache —
SemanticLookupProviderandPostLoadHook - SGLang —
SemanticPrefixProvider - vLLM —
register_modelfor CacheBlend
If you're working on an engine integration, coordinate with these upstream PRs to avoid duplicate work.
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.