Contributing to SemBlend

Thank you for your interest in contributing. SemBlend is an active research project and welcomes contributions across the core engine, engine integrations, benchmarks, and documentation.

Ways to Contribute

Bug reports — open a GitHub issue with a minimal reproducer
Engine integrations — TRT-LLM, MLC-LLM, or other inference backends
Embedder backends — alternative embedding models or ANN index backends
Benchmarks — new datasets, workloads, or latency measurement methodologies
Documentation — clarifications, examples, tutorials

Development Setup

git clone https://github.com/worldflowai/semblend
cd semblend

python -m venv .venv
source .venv/bin/activate

pip install -e ".[dev,embedder]"

Running Tests

pytest                      # unit tests (CPU-only, no GPU required)
pytest -m integration       # requires vLLM + LMCache installed

All tests must pass before submitting a PR. GPU-dependent tests (Triton kernels, RoPE correction) are excluded from the default run and marked @pytest.mark.gpu.

Code Style

ruff check .        # lint
ruff format .       # format

Line length is 100. Type annotations are expected on public functions and class methods. We follow standard Python conventions (PEP 8, PEP 484).

Pull Request Process

Fork the repo and create a branch from main: git checkout -b feat/my-change
Write tests for any new behavior — aim for coverage on the changed code paths
Ensure pytest and ruff check . both pass locally
Open a PR with a clear description of what the change does and why
Link any related GitHub issues or upstream PRs (LMCache, vLLM, SGLang)

PRs that touch semblend_core/ (the backend-agnostic engine) must not introduce dependencies on any specific inference engine. Engine-specific code belongs in semblend/integration/.

Architecture Overview

semblend_core/          Backend-agnostic pipeline (embed → search → align → plan)
  pipeline.py           5-stage orchestrator
  embedder.py           MiniLM, Jaccard, ONNX-GPU embedders
  alignment.py          Exact and fuzzy chunk alignment
  rope_correction.py    RoPE delta correction + NoPE two-step
  bathtub.py            Per-layer deviation scoring (CacheBlend)

semblend/integration/
  vllm/                 vLLM + LMCache connector (KVConnectorBase_V1)
  sglang/               SGLang RadixCache patcher + SemanticPrefixProvider

synapse_kv_connector/   Legacy vLLM connector (thin re-export for backward compat)
tests/                  Unit tests

Upstream Interface Work

SemBlend is driving standardized semantic caching interfaces upstream:

LMCache — SemanticLookupProvider and PostLoadHook
SGLang — SemanticPrefixProvider
vLLM — register_model for CacheBlend

If you're working on an engine integration, coordinate with these upstream PRs to avoid duplicate work.

License

By contributing, you agree that your contributions will be licensed under the Apache License 2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to SemBlend

Ways to Contribute

Development Setup

Running Tests

Code Style

Pull Request Process

Architecture Overview

Upstream Interface Work

License

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to SemBlend

Ways to Contribute

Development Setup

Running Tests

Code Style

Pull Request Process

Architecture Overview

Upstream Interface Work

License