Edgee is a lightweight LLM gateway that sits between your application and AI providers. It gives you a single control point for routing, observability, and cost optimization, without changing your existing code.
Think of it as an open-source alternative to LiteLLM or OpenRouter, written in Rust for speed and low resource usage, with a built-in token compression engine that reduces your AI costs automatically.
- One gateway, any provider — Unified API for Anthropic, OpenAI, and other LLM providers. Switch models without touching your app code.
- Token compression — Edgee analyzes request context and strips redundancy before it reaches the model. Same output, fewer tokens, lower bill.
- Real-time observability — See exactly how many tokens you're sending, how many you're saving, and what it costs.
- Rust-native — Fast startup, minimal memory footprint, no runtime dependencies. Runs anywhere Docker runs.
macOS / Linux (curl)
curl -fsSL https://edgee.ai/install.sh | bashHomebrew (macOS)
brew install edgee-ai/tap/edgeeWindows (PowerShell)
irm https://edgee.ai/install.ps1 | iexInstalls to %LOCALAPPDATA%\Programs\edgee\. You can override the directory with $env:INSTALL_DIR before running.
Edgee can wrap your coding assistant and compress traffic automatically:
# Claude Code
edgee launch claude
# Codex
edgee launch codex
# Opencode
edgee launch opencodePoint any OpenAI-compatible client at Edgee:
# Start the gateway
edgee serve
# Your app talks to Edgee instead of the provider directly
export OPENAI_BASE_URL=http://localhost:1207/v1Edgee's compression engine analyzes tool outputs (file listings, git logs, build output, test results) and removes noise before they enter the LLM context. The compression is lossless from the model's perspective — responses are identical, but prompts are leaner.
Route requests across Anthropic, OpenAI, and other providers through a single endpoint. Switch models, load-balance, or failover without code changes.
Real-time visibility into token consumption, compression savings, and cost per request.
| Tool | Setup command | Status |
|---|---|---|
| Claude Code | edgee launch claude |
✅ Supported |
| Codex | edgee launch codex |
✅ Supported |
| Opencode | edgee launch opencode |
✅ Supported |
| Cursor | edgee launch cursor |
🔜 Coming soon |
| Any OpenAI-compatible client | edgee serve |
✅ Supported |
The token compression engine in Edgee is derived from RTK, created by Patrick Szymkowiak and contributors at rtk-ai Labs. RTK pioneered local tool-output compression for AI coding assistants, and we built on their work to bring the same optimizations to a gateway architecture.
RTK is licensed under the Apache License 2.0. All derived files retain the original copyright notice and are individually marked with a modification history. See LICENSE-APACHE and NOTICE for full details.
If you're looking for a local-first compression tool, check out RTK directly, it's excellent for individual developer workflows.
Edgee is Apache 2.0 licensed and we genuinely want your contributions.
git clone https://github.com/edgee-ai/edgee
cd edgee
cargo buildSee CONTRIBUTING.md for the full guide. For bigger changes, open an issue first so we can align before you build.
- Discord — fastest way to get help
- GitHub Issues — bugs and feature requests
- Twitter / X — updates and releases