A production-ready Retrieval Augmented Generation (RAG) system for e-commerce, powered by Endee as the high-performance vector database.
Ask natural-language questions like "What's a good laptop for students?" and get accurate, context-grounded answers backed by a live product catalog — all in milliseconds.
E-commerce platforms often struggle with product discovery. Traditional keyword search fails when customers use natural language or describe features rather than exact product names. This project solves that by:
- Embedding product descriptions as semantic vectors stored in Endee
- Retrieving the most relevant products using cosine similarity vector search
- Generating a natural, helpful answer using a language model (OpenAI) or a rule-based fallback
Customer Question
│
▼
┌──────────────────┐
│ FastAPI Server │ ← REST API (Python)
└──────────────────┘
│
▼ embed with SentenceTransformer (all-MiniLM-L6-v2)
┌──────────────────┐
│ Endee (nDD) │ ← Vector DB on port 8080
│ cosine / INT8 │ returns top-K products
└──────────────────┘
│
▼ build context string
┌──────────────────┐
│ LLM / Fallback │ ← OpenAI GPT-3.5 (optional)
└──────────────────┘
│
▼
Final Answer + Retrieved Products
| Component | Role |
|---|---|
| Endee | Vector database — stores & retrieves product embeddings |
| SentenceTransformers | Converts product text & queries into 384-dim vectors |
| FastAPI | REST API layer exposing /ingest, /query, /products/search |
| OpenAI GPT-3.5 | Optional LLM for generating fluent answers from retrieved context |
Endee is the core retrieval engine of this project.
- Index Creation — A
cosineindex withINT8precision and 384 dimensions is created in Endee at startup. - Upserting Vectors — Each product is converted to text, embedded, and stored as a vector with metadata (name, price, brand, etc.) via
index.upsert(). - Vector Search — Customer queries are embedded and searched against all product vectors using
index.query(), returning the top-K most semantically similar products by cosine similarity.
from endee import Endee, Precision
client = Endee()
client.set_base_url("http://localhost:8080/api/v1")
# Create index
client.create_index(name="ecommerce_products", dimension=384,
space_type="cosine", precision=Precision.INT8)
index = client.get_index("ecommerce_products")
# Upsert a product vector
index.upsert([{
"id": "1",
"vector": [...], # 384-dim embedding
"meta": {"name": "Sony WH-1000XM5", "price": 349.99, ...}
}])
# Semantic search
results = index.query(vector=[...], top_k=5)ecommerce-rag-endee/
├── app/
│ ├── main.py # FastAPI app & endpoints
│ └── rag_pipeline.py # Embedding, Endee indexing, retrieval, generation
├── data/
│ └── products.json # 15 sample e-commerce products
├── scripts/
│ └── demo.py # Demo script to test all endpoints
├── docker-compose.yml # Runs Endee + FastAPI together
├── Dockerfile # Container for the FastAPI app
├── requirements.txt
├── .env.example
└── README.md
- Python 3.9+
- Docker & Docker Compose
- Git
Per the submission requirements:
- Star the Endee repo: github.com/endee-io/endee
- Fork it to your GitHub account
- Clone this project repo:
git clone https://github.com/<your-username>/ecommerce-rag-endee
cd ecommerce-rag-endeecp .env.example .env
# Edit .env if needed (OpenAI key is optional)Starts both Endee and the FastAPI app with one command:
docker compose up --build- Endee dashboard: http://localhost:8080
- RAG API docs: http://localhost:8000/docs
1. Start Endee via Docker:
docker run -p 8080:8080 -v endee-data:/data endeeio/endee-server:latest2. Install Python dependencies:
pip install -r requirements.txt3. Start the FastAPI server:
uvicorn app.main:app --reloadcurl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '{"force_reingest": true}'# RAG question answering
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"question": "What headphones are good for travel?", "top_k": 3}'
# Pure semantic search
curl "http://localhost:8000/products/search?q=running+shoes&top_k=3"python scripts/demo.py| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
API info |
GET |
/health |
Health check |
POST |
/ingest |
Ingest products into Endee |
POST |
/query |
RAG: answer a product question |
GET |
/products/search?q=... |
Semantic product search |
Interactive API docs available at http://localhost:8000/docs
| Question | What Endee Retrieves |
|---|---|
| "I need noise-cancelling headphones" | Sony WH-1000XM5, AirPods Pro |
| "Best laptop for students under $1200" | MacBook Air M3 |
| "Good shoes for running" | Adidas Ultraboost 23, Nike Air Max 270 |
| "Something to make coffee quickly" | Nespresso Vertuo Pop |
| "Warm jacket for winter hiking" | The North Face Thermoball Jacket |
- Zero hallucination risk — answers are strictly grounded in retrieved product context
- INT8 quantization on Endee for memory-efficient high-speed search
- Graceful fallback — works without OpenAI key using rule-based answer generation
- Async FastAPI — non-blocking I/O for high throughput
- Docker-first — single
docker compose upto run everything
Apache 2.0 — see LICENSE