Natural language to SQL interface using Self-Driven Exploration with SQL Probes. Query PostgreSQL databases using plain English with real-time exploration visualization.
This application implements the SDE-SQL methodology from the paper "Enhancing Text-to-SQL Generation in Large Language Models via Self-Driven Exploration with SQL Probes" (arXiv:2506.07245v2). It features:
- Natural language to SQL conversion using GPT-4o
- Real-time exploration traces showing how SQL is constructed
- DuckDB WASM for client-side SQL execution
- Interactive chat interface with query results visualization
Before you start, ensure you have:
- Python 3.9+ installed
- Node.js 18+ and npm installed
- PostgreSQL running locally (localhost:5432)
- DVD Rental Database loaded in PostgreSQL
- OpenAI API Key (for GPT-4o)
cd sde-sql-prototypecd backend
copy .env.example .envEdit .env with your credentials:
OPENAI_API_KEY=sk-your-actual-api-key-here
DATABASE_HOST=localhost
DATABASE_PORT=5432
DATABASE_NAME=dvdrental
DATABASE_USER=postgres
DATABASE_PASSWORD=your_postgres_passwordstart.batThe script will:
- Create Python virtual environment
- Install dependencies from
requirements.txt - Run infrastructure tests
- Start FastAPI server on http://localhost:8000
./start.shOpen a new terminal:
cd frontendstart.batThe script will:
- Install npm dependencies if needed
- Start Vite dev server on http://localhost:5173
npm install
npm run devOpen your browser and navigate to:
- Frontend: http://localhost:5173
- Backend API Docs: http://localhost:8000/docs
cd backend
# Create virtual environment
python -m venv venv
# Activate virtual environment
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment variables
copy .env.example .env
# Edit .env with your settings
# Test infrastructure
python test_infrastructure.py
# Start server
python main.pycd frontend
# Install dependencies
npm install
# Start development server
npm run dev
# Build for production (optional)
npm run buildsde-sql-prototype/
├── backend/
│ ├── main.py # FastAPI entry point
│ ├── start.bat # Windows startup script
│ ├── start.sh # Linux/Mac startup script
│ ├── requirements.txt # Python dependencies
│ ├── .env.example # Environment template
│ ├── config/ # Configuration & prompts
│ ├── core/ # Pipeline orchestration
│ ├── modules/ # SDE-SQL modules
│ ├── database/ # Database connectors
│ ├── utils/ # Utilities (LSH, embeddings)
│ └── api/ # REST & WebSocket endpoints
│
├── frontend/
│ ├── package.json # Node dependencies
│ ├── start.bat # Windows startup script
│ ├── vite.config.ts # Vite configuration
│ └── src/
│ ├── App.tsx # Main application
│ ├── components/ # React components
│ ├── services/ # API & WebSocket clients
│ └── types/ # TypeScript types
│
└── README.md # This file
- Download the database:
wget https://www.postgresqltutorial.com/wp-content/uploads/2019/05/dvdrental.zip
unzip dvdrental.zip- Create and restore:
# Connect to PostgreSQL
psql -U postgres
# Create database
CREATE DATABASE dvdrental;
\q
# Restore backup
pg_restore -U postgres -d dvdrental dvdrental.tar- Verify:
psql -U postgres -d dvdrental -c "\dt"You should see tables: customer, film, actor, rental, etc.
cd backend
python test_infrastructure.pyExpected output: All 15 infrastructure tests should pass.
cd frontend
npm run lint- Start both backend and frontend using the startup scripts
- Open http://localhost:5173 in your browser
- Type a natural language question, e.g., "Find all customers from Argentina"
- Watch the real-time exploration traces on the right panel
- View the generated SQL and query results in the chat interface
Try these with the DVD Rental database:
- "Find all customers from Argentina"
- "What are the top 5 most rented films?"
- "List all actors who appeared in films rented by customer ID 100"
- "Show me the total revenue by store"
- "Which films have never been rented?"
POST /api/query- Submit natural language queryGET /api/databases- List available databasesGET /api/session/{session_id}- Get session results
ws://localhost:8000/ws/{session_id}- Real-time trace updates
See full API documentation at http://localhost:8000/docs
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key | Required |
OPENAI_MODEL |
Model to use | gpt-4o |
DATABASE_HOST |
PostgreSQL host | localhost |
DATABASE_PORT |
PostgreSQL port | 5432 |
DATABASE_NAME |
Database name | dvdrental |
DATABASE_USER |
Database user | postgres |
DATABASE_PASSWORD |
Database password | Required |
API_HOST |
API server host | 0.0.0.0 |
API_PORT |
API server port | 8000 |
| Variable | Description | Default |
|---|---|---|
VITE_API_URL |
Backend API URL | http://localhost:8000 |
VITE_WS_URL |
WebSocket URL | ws://localhost:8000 |
Database Connection Failed
- Check PostgreSQL is running
- Verify credentials in
.env - Test connection:
psql -U postgres -d dvdrental
OpenAI API Error
- Verify API key at https://platform.openai.com/api-keys
- Check for extra spaces in
.env - Ensure you have API credits
Module Not Found
- Activate virtual environment:
venv\Scripts\activate - Reinstall:
pip install -r requirements.txt
Port 5173 Already in Use
- Stop other Vite instances
- Or edit
vite.config.tsto use different port
Cannot Connect to Backend
- Ensure backend is running on port 8000
- Check CORS settings in backend
.env
- FastAPI - Web framework
- LangChain - LLM orchestration
- GPT-4o - Language model
- PostgreSQL - Database
- SQLGlot - SQL parsing
- asyncpg - Async database driver
- React 19 - UI framework
- TypeScript - Type safety
- Vite - Build tool
- Axios - HTTP client
- Prism.js - Syntax highlighting
This is a research prototype. For issues or improvements:
- Check existing documentation
- Review the paper (arXiv:2506.07245v2)
- Test with the DVD Rental database first
Research prototype based on the SDE-SQL paper.
- Paper: "SDE-SQL: Enhancing Text-to-SQL Generation" (arXiv:2506.07245v2)
- DVD Rental Database: PostgreSQL Sample Database
- LangChain: https://python.langchain.com/
- FastAPI: https://fastapi.tiangolo.com/
For detailed setup instructions, see:
SETUP_INSTRUCTIONS.md- Infrastructure setupbackend/README.md- Backend detailsfrontend/README.md- Frontend details
Last Updated: 2025-10-12