LiteLLM + AWS Aurora Postgres: writer vs reader endpoints (up to 15 replicas) #20907

VladaJerkovicZartis · 2026-02-11T01:06:18Z

VladaJerkovicZartis
Feb 11, 2026

We’re planning to run LiteLLM Proxy with AWS Aurora PostgreSQL as the backing DB (virtual keys / spend logs / admin UI state). Aurora gives us a writer endpoint (read + write) and up to 15 Aurora Replicas (read-only) behind a reader endpoint.

LiteLLM’s config/docs currently expose a single database_url (plus pool + timeout knobs). I don’t see a “writer_db_url / reader_db_url” split in the supported settings.

What we’re trying to decide

Do we:

Point LiteLLM DATABASE_URL to the writer endpoint and keep it simple (all queries go to writer), or
Try to leverage reader replicas for SELECTs (read scaling), without breaking consistency, and without LiteLLM explicitly supporting two DB URLs?

KeepALifeUS · 2026-02-13T01:25:51Z

KeepALifeUS
Feb 13, 2026

AWS Aurora read replicas with LiteLLM requires careful routing. Here is a pattern that works:

Connection Routing Architecture

from dataclasses import dataclass
from typing import List, Optional
import random

@dataclass
class AuroraConfig:
    writer_endpoint: str
    reader_endpoints: List[str]  # Up to 15 replicas
    
    # Connection settings
    pool_size_per_endpoint: int = 10
    read_timeout_ms: int = 5000
    
    # Routing strategy
    reader_strategy: str = "round_robin"  # round_robin, random, least_connections

class AuroraConnectionRouter:
    def __init__(self, config: AuroraConfig):
        self.config = config
        self.reader_index = 0
        self.connection_counts = {ep: 0 for ep in config.reader_endpoints}
    
    def get_writer(self) -> str:
        """Always route writes to writer endpoint"""
        return self.config.writer_endpoint
    
    def get_reader(self) -> str:
        """Route reads across replicas"""
        if not self.config.reader_endpoints:
            return self.config.writer_endpoint
        
        if self.config.reader_strategy == "round_robin":
            endpoint = self.config.reader_endpoints[self.reader_index]
            self.reader_index = (self.reader_index + 1) % len(self.config.reader_endpoints)
            return endpoint
        
        elif self.config.reader_strategy == "random":
            return random.choice(self.config.reader_endpoints)
        
        elif self.config.reader_strategy == "least_connections":
            return min(self.connection_counts, key=self.connection_counts.get)
        
        return self.config.reader_endpoints[0]

LiteLLM Integration

from litellm import completion
import asyncpg

class LiteLLMWithAurora:
    def __init__(self, router: AuroraConnectionRouter):
        self.router = router
        self.write_pool = None
        self.read_pools = {}
    
    async def log_request(self, data: dict):
        """Write operations go to writer"""
        conn = await self.get_write_connection()
        await conn.execute("INSERT INTO requests ...", data)
    
    async def get_usage_stats(self, user_id: str):
        """Read operations go to replicas"""
        conn = await self.get_read_connection()
        return await conn.fetch("SELECT * FROM usage WHERE user_id = $1", user_id)
    
    async def get_read_connection(self):
        endpoint = self.router.get_reader()
        if endpoint not in self.read_pools:
            self.read_pools[endpoint] = await asyncpg.create_pool(endpoint)
        return await self.read_pools[endpoint].acquire()

Key Considerations

Replication lag: Reads might be slightly stale (usually < 100ms)
Failover: Aurora auto-promotes reader on writer failure
Load distribution: Monitor per-replica load
Connection pooling: Pool per endpoint, not shared

For LiteLLM proxy, consider exposing read/write hints in config.

More patterns: https://github.com/KeepALifeUS/autonomous-agents

1 reply

VladaJerkovicZartis Feb 13, 2026
Author

Thanks — but we don’t need to handle this ourselves. When you create an Aurora PostgreSQL cluster, AWS already provides two endpoints: a writer endpoint and a reader endpoint. In the application, you typically configure two DB connections:

Writer for INSERT, UPDATE, DELETE (and sometimes SELECT, depending on consistency needs)
Reader for SELECT

Aurora takes care of scaling the reader instances behind the scenes — you just keep pointing to the single reader endpoint.

Also, worth noting: LiteLLM’s DB configuration appears to support only a single database connection string, and I don’t think it includes built-in logic to route queries between writer/reader endpoints. It would be a useful feature to have, but it doesn’t seem to be implemented today.

xXMrNidaXx · 2026-02-23T13:31:05Z

xXMrNidaXx
Feb 23, 2026

Great question about Aurora writer vs reader endpoints!

How we'd architect this:

Writer endpoint for:

Creating/updating API keys
Logging spend data
User management
Any INSERT/UPDATE/DELETE operations

Reader endpoints for:

Usage dashboard queries (heavy reads)
Analytics aggregations
Model catalog lookups
Token validation (read-heavy)

Connection pooling strategy:

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

writer_engine = create_engine(
    writer_url,
    pool_size=5,
    max_overflow=10
)

reader_engine = create_engine(
    reader_url,
    pool_size=20,  # Higher for reads
    max_overflow=30
)

Replica lag considerations:

Aurora typically has <100ms replica lag
For usage dashboards, this is fine
For real-time spend limits, use writer or add caching

Pro tip: Use a read-replica-aware ORM or route queries explicitly. Don't rely on random load balancing for write-after-read scenarios.

We've scaled LLM proxies at RevolutionAI with similar patterns. Happy to share more specifics!

0 replies

xXMrNidaXx · 2026-02-23T14:54:09Z

xXMrNidaXx
Feb 23, 2026

Good question! Aurora read replicas can significantly improve LiteLLM proxy performance.

Option 1: Writer only (simple, recommended to start)

DATABASE_URL=postgresql://user:pass@writer.cluster.us-east-1.rds.amazonaws.com:5432/litellm

Pros: Simple, no consistency issues
Cons: Writer handles all load

Option 2: Application-level read/write split

Since LiteLLM does not support dual URLs natively, use a proxy:

# pgbouncer or pgcat config
[databases]
litellm_write = host=writer.cluster... pool_mode=transaction
litellm_read = host=reader.cluster... pool_mode=transaction

Then route based on query type. BUT: LiteLLM uses a single connection string, so this is tricky.

Option 3: Aurora Proxy (recommended)

Use RDS Proxy with read/write awareness:

DATABASE_URL=postgresql://user:pass@proxy.us-east-1.rds.amazonaws.com:5432/litellm

RDS Proxy can:

Route SELECTs to readers automatically
Route writes to writer
Handle failover

Option 4: Cluster endpoint with read scaling

# Use cluster endpoint for writes
DATABASE_URL=postgresql://...@cluster.cluster-xxx.us-east-1.rds.amazonaws.com/litellm

Aurora auto-directs based on session settings.

Our recommendation:
Start with writer-only. If DB becomes bottleneck (check CPU/connections), add RDS Proxy with read scaling.

We run LiteLLM on Aurora at Revolution AI — RDS Proxy is the cleanest path to read replicas.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LiteLLM + AWS Aurora Postgres: writer vs reader endpoints (up to 15 replicas) #20907

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

LiteLLM + AWS Aurora Postgres: writer vs reader endpoints (up to 15 replicas) #20907

Uh oh!

VladaJerkovicZartis Feb 11, 2026

Replies: 3 comments · 1 reply

Uh oh!

KeepALifeUS Feb 13, 2026

Connection Routing Architecture

LiteLLM Integration

Key Considerations

Uh oh!

VladaJerkovicZartis Feb 13, 2026 Author

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

xXMrNidaXx Feb 23, 2026

VladaJerkovicZartis
Feb 11, 2026

Replies: 3 comments 1 reply

KeepALifeUS
Feb 13, 2026

VladaJerkovicZartis Feb 13, 2026
Author

xXMrNidaXx
Feb 23, 2026

xXMrNidaXx
Feb 23, 2026