Production-minded backend architecture for an X-clone, built as a modular monolith with a strong focus on idempotency, atomicity, and low-latency read paths.
This project is designed to solve the non-trivial backend problems behind social products:
- high-churn user interactions (like/unlike, follow/unfollow)
- mixed-content timelines (tweets + retweets) that still paginate correctly
- notification consistency under concurrency
- scalable read performance as graph and content density grow
The codebase is split into clear business domains while remaining deployable as one service:
- Accounts: authentication, profile lifecycle, password/reset/deactivate/reactivate flows
- Tweets: tweet CRUD, comments, likes, retweets, bookmarks, feed assembly
- Relationships: follow graph and follower/following queries
- Interactions / Notifications: mentions, notifications, read-state management
This structure keeps local development and deployment simple (single process boundary), while preserving clean domain boundaries and minimizing cross-domain coupling.
The feed endpoint intentionally merges two independently optimized query paths:
- Tweet QuerySet with
select_related,annotate(Count(...), Exists(...)) - Retweet QuerySet with equivalent metrics projected onto the original tweet
Both streams are merged into a single in-memory collection and globally sorted by created_at to produce one chronological feed.
In real traffic, users can trigger rapid interaction cycles (like -> unlike -> like, repeated taps, concurrent requests). A naive notification write path can create duplicate rows or integrity failures.
- duplicated notifications for the same actor/receiver/action/target
- race-window collisions around uniqueness checks
- stale read-state when an existing notification should be re-surfaced
- Database-level idempotency contract:
Notificationenforces uniqueness across(sender, receiver, verb, content_type, content_id). - Atomic write path: notification creation runs inside
transaction.atomic(). - Concurrency-safe create: uses
get_or_create(...)withIntegrityErrorfallback to recover from same-key concurrent inserts. - Read-state resurrection: if a notification already exists and was marked as read, a new equivalent event flips
is_read=Falseinstead of duplicating data. - Commit-aware background trigger: registration-related notification dispatch is attached via
transaction.on_commit(...)to prevent out-of-transaction side effects.
The system is idempotent by design and resilient to race conditions, preserving data integrity without sacrificing UX feedback speed.
A hybrid feed (Tweet + Retweet) cannot be represented as one simple model QuerySet. Standard DRF search/filter flows assume a QuerySet-backed pipeline and can break when the feed becomes a Python list.
Hidden trap
If search is delegated to default QuerySet-based filtering while the feed is list-backed, pagination and filtering can diverge or fail, especially under mixed object types.
- Build two optimized QuerySets first (tweets and retweets).
- Merge and sort in memory to guarantee global chronology across types.
- Apply explicit in-memory search over normalized fields (
content,quote,username) to keep behavior deterministic. - Paginate after merge/search to avoid cross-type ordering distortion.
- Cache feed responses using keys scoped by user + page + search + version, preventing cache collisions and stale cross-query reads.
The feed remains semantically correct, paginates consistently, and avoids the common search/pagination regressions seen in mixed-content timelines.
- SQLite -> PostgreSQL migration: prototype-friendly local beginnings evolved into PostgreSQL-backed relational workloads for stronger concurrency semantics and production readiness.
- N+1 mitigation in nested comments:
select_related+ targetedPrefetchpipelines reduce query fan-out for comment trees and replies. - Multi-level caching with Redis:
- feed-level caching
- user-posts caching
- tweet-detail caching
- versioned cache invalidation keys for low-cost stale-busting
- JWT authentication using
djangorestframework-simplejwt - refresh token blacklisting on logout
- scoped throttling for abuse control:
- auth endpoints (brute-force mitigation)
- content creation (anti-spam)
- interaction endpoints (bot-like rapid actions)
- sensitive account operations
Celery workers are integrated to move high-latency and side-effect work off the request cycle:
- asynchronous email dispatch (password/reset/account lifecycle)
- asynchronous mention parsing for tweets
- the same task pipeline pattern is used for extending text parsing workflows such as hashtag extraction/indexing
- Django 5 + Django REST Framework
- PostgreSQL
- Redis (cache + Celery broker/result backend)
- Celery
- drf-spectacular (OpenAPI 3.0)
OpenAPI schema and interactive docs are available out of the box:
- Swagger UI:
/api/docs/ - ReDoc:
/api/redoc/ - Raw schema:
/api/schema/
The social graph is implemented as an explicit Follow model containing two foreign keys to the same user table:
follower-> who initiates the relationshipfollowing-> who is being followed
Why this design:
- preserves directionality (
A follows B!=B follows A) - allows relational constraints (
unique_together) to prevent duplicate edges - supports metadata (
created_at) on edges - keeps follower/following queries index-friendly and expressive
- real-time fan-out and notifications using WebSockets with Django Channels
- media storage and delivery via S3 + CDN
- PostgreSQL trigram-based search optimization for faster fuzzy search at scale
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables for PostgreSQL, email, and secrets.
-
Apply migrations:
python manage.py migrate
-
Run API server:
python manage.py runserver
-
Start Celery worker (separate terminal):
celery -A config worker -l info