Technical Architecture

System Overview

False Identity Generator is built on a modular architecture designed for extensibility, maintainability, and data-driven configuration.

Core Components

1. Generator Module (`src/generator.py`)

Purpose: Core identity generation engine

Key Classes:

IdentityGenerator: Main orchestrator for identity creation

Key Methods:

generate(): Main entry point for identity generation
_generate_family(): Family structure generation with fertility constraints
_generate_work_experience(): Employment history with coherent timelines
_generate_physical_characteristics(): Age-appropriate physical traits
_get_job_for_age(): Age-appropriate job assignment
_generate_death_info_for_parent(): Realistic parent death scenarios

Lines of Code: ~2,400 (after modularization)

2. Data Loader Module (`src/data_loader.py`)

Purpose: File I/O and data caching layer

Key Classes:

DataLoader: Handles all data loading with caching

Key Features:

Caching system for performance
Age-based data selection (buckets: very_young, young_adults, adults, middle_aged, older_adults, seniors, elderly)
Gender-aware data loading
Country-specific and global data support

Methods (20+ public methods):

load_names(), load_surnames(): Name data by age/gender
load_jobs(): Jobs filtered by gender and social class
load_regional_characteristics(): Age-filtered historical/cultural traits
load_rules(): Country configuration
load_death_causes(): Age-appropriate death causes

Lines of Code: ~600

3. Models Module (`src/models.py`)

Purpose: Data structures and display logic

Key Classes:

Identity: Main data structure (dataclass)
CountryRules: Configuration wrapper

Key Features:

Display formatting with colored output
Twin/triplet detection in families
Time calculation utilities
Box drawing for CLI interface

Lines of Code: ~800

4. UI Modules (`src/ui/`)

Components:

display.py: Terminal UI formatting
menus.py: Interactive menu systems

5. Storage Module (`src/storage.py`)

Purpose: Identity persistence and retrieval

Features:

JSON serialization
File-based storage
Identity search and management

Data Flow

User Input
    ↓
main.py → IdentityGenerator
    ↓
generate() method
    ↓
    ├→ DataLoader.load_rules() → Country configuration
    ├→ DataLoader.load_names() → Age/gender-appropriate names
    ├→ _generate_family() → Family structure
    │   ├→ Parent generation (age constraints)
    │   ├→ Sibling generation (fertility validation)
    │   └→ Children generation (fertility validation)
    ├→ _get_job_for_age() → Age-appropriate employment
    ├→ _generate_work_experience() → Employment history
    ├→ _generate_physical_characteristics() → Physical traits
    └→ load_regional_characteristics() → Cultural context
    ↓
Identity object
    ↓
Storage.save() → JSON file

Key Algorithms

Age-Based Data Selection

AGE_BUCKETS = [
    "very_young",      # min_age to min_age+5
    "young_adults",    # min_age+5 to min_age+16
    "adults",          # min_age+16 to 50
    "middle_aged",     # 50 to 65
    "older_adults",    # 65 to life_expectancy
    "seniors",         # Beyond life_expectancy
    "elderly"          # Oldest segment
]

Data files are organized by bucket for realistic name, hobby, and characteristic selection.

Parent Age Validation

Constraints:

parent_age = identity_age + random(parent_min_gap, parent_max_gap)
parent_age <= life_expectancy + random(0,5) (realistic cap)
Death timing based on identity age:
- Identity 70+: parent died 10-30 years ago
- Identity 50-70: parent died 5-20 years ago
- Identity 35-50: parent died 0-15 years ago

Sibling Age Calculation with Fertility Constraints

# Mother's current age (or age at death)
mother_age = mother['current_age'] or mother['age_at_death']

# Valid sibling age range
max_sibling_age = mother_age - fertility_min  # Mother was AT LEAST fertility_min
min_sibling_age = max(1, mother_age - fertility_max)  # Mother was AT MOST fertility_max

# Also respect ±15 years from identity age for realism
max_sibling_age = min(max_sibling_age, identity_age + 15)
min_sibling_age = max(min_sibling_age, max(1, identity_age - 15))

Result: No impossible scenarios (e.g., mother giving birth at age 8)

Twin/Triplet Detection

# Allow same age with 15% probability
if random.random() < 0.15 and sibling_count_at_age < 3:
    # Assign same age as existing sibling
    used_ages[age] += 1

Display logic detects multiples by counting siblings with identical ages.

Job Age Filtering

def _filter_jobs_by_age(jobs, age, age_requirements):
    """Filter jobs to only those person is old enough for."""
    filtered = []
    for job in jobs:
        min_age = age_requirements.get(job, 18)  # Default: 18
        if age >= min_age:
            filtered.append(job)
    return filtered

Safeguard: Persons under 18 are automatically assigned "Student" status.

Configuration System

Country Rules (`data/countries/{country}/rules.txt`)

Key-value pairs controlling generation logic:

# Demographics
min_age=18
max_age=83
life_expectancy=83

# Fertility
female_fertility_min_age=15
female_fertility_max_age=45
parent_min_age_gap=18
parent_max_age_gap=45

# Probabilities
death_prob_parents_elderly=95
unemployment_rate=13.5

# Formats
date_format=DD/MM/YYYY
phone_format=### ## ## ##

Data File Formats

Names (data/countries/{country}/names/{gender}/{age_bucket}.txt):

María
José
Carmen

Regional Characteristics (data/countries/{country}/regional_characteristics.txt):

characteristic_text|min_age|max_age|gender|probability
Remembers the 2004 Madrid train bombings|27|999|both|0.7

min_age: Person must be THIS age NOW to have this characteristic
Calculated so person was 5-7+ years old when event occurred (realistic memory)

Jobs (data/global/jobs/{gender}/{social_class}.txt):

Software Engineer
Nurse
Teacher

Age Requirements (data/global/jobs/age_requirements.txt):

Nurse:22
Lawyer:24
Pharmacy Technician:21

Performance Optimizations

Caching: DataLoader caches all loaded files in _cache dict
Lazy Loading: Data only loaded when needed
Random Sampling: Efficient selection from weighted distributions
Minimal File I/O: Read files once, use repeatedly

Error Handling

Data Validation:

Age range checks before all operations
Fallback to defaults if data files missing
Safe random selection with bounds checking

Edge Case Handling:

Empty job lists → fallback to "Student"
Invalid date ranges → swap min/max or adjust
Missing data files → use hardcoded defaults

Extensibility

Adding a New Country

Create directory: data/countries/{country}/
Add configuration: rules.txt
Add name files: names/{gender}/{age_bucket}.txt
Add surnames: surnames/{gender}.txt (if gendered) or surnames.txt
Add cities: cities.txt
Add regional characteristics: regional_characteristics.txt
Mark as complete: Set finished=true in rules.txt

Adding New Data Fields

Extend Identity dataclass in models.py
Add generation method in generator.py
Add data files in appropriate locations
Update DataLoader with loading method
Update display logic in models.py

Testing

Test Suite (`test_features.py`)

Validates:

Age ranges (18-83 for Spain)
Phone formats (mobile only: +34 6XX or 7XX)
Hobbies presence
Language coherence (university graduates have B1+ English)
Parent death validation (if >life_expectancy, must be deceased)
Fertility constraints (mother age at birth: 15-45)
Regional characteristics age-appropriateness

Success Rate: 98-100% (50 identity test)

Security Considerations

Data Privacy:

All generated data is synthetic
No real person data included
Random seeds ensure unpredictability

Abuse Prevention:

Legal disclaimers
No built-in document generation
Educational purpose markers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Technical Architecture

System Overview

Core Components

1. Generator Module (`src/generator.py`)

2. Data Loader Module (`src/data_loader.py`)

3. Models Module (`src/models.py`)

4. UI Modules (`src/ui/`)

5. Storage Module (`src/storage.py`)

Data Flow

Key Algorithms

Age-Based Data Selection

Parent Age Validation

Sibling Age Calculation with Fertility Constraints

Twin/Triplet Detection

Job Age Filtering

Configuration System

Country Rules (`data/countries/{country}/rules.txt`)

Data File Formats

Performance Optimizations

Error Handling

Extensibility

Adding a New Country

Adding New Data Fields

Testing

Test Suite (`test_features.py`)

Security Considerations

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Technical Architecture

System Overview

Core Components

1. Generator Module (src/generator.py)

2. Data Loader Module (src/data_loader.py)

3. Models Module (src/models.py)

4. UI Modules (src/ui/)

5. Storage Module (src/storage.py)

Data Flow

Key Algorithms

Age-Based Data Selection

Parent Age Validation

Sibling Age Calculation with Fertility Constraints

Twin/Triplet Detection

Job Age Filtering

Configuration System

Country Rules (data/countries/{country}/rules.txt)

Data File Formats

Performance Optimizations

Error Handling

Extensibility

Adding a New Country

Adding New Data Fields

Testing

Test Suite (test_features.py)

Security Considerations

1. Generator Module (`src/generator.py`)

2. Data Loader Module (`src/data_loader.py`)

3. Models Module (`src/models.py`)

4. UI Modules (`src/ui/`)

5. Storage Module (`src/storage.py`)

Country Rules (`data/countries/{country}/rules.txt`)

Test Suite (`test_features.py`)