SubtitlesTranslator

A powerful Python tool that extracts audio from MP4 videos, transcribes speech using OpenAI Whisper, translates text to multiple languages, and generates properly formatted SRT subtitle files.

Features

Audio Extraction: Extracts high-quality audio from MP4 video files using FFmpeg
Speech Recognition: Uses OpenAI Whisper for accurate speech-to-text transcription
Multi-language Translation: Translates transcribed text using deep-translator library
SRT Generation: Creates properly formatted subtitle files with precise timestamps
CLI Interface: Easy-to-use command-line interface with progress tracking
Programmatic API: Full Python API for integration into other applications
Robust Error Handling: Comprehensive validation and error recovery mechanisms
Performance Optimized: Multi-threaded translation and efficient audio processing

Quick Start

# Install the package
pip install -e .

# Translate a video file
subtitles-translator translate video.mp4 --target-lang es

# Transcribe only (no translation)
subtitles-translator transcribe video.mp4

# List supported languages
subtitles-translator list-languages

Prerequisites

System Requirements

Python: 3.8 or higher
FFmpeg: Must be installed separately (see installation guide below)
Operating System: Windows, macOS, or Linux

FFmpeg Installation

Windows

Download FFmpeg from https://ffmpeg.org/download.html
Extract and add to your PATH environment variable
Verify installation: ffmpeg -version

macOS

# Using Homebrew
brew install ffmpeg

Ubuntu/Debian

sudo apt update
sudo apt install ffmpeg

CentOS/RHEL

sudo yum install epel-release
sudo yum install ffmpeg

Installation

From Source (Recommended for Development)

Clone the repository

git clone https://github.com/yourusername/SubtitlesTranslator.git
cd SubtitlesTranslator

Create virtual environment

python -m venv venv

# Windows
venv\Scripts\activate

# macOS/Linux
source venv/bin/activate

Install dependencies
```
pip install -e .
```
Verify installation
```
subtitles-translator --help
```

Using pip (When Published)

pip install subtitles-translator

Usage

Command Line Interface

Basic Translation

# Translate Spanish video to English subtitles
subtitles-translator translate video.mp4 --source-lang es --target-lang en

# Auto-detect source language
subtitles-translator translate video.mp4 --target-lang fr

# Specify output file
subtitles-translator translate video.mp4 --output subtitles.srt --target-lang de

Advanced Options

# Use larger Whisper model for better accuracy
subtitles-translator translate video.mp4 --model-size large --target-lang es

# Multi-threaded translation for faster processing
subtitles-translator translate video.mp4 --threads 10 --target-lang ja

# Transcribe only (no translation)
subtitles-translator transcribe video.mp4 --model-size medium

# Enable verbose logging
subtitles-translator --verbose translate video.mp4 --target-lang pt

# Log to file
subtitles-translator --log-file process.log translate video.mp4 --target-lang ru

Available Commands

Command	Description
`translate`	Extract, transcribe, translate, and generate SRT subtitles
`transcribe`	Extract and transcribe audio without translation
`list-languages`	Show all supported language codes

Command Options

Option	Description	Default
`--source-lang, -s`	Source language code	`auto`
`--target-lang, -t`	Target language code	`en`
`--model-size, -m`	Whisper model size	`base`
`--threads`	Translation thread count	`5`
`--output, -o`	Output file path	Auto-generated
`--no-translation`	Skip translation step	`False`
`--verbose, -v`	Enable verbose logging	`False`
`--quiet, -q`	Suppress console output	`False`
`--log-file`	Log to file	None

Programmatic API

Basic Usage

from pathlib import Path
from subtitles_translator.core.audio_extractor import AudioExtractor
from subtitles_translator.core.speech_recognizer import SpeechRecognizer
from subtitles_translator.core.translator import Translator
from subtitles_translator.core.srt_generator import SRTGenerator

# Process video file
video_path = Path("video.mp4")

# Extract audio
audio_extractor = AudioExtractor()
audio_path = audio_extractor.extract_audio(video_path)

# Transcribe speech
speech_recognizer = SpeechRecognizer(model_size="base")
segments = speech_recognizer.transcribe_audio(audio_path)

# Translate text
translator = Translator(source_lang="es", target_lang="en")
translated_segments = translator.translate_segments(segments)

# Generate SRT file
SRTGenerator.save_srt_file(translated_segments, Path("output.srt"))

Advanced Configuration

from subtitles_translator.core.audio_extractor import AudioExtractor
from subtitles_translator.core.speech_recognizer import SpeechRecognizer
from subtitles_translator.utils.progress import StageProgressTracker
from subtitles_translator.utils.logger import setup_logger

# Setup logging
logger = setup_logger(level="DEBUG", log_file=Path("debug.log"))

# Configure audio extraction
audio_extractor = AudioExtractor(temp_dir=Path("temp"))

# Configure speech recognition with large model
speech_recognizer = SpeechRecognizer(model_size="large")

# Track progress through stages
stages = {
    "Audio Extraction": 1,
    "Speech Recognition": 3,
    "Translation": 2,
    "SRT Generation": 1
}

with StageProgressTracker(stages) as progress:
    # Your processing pipeline here
    pass

Supported Languages

The application supports all languages available in the deep-translator library. Common language codes include:

Code	Language	Code	Language
`auto`	Auto-detect	`en`	English
`es`	Spanish	`fr`	French
`de`	German	`it`	Italian
`pt`	Portuguese	`ru`	Russian
`ja`	Japanese	`ko`	Korean
`zh`	Chinese	`ar`	Arabic
`hi`	Hindi	`th`	Thai

View all supported codes with: subtitles-translator list-languages

Performance Optimization

Whisper Model Selection

tiny: Fastest, lowest accuracy (~39 MB)
base: Good balance of speed/accuracy (~74 MB) - Default
small: Better accuracy (~244 MB)
medium: High accuracy (~769 MB)
large: Highest accuracy (~1550 MB)

Translation Optimization

Use --threads parameter to increase parallel translation workers
Default: 5 threads, increase for faster processing on multi-core systems
Monitor memory usage with larger thread counts

Hardware Considerations

GPU Support: Whisper can use CUDA for faster transcription
Memory: Larger models require more RAM
Storage: Temporary audio files require disk space

Troubleshooting

Common Issues

FFmpeg Not Found

Error: FFmpeg not found in system PATH

Solution: Install FFmpeg and add to your system PATH

Memory Issues with Large Models

Error: CUDA out of memory

Solutions:

Use smaller Whisper model (--model-size small)
Close other GPU applications
Use CPU processing by setting environment variable

Audio Extraction Fails

Error: Could not extract audio from video

Solutions:

Verify video file is not corrupted
Check video format is supported (MP4 recommended)
Ensure sufficient disk space for temporary files

Translation Service Issues

Error: Translation service unavailable

Solutions:

Check internet connection
Try reducing thread count
Use --no-translation for transcription only

Debug Mode

Enable verbose logging for detailed troubleshooting:

subtitles-translator --verbose --log-file debug.log translate video.mp4 --target-lang es

Getting Help

Check the troubleshooting section in the documentation
Enable debug logging to identify specific issues
Verify all prerequisites are correctly installed
Test with a small sample video file first

File Format Support

Supported Input Formats

Video: MP4 (recommended), AVI, MOV, MKV, WMV
Audio: MP3, WAV, FLAC, AAC, OGG

Output Formats

Subtitles: SRT format with proper timing
Audio: Temporary WAV files (automatically cleaned up)

Configuration

Environment Variables

Variable	Description	Default
`CUDA_VISIBLE_DEVICES`	GPU selection for Whisper	All available
`WHISPER_CACHE_DIR`	Model cache directory	System default
`TEMP_AUDIO_DIR`	Temporary audio storage	System temp

Configuration Files

The application uses configuration through command-line arguments. Future versions may include configuration file support.

Development

Running Tests

# Install development dependencies
pip install -e .[dev]

# Run all tests
pytest

# Run with coverage
pytest --cov=subtitles_translator

# Run specific test categories
python run_tests.py unit
python run_tests.py integration
python run_tests.py performance

Code Quality

# Format code
black subtitles_translator tests

# Lint code
flake8 subtitles_translator tests

# Type checking
mypy subtitles_translator

Contributing

We welcome contributions! Please read our contributing guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Run the test suite (pytest)
Format your code (black .)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Setup

# Clone your fork
git clone https://github.com/yourusername/SubtitlesTranslator.git
cd SubtitlesTranslator

# Create development environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or
venv\Scripts\activate     # Windows

# Install in development mode
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI Whisper for speech recognition
Deep Translator for translation services
FFmpeg for audio/video processing
Click for CLI interface

Project Status

Version: 1.0.0
Status: Beta
Python Support: 3.8, 3.9, 3.10, 3.11, 3.12
Maintenance: Actively maintained

Roadmap

GUI interface
Batch processing for multiple files
Configuration file support
Additional subtitle formats (VTT, ASS)
Cloud translation service integration
Real-time processing for live streams

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
docs		docs
examples		examples
subtitles_translator		subtitles_translator
tests		tests
.gitignore		.gitignore
=20231117		=20231117
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
TEST_SUITE_SUMMARY.md		TEST_SUITE_SUMMARY.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_tests.py		run_tests.py
setup.py		setup.py

dusancv22/SubtitlesTranslator

Folders and files

Latest commit

History

Repository files navigation

SubtitlesTranslator

Features

Quick Start

Prerequisites

System Requirements

FFmpeg Installation

Windows

macOS

Ubuntu/Debian

CentOS/RHEL

Installation

From Source (Recommended for Development)

Using pip (When Published)

Usage

Command Line Interface

Basic Translation

Advanced Options

Available Commands

Command Options

Programmatic API

Basic Usage

Advanced Configuration

Supported Languages

Performance Optimization

Whisper Model Selection

Translation Optimization

Hardware Considerations

Troubleshooting

Common Issues

FFmpeg Not Found

Memory Issues with Large Models

Audio Extraction Fails

Translation Service Issues

Debug Mode

Getting Help

File Format Support

Supported Input Formats

Output Formats

Configuration

Environment Variables

Configuration Files

Development

Running Tests

Code Quality

Contributing

Development Setup

License

Acknowledgments

Project Status

Roadmap

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages