A powerful Python tool that extracts audio from MP4 videos, transcribes speech using OpenAI Whisper, translates text to multiple languages, and generates properly formatted SRT subtitle files.
- Audio Extraction: Extracts high-quality audio from MP4 video files using FFmpeg
- Speech Recognition: Uses OpenAI Whisper for accurate speech-to-text transcription
- Multi-language Translation: Translates transcribed text using deep-translator library
- SRT Generation: Creates properly formatted subtitle files with precise timestamps
- CLI Interface: Easy-to-use command-line interface with progress tracking
- Programmatic API: Full Python API for integration into other applications
- Robust Error Handling: Comprehensive validation and error recovery mechanisms
- Performance Optimized: Multi-threaded translation and efficient audio processing
# Install the package
pip install -e .
# Translate a video file
subtitles-translator translate video.mp4 --target-lang es
# Transcribe only (no translation)
subtitles-translator transcribe video.mp4
# List supported languages
subtitles-translator list-languages- Python: 3.8 or higher
- FFmpeg: Must be installed separately (see installation guide below)
- Operating System: Windows, macOS, or Linux
- Download FFmpeg from https://ffmpeg.org/download.html
- Extract and add to your PATH environment variable
- Verify installation:
ffmpeg -version
# Using Homebrew
brew install ffmpegsudo apt update
sudo apt install ffmpegsudo yum install epel-release
sudo yum install ffmpeg-
Clone the repository
git clone https://github.com/yourusername/SubtitlesTranslator.git cd SubtitlesTranslator -
Create virtual environment
python -m venv venv # Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
-
Install dependencies
pip install -e . -
Verify installation
subtitles-translator --help
pip install subtitles-translator# Translate Spanish video to English subtitles
subtitles-translator translate video.mp4 --source-lang es --target-lang en
# Auto-detect source language
subtitles-translator translate video.mp4 --target-lang fr
# Specify output file
subtitles-translator translate video.mp4 --output subtitles.srt --target-lang de# Use larger Whisper model for better accuracy
subtitles-translator translate video.mp4 --model-size large --target-lang es
# Multi-threaded translation for faster processing
subtitles-translator translate video.mp4 --threads 10 --target-lang ja
# Transcribe only (no translation)
subtitles-translator transcribe video.mp4 --model-size medium
# Enable verbose logging
subtitles-translator --verbose translate video.mp4 --target-lang pt
# Log to file
subtitles-translator --log-file process.log translate video.mp4 --target-lang ru| Command | Description |
|---|---|
translate |
Extract, transcribe, translate, and generate SRT subtitles |
transcribe |
Extract and transcribe audio without translation |
list-languages |
Show all supported language codes |
| Option | Description | Default |
|---|---|---|
--source-lang, -s |
Source language code | auto |
--target-lang, -t |
Target language code | en |
--model-size, -m |
Whisper model size | base |
--threads |
Translation thread count | 5 |
--output, -o |
Output file path | Auto-generated |
--no-translation |
Skip translation step | False |
--verbose, -v |
Enable verbose logging | False |
--quiet, -q |
Suppress console output | False |
--log-file |
Log to file | None |
from pathlib import Path
from subtitles_translator.core.audio_extractor import AudioExtractor
from subtitles_translator.core.speech_recognizer import SpeechRecognizer
from subtitles_translator.core.translator import Translator
from subtitles_translator.core.srt_generator import SRTGenerator
# Process video file
video_path = Path("video.mp4")
# Extract audio
audio_extractor = AudioExtractor()
audio_path = audio_extractor.extract_audio(video_path)
# Transcribe speech
speech_recognizer = SpeechRecognizer(model_size="base")
segments = speech_recognizer.transcribe_audio(audio_path)
# Translate text
translator = Translator(source_lang="es", target_lang="en")
translated_segments = translator.translate_segments(segments)
# Generate SRT file
SRTGenerator.save_srt_file(translated_segments, Path("output.srt"))from subtitles_translator.core.audio_extractor import AudioExtractor
from subtitles_translator.core.speech_recognizer import SpeechRecognizer
from subtitles_translator.utils.progress import StageProgressTracker
from subtitles_translator.utils.logger import setup_logger
# Setup logging
logger = setup_logger(level="DEBUG", log_file=Path("debug.log"))
# Configure audio extraction
audio_extractor = AudioExtractor(temp_dir=Path("temp"))
# Configure speech recognition with large model
speech_recognizer = SpeechRecognizer(model_size="large")
# Track progress through stages
stages = {
"Audio Extraction": 1,
"Speech Recognition": 3,
"Translation": 2,
"SRT Generation": 1
}
with StageProgressTracker(stages) as progress:
# Your processing pipeline here
passThe application supports all languages available in the deep-translator library. Common language codes include:
| Code | Language | Code | Language |
|---|---|---|---|
auto |
Auto-detect | en |
English |
es |
Spanish | fr |
French |
de |
German | it |
Italian |
pt |
Portuguese | ru |
Russian |
ja |
Japanese | ko |
Korean |
zh |
Chinese | ar |
Arabic |
hi |
Hindi | th |
Thai |
View all supported codes with: subtitles-translator list-languages
- tiny: Fastest, lowest accuracy (~39 MB)
- base: Good balance of speed/accuracy (~74 MB) - Default
- small: Better accuracy (~244 MB)
- medium: High accuracy (~769 MB)
- large: Highest accuracy (~1550 MB)
- Use
--threadsparameter to increase parallel translation workers - Default: 5 threads, increase for faster processing on multi-core systems
- Monitor memory usage with larger thread counts
- GPU Support: Whisper can use CUDA for faster transcription
- Memory: Larger models require more RAM
- Storage: Temporary audio files require disk space
Error: FFmpeg not found in system PATHSolution: Install FFmpeg and add to your system PATH
Error: CUDA out of memorySolutions:
- Use smaller Whisper model (
--model-size small) - Close other GPU applications
- Use CPU processing by setting environment variable
Error: Could not extract audio from videoSolutions:
- Verify video file is not corrupted
- Check video format is supported (MP4 recommended)
- Ensure sufficient disk space for temporary files
Error: Translation service unavailableSolutions:
- Check internet connection
- Try reducing thread count
- Use
--no-translationfor transcription only
Enable verbose logging for detailed troubleshooting:
subtitles-translator --verbose --log-file debug.log translate video.mp4 --target-lang es- Check the troubleshooting section in the documentation
- Enable debug logging to identify specific issues
- Verify all prerequisites are correctly installed
- Test with a small sample video file first
- Video: MP4 (recommended), AVI, MOV, MKV, WMV
- Audio: MP3, WAV, FLAC, AAC, OGG
- Subtitles: SRT format with proper timing
- Audio: Temporary WAV files (automatically cleaned up)
| Variable | Description | Default |
|---|---|---|
CUDA_VISIBLE_DEVICES |
GPU selection for Whisper | All available |
WHISPER_CACHE_DIR |
Model cache directory | System default |
TEMP_AUDIO_DIR |
Temporary audio storage | System temp |
The application uses configuration through command-line arguments. Future versions may include configuration file support.
# Install development dependencies
pip install -e .[dev]
# Run all tests
pytest
# Run with coverage
pytest --cov=subtitles_translator
# Run specific test categories
python run_tests.py unit
python run_tests.py integration
python run_tests.py performance# Format code
black subtitles_translator tests
# Lint code
flake8 subtitles_translator tests
# Type checking
mypy subtitles_translatorWe welcome contributions! Please read our contributing guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with tests
- Run the test suite (
pytest) - Format your code (
black .) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Clone your fork
git clone https://github.com/yourusername/SubtitlesTranslator.git
cd SubtitlesTranslator
# Create development environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# or
venv\Scripts\activate # Windows
# Install in development mode
pip install -e .[dev]
# Install pre-commit hooks
pre-commit installThis project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for speech recognition
- Deep Translator for translation services
- FFmpeg for audio/video processing
- Click for CLI interface
- Version: 1.0.0
- Status: Beta
- Python Support: 3.8, 3.9, 3.10, 3.11, 3.12
- Maintenance: Actively maintained
- GUI interface
- Batch processing for multiple files
- Configuration file support
- Additional subtitle formats (VTT, ASS)
- Cloud translation service integration
- Real-time processing for live streams