SayLess is an intelligent Telegram bot designed for high-speed communication. It converts long voice messages into concise, high-impact "soundbites" and automatically reacts with an emoji that matches the speaker's emotional tone.
- Audio Transcription: Converts speech to text using Google's Speech Recognition (optimized for en-SG).
- Emotion Detection: Analyzes the transcript locally to determine the speaker's mood (Joy, Anger, Sadness, etc.).
- Soundbite Generation: Automatically trims and processes the original audio into a highlight reel based on extracted keywords.
- Native Reactions: Uses Telegram's native reaction system to tag generated soundbites with emotional metadata.
The bot handles a complex audio transformation process in real-time:
- Ingestion: Downloads Telegram .ogg (Opus) voice files.
- Transcoding: Uses FFmpeg and PyDub to convert compressed .ogg to .wav for transcription.
- Analysis: Feeds the .wav file into the SpeechRecognition engine.
- Synthesis: The soundbite_generator extracts key segments to create a new, shorter .ogg voice message.
- Cleanup: Automatically wipes all temporary files (.ogg, .wav) after processing to ensure data privacy and save disk space.
You must have FFmpeg installed for the audio conversion logic to work:
- Windows: choco install ffmpeg
- Mac: brew install ffmpeg
Create a .env file in the root directory:
BOT_TOKEN=your_telegram_bot_token
FREESOUND_API_KEY=your_freesound_api_key