TalkBot reads Discord text channel messages aloud in voice channels. Type a message and the bot speaks it. Supports multiple voice providers, per-user voice customization, and works on any server.
This is v2 — a major rewrite of TalkBot. The codebase has been migrated to TypeScript, all secrets moved to
.env, and Docker startup has changed. A new local TTS provider (Kokoro) has been added — no cloud API keys needed, runs on your machine. If you were running v1 and want to upgrade, see Migrating to v2. If you need the old version, switch to themasterbranch on GitHub (click Branch: v2 at the top of the repo and selectmaster).
- What You Need
- Download
- Quick Start (Docker)
- Migrating to v2
- Quick Start (No Docker)
- Discord Setup
- TTS Providers
- Commands
- Environment Variables Reference
- Setup on Coolify
- Local Development
- Project Structure
- Troubleshooting
- Acknowledgements
- A Discord bot token — free from the Discord Developer Portal (see Discord Setup)
- API credentials for a TTS (text-to-speech) service — at least one. Amazon Polly is the easiest to start with (see TTS Providers). Or skip the cloud APIs entirely and run a local TTS model (Kokoro) if you have spare CPU/GPU resources
- A place to run the bot — your own computer, a VPS, or a hosting service like Coolify
Option A — Download as ZIP (no git required):
- Click the green Code button at the top of this page
- Click Download ZIP
- Extract the ZIP to a folder on your computer
Option B — Clone with git:
git clone https://github.com/nullabork/talkbot.git
cd talkbotDocker packages the bot and everything it needs into a container so you don't have to install Node.js or manage dependencies. If you don't have Docker, see Quick Start (No Docker) instead.
-
Install Docker if you don't have it: docker.com/get-started
-
Open a terminal in the talkbot folder.
-
Create your settings file:
- Copy the file
.env.exampleand rename the copy to.env - Open
.envin a text editor (Notepad, VS Code, etc.) - Paste your Discord bot token after
DISCORD_TOKEN= - Paste your TTS provider credentials (see TTS Providers), or skip cloud APIs and use the local Kokoro model instead
- Copy the file
-
Start the bot:
# Mac / Linux: ./docker-start.sh # Windows: docker-start.cmd
The start script reads your
.envand launches the right containers automatically (including Kokoro TTS if enabled). If Kokoro hasn't been downloaded yet, it will prompt you first. -
Check it's working:
./docker-start.sh logs # Mac / Linux docker-start.cmd logs # Windows
You should see
Loaded the <Provider> TTS API credentials OK.and then the bot logging into Discord. Press Ctrl+C to stop watching logs (the bot keeps running).
To stop the bot:
- Mac / Linux:
./docker-start.sh down - Windows:
docker-start.cmd down
To restart after changes:
- Mac / Linux:
./docker-start.sh restart - Windows:
docker-start.cmd restart
If you had TalkBot running before, here's what changed and how to upgrade.
- All secrets and settings are now in
.env—config/auth.jsonand the TTS settings fromconfig/config.jsonare gone. Everything is in one file. - Docker startup changed — instead of
docker compose up -d, use the start scripts (docker-start.shordocker-start.cmd). They read your.envand launch the right containers automatically. - TTS providers are enabled in
.env— setTTS_AMAZON_ENABLED=true(or whichever provider you use) instead of editingconfig.json. - New local TTS option — Kokoro runs a TTS model on your machine, no cloud API keys needed.
-
Pull the latest code:
git pull git checkout v2
Or re-download the ZIP from GitHub (make sure the branch says
v2). -
Create your
.envfile — copy.env.exampleto.env, then move your credentials across:Old location ( auth.json/config.json)New location ( .env)bot_tokenDISCORD_TOKENclient_idCLIENT_IDdev_ids(array)DEV_IDS(comma-separated)command_charCOMMAND_CHARAmazon accessKeyIdAWS_ACCESS_KEY_IDAmazon secretAccessKeyAWS_SECRET_ACCESS_KEYAzure subscriptionKeyAZURE_SUBSCRIPTION_KEYWatson apikeyWATSON_API_KEYWatson serviceUrlWATSON_SERVICE_URLtts.amazon.enabled: trueTTS_AMAZON_ENABLED=true -
Start the bot:
# Mac / Linux: ./docker-start.sh # Windows: docker-start.cmd
-
Delete
auth.json— it's no longer used.
Your .server files (per-guild state in config/) are unchanged and will continue to work. Voice settings, permissions, and sound effects are all preserved.
If you'd rather run the bot directly on your computer without Docker.
-
Install Node.js 22 or newer from nodejs.org — download the LTS version.
-
Open a terminal in the talkbot folder.
-
Install dependencies:
npm install
-
Create your settings file:
- Copy
.env.exampleand rename the copy to.env - Open
.envin a text editor and add your Discord bot token and TTS credentials
- Copy
-
Build and start:
npm run build npm start
The bot will start and connect to Discord. To stop it, press Ctrl+C.
- Go to the Discord Developer Portal and create a new application.
- Under Bot, create a bot and copy the token → paste into
DISCORD_TOKENin.env. - Under Bot > Privileged Gateway Intents, enable:
- Message Content Intent (required — the bot reads message text for TTS)
- Server Members Intent (optional, for member lookups)
- Under OAuth2 > URL Generator, select:
- Scopes:
bot - Permissions: Send Messages, Connect, Speak, Manage Messages
- Scopes:
- Use the generated URL to invite the bot to your server.
You need at least one provider enabled. Set the API credentials and TTS_<PROVIDER>_ENABLED=true in .env.
- Create an IAM user with
polly:SynthesizeSpeechandpolly:DescribeVoicespermissions. - Add to
.env:AWS_ACCESS_KEY_ID=your_key AWS_SECRET_ACCESS_KEY=your_secret AWS_REGION=us-east-1 TTS_AMAZON_ENABLED=true
- Create a service account and download the JSON key file.
- Place the key file in
config/google-auth.json. - Add to
.env:GOOGLE_APPLICATION_CREDENTIALS=./config/google-auth.json TTS_GOOGLE_ENABLED=true
- Create a Speech resource in the Azure portal.
- Add to
.env:AZURE_SUBSCRIPTION_KEY=your_key AZURE_ENDPOINT=https://eastus.tts.speech.microsoft.com/ TTS_AZURE_ENABLED=true
- Create a Text to Speech service instance.
- Add to
.env:WATSON_API_KEY=your_key WATSON_SERVICE_URL=https://api.au-syd.text-to-speech.watson.cloud.ibm.com/instances/your-guid TTS_WATSON_ENABLED=true
- Create API credentials in the Tencent Cloud console.
- Add to
.env:TENCENT_ACCESS_KEY_ID=your_key TENCENT_SECRET_ACCESS_KEY=your_secret TTS_TENCENT_ENABLED=true
The Alibaba provider is non-functional due to missing audio processing dependencies. It is hardcoded to enabled: false.
Runs a local TTS model via Kokoro-FastAPI. No API keys required — the model runs on your machine. The first startup downloads the model image (~4GB) and may take 5-20 minutes depending on your internet speed. Only CPU mode has been tested so far — GPU mode should work with NVIDIA hardware but is experimental.
-
Add to
.env:TTS_KOKORO_ENABLED=true KOKORO_DEVICE=cpuSet
KOKORO_DEVICE=gpuif you have an NVIDIA GPU with nvidia-container-toolkit installed. -
Start everything:
./docker-start.sh # Mac / Linux docker-start.cmd # Windows
The start script reads your
.envand automatically launches the Kokoro container alongside the bot. On first run it will prompt to confirm the ~4GB model download.
TalkBot uses prefix commands (default !). Type !help in any text channel the bot can see.
- Join a voice channel
- Type
!follow— the bot joins your channel - Type any message — the bot reads it aloud
- Type
!unfollowwhen done
| Command | Description |
|---|---|
!follow |
Bot joins your voice channel |
!unfollow |
Bot leaves voice channel |
!tts <message> |
Force the bot to speak a message |
!stop |
Stop current playback |
!mute [@user] |
Mute yourself or someone else |
!unmute [@user] |
Unmute yourself or someone else |
!sidle |
Take over as bot master |
!transfer @user |
Transfer master to someone else |
| Command | Description |
|---|---|
!myvoice <voice> |
Set your TTS voice (provider/voice or just voice) |
!mypitch <-20..20> |
Set voice pitch |
!myspeed <0.25..4> |
Set voice speed |
!tolang <lang> |
Translate your text (e.g. en, fr, de) |
!defaults |
Reset all your voice settings |
!mytitle <title> |
Set your display title |
!puberty on|off |
Toggle random pitch/speed per message |
!announceme on|off |
Toggle join/leave voice announcements |
!myprefix set <text> |
Prepend text before your messages |
!mysuffix set <text> |
Append text after your messages |
| Command | Description |
|---|---|
!help |
Show all commands |
!help <group> |
Show commands for a group (control, personalization, info, server) |
!ping |
Check if the bot is alive |
!who |
Show master and permitted users |
!voices |
Browse voices — pick provider, language, see voice list |
!voices set |
Set your voice via interactive dropdowns |
!voices samples |
Open the online voice sample database |
!details [@user] |
Show voice settings |
!stats |
Show server TTS usage statistics |
!invite |
Get the bot invite link |
| Command | Description |
|---|---|
!permit @user |
Allow a user to speak through the bot |
!unpermit @user |
Remove speak permission |
!adminrole @role |
Set which role can manage the bot |
!commandchar <char> |
Change the command prefix (default !) |
!restrict [#channel] |
Restrict bot to specific text channels |
!keep <count> |
Auto-delete old messages (keep last N) |
!sfx set <word> <url> |
Add a sound effect triggered by a word/emoji |
!sfx list |
List all sound effects |
!sfx del <word> |
Remove a sound effect |
!textrule add <find> -> <replace> |
Add a text replacement rule |
!textrule list |
List all text rules |
!bind add <id> |
Auto-follow when users join a channel/role |
!twitch permit <channel> <who> |
Connect Twitch chat to voice |
All configuration is done through .env. Copy .env.example to get started.
| Variable | Description |
|---|---|
DISCORD_TOKEN |
Bot token from Discord Developer Portal |
| Variable | Default | Description |
|---|---|---|
CLIENT_ID |
Application ID (needed for slash command registration) | |
DEV_IDS |
Comma-separated Discord user IDs for bot developers | |
SUPPORT_SERVER_ID |
Your support Discord server ID |
| Variable | Default | Description |
|---|---|---|
COMMAND_CHAR |
! |
Prefix character for commands |
DEFAULT_TITLE |
master |
Default title for the bot master |
NEGLECT_TIMEOUT |
3600000 |
Ms before bot leaves inactive voice (default 1 hour) |
NEGLECT_TIMEOUT_MESSAGES |
Talkbot inactivity timeout |
Comma-separated messages spoken before timeout |
PESTER_THRESHOLD |
10000000 |
Character count before self-hosting nag |
TWITCH_AUDIO_QUEUE_LIMIT |
10 |
Max queued Twitch messages |
ADVERTISE_STREAMER |
Streamer name to advertise | |
TAGLINE |
Bot tagline |
| Variable | Default | Description |
|---|---|---|
LOG_OUTPUT |
true |
Enable info logging |
LOG_ERRORS |
true |
Enable error logging |
NODE_ENV |
development |
Set to production for deployed instances |
LOG_LEVEL |
info |
Logging verbosity |
| Variable | Provider | Description |
|---|---|---|
GOOGLE_APPLICATION_CREDENTIALS |
Path to service account JSON | |
AWS_ACCESS_KEY_ID |
Amazon Polly | IAM access key |
AWS_SECRET_ACCESS_KEY |
Amazon Polly | IAM secret key |
AWS_REGION |
Amazon Polly | AWS region (default us-east-1) |
AZURE_SUBSCRIPTION_KEY |
Azure | Speech resource subscription key |
AZURE_ENDPOINT |
Azure | Speech endpoint URL |
WATSON_API_KEY |
Watson | IAM API key |
WATSON_SERVICE_URL |
Watson | Service instance URL |
TENCENT_ACCESS_KEY_ID |
Tencent | API access key |
TENCENT_SECRET_ACCESS_KEY |
Tencent | API secret key |
ALIBABA_APP_KEY |
Alibaba | Application key |
ALIBABA_TOKEN |
Alibaba | Access token |
ALIBABA_ENDPOINT |
Alibaba | TTS endpoint URL |
KOKORO_BASE_URL |
Kokoro | Server URL (default http://kokoro-tts:8880) |
KOKORO_DEFAULT_VOICE |
Kokoro | Default voice (default af_heart) |
KOKORO_DEVICE |
Kokoro | cpu or gpu (picks Docker image) |
Each provider has ENABLED, ENFORCE_LIMIT, and LIMIT settings. Format: TTS_<PROVIDER>_<SETTING>.
| Variable | Default | Description |
|---|---|---|
TTS_GOOGLE_ENABLED |
false |
Enable Google Cloud TTS |
TTS_GOOGLE_ENFORCE_LIMIT |
false |
Block requests after limit hit |
TTS_GOOGLE_LIMIT |
5000000 |
Character limit |
TTS_AMAZON_ENABLED |
true |
Enable Amazon Polly |
TTS_AMAZON_ENFORCE_LIMIT |
false |
Block requests after limit hit |
TTS_AMAZON_LIMIT |
5000000 |
Character limit |
TTS_AZURE_ENABLED |
false |
Enable Azure Cognitive Services |
TTS_AZURE_ENFORCE_LIMIT |
false |
Block requests after limit hit |
TTS_AZURE_LIMIT |
5000000 |
Character limit |
TTS_WATSON_ENABLED |
false |
Enable IBM Watson |
TTS_WATSON_ENFORCE_LIMIT |
false |
Block requests after limit hit |
TTS_WATSON_LIMIT |
5000000 |
Character limit |
TTS_TENCENT_ENABLED |
false |
Enable Tencent Cloud |
TTS_TENCENT_ENFORCE_LIMIT |
true |
Block requests after limit hit |
TTS_TENCENT_LIMIT |
1000000 |
Character limit |
TTS_ALIBABA_ENABLED |
false |
Enable Alibaba Cloud |
TTS_ALIBABA_ENFORCE_LIMIT |
true |
Block requests after limit hit |
TTS_ALIBABA_LIMIT |
1000000 |
Character limit |
TTS_KOKORO_ENABLED |
false |
Enable Kokoro local TTS |
TTS_KOKORO_ENFORCE_LIMIT |
false |
Block requests after limit hit |
TTS_KOKORO_LIMIT |
5000000 |
Character limit |
Coolify is a self-hosted PaaS that can deploy Docker apps.
-
Create a new resource in Coolify and connect your GitHub repo (or use the public URL).
-
Set build pack to Docker Compose.
-
Add environment variables in Coolify's UI — add all the variables from your
.envfile. At minimum:DISCORD_TOKEN- Your TTS provider credentials (e.g.
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) TTS_<PROVIDER>_ENABLED=truefor the provider(s) you want
-
Persistent storage — create a volume mount for the config directory so
.serverfiles (per-guild state) persist across deployments:/data/talkbot/config → /app/config -
Deploy — Coolify will build the Docker image and start the container.
-
Check logs in Coolify's UI to verify the bot connected and loaded TTS providers.
Tips:
- Coolify passes env vars to the container automatically — you don't need a
.envfile inside the container. - For Google TTS, upload
google-auth.jsonto your persistent volume and setGOOGLE_APPLICATION_CREDENTIALS=/app/config/google-auth.json. - The bot stores per-server state in
config/*.serverfiles — make sure the volume is persistent so these survive redeployments.
- Node.js 22+
- npm
git clone https://github.com/nullabork/talkbot.git
cd talkbot
npm install
cp .env.example .env
# Edit .env with your credentials| Command | Description |
|---|---|
npm run build |
Compile TypeScript to dist/ |
npm start |
Run the compiled bot |
npm run dev |
Watch mode with auto-restart |
npm test |
Run tests (Vitest) |
npm run lint |
Type-check (tsc --noEmit) |
npm run buildcompiles TypeScript fromsrc/todist/usingtsc, then rewrites path aliases withtsc-alias, then copiessrc/lang.jsontodist/.npm startrunsnode dist/src/index.jswhich loads.env, validates env vars with Zod, initializes TTS providers, connects to Discord, and listens for messages.- When a user types in a text channel, the bot checks if it's a command (
!follow, etc.) or regular text. Regular text is sent to the TTS provider and played in the voice channel.
src/
index.ts — Bot entry point (event handlers, startup)
env.ts — Zod-validated environment variables
config-loader.ts — Loads config/config.json
paths.ts — Resolved filesystem paths
commands/
index.ts — Command registry and dispatch
modules/ — Individual command handlers (37 commands)
helpers/
bot-stuff.ts — Discord client setup
common.ts — Logging and string utilities
ffmpeg.ts — MP3 → Opus conversion (WASM)
ssml-dictionary.ts — Discord markdown → SSML mappings
models/
Server.ts — Per-guild state, voice connection, TTS pipeline
World.ts — All servers, presence, lifecycle
Command.ts — Base command class
BotCommand.ts — Legacy command data class
MessageDetails.ts — Command context wrapper
MessageSSML.ts — SSML message builder
services/
TextToSpeechService.ts — Abstract TTS base class + provider registry
tts/ — Provider implementations (Google, Amazon, Azure, Watson, Tencent, Alibaba)
types/ — TypeScript interfaces
config/
config.json — Server overrides (per-guild settings)
config.example.json — Template
default.textrules.json — Default text replacement rules
lang.json — i18n string overrides
*.server — Per-guild state files (auto-generated)
- Make sure
DISCORD_TOKENis set in.env. - Check that Message Content Intent is enabled in the Discord Developer Portal.
- Verify the bot has Send Messages permission in the channel.
- Check that at least one TTS provider is enabled (
TTS_<PROVIDER>_ENABLED=true) and has valid credentials in.env. - Check logs for TTS API errors.
- Make sure the bot has Connect and Speak permissions in the voice channel.
- Ensure
patches/directory exists (needed bypatch-package). - Check that
config/default.textrules.jsonandconfig/lang.jsonexist.
GOOGLE_APPLICATION_CREDENTIALSmust point to a valid service account JSON file.- In Docker, the config volume mount (
./config:/app/config) must containgoogle-auth.json.
- First message after startup is slower (WASM ffmpeg cold load for Amazon Polly).
- Subsequent messages are faster.
- Google Cloud TTS is generally the fastest provider.
- See Migrating to v2 for the full walkthrough.
- WootoSmash
- FaxWang
- GreenLionVoltronPilot
- Kingk22
- Kelinmiriel
See LICENSE for details.