diff --git a/.cursor/skills/scaffold-elevenlabs-example/SKILL.md b/.cursor/skills/scaffold-elevenlabs-example/SKILL.md new file mode 100644 index 00000000..35f867d0 --- /dev/null +++ b/.cursor/skills/scaffold-elevenlabs-example/SKILL.md @@ -0,0 +1,84 @@ +--- +name: scaffold-elevenlabs-example +description: Scaffold prompt-driven examples in this repository using the existing example patterns and repo skills in `.agents/skills`. Use when adding a new example directory, creating matching `README.md`, `PROMPT.md`, and `setup.sh` files, or preparing a new example for `pnpm run generate`. +--- + +# Scaffold ElevenLabs Example + +Use this skill when a user wants a new example scaffold in this repo. + +## Defaults + +- Ignore the deprecated root `examples/` folder. +- Put new examples under `//`. +- The parent example directory owns authoring files; generated code lives in `example/`. + +## Inputs to confirm + +- destination path +- product and runtime +- whether the example needs bundled `assets/` +- whether the user wants scaffold only or scaffold plus a generated `example/` + +Ask concise follow-ups only when these are missing. + +## Workflow + +1. Read [reference.md](reference.md). +2. Inspect available repo skills in `.agents/skills/` and choose the best fit for the requested example. +3. Prefer the helper scaffold: + +```bash +python3 .cursor/skills/scaffold-elevenlabs-example/scripts/scaffold_example.py \ + --path text-to-speech/nextjs/my-example +``` + +Add `--with-assets` when the example should ship sample files, or `--reference ` to copy from a specific existing example. + +4. Edit the scaffolded `README.md`, `PROMPT.md`, and `setup.sh` until they match the requested example. +5. Treat the helper output as a copy of the closest reference. Adapt all three files for the new example. +6. Keep `PROMPT.md` terse: + +- first line invokes the most relevant repo skill found in `.agents/skills/`; for current examples this is often `/text-to-speech`, `/speech-to-text`, or `/agents`, but do not assume that list is exhaustive +- sections are file-by-file using `## \`path/to/file\`` +- bullets call out concrete SDKs, env handling, models, voice IDs, UI states, and error handling +- do not restate repo preamble like `example/`-only rules or `DESIGN.md`; the generator adds that + +7. Keep `setup.sh` aligned with current patterns: + +- use `set -euo pipefail` +- derive `DIR` and `REPO_ROOT` +- clean `example/` but preserve cache dirs (`node_modules`, `.venv`, `.next`) when relevant +- seed from `templates//` +- copy `README.md` into `example/README.md` +- copy `assets/` and local `.env` only when present +- install dependencies at the end +- for `nextjs`, fetch latest ElevenLabs package versions at setup time and patch `package.json` + +8. Keep `README.md` aligned with the closest current reference: + +- always include a heading, one-sentence summary, `## Setup`, and `## Run` +- add `## Usage` for interactive examples such as Next.js and agents demos +- commands should work from inside `example/` + +9. Recommended when shipping the example: add it to the root `README.md`. +10. Verify: + +- `bash /setup.sh` +- inspect the generated `example/` +- run `pnpm run generate ` only when the user wants full prompt validation or generated output + +## Constraints + +- Stay inside the current template matrix unless the user explicitly asks for a new base template. +- Reuse the closest existing example instead of inventing a new file format. +- Do not add application code directly under the parent example directory. +- Do not use the deprecated root `examples/` folder for new work. + +## Output checklist + +- [ ] new example directory exists at the requested path +- [ ] `README.md`, `PROMPT.md`, and `setup.sh` exist +- [ ] `setup.sh` uses the correct shared template +- [ ] `PROMPT.md` matches the terse style of the current examples +- [ ] the scaffold is ready for `pnpm run generate ` diff --git a/.cursor/skills/scaffold-elevenlabs-example/reference.md b/.cursor/skills/scaffold-elevenlabs-example/reference.md new file mode 100644 index 00000000..ccb0e92d --- /dev/null +++ b/.cursor/skills/scaffold-elevenlabs-example/reference.md @@ -0,0 +1,105 @@ +# Example Generator Reference + +## Generator behavior + +- `pnpm run generate` runs `scripts/generate-examples.sh`. +- The script finds every `PROMPT.md` outside ignored folders. +- If a target folder has `setup.sh` and no `example/`, it runs `setup.sh` first. +- The generator prepends repo context before sending the prompt: + - check existing `example/` first + - `setup.sh` already ran + - implement in `example/` only + - read root `DESIGN.md` for UI work +- The generator auto-loads repo SDK skills from `.agents/skills` when the prompt contains backticked skill names such as `/text-to-speech`. Keep the backticks and leading slash. + +## Available repo skills + +Inspect `.agents/skills/*/SKILL.md` before finalizing the first line of `PROMPT.md`. + +Current repo skills: + +- `/agents` +- `/music` +- `/setup-api-key` +- `/sound-effects` +- `/speech-to-text` +- `/text-to-speech` + +## Directory shape + +```text +/// +|-- README.md +|-- PROMPT.md +|-- setup.sh +|-- assets/ # optional +|-- .env # optional, local only +`-- example/ # generated output +``` + +Ignore the deprecated root `examples/` folder for new work. + +## Current example matrix + +| Path | Shared template | Prompt sections | Setup extras | +| -------------------------------------- | ---------------------- | ------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | +| `text-to-speech/typescript/quickstart` | `templates/typescript` | `index.ts` | Copies `.env`, preserves `node_modules`, installs with `pnpm` | +| `text-to-speech/python/quickstart` | `templates/python` | `main.py` | Copies `.env`, preserves `.venv`, installs with `pip` | +| `speech-to-text/typescript/quickstart` | `templates/typescript` | `index.ts` | Optional `assets/`, copies `.env`, preserves `node_modules` | +| `speech-to-text/python/quickstart` | `templates/python` | `main.py` | Optional `assets/`, copies `.env`, preserves `.venv` | +| `music/typescript/quickstart` | `templates/typescript` | `index.ts` | Copies `.env`, preserves `node_modules`, installs with `pnpm` | +| `music/nextjs/quickstart` | `templates/nextjs` | `app/api/generate-music/route.ts`, `app/page.tsx` | Adds `@elevenlabs/elevenlabs-js`, copies `.env.local`, preserves `node_modules` and `.next` | +| `speech-to-text/nextjs/realtime` | `templates/nextjs` | `app/api/scribe-token/route.ts`, `app/page.tsx` | Adds `@elevenlabs/react` and `@elevenlabs/elevenlabs-js`, copies `.env.local`, preserves `node_modules` and `.next` | +| `agents/nextjs/quickstart` | `templates/nextjs` | `app/api/agent/route.ts`, `app/api/conversation-token/route.ts`, `app/page.tsx` | Same Next.js setup pattern, removes `@elevenlabs/client` if present | +| `agents/nextjs/guardrails` | `templates/nextjs` | `app/api/agent/route.ts`, `app/api/conversation-token/route.ts`, `app/page.tsx` | Same as quickstart, but prompt targets guardrails and `onGuardrailTriggered` | + +## Runtime setup rules + +| Runtime | Seed template | Preserve on clean | Env copied into `example/` | Install step | +| ------------ | ----------------------- | ----------------------- | -------------------------- | ---------------------------------------------------------------------------- | +| `typescript` | `templates/typescript/` | `node_modules` | `.env` | `pnpm install --config.confirmModulesPurge=false` | +| `python` | `templates/python/` | `.venv` | `.env` | create `.venv`, upgrade `pip`, `pip install -r requirements.txt` | +| `nextjs` | `templates/nextjs/` | `node_modules`, `.next` | `.env.local` | patch `package.json`, then `pnpm install --config.confirmModulesPurge=false` | + +## Prompt rules + +- Start with `Before writing any code, invoke the \`/skill-name\` skill...`. +- Choose that skill by checking `.agents/skills/` first; do not assume only `/text-to-speech`, `/speech-to-text`, and `/agents` exist. +- Use `## \`path/to/file\``headings only for files inside`example/`. +- Keep prompts short and implementation-focused. Current prompts are direct checklists, not essays. +- Mention the concrete SDK client, env loading, output format, model ids, voice ids, API route security, and UI behavior when those details are known. +- Do not repeat repo-wide context that the generator already injects. + +## README rules + +- Always include a title, one-sentence summary, `## Setup`, and `## Run`. +- Add `## Usage` for interactive or multi-step examples such as Next.js and agents demos. +- Keep commands valid from inside `example/`. +- Use the closest current example as the formatting reference. + +## Best reference by request + +- Simple CLI text-to-speech script: start from `text-to-speech/typescript/quickstart` or `text-to-speech/python/quickstart`. +- Simple CLI music script: start from `music/typescript/quickstart`. +- Simple Next.js music prompt form: start from `sound-effects/nextjs/quickstart`. +- CLI transcription or file-based Scribe example: start from the speech-to-text quickstarts. +- Realtime microphone UI: start from `speech-to-text/nextjs/realtime`. +- Voice agent creation and conversation UI: start from `agents/nextjs/quickstart`. +- For specialized agent behavior, start from `agents/nextjs/quickstart` and consult `agents/nextjs/guardrails` only as an existing reference, not as a scaffold mode. + +## Scaffold helper + +The helper script creates a new example directory by copying `PROMPT.md`, `README.md`, and `setup.sh` from the closest existing example. It auto-detects the best reference by scanning the repo, or accepts an explicit `--reference`. + +```bash +python3 .cursor/skills/scaffold-elevenlabs-example/scripts/scaffold_example.py \ + --path agents/nextjs/my-agent-demo +``` + +Useful flags: + +- `--reference sound-effects/nextjs/quickstart` to copy from a specific example +- `--with-assets` to create an `assets/` directory +- `--force` to overwrite scaffold files in an existing directory + +The helper copies files verbatim from the reference. Edit `PROMPT.md`, `README.md`, and `setup.sh` after scaffolding to match the new example. diff --git a/.cursor/skills/scaffold-elevenlabs-example/scripts/scaffold_example.py b/.cursor/skills/scaffold-elevenlabs-example/scripts/scaffold_example.py new file mode 100644 index 00000000..b6f5d12d --- /dev/null +++ b/.cursor/skills/scaffold-elevenlabs-example/scripts/scaffold_example.py @@ -0,0 +1,146 @@ +#!/usr/bin/env python3 +"""Scaffold a new example directory by copying files from the closest existing example.""" +from __future__ import annotations + +import argparse +import re +from pathlib import Path + +REPO_ROOT = Path(__file__).resolve().parents[4] +SCAFFOLD_FILES = ("PROMPT.md", "README.md", "setup.sh") +IGNORED_DIRS = {"examples", "node_modules", ".venv", ".next", "example", "templates"} + + +def parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser( + description="Scaffold a new example by copying from the closest existing one." + ) + parser.add_argument( + "--path", + required=True, + help="Relative path like product/runtime/my-example", + ) + parser.add_argument( + "--reference", + help="Explicit reference example path (e.g. music/nextjs/quickstart). " + "Auto-detected from the repo when omitted.", + ) + parser.add_argument( + "--with-assets", + action="store_true", + help="Create an assets/ directory in the new example.", + ) + parser.add_argument( + "--force", + action="store_true", + help="Overwrite scaffold files in an existing directory.", + ) + return parser.parse_args() + + +def parse_example_path(path_text: str) -> tuple[str, str, str]: + clean_path = path_text.strip().strip("/") + parts = [part for part in clean_path.split("/") if part] + if len(parts) != 3: + raise SystemExit( + "Example paths must look like //, for example " + "text-to-speech/nextjs/my-example." + ) + + product, runtime, slug = parts + if any(part in {".", ".."} for part in parts): + raise SystemExit("Path segments cannot contain '.' or '..'.") + + if not re.fullmatch(r"[a-z0-9][a-z0-9-]*", slug): + raise SystemExit("Slug must use lowercase letters, numbers, and hyphens only.") + + return product, runtime, slug + + +def find_existing_examples() -> list[tuple[str, str, str, Path]]: + """Scan repo for directories containing PROMPT.md and return (product, runtime, slug, dir).""" + results: list[tuple[str, str, str, Path]] = [] + for prompt_file in REPO_ROOT.rglob("PROMPT.md"): + rel = prompt_file.parent.relative_to(REPO_ROOT) + if any(part in IGNORED_DIRS for part in rel.parts): + continue + parts = rel.parts + if len(parts) == 3: + results.append((parts[0], parts[1], parts[2], prompt_file.parent)) + return results + + +def find_reference( + product: str, runtime: str, examples: list[tuple[str, str, str, Path]] +) -> Path | None: + """Pick the best existing example to copy from. + + Priority: same product+runtime > same runtime > first available. + """ + same_product_runtime = [d for p, r, _, d in examples if p == product and r == runtime] + if same_product_runtime: + return same_product_runtime[0] + + same_runtime = [d for _, r, _, d in examples if r == runtime] + if same_runtime: + return same_runtime[0] + + if examples: + return examples[0][3] + + return None + + +def write_file(path: Path, content: str) -> None: + path.write_text(content.rstrip() + "\n", encoding="utf-8") + + +def main() -> None: + args = parse_args() + product, runtime, slug = parse_example_path(args.path) + + target_dir = (REPO_ROOT / args.path.strip().strip("/")).resolve() + if REPO_ROOT not in target_dir.parents: + raise SystemExit("Target path must stay inside the repository root.") + + if target_dir.exists() and any(target_dir.iterdir()) and not args.force: + raise SystemExit( + f"Target directory '{args.path}' already exists and is not empty. " + "Use --force to overwrite scaffold files." + ) + + if args.reference: + ref_dir = REPO_ROOT / args.reference.strip().strip("/") + if not ref_dir.is_dir(): + raise SystemExit(f"Reference directory '{args.reference}' does not exist.") + else: + examples = find_existing_examples() + ref_dir = find_reference(product, runtime, examples) + + target_dir.mkdir(parents=True, exist_ok=True) + if args.with_assets: + (target_dir / "assets").mkdir(exist_ok=True) + + copied = [] + if ref_dir: + for filename in SCAFFOLD_FILES: + source = ref_dir / filename + if source.exists(): + content = source.read_text(encoding="utf-8") + write_file(target_dir / filename, content) + copied.append(filename) + + setup_path = target_dir / "setup.sh" + if setup_path.exists(): + setup_path.chmod(0o755) + + print(f"Scaffolded {args.path.strip().strip('/')}") + if ref_dir: + print(f"Copied {', '.join(copied)} from {ref_dir.relative_to(REPO_ROOT)}") + else: + print("No existing example found to copy from — created empty directory.") + print("Next: edit PROMPT.md, README.md, and setup.sh for the new example.") + + +if __name__ == "__main__": + main() diff --git a/README.md b/README.md index 25ffdedd..1f272f9b 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ ![ElevenLabs Examples Header](./examples-header.png) -Prompt-driven ElevenLabs examples for text-to-speech and speech-to-text. Each project includes: +Prompt-driven ElevenLabs examples for text-to-speech, speech-to-text, music, sound effects, and agents. Each project includes: - `PROMPT.md` — instructions for agent-driven generation - `setup.sh` — scaffolds the `example/` directory from a shared template @@ -14,9 +14,20 @@ Shared base templates live in `templates/` (Next.js, Python, TypeScript). UI sty - [Text-to-Speech Quickstart (TypeScript)](text-to-speech/typescript/quickstart/example/README.md) — Generate an MP3 from text with the ElevenLabs JS SDK. - [Text-to-Speech Quickstart (Python)](text-to-speech/python/quickstart/example/README.md) — Generate an MP3 from text with the ElevenLabs Python SDK. +- [Text to Speech Playground (Next.js)](text-to-speech/nextjs/quickstart/example/README.md) — Generate speech from text in a Next.js app and play it back in the browser. - [Speech-to-Text Quickstart (TypeScript)](speech-to-text/typescript/quickstart/example/README.md) — Transcribe local audio files with Scribe v2. - [Speech-to-Text Quickstart (Python)](speech-to-text/python/quickstart/example/README.md) — Transcribe local audio files with Scribe v2. +- [Music Quickstart (TypeScript)](music/typescript/quickstart/example/README.md) — Generate an MP3 track from a text prompt with the ElevenLabs JS SDK. +- [Music Quickstart (Python)](music/python/quickstart/example/README.md) — Generate an MP3 track from a text prompt with the ElevenLabs Python SDK. +- [Music Playground (Next.js)](music/nextjs/quickstart/example/README.md) — Enter a prompt, generate a music track, and play it back in the browser. +- [Sound Effects Quickstart (TypeScript)](sound-effects/typescript/quickstart/example/README.md) — Generate a sound effect MP3 from a text prompt with the ElevenLabs JS SDK. +- [Sound Effects Quickstart (Python)](sound-effects/python/quickstart/example/README.md) — Generate a sound effect MP3 from a text prompt with the ElevenLabs Python SDK. +- [Sound Effects Playground (Next.js)](sound-effects/nextjs/quickstart/example/README.md) — Enter a prompt, generate a sound effect, and play it back in the browser. - [Real-Time Speech-to-Text (Next.js)](speech-to-text/nextjs/realtime/example/README.md) — Live microphone transcription with VAD in a Next.js app. +- [Real-Time Voice Agent (Next.js)](agents/nextjs/quickstart/example/README.md) — Live voice conversations with the ElevenLabs Agents Platform using the React Agents SDK. +- [Voice Agent Guardrails Demo (Next.js)](agents/nextjs/guardrails/example/README.md) — Demonstrate custom guardrails and the `guardrail_triggered` client event in a live voice agent. +- [Voice Isolator (Next.js)](voice-isolator/nextjs/quickstart/example/README.md) — Record your voice in the browser and remove background noise with the Voice Isolator API. +- [Dubbing Recorder (Next.js)](dubbing/nextjs/quickstart/example/README.md) — Record your voice in the browser, dub it into another language, and play or download the result. ## Generate examples from prompts diff --git a/agents/nextjs/guardrails/example/app/api/agent/route.ts b/agents/nextjs/guardrails/example/app/api/agent/route.ts index 41e821bb..a0739156 100644 --- a/agents/nextjs/guardrails/example/app/api/agent/route.ts +++ b/agents/nextjs/guardrails/example/app/api/agent/route.ts @@ -87,7 +87,7 @@ export async function POST() { }, tts: { voiceId: "JBFqnCBsd6RMkjVDRZzb", - modelId: "eleven_turbo_v2", + modelId: "eleven_flash_v2", }, conversation: { textOnly: false, diff --git a/agents/nextjs/quickstart/example/app/api/agent/route.ts b/agents/nextjs/quickstart/example/app/api/agent/route.ts index e702f6d5..f4bd2fd5 100644 --- a/agents/nextjs/quickstart/example/app/api/agent/route.ts +++ b/agents/nextjs/quickstart/example/app/api/agent/route.ts @@ -81,7 +81,7 @@ export async function POST() { }, tts: { voiceId: "JBFqnCBsd6RMkjVDRZzb", - modelId: "eleven_turbo_v2", + modelId: "eleven_flash_v2", }, conversation: { textOnly: false, diff --git a/dubbing/nextjs/quickstart/PROMPT.md b/dubbing/nextjs/quickstart/PROMPT.md new file mode 100644 index 00000000..fa2530cc --- /dev/null +++ b/dubbing/nextjs/quickstart/PROMPT.md @@ -0,0 +1,45 @@ +Before writing any code, invoke the `/text-to-speech` skill to learn the correct ElevenLabs SDK patterns. + +## 1. `app/api/dubbing/route.ts` + +Secure POST endpoint that starts a dubbing job from an uploaded recording. + +- Read `ELEVENLABS_API_KEY` from `process.env`. Return 500 if missing. +- Accept `audio` (File), `targetLang` (string), and optional `sourceLang` (string, default `auto`) from request `FormData`. +- Return 400 for missing or invalid audio, or a missing `targetLang`. +- Use `ElevenLabsClient` and call `client.dubbing.create({ file: audio, targetLang, sourceLang: sourceLang === "auto" ? undefined : sourceLang, name: "Browser dubbing demo" })`. +- Read the job id from the SDK response (`dubbingId`) and return JSON `{ dubbingId, expectedDurationSec }`. +- Wrap failures in readable JSON errors. + +## 2. `app/api/dubbing/[dubbingId]/route.ts` + +Secure GET endpoint that returns dubbing status metadata for polling. + +- Read and validate `dubbingId` from the route params. +- Call `client.dubbing.get(dubbingId)`. +- Return JSON with `status`, `error`, `sourceLanguage`, and `targetLanguages`. +- Keep the response small and friendly for client polling. + +## 3. `app/api/dubbing/[dubbingId]/audio/[languageCode]/route.ts` + +Secure GET endpoint that proxies the dubbed audio file. + +- Read and validate `dubbingId` and `languageCode` from the route params. +- Call `client.dubbing.audio.get(dubbingId, languageCode)`. +- Collect the returned stream into a `Buffer` and respond with `audio/mpeg`. +- Return readable JSON errors when the dub is not ready or fails. + +## 4. `app/page.tsx` + +In-browser voice recorder and dubbing page. + +- Use a compact curated language list in the page: `auto` for source detection plus English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, and Hindi. +- Record from the microphone with `navigator.mediaDevices.getUserMedia({ audio: true })` and `MediaRecorder`, using a browser-supported mime type (prefer `audio/webm;codecs=opus`, then `audio/webm`, then `audio/mp4`). +- After stopping, convert the recorded blob to a WAV `File` in the browser before upload. Do not send raw `audio/webm;codecs=opus` to `/api/dubbing`, because the Dubbing API rejects that content type. +- Show clear states: idle, recording, preparing, polling, ready, and error. While recording, show elapsed time and a pulsing red indicator. +- After recording, show the original audio player plus source-language and target-language selects. Prevent choosing the same explicit source and target language. +- On **Dub Recording**, `POST` `FormData` with the converted WAV file to `/api/dubbing`. +- Poll `/api/dubbing/${dubbingId}` every 5 seconds until the status is `dubbed`; stop early and show the API error if one is returned. +- When ready, fetch `/api/dubbing/${dubbingId}/audio/${targetLang}`, create an object URL, and render a dubbed `