🎵 Timbre AI – AI-Powered Text-to-Speech and Voice Cloning SaaS

A modern AI-powered Text-to-Speech SaaS built with Next.js, TypeScript, TailwindCSS, and Prisma/Postgres.
Timbre AI allows users to generate realistic speech, clone voices instantly, and manage audio workflows with team-based access and usage-based billing.

🗣️ Live Demo

Try it here:
👉 https://timbre-ai.vercel.app/

🚀 Features

📝 Text-to-Speech – Generate speech with adjustable creativity, variety, expression, and flow parameters
🗣️ Zero-Shot Voice Cloning – Clone any voice from a 10s+ sample instantly, no fine-tuning required
🎤 20 Built-in Voices – Pre-seeded system voices across 12 categories and 5 locales
📊 Waveform Audio Player – Play, pause, seek, and download with WaveSurfer.js visualization
👥 Multi-Tenant – Team access via Clerk Organizations with full data isolation
💸 Usage-Based Billing – Metered pricing with Polar for characters and voice generations
🕒 Generation History – Browse and replay past audio generations with voice metadata
📱 Fully Responsive – Mobile-first, adaptive layouts, and compact controls

🏗️ Tech Stack

Getting Started

Prerequisites

Node.js 20.9 or later
Prisma Postgres database
Clerk account (with Organizations enabled)
Cloudflare R2 bucket
Modal account (for GPU-hosted TTS)
Polar account (for billing)

1. Clone and install

git clone https://github.com/code-with-antonio/resonance.git
cd resonance
npm install

2. Configure environment

cp .env.example .env

Fill in the blank values in .env. Sensible defaults (Clerk routes, Polar meter names, APP_URL, etc.) are pre-filled.

3. Set up Polar billing

In your Polar dashboard, create two meters under Meters:

Voice Creation meter
- Filter: Name equals voice_creation
- Aggregation: Count
Text-to-Speech Characters meter
- Filter: Name equals tts_generation
- Aggregation: Sum over characters

Then create a new product with Recurring subscription pricing. Under Price Type, add two metered prices:

Click Add metered price and select the Text-to-Speech Characters meter
- Set the Amount per unit (price per character, e.g. $0.003)
- Optionally set a Cap amount (e.g. $100)
Click Add metered price again and select the Voice Creation meter
- Set the Amount per unit (price per voice generation, e.g. $0.25)
- Optionally set a Cap amount (e.g. $100)

With only metered prices, the subscription starts at $0/month and scales with usage. If you want a baseline subscription fee (e.g. $20/month), add a third price to the same product — select a fixed price instead of a metered price. This requires no code changes since fixed prices are handled entirely by Polar.

Ensure Allow multiple subscriptions is turned off under Settings > Billing (this is the Polar default).

Copy the product ID into POLAR_PRODUCT_ID. The meter filter names and aggregation property must match the POLAR_METER_* env variables.

4. Set up the database

npx prisma migrate deploy

5. Deploy the TTS engine

The included chatterbox_tts.py is adapted from Modal's official Chatterbox TTS example, modified to read voice reference audio directly from your R2 bucket instead of a Modal Volume.

Before deploying, update chatterbox_tts.py with your R2 credentials:

R2_BUCKET_NAME = "<your-r2-bucket-name-here>"
R2_ACCOUNT_ID = "<your-r2-account-id-here>"

Then create the required secrets in your Modal dashboard:

Secret Name	Keys	Description
`cloudflare-r2`	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`	R2 API credentials (used for bucket mount)
`chatterbox-api-key`	`CHATTERBOX_API_KEY`	API key to protect the endpoint (use any strong random string)
`hf-token`	`HF_TOKEN`	Hugging Face token (for downloading the Chatterbox model weights)

Deploy to Modal:

modal deploy chatterbox_tts.py

This deploys Chatterbox TTS to a serverless NVIDIA A10G GPU on Modal. The container mounts your R2 bucket read-only for direct access to voice reference audio. Use the resulting Modal URL as CHATTERBOX_API_URL in your .env.local.

Note: The first request after a period of inactivity may take longer due to cold starts as Modal provisions the GPU container.

Once deployed, generate the type-safe Chatterbox client from the OpenAPI spec:

npm run sync-api

6. Seed voices

npx prisma db seed

Seeds 20 built-in voices to the database and R2. The system voice WAV files are included in the repository and originate from Modal's voice sample pack.

7. Run

npm run dev

Open http://localhost:3000.

Self-Hosting

Resonance is designed to be self-hosted. You'll need:

A PostgreSQL database - Prisma Postgres (recommended), or any managed Postgres
Cloudflare R2 - For audio storage (S3-compatible, generous free tier)
Modal - For serverless GPU inference (pay-per-second billing)
Clerk - For authentication and multi-tenancy
Polar - For metered billing (use sandbox mode with card 4242 4242 4242 4242 for testing)

Deploy the Next.js app to any Node.js host (Railway, Docker, etc.).

Project Structure

src/
├── app/                        # Next.js App Router
│   ├── (dashboard)/            # Protected routes (home, TTS, voices)
│   ├── api/                    # Audio proxy routes + tRPC handler
│   ├── sign-in/                # Clerk auth pages
│   └── sign-up/
├── components/                 # Shared UI components (shadcn/ui + custom)
├── features/
│   ├── dashboard/              # Home page, quick actions
│   ├── text-to-speech/         # TTS form, audio player, settings, history
│   ├── voices/                 # Voice library, creation, recording
│   └── billing/                # Usage display, checkout
├── hooks/                      # App-wide hooks
├── lib/                        # Core: db, r2, polar, env, chatterbox client
├── trpc/                       # tRPC routers, client, server helpers
├── generated/                  # Prisma client
└── types/                      # Generated API types

Scripts

Command	Description
`npm run dev`	Start dev server
`npm run build`	Production build
`npm run start`	Start production server
`npm run lint`	Lint with ESLint
`npm run sync-api`	Regenerate Chatterbox API types from OpenAPI spec

Acknowledgements

Chatterbox TTS by Resemble AI - the open-source zero-shot voice cloning model powering speech generation
Modal - serverless GPU deployment example and voice sample pack

📜 License

👨‍💻 Author

⭐ If you like Timbre AI, consider giving it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.vscode		.vscode
__pycache__		__pycache__
prisma		prisma
public		public
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chatterbox_tts.py		chatterbox_tts.py
components.json		components.json
eslint.config.mjs		eslint.config.mjs
global.d.ts		global.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
screenshot.png		screenshot.png
sentry.edge.config.ts		sentry.edge.config.ts
sentry.server.config.ts		sentry.server.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎵 Timbre AI – AI-Powered Text-to-Speech and Voice Cloning SaaS

🗣️ Live Demo

🚀 Features

🏗️ Tech Stack

Getting Started

Prerequisites

1. Clone and install

2. Configure environment

3. Set up Polar billing

4. Set up the database

5. Deploy the TTS engine

6. Seed voices

7. Run

Self-Hosting

Project Structure

Scripts

Acknowledgements

📜 License

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎵 Timbre AI – AI-Powered Text-to-Speech and Voice Cloning SaaS

🗣️ Live Demo

🚀 Features

🏗️ Tech Stack

Getting Started

Prerequisites

1. Clone and install

2. Configure environment

3. Set up Polar billing

4. Set up the database

5. Deploy the TTS engine

6. Seed voices

7. Run

Self-Hosting

Project Structure

Scripts

Acknowledgements

📜 License

👨‍💻 Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages