Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions frontend-angular-ai/voice-generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# voice-generator

Generate a short script with an **LLM** (ChatGPT or Claude), then turn it into speech with a
**Text-to-Speech (TTS)** engine. Two TTS providers are supported and interchangeable at request time:

| Provider | `tts` value | Auth header | Response | Notes |
|--------------|---------------|------------------------------|-------------------|-----------------------------------------|
| ElevenLabs | `elevenlabs` | `xi-api-key: <key>` | streamed MP3 | Default. `eleven_multilingual_v2` model |
| 60dB | `60db` | `Authorization: Bearer <key>`| JSON (base64 MP3) | `/tts-synthesize`, decoded to disk |

---

## 🧩 Architecture

```
Angular frontend ──POST /api/llm/:type/:llm──► LLM service ──► storage/data/<name>-<llm>.json
│ │
└────POST /api/voice/:llm?tts=<provider>──► voice route ──reads JSON──┘
ElevenLabs OR 60dB adapter ──► storage/voices/<name>-<llm>.mp3
◄──── { success, data: <public MP3 url> } ──────────┘ (served from /storage)
```

The voice route does **not** receive the text directly — it reads the script JSON produced by the
LLM step, then hands the text to the selected TTS adapter.

---

## 🔗 API

| Method | Endpoint | Description |
|--------|---------------------------------------|--------------------------------------------------------|
| POST | `/api/llm/:type/:llm` | Generate a script (`type` = `biography`/`summary`) |
| POST | `/api/voice/:llm?tts=<provider>` | Synthesize the script to MP3. `tts` defaults to `elevenlabs` |
| GET | `/api/voice/health/tts` | ElevenLabs connectivity/key check |

**Voice request example**

```bash
# ElevenLabs (default — ?tts can be omitted)
curl -X POST "http://localhost:3000/api/voice/chatgpt" \
-H "Content-Type: application/json" -d '{"name":"Ridley Scott"}'

# 60dB
curl -X POST "http://localhost:3000/api/voice/chatgpt?tts=60db" \
-H "Content-Type: application/json" -d '{"name":"Ridley Scott"}'
```

Provider selection lives in `backend-javascript/src/routes/voice.routes.js` (`getTtsProvider()`),
mirroring the LLM `getProvider()` pattern. Each adapter exposes the same
`generateVoice(text, voiceId, outputPath)` signature:

- `src/services/voice/voice.service.js` — ElevenLabs
- `src/services/voice/sixtydb.service.js` — 60dB

---

## 🛠 Configuration

`backend-javascript/.env` (see `.env.template`):

```env
# true => local mocks, no API calls | false => real provider APIs
USE_MOCK=true

# ElevenLabs
ELEVENLABS_API_KEY=eleven-your-key
ELEVENLABS_VOICE_ID=eleven-voice-id-xxxxxxxx

# 60dB (VOICE_ID optional — blank uses the 60dB system default voice)
SIXTYDB_API_KEY=sixtydb-your-key
SIXTYDB_VOICE_ID=
```

When `USE_MOCK=true`, the backend copies a pre-recorded sample instead of calling any provider, so
no API key is needed to demo the flow.

---

## 🎚 Frontend

In the Angular UI a **Voix (TTS)** dropdown selects the provider (`ElevenLabs` / `60dB`). The choice
is sent through as `?tts=` and the voice buttons / status messages relabel to the active provider.

> Note: in mock mode the bundled sample audio is ElevenLabs-style; switching to 60dB relabels the UI
> but plays the same sample. Real synthesis (`USE_MOCK=false`) uses the selected provider end to end.

---

## ⚙️ Quick start

```bash
# Backend
cd backend-javascript
npm install
npm start # http://localhost:3000

# Frontend
cd frontend-angular
npm install
npm start # http://localhost:4200
```
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ DEEPSEEK_API_KEY=deepseek-your-key
ELEVENLABS_API_KEY=eleven-your-key
ELEVENLABS_VOICE_ID=eleven-voice-id-xxxxxxxx

# 60dB – Realistic voice synthesis & cloning (multi-language)
# VOICE_ID is optional: leave blank to use the 60dB system default voice
SIXTYDB_API_KEY=sixtydb-your-key
SIXTYDB_VOICE_ID=

# --------------------------------------------------
# AVATARS / VIDEO AI – Face & Speech Animation
# --------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ const aiServices = {

tts: [
{ type: 'elevenlabs', label: 'ElevenLabs', purpose: 'High-quality voice synthesis from text, multilingual' },
{ type: '60db', label: '60dB', purpose: 'Voice synthesis and cloning from text, multilingual' },
// autres services TTS...
],

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ const aiServices = {

tts: [
{ type: 'elevenlabs', label: 'ElevenLabs', purpose: 'High-quality voice synthesis from text, multilingual' },
{ type: '60db', label: '60dB', purpose: 'Voice synthesis and cloning from text, multilingual' },
],

avatar: [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ const aiServices = {

tts: [
{ type: 'elevenlabs', label: 'ElevenLabs', purpose: 'High-quality voice synthesis from text, multilingual' },
{ type: '60db', label: '60dB', purpose: 'Voice synthesis and cloning from text, multilingual' },
],

avatar: [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ import path from 'path';
import dotenv from 'dotenv';

import testElevenLabs from '../services/voice/test-elevenlabs.js';
import generateVoice from '../services/voice/voice.service.js';
import generateVoiceElevenLabs from '../services/voice/voice.service.js';
import generateVoiceSixtyDb from '../services/voice/sixtydb.service.js';
import generateVoiceMock from '../mocks/voice/voice.mock.js';

dotenv.config();
Expand All @@ -16,11 +17,28 @@ function safeFilename(name, llm) {
return `${name.toLowerCase().replace(/\s+/g, '-')}-${llm}`;
}

function getTtsProvider(tts) {
const providers = {
elevenlabs: {
real: generateVoiceElevenLabs,
voiceId: () => process.env.ELEVENLABS_VOICE_ID || '21m00Tcm4TlvDq8ikWAM',
},
'60db': {
real: generateVoiceSixtyDb,
voiceId: () => process.env.SIXTYDB_VOICE_ID || '',
},
};

return providers[tts] || providers.elevenlabs;
}

router.post('/:llm', async (req, res) => {
const { llm } = req.params;
const { name } = req.body;

const voiceId = process.env.ELEVENLABS_VOICE_ID || '21m00Tcm4TlvDq8ikWAM';
const tts = (req.query.tts || 'elevenlabs').toLowerCase();
const provider = getTtsProvider(tts);
const voiceId = provider.voiceId();
const fileName = safeFilename(name, llm);

const audioPath = path.join(process.cwd(), 'storage', 'voices', `${fileName}.mp3`);
Expand All @@ -47,8 +65,8 @@ router.post('/:llm', async (req, res) => {
await generateVoiceMock(text, voiceId, audioPath);
console.log('🟡 TTS MOCK -', audioPath);
} else {
await generateVoice(text, voiceId, audioPath);
console.log('✅ TTS réel -', audioPath);
await provider.real(text, voiceId, audioPath);
console.log(`✅ TTS réel (${tts}) -`, audioPath);
}

const publicPath = `/storage/voices/${fileName}.mp3`;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@

import axios from 'axios';
import fs from 'fs';

async function generateVoice(text, voiceId, outputPath) {
const url = 'https://api.60db.ai/tts-synthesize';

try {
const body = {
text: text,
output_format: 'mp3',
};

if (voiceId) {
body.voice_id = voiceId;
}

const response = await axios.post(
url,
body,
{
headers: {
Authorization: `Bearer ${process.env.SIXTYDB_API_KEY}`,
'Content-Type': 'application/json',
},
},
);

const { success, message, audio_base64 } = response.data || {};

if (!success || !audio_base64) {
throw new Error(message || 'Réponse 60db invalide (audio_base64 manquant)');
}

fs.writeFileSync(outputPath, Buffer.from(audio_base64, 'base64'));
console.log('✅ Audio enregistré :', outputPath);

return outputPath;

} catch (error) {
const status = error.response?.status;

if (status) {
console.error(`❌ Erreur 60db ${status}`);
} else {
console.error('❌ Erreur inconnue :', error.message);
}

throw error;
}
}

export default generateVoice;
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ export class AiService {
);
}

generateVoice(llm: string, name: string): Observable<VoiceGenerationResponse> {
generateVoice(llm: string, name: string, tts = 'elevenlabs'): Observable<VoiceGenerationResponse> {
if (environment.useMock) {
const safeName = name.toLowerCase().replace(/\s+/g, '-');
const voiceMockPath = `assets/voices/${safeName}-${llm}.mp3`;
Expand All @@ -57,7 +57,7 @@ export class AiService {
}).pipe(delay(1000));
}

const url = `${this.baseUrl}/voice/${llm}`;
const url = `${this.baseUrl}/voice/${llm}?tts=${encodeURIComponent(tts)}`;
const body = { name };

return this.http.post<VoiceGenerationResponse>(url, body).pipe(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,26 @@ <h1 class="text-center text-primary">voice-generator</h1>
<option value="summary">Résumé de film</option>
</select>
</div>
<div class="col-12 col-lg-3">
<div class="col-12 col-lg-2">
<label class="form-label" for="style">Style</label>
<select class="form-select" [(ngModel)]="style" (ngModelChange)="onStyleChange($event)" [disabled]="useMock">
<option *ngFor="let s of styleOptions" [value]="s.value">{{ s.label }}</option>
</select>
</div>
<div class="col-12 col-lg-3">
<div class="col-12 col-lg-2">
<label class="form-label" for="length">Longueur</label>
<select class="form-select" [(ngModel)]="length" (ngModelChange)="onLengthChange($event)" [disabled]="useMock">
<option value="short">Courte</option>
<option value="medium">Moyenne</option>
<option value="long">Longue</option>
</select>
</div>
<div class="col-12 col-lg-2">
<label class="form-label" for="tts">Voix (TTS)</label>
<select class="form-select" [(ngModel)]="tts" (ngModelChange)="onTtsChange($event)">
<option *ngFor="let t of ttsOptions" [value]="t.value">{{ t.label }}</option>
</select>
</div>
</div>

<div class="row g-4 mb-3">
Expand Down Expand Up @@ -79,7 +85,7 @@ <h1 class="text-center text-primary">voice-generator</h1>
</div>
<div class="card p-4 m-1">
<div class="d-flex justify-content-between align-items-center mb-3 btn-group-responsive">
<button class="btn btn-outline-primary" (click)="loadVoice('chatgpt')">Voix - ElevenLabs</button>
<button class="btn btn-outline-primary" (click)="loadVoice('chatgpt')">Voix - {{ ttsLabel }}</button>
<span *ngIf="voiceChatgpt && !voiceChatgptLoading" class="badge bg-primary">Voix OK ✓</span>
<span *ngIf="voiceChatgptDuration > 0" class="text-primary ms-auto small-text">Réponse en {{
voiceChatgptDuration.toFixed(1) }}s</span>
Expand All @@ -96,12 +102,12 @@ <h1 class="text-center text-primary">voice-generator</h1>
</div>
<div class="mt-3">
<div *ngIf="voiceChatgptLoading" class="alert alert-info alert-dismissible fade show" role="alert">
📨 Requête envoyée à ElevenLabs...
📨 Requête envoyée à {{ ttsLabel }}...
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
<div *ngIf="voiceChatgptLoading && !voiceChatgpt && !voiceChatgptError"
class="alert alert-warning alert-dismissible fade show" role="alert">
⏳ Réponse de ElevenLabs en cours de traitement...
⏳ Réponse de {{ ttsLabel }} en cours de traitement...
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
<div *ngIf="voiceChatgptError" class="alert alert-danger alert-dismissible fade show" role="alert">
Expand All @@ -110,7 +116,7 @@ <h1 class="text-center text-primary">voice-generator</h1>
</div>
<div *ngIf="voiceChatgpt && !voiceChatgptError" class="alert alert-success alert-dismissible fade show"
role="alert">
✅ Réponse de ElevenLabs reçue avec succès.
✅ Réponse de {{ ttsLabel }} reçue avec succès.
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
</div>
Expand Down Expand Up @@ -171,7 +177,7 @@ <h1 class="text-center text-primary">voice-generator</h1>

<div class="card p-4 m-1">
<div class="d-flex justify-content-between align-items-center mb-3 btn-group-responsive">
<button class="btn btn-outline-success" (click)="loadVoice('claude')">Voix - ElevenLabs</button>
<button class="btn btn-outline-success" (click)="loadVoice('claude')">Voix - {{ ttsLabel }}</button>
<span *ngIf="voiceClaude && !voiceClaudeLoading" class="badge bg-success">Voix OK ✓</span>
<span *ngIf="voiceClaudeDuration > 0" class="text-success ms-auto small-text">Réponse en {{
voiceClaudeDuration.toFixed(1) }}s</span>
Expand All @@ -188,12 +194,12 @@ <h1 class="text-center text-primary">voice-generator</h1>
</div>
<div class="mt-3">
<div *ngIf="voiceClaudeLoading" class="alert alert-info alert-dismissible fade show" role="alert">
📨 Requête envoyée à ElevenLabs...
📨 Requête envoyée à {{ ttsLabel }}...
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
<div *ngIf="voiceClaudeLoading && !voiceClaude && !voiceClaudeError"
class="alert alert-warning alert-dismissible fade show" role="alert">
⏳ Réponse de ElevenLabs en cours de traitement...
⏳ Réponse de {{ ttsLabel }} en cours de traitement...
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
<div *ngIf="voiceClaudeError" class="alert alert-danger alert-dismissible fade show" role="alert">
Expand All @@ -202,7 +208,7 @@ <h1 class="text-center text-primary">voice-generator</h1>
</div>
<div *ngIf="voiceClaude && !voiceClaudeError" class="alert alert-success alert-dismissible fade show"
role="alert">
✅ Réponse de ElevenLabs reçue avec succès.
✅ Réponse de {{ ttsLabel }} reçue avec succès.
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Fermer"></button>
</div>
</div>
Expand Down
Loading