# Authorized Voice Generation Adapter

Purpose: define a provider-neutral interface for generating per-segment narration with an authorized teacher voice.

## Current Asset

- Speaker ID: `teacher-liu-authorized`
- Consent record: `storage/voice-rights/teacher-liu-authorized/consent.md`
- Training manifest: `storage/voice-rights/teacher-liu-authorized/training-manifest.json`
- Voice model config: `storage/voice-rights/teacher-liu-authorized/voice-model.json`
- Cleaned source audio: `storage/voice-rights/teacher-liu-authorized/cleaned-audio/reference-youtube-v2EN9GVEeMc-clean.wav`

## Interface

Input:

```json
{
  "speakerId": "teacher-liu-authorized",
  "text": "這一課要先抓住題目...",
  "language": "zh-TW",
  "style": {
    "pace": "teacher_explanation",
    "emotion": "warm_precise",
    "pauseAfterBoardPointMs": 400
  },
  "outputPath": "storage/childhood-wonder-video/segment-01-audio.mp3"
}
```

Output:

```json
{
  "ok": true,
  "speakerId": "teacher-liu-authorized",
  "audioPath": "storage/childhood-wonder-video/segment-01-audio.mp3",
  "durationSeconds": 42.7,
  "provider": "provider-name",
  "providerVoiceId": "provider-specific-id"
}
```

Failure fallback:

```json
{
  "ok": false,
  "reason": "provider_not_configured",
  "fallback": {
    "mode": "edge-tts",
    "voice": "zh-TW-YunJheNeural"
  }
}
```

## Provider Requirements

Any provider implementation must:

- Verify `consentPath` exists before generating voice-clone audio.
- Never print API keys or provider secrets.
- Accept text from rewritten scripts only.
- Generate one audio file per segment.
- Normalize output loudness.
- Convert output to MP3 if the provider returns WAV.
- Return duration from `ffprobe`.

## Segment Generation Loop

```text
for each segment:
  read segment.narration
  generate authorized voice audio
  normalize/convert to segment-{id}-audio.mp3
  ffprobe duration
  store duration for video composition
```

## Integration With V1

Current V1 script uses `prepareNarrationAudio(segment, audioPath)` in `produceChildhoodWonderVideo.ts`.

Implemented behavior:

1. If `AUTHORIZED_VOICE_SPEAKER_ID` is set, the script checks:
   - `storage/voice-rights/<speaker-id>/generated-audio/segment-XX-audio.mp3`
2. If the authorized MP3 exists:
   - copy it into `storage/childhood-wonder-video/segment-XX-audio.mp3`
   - derive timing from that MP3
3. If the authorized MP3 is missing:
   - fall back to Edge-TTS unless `AUTHORIZED_VOICE_REQUIRED=true`
4. If `AUTHORIZED_VOICE_REQUIRED=true` and an authorized MP3 is missing:
   - fail fast

Remaining provider step:

- Add a provider-specific command or script that turns `segment.narration` into the generated MP3 files under `storage/voice-rights/<speaker-id>/generated-audio/`.

Recommended env vars:

```bash
AUTHORIZED_VOICE_SPEAKER_ID=teacher-liu-authorized
AUTHORIZED_VOICE_REQUIRED=false
AUTHORIZED_VOICE_PROVIDER=
```

Provider-specific env vars should be added only after choosing the provider.