← กลับ

F5-TTS Thai

กำลังพัฒนา

Self-hosted Thai TTS inference service using the F5-TTS flow-matching model, served via REST API with output stored on Cloudflare R2

role: Developer

PythonFastAPIDockerCloudflare R2

Overview

F5-TTS Thai is a self-hosted Text-to-Speech inference service for the Thai language, built out of the need for high-quality TTS without relying on third-party APIs that charge per request and handle Thai poorly.

It uses F5-TTS (Flow Matching TTS), an open-source model that produces natural-sounding speech. It accepts Thai text input and either streams audio back or uploads the result to Cloudflare R2.

Why Self-Hosted

  • Mainstream TTS services charge per character
  • Thai language support is poor (wrong prosody, unnatural accent)
  • Full control over voice, speed, and pitch
  • Output stored on R2 and served immediately via cdn.kksz.dev

API

POST /tts
Content-Type: application/json
 
{
  "text": "สวัสดีครับ นี่คือบริการ TTS ภาษาไทย",
  "speaker": "default",
  "speed": 1.0
}
 
# Response: audio stream (WAV/MP3) or R2 URL

Stack

Model:   F5-TTS (flow-matching, open-source)
API:     Python + FastAPI
Runtime: Docker
Storage: Cloudflare R2
         bucket: tts-kimss (APAC)
         CDN:    https://cdn.kksz.dev