SuperSonic WebUI

SuperSonic WebUI is a local text-to-speech gateway and browser interface for Supertone's Supertonic TTS engine. It exposes a mostly OpenAI-compatible /v1/audio/speech endpoint, maps OpenAI-style voice names to Supertonic voices, and includes a compact WebUI for generating speech, changing language, selecting output format, and using simulated streaming playback.

The app runs locally with FastAPI and serves both the API and static WebUI from the same process.

Features

Local Supertonic TTS through supertonic>=1.2.0
OpenAI-style speech endpoint: POST /v1/audio/speech
WebUI at /
Voice aliases: alloy, echo, fable, nova, onyx, shimmer
Native Supertonic voices: M1-M5, F1-F5
Output formats: wav, mp3, opus, flac, pcm
Simulated streaming endpoint: POST /v1/audio/stream
31 language options
Text normalization for numbers, currencies, units, and phone numbers
Custom voice-style JSON upload support

Requirements

Python 3.10+ recommended
ffmpeg for MP3, Opus, and FLAC output
The first Supertonic model load may download model assets into ~/.cache/supertonic3

Local Setup

python3 -m venv .venv
.venv/bin/python -m pip install -r requirements.txt
.venv/bin/python server.py

Open:

http://127.0.0.1:8880

Health check:

curl http://127.0.0.1:8880/health

Docker

Build and run with Docker Compose:

docker compose up -d --build

Open:

http://127.0.0.1:8880

The compose file keeps the Supertonic model cache in a Docker volume so the model is not downloaded every time:

supertonic-cache:/root/.cache

Custom uploaded voices are persisted through:

./voices:/app/voices

Stop the container:

docker compose down

Remove the model cache volume as well:

docker compose down -v

API Usage

Generate WAV audio:

curl -o speech.wav \
  -X POST http://127.0.0.1:8880/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello from SuperSonic.",
    "voice": "alloy",
    "response_format": "wav",
    "speed": 1.0,
    "language": "en",
    "steps": 8
  }'

List voices and languages:

curl http://127.0.0.1:8880/v1/audio/voices

List models:

curl http://127.0.0.1:8880/v1/models

Stream simulated PCM frames over Server-Sent Events:

curl -N \
  -X POST "http://127.0.0.1:8880/v1/audio/stream?chunk_size=300" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "This is a simulated streaming test.",
    "voice": "echo",
    "language": "en",
    "speed": 1.0,
    "steps": 8
  }'

OpenAI Compatibility

Compatible for basic text-to-speech clients that call:

POST /v1/audio/speech

Supported OpenAI-style fields:

model
input
voice
response_format
speed

SuperSonic-specific fields:

language
steps

Important differences:

model is accepted but the app always uses Supertonic.
Authentication is not required or validated.
Error responses are FastAPI-style, not exact OpenAI error objects.
Streaming uses a custom /v1/audio/stream SSE format.
Transcription and translation endpoints are not implemented.

Configuration

Environment variables:

Variable	Default	Description
`SUPERSONIC_HOST`	`0.0.0.0`	Bind address
`SUPERSONIC_PORT`	`8880`	HTTP port
`SUPERSONIC_LOG_LEVEL`	`INFO`	Python logging level

Custom Voices

Upload a Supertonic voice-style JSON file from the WebUI, or call:

curl -X POST http://127.0.0.1:8880/v1/audio/voices/upload \
  -F "file=@voice.json"

Uploaded voices are stored in voices/ and can be used by name in voice.

Notes

WAV and PCM do not require ffmpeg.
MP3, Opus, and FLAC require ffmpeg.
The streaming endpoint is simulated streaming: Supertonic synthesizes text chunks, then the server slices generated PCM into small frames for browser playback.
CPU inference works, but generation speed depends on machine performance.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
static		static
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
server.py		server.py
text_normalizer.py		text_normalizer.py
tts_engine.py		tts_engine.py
voice_map.py		voice_map.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SuperSonic WebUI

Features

Requirements

Local Setup

Docker

API Usage

OpenAI Compatibility

Configuration

Custom Voices

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SuperSonic WebUI

Features

Requirements

Local Setup

Docker

API Usage

OpenAI Compatibility

Configuration

Custom Voices

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages