I'm using AsyncFishAudio with tts.stream_websocket to stream TTS audio over WebSocket in a conversational application.
When generating relatively long responses, I frequently hit a WebSocketNetworkError coming from httpx_ws during ws.receive_bytes(). After some investigation, I found that increasing keepalive_ping_timeout_seconds on the underlying aconnect_ws call from the default 20 seconds to 60 seconds completely eliminates these errors for my use case.
Right now, the only way I can change this timeout is by modifying the library source directly, which is not maintainable.
Environment
- Library:
fishaudio (AsyncFishAudio)
- Feature:
AsyncTTSClient.stream_websocket / client.tts.stream_websocket(...)
- Transport: WebSocket via
httpx_ws.aconnect_ws
- Use case: Long-form, streamed conversational TTS (responses can be quite long)
Current behavior
AsyncTTSClient.stream_websocket internally calls:
async with aconnect_ws(
"/v1/tts/live",
client=self._client.client,
headers={"model": model, "Authorization": f"Bearer {self._client.api_key}"}
) as ws:
...
- For long TTS generations, there can be periods where no audio chunks or other frames are received for more than 20 seconds.
- In those cases,
httpx_ws raises a WebSocketNetworkError, which bubbles up to my application and breaks the TTS stream.
Workaround
If I patch the library locally and change the aconnect_ws call to:
async with aconnect_ws(
"/v1/tts/live",
client=self._client.client,
headers={"model": model, "Authorization": f"Bearer {self._client.api_key}"},
keepalive_ping_timeout_seconds=60,
) as ws:
...then the WebSocketNetworkError no longer occurs, and long TTS responses stream successfully. However, this requires modifying the installed package, which is fragile and hard to maintain across upgrades.
Requested / expected behavior
It would be great if the keepalive timeout were configurable from the public API, for example by:
- Adding an optional parameter to:
async def stream_websocket(
self,
text_stream: AsyncIterable[Union[str, TextEvent, FlushEvent]],
*,
reference_id: Optional[str] = None,
references: Optional[List[ReferenceAudio]] = None,
format: Optional[AudioFormat] = None,
latency: Optional[LatencyMode] = None,
speed: Optional[float] = None,
config: TTSConfig = TTSConfig(),
model: Model = "s1",
keepalive_ping_timeout_seconds: int | None = None, # for example
):
and passing it through to aconnect_ws, with a sensible default (e.g. current behavior).
- Or alternatively, exposing this via some configuration or
RequestOptions-like object.
Questions
- Is
keepalive_ping_timeout_seconds intended to be user-configurable for long-running TTS streams?
- Would you be open to a PR that adds an optional parameter (or configuration mechanism) to control this timeout without patching the library source?
- Is there a recommended pattern in this library for configuring WebSocket-level timeouts for TTS streaming?
Having an official way to configure this timeout would make it much easier to support long-form conversational TTS without resorting to local patches. Thanks!
I'm using
AsyncFishAudiowithtts.stream_websocketto stream TTS audio over WebSocket in a conversational application.When generating relatively long responses, I frequently hit a
WebSocketNetworkErrorcoming fromhttpx_wsduringws.receive_bytes(). After some investigation, I found that increasingkeepalive_ping_timeout_secondson the underlyingaconnect_wscall from the default20seconds to60seconds completely eliminates these errors for my use case.Right now, the only way I can change this timeout is by modifying the library source directly, which is not maintainable.
Environment
fishaudio(AsyncFishAudio)AsyncTTSClient.stream_websocket/client.tts.stream_websocket(...)httpx_ws.aconnect_wsCurrent behavior
AsyncTTSClient.stream_websocketinternally calls:httpx_wsraises aWebSocketNetworkError, which bubbles up to my application and breaks the TTS stream.Workaround
If I patch the library locally and change the
aconnect_wscall to:...then the
WebSocketNetworkErrorno longer occurs, and long TTS responses stream successfully. However, this requires modifying the installed package, which is fragile and hard to maintain across upgrades.Requested / expected behavior
It would be great if the keepalive timeout were configurable from the public API, for example by:
and passing it through to
aconnect_ws, with a sensible default (e.g. current behavior).RequestOptions-like object.Questions
keepalive_ping_timeout_secondsintended to be user-configurable for long-running TTS streams?Having an official way to configure this timeout would make it much easier to support long-form conversational TTS without resorting to local patches. Thanks!