TTS Speaker API¶
- tetos.get_speaker(name: str) type[Speaker]¶
Get a speaker by name.
- Parameters:
name (str) – The lowercase name of the speaker.
- Raises:
ValueError – If the speaker is not found.
- Returns:
The speaker class.
- Return type:
type[Speaker]
Azure¶
- class tetos.azure.AzureSpeaker(speech_key: str, speech_region: str, voice: str | None = None)¶
Azure TTS speaker.
- Parameters:
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
Baidu¶
- class tetos.baidu.BaiduSpeaker(api_key: str, secret_key: str, voice: str | None = None, speed: int = 5, pitch: int = 5, volume: int = 5)¶
Baidu TTS speaker.
- Parameters:
api_key (str) – The Baidu API key.
secret_key (str) – The Baidu secret key.
voice (str) – The voice to use.
speed (int) – The speed of speech, from 0 to 15. Defaults to 5.
pitch (int) – The pitch of speech, from 0 to 15. Defaults to 5.
volume (int) – The volume of speech, from 0 to 9(basic) and 0 to 15(high quality). Defaults to 5.
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
Edge¶
- class tetos.edge.EdgeSpeaker(voice: str | None = None, rate: str = '+0%', pitch: str = '+0Hz', volume: str = '+0%')¶
Edge TTS speaker.
- Parameters:
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
Google¶
Synthesizes speech from the input string of text.
- class tetos.google.GoogleSpeaker(*, voice: str | None = None, speaking_rate: float = 1.0, pitch: float = 0.0, volume_gain_db: float = 0.0)¶
Google TTS speaker.
- Parameters:
voice (str) – The voice to use. Defaults to “en-US-Studio-M”.
speaking_rate (float) – Optional. Input only. Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed. Any other values < 0.25 or > 4.0 will return an error.
pitch (float) – Optional. Input only. Speaking pitch, in the range [-20.0, 20.0]. 20 means increase 20 semitones from the original pitch. -20 means decrease 20 semitones from the original pitch.
volume_gain_db (float) – Optional. Input only. Volume gain (in dB) of the normal native volume supported by the specific voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there’s usually no effective increase in loudness for any value greater than that.
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
OpenAI¶
- class tetos.openai.OpenAISpeaker(*, model: str = 'tts-1', voice: str | None = None, speed: float | None = None, api_key: str | None, api_base: str | None)¶
OpenAI TTS speaker.
- Parameters:
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
Volcengine¶
- class tetos.volc.VolcSpeaker(access_key: str, secret_key: str, app_key: str, *, voice: str | None = None, sample_rate: int = 24000, speech_rate: int = 0, pitch_rate: int = 0)¶
Volcengine TTS speaker.
- Parameters:
access_key (str) – The access key ID.
secret_key (str) – The access secret key.
app_key (str) – The app key.
voice (str, optional) – The voice to use.
sample_rate (int, optional) – The sample rate. Available values: [8000,16000,22050,24000,32000,44100,48000], Defaults to 24000.
speech_rate (int, optional) – The speech rate. It should be in range [-50,100]. 100 means 2x speed and -50 means half speed. Defaults to 0.
pitch_rate (int, optional) – The pitch rate. It should be in range [-12,12]. Defaults to 0.
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.
Minimax¶
- class tetos.minimax.MinimaxSpeaker(api_key: str, group_id: str, model: str = 'speech-01', voice: str | None = None, timber_weights: list[TimberWeight] | None = None, speed: float | None = None, vol: float | int | None = None, pitch: int | None = None)¶
MiniMax TTS speaker.
- Parameters:
api_key (str) – The MiniMax API key.
group_id (str) – The MiniMax group ID.
model (str) – The model to use. Defaults to “speech-01”.
voice (str) – The voice to use.
timber_weights (list[TimberWeight]) – The timber weights.
speed (float) – The speed of speech. Range [0.5, 2.0]. Defaults to 1.0.
vol (float | int) – The volume of speech. Range (0, 10]. Defaults to 1.
pitch (int) – The pitch of speech. Range [-12, 12]. Defaults to 0.
- classmethod get_command() Command¶
Return a Click command for the speaker.
- Returns:
The Click command.
- Return type:
- say(text: str, out_file: str | Path | None = None, lang: str = 'en-US') float¶
A synchronous version of synthesize()
- async stream(text: str, lang: str = 'en-US') AsyncGenerator[bytes, None]¶
Generate speech from text as a stream.