Balabolka :: Utilitaire pour les services de synthèse vocale

The command line application allows to use online text-to-speech services: text files or subtitles can be converted to audio files. The utility can be used for testing purposes: it will help you to choose a cloud computing service that satisfies your needs. The separate application for Yandex SpeechKit is available for downloading, because Yandex is the Russian IT company with close government ties.

Online services with speech technologies:

Google Cloud TTS;
Amazon Polly;
Baidu TTS;
CereVoice Cloud;
Descript TTS;
IBM Watson TTS;
Iciba TTS;
iTranslate TTS;
Microsoft Azure;
Naver TTS;
OpenAI TTS;
Youdao TTS;
Yandex SpeechKit.

Télécharger Balabolka (application pour les services en ligne)

Taille de fichier : Mo

Version : Changelog

Licence : Freeware

Système d'exploitation :

Utilitaire de ligne de commande pour Yandex SpeechKit : Télécharger ( Mo)
Le programme convertit le texte ou les sous-titres en fichiers audio en utilisant le service Yandex.
Pour effectuer des opérations via l'API Yandex, il est nécessaire de s'authentifier à l'aide d'une clé API.

Ligne de commande

The utility handles various command line parameters to be able to read text aloud or save as an audio file. Les paramètres doivent être séparés par un espace et commencer par « - » (tiret). Utilisez la commande bal4web avec les options -? ou -h pour obtenir de l'aide sur la syntaxe et les paramètres de ligne de commande.

-s nom_du_service: Sets the name of the online TTS service ("google" or "g", "amazon" or "a", "baidu" or "b", "cerevoice" or "c", "descript" or "d", "ibm" or "i", "iciba" or "k", "itranslate" or "t", "microsoft" or "m", "naver" or "n", "openai" or "o", "youdao" or "y"). La valeur par défaut est "google".
-l nom_de_langue: Sets the language name for the online TTS service. The name is a combination of an ISO 639 two-letter lowercase culture code associated with a language and an ISO 3166 two-letter uppercase subculture code associated with a country or region. Par exemple : fr-FR, en-US, de-DE. La valeur par défaut est "en-US".
Note: Descript TTS and OpenAI TTS perform the language identification for input text, so these services ignore the option now. These services can recognize several dozen languages on their own.
-g sexe: Sets the gender for the online TTS service (if supported). The available values: "female" or "f", "male" or "m". The default value is not defined. This parameter is supported by services: Amazon Polly, CereProc TTS, Descript TTS, Google TTS, IBM Watson TTS, iTranslate TTS, Microsoft Azure, Naver TTS, OpenAI TTS. If a voice name is specified, there is no need to set its gender.
-n nom_de_voix: Sets the voice name for the online TTS service (if supported). The default value is not defined. This parameter is supported by services Amazon Polly, CereProc TTS, Descript TTS, Google Cloud TTS, IBM Watson TTS, Microsoft Azure, Naver TTS, OpenAI TTS.
-r débit_de_parole: Sets the rate of the synthesized speech (if supported).
La valeur par défaut est "1.0" (le débit de parole moyen oral).
Amazon Polly: from "0.20" to "2.00".
CereProc TTS: from "0.30" to "4.00".
Descript TTS, Naver TTS, OpenAI TTS, Youdao TTS: from "0.70" to "2.00".
Google TTS, IBM Watson TTS, Microsoft Azure: from "0.10" to "3.00".
Google Cloud: from "0.25" to "4.00".
iTranslate TTS: from "0.50" to "2.00".
-p nombre_intégral: Sets the speaking pitch in a range of -20 to 20 (if supported). La valeur par défaut est 0.
This option is supported by Amazon Polly, CereProc TTS, Google Cloud TTS, IBM Watson TTS, Microsoft Azure.
-v nombre_intégral: Spécifie le volume compris entre 0 et 200 (la valeur par défaut est 100).
-st style: Sets the voice-specific speaking style. The voice can express emotions like cheerfulness, empathy or calmness. This option is supported by some voices in Microsoft Azure. Styles are not available if the WebSocket protocol for Microsoft Azure is used.
--style-degree style_degree ou -sd style_degree: Sets the intensity of the speaking style in a range of "0.01" to "2.00" (for styles supported by Microsoft Azure). La valeur par défaut est "1.00". The option allows to specify a stronger or softer style to make the speech more expressive or subdued.
-m: Prints the list of supported languages (genders and voices' names, if available) for the online TTS service.
-f nom_de_fichier: Spécifie le nom du fichier texte d'entrée. La ligne de commande peut contenir quelques options -f.
-fl nom_de_fichier: Ouvrir le fichier avec la liste des fichiers texte (un nom de fichier par ligne). La ligne de commande peut contenir quelques options -fl.
-w nom_de_fichier: Sets the name of the output file in WAV format.
-c: Utilise le texte du presse-papiers.
-t texte: Utilise le texte de la ligne de commande. La ligne de commande peut contenir quelques options -t.
-i: Utilise le texte de flux d'entrée standard (STDIN).
-o: Enregistre les données audio dans le flux de sortie standard (STDOUT). Si l'option est spécifiée, l'option -w est ignorée.
--encoding encodage ou -enc encodage: L'encodage du texte de flux d'entrée standard ("ansi", "utf8" ou "unicode"). Si l'option n'est pas spécifiée, le programme détectera l'encodage du texte.
--silence-begin nombre_intégral ou -sb nombre_intégral: Spécifier la longueur de la pause en début du fichier audio (en millisecondes). La valeur par défaut est 0.
--silence-end nombre_intégral ou -se nombre_intégral: Spécifier la longueur de la pause en fin du fichier audio (en millisecondes). La valeur par défaut est 0.
-ln nombre_intégral: Selects a line from the text file by using of a line number. The line numbering starts at "1". The interval of numbers can be used for selecting of more than one line (for example, "26-34"). The command line may contain few options -ln.
-e nombre_intégral: Sets the length of pauses between sentences (in milliseconds). The value should be set less than 5000. If the option is not specified, the service will use the default pauses between sentences. This parameter is supported by Microsoft Azure only.
-d nom_de_fichier: Applies a dictionary for pronunciation correction (*.BXD, *.DIC or *.REX). The command line may contain few options -d.
-lrc: Creates the LRC file. Lyrics will be synchronized with the speech in the output audio file.
-srt: Creates the SRT file. Subtitles will be synchronized with the speech in the output audio file.
-sub: Le texte constitue des sous-titres et doit être converti en fichier audio, compte tenu des pauses spécifiées. Le paramètre peut être utile lorsque les options -i ou -c sont spécifiées en ligne de commande.
-host host_name: Sets the hostname of the proxy server.
-port nombre_intégral: Sets the port number of the proxy server.
-fr nombre_intégral: Sélectionner la fréquence d’échantillonnage audio de sortie en kHz (8, 11, 16, 22, 24, 32, 44, 48). Si le paramètre n’est pas spécifié, la valeur par défaut de la voix sélectionnée sera utilisée.
-ae encodage_audio: Sets the audio encoding for data returned by Google Cloud or Microsoft Azure ("linear16", "mp3" or "oggopus"). With this setting, it is possible to improve the sound quality. The option is available if the API key is specified. It is not recommended to be used without special necessity: apply it for testing purposes only.
--ignore-square-brackets ou -isb: Ignorer le texte entre [les crochets].
--ignore-curly-brackets ou -icb: Ignorer le texte entre {les accolades}.
--ignore-angle-brackets ou -iab: Ignorer le texte entre <les crochets angulaires>.
--ignore-round-brackets ou -irb: Ignorer le texte entre (les parenthèses).
--ignore-url ou -iu: Ignorer les adresses URL.
--ignore-comments or -ic: Ignorer les commentaires dans le texte. Les commentaires sur une seule ligne commencent par // et se poursuivent jusqu’à la fin de la ligne. Les commentaires multilignes commencent par /* et se terminent par */.
-dp: Afficher les informations sur l’avancement dans la fenêtre de console.
-cfg nom_de_fichier: Sets the name of the configuration file with the command line options (a text file where each line contains one option). If the option is not specified, the file bal4web.cfg in the same folder as the utility will be used.
-h: Affiche la liste des options de ligne de commande.
--lrc-length nombre_intégral: Spécifie la longueur maximale des lignes de texte pour le fichier LRC (en caractères).
--lrc-fname nom_de_fichier: Spécifie le nom du fichier LRC. L'option peut être utile lorsque l'option -o est spécifiée en ligne de commande.
--lrc-enc encodage: Spécifie l'encodage pour le fichier LRC ("ansi", "utf8" ou "unicode"). La valeur par défaut est "ansi".
--lrc-offset nombre_intégral: Spécifie le décalage temporel pour le fichier LRC (en millisecondes).
--lrc-artist texte: Spécifie une balise ID pour le fichier LRC : artiste.
--lrc-album texte: Spécifie une balise ID pour le fichier LRC : album.
--lrc-title texte: Spécifie une balise ID pour le fichier LRC : titre.
--lrc-author texte: Spécifie une balise ID pour le fichier LRC : auteur.
--lrc-creator texte: Spécifie une balise ID pour le fichier LRC : créateur du fichier LRC.
--lrc-sent: Insérer des lignes vides après les phrases dans le fichier LRC.
--lrc-para: Insérer des lignes vides après les alinéas dans le fichier LRC.
--srt-length nombre_intégral: Spécifie la longueur maximale des lignes de texte pour le fichier SRT (en caractères).
--srt-fname nom_de_fichier: Spécifie le nom du fichier SRT. L'option peut être utile lorsque l'option -o est spécifiée en ligne de commande.
--srt-enc encodage: Spécifie l'encodage pour le fichier SRT ("ansi", "utf8" ou "unicode"). La valeur par défaut est "ansi".
--raw: Sortie des données audio comme fichiers PCM brut ; les données audio sont sans l'en-tête WAV. L'option est utilisée avec l'option -o.
--ignore-length ou -il: Omet la longueur des données audio dans l'en-tête WAV. L'option est utilisée avec l'option -o.
--wss: Use the WebSocket protocol for Microsoft Azure. It allows to improve sound quality of audio files (24 KHz instead of 16 KHz). The option is ignored if the subscription key for the Microsoft Azure Cognitive Services is defined. Use the option -m to check if a voice supports the WebSocket protocol or not.
--sub-format texte: Le format des sous-titres ("srt", "lrc", "ssa", "ass", "smi" ou "vtt"). Si le paramètre n'est pas spécifié, le format est déterminé d'après l'extension du fichier des sous-titres.
--sub-fit ou -sf: Automatically increases the speech rate to fit time intervals (when the program converts subtitles to audio file). The SoundTouch library will be used for changing tempo.
--sub-max nombre_intégral ou -sm nombre_intégral: Sets the maximal rate of speech in a range of 110% to 200% (when the program converts subtitles to audio file). The program will automatically increase the speech rate without exceeding the set rate value.

--aws-keyid texte ou -ak texte: Sets AWS access key ID for the Amazon Polly. It is recommended to apply such key if you have it.
--aws-secret texte ou -as texte: Sets AWS secret access key for the Amazon Polly.
--aws-region texte ou -ar texte: Sets AWS region for the Amazon Polly.
--crv-email texte ou -ce texte: Sets the email address used when registering on the CereProc website. This information is necessary for CereVoice Cloud API authorization. It is recommended to apply such email if you have it.
--crv-pwd texte ou -cp texte: Sets the password used when registering on the CereProc website. This information is necessary for CereVoice Cloud API authorization. It is recommended to apply such password if you have it.
--gc-apikey texte ou -gk texte: Sets API key ID for the Google Cloud. It is recommended to apply such key if you have it.
--ms-apikey texte ou -mk texte: Sets the subscription key for the Microsoft Azure Cognitive Services. It is recommended to apply such key if you have it.
--ms-region texte ou -mr texte: Sets the subscription region for the Microsoft Azure Cognitive Services.

Exemples

Créer le fichier texte LANGUAGE.TXT avec la liste de toutes les langues prises en charge par le service Google TTS :

bal4web -s Google -m > language.txt

Convertir le texte de BOOK.TXT en parole et l'enregistrer dans le fichier BOOK.WAV :

bal4web -f "d:\Text\book.txt" -w "d:\Sound\book.wav" -s Google -l en-US -g female

Convertir les sous-titres en paroles et les enregistrer dans le fichier MOVIE.WAV :

bal4web -f "d:\Subtitles\movie.srt" -w "d:\Sound\movie.wav" -s m -l de-DE -n Conrad -r 1.1

bal4web -f "d:\Subtitles\movie.srt" -w "d:\Sound\movie.wav" -s m -l de-DE -n Conrad --sub-fit

Un exemple de l'utilisation de l'application avec l'utilitaire LAME.EXE :

bal4web -f d:\book.txt -s Baidu -l en-US -o --raw | lame -r -s 16 -m m -h - d:\book.mp3

Un exemple de l'utilisation de l'application avec l'utilitaire OGGENC2.EXE :

bal4web -f d:\book.txt -s Baidu -l en-US -o -il | oggenc2 --ignorelength - -o d:\book.ogg

Fichier de configuration

Les options de ligne de commande peuvent être enregistrées en tant que fichier de configuration « bal4web.cfg » dans le même dossier que l'application console.

Exemple de fichier de configuration :

-f d:\Text\book.txt
-w d:\Sound\book.wav
-s Google
-l de-DE
-g female
-d d:\Dict\rules.bxd
-lrc
--lrc-length 75
--lrc-enc utf8

Le programme peut combiner les options du fichier de configuration et celles de la ligne de commande.

Licence

Droits d'utilisation non commerciale de l’application :

personnes physiques – sans restriction,
personnes morales – avec les restrictions stipulées dans l'Accord de Licence du logiciel Balabolka.

L’utilisation commerciale du logiciel demande l'autorisation du détenteur du copyright.