TTS CLI: Convert Text to Audio in One Command

April 30, 2026 · 9 min read

TTSBuddy CLI makes converting text to high-quality audio simple and fast. With support for 58 voices across 10 languages, it processes up to 500,000 characters in 10–30 seconds. You can choose playback speeds from 0.5x to 1.5x, or even up to 4.0x in some cases. It integrates with top TTS engines like ElevenLabs, OpenAI, and Google Gemini, while also offering an offline option with Kokoro ONNX.

Key features include:

Multiple input methods: Command-line text, stdin, clipboard, or file monitoring.
Customizable output: Save in formats like MP3, WAV, or FLAC, and specify file paths.
Advanced tools: AI agent support, cost estimation, and YAML-based profiles for tailored workflows.
Accessibility focus: Free plan includes 120 minutes/month, helping users with visual impairments, dyslexia, or ADHD.

Installation is quick with options like Homebrew, Go, or prebuilt binaries. Whether you’re creating audiobooks, automating workflows, or improving accessibility, TTSBuddy CLI simplifies the process. Just type ttsbuddy "your text here" to get started.

Installation and Setup

Supported Platforms

TTSBuddy CLI works seamlessly on macOS, Linux, and Windows, and the best part? It doesn't require any extra libraries or system packages. Thanks to its zero-dependency architecture, you won’t need to install Python, Node.js, or any other additional software.

All you need to get started is a free account at ttsbuddy.com (you can sign up using your email or through Google/GitHub SSO) and an active internet connection.

This streamlined, zero-dependency setup ensures compatibility across all major platforms. Once you're ready, follow the installation instructions below to get started.

How to Install TTSBuddy CLI

TTSBuddy CLI

There are three easy ways to install TTSBuddy CLI:

Using Homebrew (for macOS or Linux users):
Run the following command in your terminal:
```
brew install ttsbuddy
```
With Go Installed:
If you have Go installed, you can use this command:
```
go install github.com/ttsbuddy/cli@latest
```
Downloading Prebuilt Binaries:
Visit the TTSBuddy website and download the prebuilt binary for your operating system.

After installation, confirm that everything is set up by running this command in your terminal:

ttsbuddy --version

Once verified, you’re ready to convert text into audio. Most users create their first audio file within just 5 minutes of setup[1].

Converting Text to Audio with One Command

Command Syntax

TTSBuddy CLI makes turning text into audio straightforward. All you need to do is type:

ttsbuddy "your text here"

into your terminal and hit enter. The tool processes your input and creates high-quality audio using neural TTS technology. It offers access to over 58 voices in 14+ languages for a wide range of use cases [2].

The CLI is designed to handle complex formatting effortlessly. It converts Markdown tables into spoken descriptions, restructures bullet points into natural-sounding sentences, and adjusts how code blocks are narrated [2]. By default, the output is in MP3 format, but you can tweak playback speeds between 0.5x and 1.5x. This flexibility makes it useful for various scenarios - slower speeds are great for mastering a new language, while faster speeds can help you skim through material efficiently [2][1].

Now, let’s look at how you can customize your audio output.

Output Options

TTSBuddy CLI gives you control over how and where your audio files are saved. To specify a file path, use the -o flag like this:

ttsbuddy "your text" -o /path/to/file.mp3 [4].

If MP3 isn’t your preferred format, the CLI supports six additional audio formats: WAV, FLAC, AAC, Opus, PCM, and MP3. Just use the -fmt flag to choose your desired format [4]. For more advanced use, the CLI can stream raw audio directly to stdout or generate JSON outputs. These features are especially handy for automation workflows or when integrating text-to-speech functionality into larger projects.

Input Modes and Configuration

Input Methods

TTSBuddy CLI is built to fit into various workflows by offering three input methods. You can input text directly as a command-line argument for quick conversions, use local files like .md or .txt for lengthier content, or pipe input to seamlessly integrate it into automated processes.

Markdown content is automatically cleaned up during processing. For batch tasks, you can chain commands by piping output directly into TTSBuddy CLI.

Configuration Settings

Once you've selected your input method, you can refine the audio output using a range of customizable settings.

TTSBuddy CLI follows a clear priority system for resolving settings: CLI flags take precedence, followed by per-provider settings in the configuration file, then legacy global settings, environment variables, and finally, the built-in defaults. This hierarchy allows you to establish baseline preferences while easily overriding them for specific tasks.

To make switching between workflows effortless, you can set up named profiles in a YAML configuration file. For instance, you might create a podcast profile with a specific voice, slower playback speed, and WAV format. Meanwhile, a notifications profile could use a Flash voice, faster playback, and MP3 output. These profiles save you from repeatedly entering the same settings and let you adapt to different audio production needs with a single command.

Use Cases for Accessibility and Automation

Accessibility Support

TTSBuddy CLI makes it easier for individuals with visual impairments, dyslexia, or ADHD to interact with digital content independently. Whether it's reviewing menus, filling out applications, or reading emails, this tool eliminates the need for manual text adjustments [5].

"Text-to-speech (TTS) technology provides people with vision loss an essential tool for accessing digital and printed content independently." – TTS Buddy Documentation [5]

For those with ADHD or dyslexia, the ability to convert text into speech can significantly reduce cognitive effort. The CLI also includes playback speed options ranging from 0.5x to 1.5x, allowing users to either slow down for detailed comprehension or speed up for quicker overviews [1].

Considering that over 1 billion people worldwide face accessibility challenges, TTSBuddy CLI addresses both personal needs and compliance with regulations. For example, U.S. municipalities with populations over 50,000 are required to meet WCAG 2.1 Level AA standards by April 24, 2026. To support these efforts, the tool’s free plan offers 120 minutes of text-to-speech conversion per month - no credit card required [5].

Developer and Content Workflows

Beyond accessibility, TTSBuddy CLI simplifies developer workflows with its automation-friendly features. Developers can use it to convert technical documentation or blog posts written in Markdown into natural-sounding audio. This is particularly useful for large-scale documentation projects, as most audio files are generated in just 10 to 30 seconds [2]. The -c flag even allows users to merge files, making it a great option for creating audiobooks or multi-chapter podcasts.

The tool integrates seamlessly into CI/CD pipelines, enabling teams to automate the creation of audio versions of updated documentation using platforms like GitHub Actions. For high-volume tasks, the -r flag can set API rate limits, ensuring uninterrupted service. Developers can also use TTSBuddy for spoken status updates by integrating it into "Stop hooks" [7]. Additionally, the CLI supports local TTS engines, such as macOS's say, which is ideal for converting sensitive documents on-device [6].

Conclusion

TTSBuddy CLI makes text-to-speech conversion incredibly straightforward, packing powerful features into a single command. Whether you're working with a lengthy 500,000-character novel chapter or a quick Markdown file, its integration with multiple TTS providers gives you the flexibility to choose what works best for your needs.

For those focused on accessibility, the free plan provides 120 minutes of conversion time each month - no credit card needed. Developers will appreciate automation-friendly tools like watch mode, batch processing, and event hooks, which integrate smoothly into CI/CD pipelines [3].

The par-tts doctor command takes the frustration out of troubleshooting by checking audio backends and API setups before you even begin [3]. Plus, AI-powered content sanitization saves time by cleaning up Markdown tables and code blocks automatically [2][3]. And it’s compatible across all major platforms [3].

Want to test the waters? Use the --dry-run or --estimate-cost options to preview operations without incurring charges [3]. TTSBuddy CLI is ready to simplify your workflow and boost digital accessibility - all with just one command.

FAQs

How do I pick a specific voice and language?

To choose a specific voice and language in TTSBuddy, follow these steps:

Open the dashboard and locate the Voice dropdown menu.
Explore the available voices categorized by language, such as American English, British English, or Spanish.
Click the Preview button to listen to samples and pick the voice that fits your needs.
Make sure the selected voice aligns with the language of your source text for the most natural-sounding output.

TTSBuddy offers a variety of languages, each with tailored voice options.

Can I run TTSBuddy CLI fully offline?

TTSBuddy CLI offers fully offline text-to-speech conversion, ensuring privacy and reliability. It relies on local models, such as VoxCPM2, which operate without requiring an internet connection or external API calls. This makes it perfect for secure or remote settings where online access isn't an option.

How do I automate TTSBuddy CLI in scripts or CI?

You can easily integrate TTSBuddy CLI into your scripts or CI workflows using its command-line interface. Here’s how to get started:

Install the CLI globally with:
```
npm install -g @humeai/cli
```
Authenticate your account by running:
```
hume login
```
Execute commands like:
```
hume tts "Your text here"
```

The audio file generated will be saved in the specified output directory. This setup allows you to automate the text-to-speech process or seamlessly include it in your CI pipelines for smooth workflow integration.

Installation and Setup​

Supported Platforms​

How to Install TTSBuddy CLI​

Converting Text to Audio with One Command​

Command Syntax​

Output Options​

Input Modes and Configuration​

Input Methods​

Configuration Settings​

Use Cases for Accessibility and Automation​

Accessibility Support​

Developer and Content Workflows​

Conclusion​

FAQs​

How do I pick a specific voice and language?​

Can I run TTSBuddy CLI fully offline?​

How do I automate TTSBuddy CLI in scripts or CI?​