• An addendum to Rule 3 regarding fan-translated works of things such as Web Novels has been made. Please see here for details.
  • We've issued a clarification on our policy on AI-generated work.
  • Our mod selection process has completed. Please welcome our new moderators.
  • Due to issues with external spam filters, QQ is currently unable to send any mail to Microsoft E-mail addresses. This includes any account at live.com, hotmail.com or msn.com. Signing up to the forum with one of these addresses will result in your verification E-mail never arriving. For best results, please use a different E-mail provider for your QQ address.
  • For prospective new members, a word of warning: don't use common names like Dennis, Simon, or Kenny if you decide to create an account. Spammers have used them all before you and gotten those names flagged in the anti-spam databases. Your account registration will be rejected because of it.
  • Since it has happened MULTIPLE times now, I want to be very clear about this. You do not get to abandon an account and create a new one. You do not get to pass an account to someone else and create a new one. If you do so anyway, you will be banned for creating sockpuppets.
  • Due to the actions of particularly persistent spammers and trolls, we will be banning disposable email addresses from today onward.
  • The rules regarding NSFW links have been updated. See here for details.

I built a free, local-AI tool to turn your stories into audiobooks, introducing Alexandria

Finrandojin

Your first time is always over so quickly, isn't it?
Joined
Nov 4, 2015
Messages
2
Likes received
6
Hi everyone,

I'm a long time reader and dev I've tried most TTS services and programs that convert books to audio and just coudn't find something that satisfied me. I wanted something that felt more like a directed performance and less like a flat narration reading a spreadsheet, so I built Alexandria.

It is 100% free and open source. It runs locally on your own hardware, so there are no character limits, no subscriptions, and no one is looking over your shoulder at what you're generating.

Audio Sample: https://vocaroo.com/1cG82gVS61hn (Uses the built-in Sion LoRA)

GitHub Repository: https://github.com/Finrandojin/alexandria-audiobook/


The Feature Set:


Natural Non-Verbal Sounds
Unlike most tools that just skip over emotional cues or use tags like [gasp], the scripting engine in Alexandria actually writes out pronounceable vocalizations. It can handle things like gasps, laughter, sighs, crying, and heavy breathing. Because it uses Qwen3-TTS, it doesn't treat these as "tags" but as actual audio to be performed alongside the dialogue.

LLM-Powered Scripting
The tool uses a local LLM to parse your manuscript into a structured script. It identifies the different speakers and narration automatically. It also writes specific "vocal directions" for every line so the delivery matches the context of the scene.

Advanced Voice System
  • Custom Voices: Includes 9 high-quality built-in voices with full control over emotion, tone, and pacing.
  • Cloning: You can clone a voice from any 5 to 15 second audio clip.
  • LoRA Training: Includes a pipeline to train permanent, custom voice identities from your own datasets.
  • Voice Design: You can describe a voice in plain text, like "a deep male voice with a raspy, tired edge," and generate it on the fly.

Production Editor
Full control over the final output. You can review / edit lines and change the instructions for the delivery. If a specific "gasp" or "laugh" doesn't sound right, you can regenerate lines or use a different instruction like "shaking with fear" or "breathless and exhausted."

Local and Private
Everything runs via Qwen3-TTS on your own machine. Your stories stay private and you never have to worry about a "usage policy" flagging your content.

Export Options
You can export as a single MP3 or as a full Audacity project. The Audacity export separates every character onto their own track with labels for every line of dialogue so you can see on the timeline what is being said and search the timeline for dialog. which makes it easy to add background music or fine-tune the timing between lines.

Supported configurations

GPUOSStatusDriver RequirementNotes
NVIDIAWindowsFull supportDriver 550+ (CUDA 12.8)Flash attention included for faster encoding
NVIDIALinuxFull supportDriver 550+ (CUDA 12.8)Flash attention + triton included
AMDLinuxFull supportROCm 6.3ROCm optimizations applied automatically
AMDWindowsCPU onlyN/AGPU acceleration is not supported — the app runs in CPU mode. For GPU acceleration with AMD, use Linux


I'm around to answer any technical questions or help with setup if anyone runs into issues.
 

Users who are viewing this thread

Back
Top