Making my own text to speech software

Introduction

Welcome back to the channel! Today, I’m diving into an exciting science experiment—creating my own text-to-speech software. Lately, I’ve been experimenting with various algorithms that aim to mimic my voice, but unfortunately, they’ve all fallen short.

For instance, I tried utilizing a scripting tool, but everything it produced sounded exceedingly monotone. On top of that, I also experimented with Eleven Labs, which unfortunately, doesn’t sound like me at all—it feels a bit mentally unhinged, to be honest. This led me to wonder: how hard can it really be to create something better?

The Plan

My plan is pretty simple. I’ll just need to record a unique sound for every letter in the alphabet, mapping those sounds to the corresponding text that I want to generate. The idea seems straightforward, so I was puzzled as to why it's been such a difficult endeavor in the past.

After coding the entire process completely by hand, the only remaining task was to record all the necessary sounds for the alphabet:

A: ah
B: duh
C: eh

And so on…

Testing the Creation

Once I had recorded the sounds, it was time for a quick test. I decided to use the classic phrase, “The quick brown fox jumps over the lazy dog.” This phrase is fantastic for testing text-to-speech because it uses every letter of the English alphabet.

To my surprise, the first run was beautiful—probably better than I anticipated. It sounded perfect right out of the box! I couldn't help but run it back for a few more trials. The output was impressive, and I could see the potential to improve further. Still, there was a part of me that wanted to embrace the imperfections because, at that moment, everything felt just right.

In conclusion, my little text-to-speech project turned out to be successful—far more than I expected on the first try!

Keywords

Text-to-Speech
Science Experiment
Voice Mimicry
Alphabet Sounds
Recording Process
Audio Testing
Computer Algorithms

FAQ

Q: What inspired you to create your own text-to-speech software?

A: After trying various text-to-speech algorithms that didn't meet my expectations in terms of voice fidelity and tone, I decided to try and create my own to achieve better results.

Q: What is the basic process you followed to create your software?

A: The process involves recording unique sounds for each letter of the alphabet and then mapping those sounds to generate the intended text.

Q: How did the first test of your system go?

A: The first test exceeded my expectations; it produced a beautiful and accurate representation of the phrase I used for testing.

Q: Are there any plans for further improvement on your text-to-speech software?

A: While there is always potential for improvement, I felt the initial results were satisfactory and may not require significant changes at this moment.