Make your computer speak - Part 4 Text-to-Speech

advertisement
Make your computer speak - Part 4
Text-to-Speech Voices
What is this about
This is the fourth post in a 5-part series on making your computer speak using Textto-Speech technology. The five parts are:





Part 1: Getting started
Part 2: Free speaking software
Part 3: Commercial speaking software
Part 4: Text-to-speech voices
Part 5: Text-to-speech on mobile devices
You can get an overview of the whole series, including links in this mind map.
In this fourth part, we will have a look at how to give your computer some new
voices.
What is a computer voice
As I said in the first part, a computer voice works kind of like a font. You install it
on your computer and different software can use it.
All computers come with a default free voice. This means that you can start using
text-to-speech right away. These free voices are usable but not the best you can
get.
But unlike with fonts, there are not many free voices everybody can download, so
if you want a better voice for your computer, you have to buy it. They cost between
£15 and £30 so you will want to choose carefully. Luckily, you can always try before
you buy.
1
How a computer voice works
It’s a lot more difficult to make a new computer voice than it is to make a new
font. To start with, there are only 26 letters in English but 44 sounds (called
phonemes). But it doesn’t stop there. If you only created synthetic sounds and added
them up, the voice would sound nothing like a human. So a voice is made up of
combinations of sounds – sometimes combinations of parts of sounds.
Next you need to collect lots of samples of a human voice because voices that
are made by computer alone sound too robotic.
Finally, you need to teach your voice to understand as much English as
possible. It has to guess at the right pauses and intonation, how will it know that
‘read’ in ‘I like reading?’ and ‘I live in Reading’ is pronounced differently. How about
the stress in ‘convict’ when it is a noun (‘an object) or a verb (‘to object’)?
The more a voice understands of what you’re trying to say, the better it will sound.
Some voices are better at understanding than others. And even good voices
make mistakes.
No wonder, all the best voices cost money.
How to choose the right voice
The best way to choose if a voice is right for you is to try it for a few days. Ideally,
you should forget it’s there and just listen to the content.
The good news is that it’s possible to get used to even a lesser quality voice.
They’re definitely good enough for their key purpose, which is to let you listen to your
texts instead of reading them.
You just must not give up after a few words. It may feel a bit awkward at first but
when you get used to it, it will feel completely natural.
Here’s an example of a text you can try. It tests some of the things voices often get
wrong. Some things like ‘read’ in ‘I read everyday’ and ‘I read yesterday’ are hard for
all voices. Other things like ‘an object’ vs. ‘to object’ or reading a list are only hard for
some voices.
Hi, I’m a computer voice X. Try me and you can hear how I deal with things like:
• Intonation
• Punctuation, and
• Lists
2
How about the difference between 'I read yesterday.' and 'I read everyday.'
Can you tell this sentence is a question. Even when someone forgot to put in a
question mark?
How about the difference between 'Do you object to my accent?' and 'Is money no
object?' Did I get the stress right in object as a verb and as a noun?
Can I spell out acronyms like USA and UK but read out ones like OFSTED? How do
I treat common text features like i.e. and e.g.?
You should also try me at different speeds and different levels of pitch. Just don't
give up if I sound a bit strange at first or don't get it all right the first time.
Hope to talk to you soon again.
Free voices
As I said, there are not that many great choices of free voices. In fact, the default
voices on your computer are probably the best free voices you can find.
Here’s what Anna, who is on most Windows computers sounds like:
<iframe width="100%" height="166" scrolling="no" frameborder="no"
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/16
0007393&color=ff5500&auto_play=false&hide_related=false&sh
ow_comments=true&show_user=true&show_reposts=false"></iframe>
Mac users are lucky, their machines come preinstalled with a number of good
voices.
If you bought a computer after 2013, it’s likely it has Windows 8 on it. Windows 8
comes with only one voice, Hazel, but it’s really good.
<iframe width="100%" height="166" scrolling="no" frameborder="no"
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/16
0002889&color=ff5500&auto_play=false&hide_related=false&sh
ow_comments=true&show_user=true&show_reposts=false"></iframe>
If you’d still like to try some more free voices, the Balabolka site lists a lot of free
options.
You can also try eSpeak, which is a completely free fully synthetic voice. It sounds
very robotic but some people like to listen to it high speeds.
<iframe width="100%" height="166" scrolling="no" frameborder="no"
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/16
0004484&color=ff5500&auto_play=false&hide_related=false&sh
ow_comments=true&show_user=true&show_reposts=false"></iframe>
The best free voices you can install are not free for everyone.
3
JISC TechDis Voices are available for free but only for people in Further or
Higher education. See if you’re eligible for it on their website.
<iframe width="100%" height="166" scrolling="no" frameborder="no"
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/16
0003190&color=ff5500&auto_play=false&hide_related=false&sh
ow_comments=true&show_user=true&show_reposts=false"></iframe>
If you live in Scotland, see if you or your school is eligible for Scotish Voice.
Paid voices
Here’s a list of the key providers of commercial voices. The links will take you to
the website where you can try and buy the voices. Some will have trial installs and
some you have to try using the web interface.
It’s impossible to recommend one. You need to find a voice that works for you.
Personally, I’ve bought the voice Brian from Ivona because he sounds like a radio
presenter. But you may like a voice that sounds more youthful. Or prefer a female
voice, or a voice with a Welsh accent. There are many options for you.
<iframe width="100%" height="166" scrolling="no" frameborder="no"
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/16
0003037&color=ff5500&auto_play=false&hide_related=false&sh
ow_comments=true&show_user=true&show_reposts=false"></iframe>






Acapela
Cepstral
Cereproc
Ivona
Nuance
Natural Voices
What’s next
Enjoy listening to your computer. Next time, we will conclude the series by talking
about text-to-speech on your mobile device.
4
Download