Word Document version of this post

advertisement
Tech Thursday: Speaking to your
computer Part 2: Free speech
recognition with Google Docs
This is Part 2 in a three-part series on speech-recognition tools. The three parts are:
1. Getting started: Using voice for simple searches, calculations and notes
2. Free dictation on mobile and desktop: Using Google Docs dictation
3. Full commercial dictation: Dragon Naturally Speaking
Note
This entire document was created by dictation in this Google Doc. It was then copied
into Microsoft Word for editing and formatting.
I recorded the entire process of dictation in a screencast. Unfortunately, due to
technical difficulties, only the video was recorded. It is good enough to show you the
process but you won’t hear me speak what is transcribed.
What is this about
In the second part of a series on speaking to your computer we will look at a free
option using Google Docs. This should give you some idea about what it's like to
dictate a whole document.
Dictating documents to Google Docs requires two things:
1. Free Google Drive account
1
2. Using Google Chrome as your browser
Of course, you also have to be connected to the Internet at the same time.
Google speech recognition does not work offline. next paragraph delete that
How to start a dictation
Starting dictation is actually very easy. Simply start a new Google Doc and then go
to the tools menu and select Voice typing.
You will then see a microphone icon, and after you click on it, simply
start speaking. You'll be amazed and how accurate the speech recognition is. No
training is required, but it is more likely to work better the more you use it.
The great thing about this is that it works just as well on your phone, tablet, or your
desktop or laptop computer.
This is a very new feature that was announced by Google only this summer. Before,
you could only speak in short segments.
The quality of the speech recognition is actually very good. But there are still
some shortcomings when it comes navigating and controlling the process of
dictation. In this, the main competitor, actually the only competitor, Dragon Naturally
Speaking is far superior.
But if you learn a few tricks, you can get really good results with Google Docs
dictation, and can dictate entire documents.
This whole document was created simply by voice with only the final editing and
formatting done with the keyboard.
I recorded the whole process on video so you can see what it's like to dictate
a document. It works really well but you still need to use some work arounds for the
missing features. (Note: The sound was not recorded due to a technical glitch.)
2
Of course, you also have to change the way you speak. You cannot just speak
fluently without first thinking and formulating what you want to say.
Finally, let's have a look at some of the key tips and tricks that will let you be more
successful with Google Docs.
Google Docs dictation tips and tricks
Punctuation
The most annoying feature of Google Docs at the moment is its approach
to punctuation. If you watch the video, you will see that the Google Docs dictation
often writes out the name of the punctuation mark such as ‘period’, ‘comma’ or
‘colon’ even though you are almost always when to use them just as punctuation.
3
The only reliable way to enter punctuation is to say it right at the end of the last
word of the part of the sentence that you want to punctuate. If you wait even for
a second, Google Docs interprets the name of the punctuation mark as a word.
The only way to fix that is to delete the last word and say it again quickly
followed by the name of the punctuation mark. If you watch the video you will see
me doing that quite often (unfortunately, you cannot hear that).
The punctuation marks that Google Docs can recognise are:
 period
 colon
 semi colon
 question mark
 exclamation point
But if you watch the video, you will see that punctuation is the most frequent cause
of error.
To start a new paragraph, just say new paragraph or enter.
Correction
The other big missing feature in Google Docs dictation is correction, deletion and
selection by voice. This is very common in all the other speech recognition software
out there. And we can only hope that Google add that soon.
This makes it very difficult to correct things while you are still dictating them. This
means that when you get to correct the whole document, you sometimes come
across passages you don’t understand any more.
It also makes it easier to use Voice Typing at the computer because you can
quickly fix things it got wrong with the keyboard. It’s still possible on your phone or
tablet but much more limiting.
Dictating whole sentences
The key thing to remember when you dictate to Google Docs is that Google is better
at recognising complete sentences rather than individual words.
Initially, you may even be better off not actually watching the screen as what
you're saying is being transcribed. Because the Google Docs speech translation
engine is doing a lot of guessing based on context. And it is doing an excellent job.
But because of that it often needs to change its mind and therefore will fix things
that look like the wrong thing when it first transcribed them.
Again, you can see that on the video recording of me dictating this document.
4
Where does the Google speech recognition
fail?
If you watch the video you will see that the Google Docs speech recognition makes
very few mistakes. It even does a very good job of recognising words such as
product names and place names which are often an stumbling block for speech
recognition.
It is most likely to make errors with short words such as ‘if’ or ‘but’. Sometimes it will
leave out or change ‘not’. This can lead to a change in meaning that is hard to
spot.
It will struggle with endings such as the plural –s and the past tense –ed.
And as you can see on the video, it also doesn't do very well with metalanguage
even though it is not bad in specialist vocabulary in general.
Next time
The third part of the series we will look at the commercial solution from Dragon
Naturally Speaking and compare it to the free speech recognition which is built into
Windows.
5
Download