Speech Recognition Software What is speech? Humans are the only species that actually use speech for communication. While we speak our human voices make or generate sounds that are called “phones” (ex. the word “learn” produces phones that correspond to the sounds “l”, “e”, “a”, “r”, and “n”). “Phonemes”, on the other hand, are the foundations of sound that words are built from. Now there’s a difference between those two: “Phones” are real pieces of sound that we make, whereas “phonemes” are in some way in our mind (they are never really spoken). When we listen to speech our brain catches and processes the “phones” and turns them into actual words. The same can be said for computers. Why is speech so hard to handle? Why, really? Let’s see: There is the problem of separating the people’s voices from the background noises of our everyday surroundings. When some people talk in a very fast way. The difference in our voices from time to time or between different people (ex. different accents, high pitch voice and low, different ages of people etcetera…) Then, there are the words that sound so similar to each other, we take “now” and “know”. Mishearing more complex sentences for something else completely (which doesn’t happen often). Also, we must really be careful of the syntax of our speech and understand the semantics or the meanings of what we hear.