Using Web-based Speech Recognition Technologies to Improve

Using Web-based Speech
Recognition Technologies to
Improve English Pronunciation
Howard Chen 陳浩然
English Department 師大英語系
National Taiwan Normal University
Better Listening and Speaking Skills: GEPT and
 Writing Section: There will be one 30-minute
and one 20-minute essay.
 Speaking Section: Six open-ended speaking
questions require test takers to speak into a
 Integrated Language: Some sections of the
new test will combine four basic
communication skills. For example, a test
taker might listen to a lecture and read a
passage, then write or speak about it.
 Score: 1-120 (human graders for essay and
speaking section).
Time: 3.5 hours
Different Solutions
 Some universities hire more native English
speakers and provide student with more
opportunities to interact with these teachers.
 Some colleges have reduced the class size
and expect that there will be more teacherstudent interactions in the target language.
 Some universities also begin to explore the
power of new computer technologiespronunciation tutor and ASR.
The Development of ASR in CALL
 A few years ago, very few CALL (computer
assisted language learning) programs
claimed that they incorporated new speech
recognition technologies. ‘
 But within the past 5 years, automatic speech
recognition (ASR) technologies are widely
used in various language teaching and
learning programs.
Some Programs Available in Taiwan
ASR Provides the Following Benefits:
 Students will have more opportunities to produce the
target language and have extensive interaction with
Students will have individual attention, and they do
not need to compete with other classmates.
Students can learn to communicate under a less
threatening environment and they often can get
feedback from computers quickly.
Students are provided with various direct or indirect
feedback from computers.
Students can hear the models provided by different
native speakers. Student can have better control
about their learning pace and might have enhanced
their confidence.
TeLL me More pro
The Scores Provided by MyET
Program-via web
Prices Are Too High and Limited
 Although these PC-based speech recognition
software programs are quite useful, many
students cannot afford to buy these products.
(The price of Traci Talk or TeLLmeMore will
be around 100 US dollars).
 Some schools have purchased these
software programs but students can only use
these software programs in language center
or language labs and they can not have
access to these programs from dormitory or
Microsoft Speech Applications
Developmental Kit
 The new Microsoft speech applications
developmental kit allows programmers and
researcher to develop different web-based
speech applications. The speech
development kit requires a new programming
environment based on the integration of the
following software programs.
 Microsoft .NET Speech SDK
 Microsoft Internet Information Services (IIS).
 Microsoft Internet Explorer 6.0 or later version.
 Microsoft Visual Studio .NET Professional
 Microsoft .NET Framework
The Flowchart of Interaction
NTNU Pronunciation Practice Web
NTNU Dept of English: ASR
Choose the Right Answer
Write and Say
Flash and Speech Recognition
Personal Learning Records
Checking Students’ Performances
Learners’ Evaluation
 The online system was completed after
extensive tests around January, 2005.
 We then invited 25 students (non-English
majors) who were taking Freshman English
course to use this web site in the spring
 All of these college students were graduated
from vocational high schools and most of
them have difficulties in English
 74% of these students felt that their
pronunciations were poor in a survey.
Users Survey
Other Useful Information
 most students indicated that they chose to try
3-5 times before they gave up trying.
 If they find they were not sure about the
pronunciations of a certain word, many
students (65%) would choose to listen to the
audios provided by the system.
 It is also interesting to note that students
(61%) find that they can pass the shorter
sentences more easily. They also found that
the “identify the object” is the easiest
exercise (39%), followed by “listen and
repeat”(35%) and “choose the appropriate
Some Positive Comments
Suggestions for Improvement
Reflection 1
 The user interface can be made more interesting and
attractive. Some of the sections with TTS (text to
speech) sounds can be replaced with human voices.
The hardware of the web site server can be enhanced to
provide better and reliable performances.
 If the learning environment can be changed to game-like
environment, that would be more attractive for users.
We are currently developing a 2-D learning environment
and use the Flash animation to make this site more
 As for the problematic items, there seems be some bugs
for certain items, we would need to find out the solutions
these items. To identify and correct these items quickly
will help students to reduce their frustrations in
interacting with this site.
Reflection 2: Feedback Quality- Lower
 In addition, the system so far can only give a
pass or fail judgment. For students at the
lower proficiency level, sometimes they might
find the system very demanding about their
 Furthermore, the system cannot clearly
pinpoint the problems of individual speakers.
The students can only listen to the models
carefully and try again and again. Perhaps
we can figure out a better way of make the
speech recognition system less demanding
(e.g. lowering the sensitivity of speech
recognition engine); however, we will then
face the problem of setting the reasonable
How to Use the System in English
 Diagnosis: A screening device
 One possibility of utilizing this ASR system is to use
the system to assess large numbers of students’
pronunciation abilities. This system might be able
to help to quickly identify students who are relatively
weak in pronunciation. Then teachers and tutors
might provide extra help.
 Motivating learners to listen and speak. A tool for
oral practice. (with the animation and other
multimedia support )- Practice with any kind of
sentences and phrases. (not confined to some
fixed patterns).
Thank you!
Questions and
Feedback Welcome!