Telecommunications Relay Services in Speech-to-Speech translation system

advertisement
ITU-T Workshop on
“Telecommunications relay services for persons with disabilities ”
(Geneva, 25 November 2011)
Telecommunications Relay Services
in Speech-to-Speech translation system
in accordance with Recommendations F.745 and H.625
Chiori Hori Ph.D.
Spoken Language Communication Laboratory
National Institute of Information and Communications Technology
(NICT)
Geneva, 25 November 2011
2016/5/30
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
1
Telecommunications Relay Services in Speech-to-Speech translation
in accordance with ITU-T Recommendations F.745 and H.625
Network-based S2ST systems
Speech-to-Speech Translation
Speech-to-Speech Translation (S2ST) technologies
are an effective means to break through language
barriers between people who do not speak the
same language.
Communicating between more languages can be
actualized using S2ST technology by connecting
distributed S2ST servers, (i.e., ASR, MT, TTS) all over the
world.
MC
client
Automatic
Speech
Recognition
(ASR)
Japanese
「私は学校に行く」
w a t a sh i
w a g a xtu
k o o n i…..
Convert
from phoneme
to word
Machine
Translation
(MT)
Communication
between users
who speak
different languages
Speaker of
Language
A
Speech
Synthesis
(TTS)
Digitalizatio
n of speech
signals
私は
学校に行く
Convert
from Japanese text
to English text
I go to
school
MC
client
Speaker of
Language
B
Digitalizatio
n of speech
signals
English
“I go to school”
Network
Convert
from text
to waveform
Large amount of training data for machine learning
Japanese
speech and
text
corpora
2016/5/30
Japaneseto-English
parallel
corpora
English
speech
corpora
MC server
MC
server
MC server
MC server
MC
server
MC server
ASR server
MT server
TTS server
ASR server
MT server
TTS server
Conversion
from speech
signal to text
in Language
A
Conversion
from text in
A to text in
B
Conversion
from text in
B to speech
signal
Conversion
from speech
signal to text
in Language
B
Conversion
from text in
B to text in
A
Conversion
from text in
A to speech
signal
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
2
Network-based Speech Translation System
in accordance with ITU-T Recommendations F.745 and H.625
Network-based S2ST application via multilateral translation on smartphone/tablet/PC/TV
Remote Communication
お父さん,お母さん
お元気ですか?
On-site communication
Papa, maman,
comment vas-tu?
English speaker’s device
飲み水は13:00から市役所
前で配給します.
Water to drink will be
provided in front of the
city hall from13:00.
从下午一点开始,在市
政府门前供应饮用水。
Chinese speaker’s
device
2016/5/30
Japanese speaker’s
device
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
3
Network-based Speech Translation System
in accordance with ITU-T Recommendations F.745 and H.625
Modality Conversion Markup Language (MCML)
XML schema, ITU-T name space (http://www.itu.int/xml-namespace/itu-t/H.645/MCML.xsd)
MCML includes information for communication between multiple persons who use different
modalities. Ex. speech, text, image, video data input by users or output by MCML servers
such as ASR, MT, TTS , Sign Recognition systems.
http://www.itu.int/rec/T-REC-F.745-201010-I/en
2016/5/30
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
4
2016/5/30
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
5
U-STAR Consortium
The Universal Speech Translation Advanced Research (U-STAR) Consortium has been
established as an international research collaboration entity with the goal of developing a
world wide network-based speech-to-speech translation system. The consortium
objective is to create a basic infrastructure for spoken language communication to
overcome the language barriers that exist around the world. Currently, there are
participant members from 14 countries (15 institutes).
Plan for Field experiment
Period:
One year from April of 2012 including during the
2012 London Olympics
Application:
Multiparty conversation via a network-based S2ST
system on iPhones and Android phones (Free)
MCML servers:
ASR,MT, TTS servers will be provided by U-STAR
members
Potential languages:
Chinese, Dzongkha, English, Filipino, Hindi,
Indonesian Japanese, Korean, Mongolian, Malay
Nepali, Sinhala, Thai, Urdu, Vietnamese and some
European languages
2016/5/30
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
6
The U-STAR members
2016/5/30
Institute
Country
Language
DITT
Bhutan
Dzongkha
UPD
Philippines
Filipino
CDAC
India
Hindi
BPPT
Indonesia
Indonesian
NICT
Japan
Japanese
ETRI
Korea
Korean
MUST
Mongolia
Mongolian
NUM
Mongolia
Mongolian
I2R
Singapore
Malay
LTK
Nepal
Nepali
UCSC
Sri Lanka
Sinhala
NECTEC
Thailand
Thai
KICS-UET
Pakistan
Urdu
IOIT
Vietnam
Vietnamese
CASIA
China
Chinese
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
7
Potential European Language
French, German, Italian,
Portuguese, Spanish, Turkish,
British English
2016/5/30
MASTAR Project, Universal Communication Research Institute
E-mail: info@mastar.jp
8
Download