Thai Language Processing and Its Summarization Applications

advertisement
The Fourth International Conference on Digital Information and Communication Technology
and its Applications (DICTAP2014)
May 6-8, 2014, University of the Thai Chamber of Commerce, Bangkok, Thailand
Thai Language Processing and Its Summarization Applications
Thanaruk Theeramunkong
Sirindhorn International Institute of Technology, Thammasat University
131 Moo 5 Tiwanont Rd. Bangkadi Muang Pathumthani 12000
thanaruk@siit.tu.ac.th
This presentation describes a series of works in Thai language processing and its applications to
summarization. In the series, four areas explored are Thai word segmentation, Thai named entity (NE)
corpus construction and NE recognition, relation discovery in Thai news texts, and Thai text
summarization. Similar to several Asian languages including Chinese, Japanese and Korean, there is
no explicit word boundary in Thai written text. Furthermore, a Thai text also has no phrase boundary
and no sentence boundary with flexible phrase and sentence structure. This characteristic triggers
several issues in processing such Thai running texts, particularly Thai word segmentation and name
entity recognition, which are fundamental essential tasks in order to analyze or understand Thai texts.
Towards Thai text summarization in news collection, techniques to extract relations among Thai texts
and summarize them to create a summary are explored and reported.
BIOGRAPHY
Thanaruk Theeramunkong is currently a professor and the Head of School, School of Information, Computer
and Communication Technology at Sirindhorn International Institute of Technology (SIIT) at Thammasat
University, Bangkok, Thailand. He is also the ICTES Program Director of Information and Communication
Technology for Embedded System for TAIST Tokyo Tech, National Science and Technology Development
Agency (NSTDA). He received his bachelor degree in Electric and Electronics Engineering, master and doctoral
degrees in Computer Science from Tokyo Institute of Technology in 1990, 1992 and 1995, respectively. He was
a research associate at Japan Advanced Institute of Science and Technology from 1996-1998 and a MIS
manager at CP company in Thailand in 1999 before working at SIIT, Thammasat University. He got several
awards the Very Good Research Award in engineering field from Thammasat University in 2008, 2009 and
2010. He also got several best paper awards from conferences and societies, including the Japanese Society for
Artificial Intelligence, PAKDD, and KICSS. His research interests are natural language processing, data mining,
text mining, machine learning and applications to service science. He is also a member of the Steering
Committee of the Pacific-Asia Conferences on Knowledge Discovery and Data Mining (PAKDD). He is an
associate editor of the Institute of Electronics, Information and Communication Engineers (IEICE). He is the
author of more than 40 papers in a number of journals with impact factors and more than 100 conference papers.
Download