Ann Clifton - Computing Science

advertisement
Ann Clifton
Ph.D. candidate
Natural Language Lab
School of Computing Science
Simon Fraser University
Education
Skills
ann clifton@sfu.ca
www.cs.sfu.ca/∼aca69/personal
Phone: +1-778-782-3208
•
Ph.D., Computing Science, expected completion Winter 2014
Simon Fraser University, CGPA: 4.33 (4.33 maximum)
•
M.Sc., Computing Science, 2010
Simon Fraser University
Thesis: Unsupervised Morphological Segmentation for Statistical Machine Translation.
•
Post-Baccaulaureate, Computer Science, 2008
Portland State University
•
B.A., Linguistics, 2001
Reed College
Thesis: Nominal Constituents and the Obviation Hierarchy in Tenejapa Tseltal.
•
Methods and techniques: applying Machine Learning methods to Natural Language
Processing problems; convex optimization, Lagrangian relaxation, approximation
methods for non-convex optimization, discriminative modeling, latent structured
learning, online algorithms
•
Programming: Python, Perl, Unix scripting, C++, C, Java
•
Technical writing and presentation
•
Tools: MATLAB, Vowpal Wabbit, NLTK, Moses, SRILM
Research Experience
Research Assistant, Simon Fraser University (September 2008 - present)
•
Conducted research on a variety of NLP projects under the supervision of Anoop
Sarkar
Research Team Member, Center for Speech and Language Processing Summer Workshop
2012, Johns Hopkins University
•
Collaborated on research projects as a member of the Domain Adaptation in Statistical Machine Translation (SMT) research team, under the supervision of Hal
Daumé III (University of Maryland), Alexander Fraser (University of Stuttgart),
Marine Carpuat (National Research Council Canada), and Chris Quirk (Microsoft Research)
In this collaboration, I worked on domain adaptation for SMT, focusing on Machine Learning techniques to use phrase-sense disambiguation classifiers to inform
an adaptation model.
Publications
•
Collaborated on research projects under the supervision of Chris Quirk (Microsoft
Research)
In this collaboration, I leveraged topic modeling for SMT, using unsupervised topic
modeling techniques to learn lexical weighting models for dynamic domain adaptation.
•
An Online Algorithm for Learning over Constrained Latent Representations using Multiple Views. Ann Clifton, Max Whitney, and Anoop Sarkar. In
Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japan, October 14-18, 2013.
•
Domain Adaptation in Machine Translation: Final Report. Marine Carpuat,
Hal Daum III, Alexander Fraser, Chris Quirk, Fabienne Braune, Ann Clifton, Ann
Irvine, Jagadeesh Jagarlamudi, John Morgan, Majid Razmara, Ale Tamchyna, Katharine
Henry and Rachel Rudinger. Technical report.
•
Kriya-The SFU System for Translation Task at WMT-12. Majid Razmara,
Baskaran Sankaran, Ann Clifton and Anoop Sarkar. In Proceedings of the 7th
Workshop on Statistical Machine Translation, Montreal, Canada, June 7-8, 2012.
•
Making the Most of a Distributed Perceptron for NLP. Max Whitney, Ann
Clifton, Anoop Sarkar and Alexandra Fedorova. The Pacific Northwest Regional
NLP Workshop. May 11, 2012.
•
Combining Morpheme-based Machine Translation with Post-processing
Morpheme Prediction. Ann Clifton and Anoop Sarkar. In Proceedings of the
49th Annual Meeting of the Association for Computational Linguistics: Human
Language Technologies. Portland, OR, USA. June 19-24, 2011.
•
Unsupervised Morphological Segmentation for Statistical Machine Translation. Ann Clifton. M.Sc. Thesis, Simon Fraser University, July, 2010.
•
Morphology Generation for Statistical Machine Translation Using Conditional Random Fields. Ann Clifton and Anoop Sarkar. The Pacific Northwest
Regional NLP Workshop. April 23, 2010.
Other Reports and Presentations
•
Latent Variable Discriminative Training for Joint Word Alignment and
Segmentation. Ann Clifton, Max Whitney, and Anoop Sarkar. Women in Machine
Learning 2012.
•
Document Level Information in MT. Ann Clifton, Chris Quirk, and Hal Daumé
III. Closing presentation for the Center for Speech and Language Processing Summer
Workshop 2012.
•
Multilingual Statistical Machine Translation. Ann Clifton. Ph.D. Depth
Report, Simon Fraser University, April 2012.
Selected Projects
Morphology for SMT
•
Proposed novel techniques for Statistical Machine Translation for morphologically
rich languages
•
Used Conditional Random Fields for morphology prediction, as well as unsupervised
morphological segmentation for SMT
•
Achieved new state of the art scores for an English-Finnish translation task
Discriminative Language Modeling
•
Implemented an online large-margin learner for language modeling using multiple
latent structures
•
Developed a classifier that distinguishes between human-generated and high-quality
synthetic language data
•
Showed that this model is faster and more accurate than a similar batch algorithm
Computational Topologies for the Distributed Perceptron
•
Implemented and compared computational topologies for parallelizing the averaged
perceptron
•
Examined the theoretical properties as well as the accuracy and efficiency
•
Found that in addition to the greater accuracy of the distributed version, the choice
of topology can lead to significantly improved run-time efficiency
Discriminative Alignment with Latent Variables
•
Implemented a discriminative alignment model that jointly learns the word segmentation with the word alignment of parallel data
•
Showed that the segmentation that maximizes the accuracy of the alignment model
does not necessarily correspond to a linguistically-motivated segmentation of natural
language data
Lexical Weighting using Generative and Discriminative Topic Models for
SMT
•
Learned topic models to inform static and dynamic domain adaptation for SMT
•
Showed that topic-aware lexical weighting can improve the log likelihood of out-ofdomain data without the benefit of in-domain parallel data
•
Worked on the development of a novel discriminative hierarchical bilingual topic
model
Grants Awarded
•
Simon Fraser University President’s PhD Scholarship, $6250, Simon Fraser University, Summer 2014
•
Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University,
Spring 2014
•
Grace Hopper Celebration Scholarship, Fall 2013
•
Simon Fraser University Graduate Fellowship, $6250, Simon Fraser University, Summer 2013
•
Simon Fraser University Travel and Minor Research Award, Simon Fraser University,
Spring 2013
•
Simon Fraser University Graduate Fellowship, $6250, Simon Fraser University, Spring
2012
•
Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University,
Fall 2011
•
Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University,
Fall 2009
•
National Science Foundation, KDI program, “Cross-Modal Analysis of Signal and
Sense: Multimedia Corpora and Tools for Gesture, Speech, and Gaze Research,”
subcontract to Reed College from Wright State University, June-August 2000
Other Related Activites
Co-chair, The Pacific Northwest Regional NLP Workshop, April 2014.
Contributor, Dagstuhl Seminar: Statistical Techniques for Translating to Morphologically Rich Languages, February 2014.
Reviewer, Association for Computational Linguistics 2014 ; Phonology, Morphology and
Word Segmentation track.
Reviewer, European Chapter of the Association for Computational Linguistics 2012 ;
Phonology, Morphology, Tagging, Chunking, and Segmentation track.
Teaching Assistant, Simon Fraser University, September 2008 - present. Courses:
Computational Linguistics, Intro Computing Science and Programming, Software and
Programming, Social Implications for Computing Science, Technical Writing for Computing Science.
Volunteer, Let’s Talk Science, August 2012 - present. I lead a team of Computer Science
grad students to go to local schools to speak to students about our research and devise
collaborative activities with the students to inspire interest and enthusiasm for Computer
Science.
May 6, 2014
Download