Ann Clifton Ph.D. candidate Natural Language Lab School of Computing Science Simon Fraser University Education Skills ann clifton@sfu.ca www.cs.sfu.ca/∼aca69/personal Phone: +1-778-782-3208 • Ph.D., Computing Science, expected completion Winter 2014 Simon Fraser University, CGPA: 4.33 (4.33 maximum) • M.Sc., Computing Science, 2010 Simon Fraser University Thesis: Unsupervised Morphological Segmentation for Statistical Machine Translation. • Post-Baccaulaureate, Computer Science, 2008 Portland State University • B.A., Linguistics, 2001 Reed College Thesis: Nominal Constituents and the Obviation Hierarchy in Tenejapa Tseltal. • Methods and techniques: applying Machine Learning methods to Natural Language Processing problems; convex optimization, Lagrangian relaxation, approximation methods for non-convex optimization, discriminative modeling, latent structured learning, online algorithms • Programming: Python, Perl, Unix scripting, C++, C, Java • Technical writing and presentation • Tools: MATLAB, Vowpal Wabbit, NLTK, Moses, SRILM Research Experience Research Assistant, Simon Fraser University (September 2008 - present) • Conducted research on a variety of NLP projects under the supervision of Anoop Sarkar Research Team Member, Center for Speech and Language Processing Summer Workshop 2012, Johns Hopkins University • Collaborated on research projects as a member of the Domain Adaptation in Statistical Machine Translation (SMT) research team, under the supervision of Hal Daumé III (University of Maryland), Alexander Fraser (University of Stuttgart), Marine Carpuat (National Research Council Canada), and Chris Quirk (Microsoft Research) In this collaboration, I worked on domain adaptation for SMT, focusing on Machine Learning techniques to use phrase-sense disambiguation classifiers to inform an adaptation model. Publications • Collaborated on research projects under the supervision of Chris Quirk (Microsoft Research) In this collaboration, I leveraged topic modeling for SMT, using unsupervised topic modeling techniques to learn lexical weighting models for dynamic domain adaptation. • An Online Algorithm for Learning over Constrained Latent Representations using Multiple Views. Ann Clifton, Max Whitney, and Anoop Sarkar. In Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japan, October 14-18, 2013. • Domain Adaptation in Machine Translation: Final Report. Marine Carpuat, Hal Daum III, Alexander Fraser, Chris Quirk, Fabienne Braune, Ann Clifton, Ann Irvine, Jagadeesh Jagarlamudi, John Morgan, Majid Razmara, Ale Tamchyna, Katharine Henry and Rachel Rudinger. Technical report. • Kriya-The SFU System for Translation Task at WMT-12. Majid Razmara, Baskaran Sankaran, Ann Clifton and Anoop Sarkar. In Proceedings of the 7th Workshop on Statistical Machine Translation, Montreal, Canada, June 7-8, 2012. • Making the Most of a Distributed Perceptron for NLP. Max Whitney, Ann Clifton, Anoop Sarkar and Alexandra Fedorova. The Pacific Northwest Regional NLP Workshop. May 11, 2012. • Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction. Ann Clifton and Anoop Sarkar. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, OR, USA. June 19-24, 2011. • Unsupervised Morphological Segmentation for Statistical Machine Translation. Ann Clifton. M.Sc. Thesis, Simon Fraser University, July, 2010. • Morphology Generation for Statistical Machine Translation Using Conditional Random Fields. Ann Clifton and Anoop Sarkar. The Pacific Northwest Regional NLP Workshop. April 23, 2010. Other Reports and Presentations • Latent Variable Discriminative Training for Joint Word Alignment and Segmentation. Ann Clifton, Max Whitney, and Anoop Sarkar. Women in Machine Learning 2012. • Document Level Information in MT. Ann Clifton, Chris Quirk, and Hal Daumé III. Closing presentation for the Center for Speech and Language Processing Summer Workshop 2012. • Multilingual Statistical Machine Translation. Ann Clifton. Ph.D. Depth Report, Simon Fraser University, April 2012. Selected Projects Morphology for SMT • Proposed novel techniques for Statistical Machine Translation for morphologically rich languages • Used Conditional Random Fields for morphology prediction, as well as unsupervised morphological segmentation for SMT • Achieved new state of the art scores for an English-Finnish translation task Discriminative Language Modeling • Implemented an online large-margin learner for language modeling using multiple latent structures • Developed a classifier that distinguishes between human-generated and high-quality synthetic language data • Showed that this model is faster and more accurate than a similar batch algorithm Computational Topologies for the Distributed Perceptron • Implemented and compared computational topologies for parallelizing the averaged perceptron • Examined the theoretical properties as well as the accuracy and efficiency • Found that in addition to the greater accuracy of the distributed version, the choice of topology can lead to significantly improved run-time efficiency Discriminative Alignment with Latent Variables • Implemented a discriminative alignment model that jointly learns the word segmentation with the word alignment of parallel data • Showed that the segmentation that maximizes the accuracy of the alignment model does not necessarily correspond to a linguistically-motivated segmentation of natural language data Lexical Weighting using Generative and Discriminative Topic Models for SMT • Learned topic models to inform static and dynamic domain adaptation for SMT • Showed that topic-aware lexical weighting can improve the log likelihood of out-ofdomain data without the benefit of in-domain parallel data • Worked on the development of a novel discriminative hierarchical bilingual topic model Grants Awarded • Simon Fraser University President’s PhD Scholarship, $6250, Simon Fraser University, Summer 2014 • Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University, Spring 2014 • Grace Hopper Celebration Scholarship, Fall 2013 • Simon Fraser University Graduate Fellowship, $6250, Simon Fraser University, Summer 2013 • Simon Fraser University Travel and Minor Research Award, Simon Fraser University, Spring 2013 • Simon Fraser University Graduate Fellowship, $6250, Simon Fraser University, Spring 2012 • Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University, Fall 2011 • Faculty of Applied Sciences Graduate Fellowship, $3125, Simon Fraser University, Fall 2009 • National Science Foundation, KDI program, “Cross-Modal Analysis of Signal and Sense: Multimedia Corpora and Tools for Gesture, Speech, and Gaze Research,” subcontract to Reed College from Wright State University, June-August 2000 Other Related Activites Co-chair, The Pacific Northwest Regional NLP Workshop, April 2014. Contributor, Dagstuhl Seminar: Statistical Techniques for Translating to Morphologically Rich Languages, February 2014. Reviewer, Association for Computational Linguistics 2014 ; Phonology, Morphology and Word Segmentation track. Reviewer, European Chapter of the Association for Computational Linguistics 2012 ; Phonology, Morphology, Tagging, Chunking, and Segmentation track. Teaching Assistant, Simon Fraser University, September 2008 - present. Courses: Computational Linguistics, Intro Computing Science and Programming, Software and Programming, Social Implications for Computing Science, Technical Writing for Computing Science. Volunteer, Let’s Talk Science, August 2012 - present. I lead a team of Computer Science grad students to go to local schools to speak to students about our research and devise collaborative activities with the students to inspire interest and enthusiasm for Computer Science. May 6, 2014