Ling 354 Language and Computers Spring 2016 MWF 11:00am–11:50am Room HT-022 Schedule # 21912 This course offers an introduction and overview of natural language processing and computational linguistics. Topics to be covered include speech recognition and generation, spelling and grammar checkers, information retrieval and search engines, conversational agents, and machine translation. After successful completion of this course, students will be able to: • Recognize the use of Natural Language Processing technology in everyday applications • Describe finite state machines and noisy channel models, the two primary techniques for building Natural Language Processing systems • Formulate solutions to new Natural Language Processing problems using these techniques • Discuss the technical and social challenges that limit the development of Natural Language Processing applications Courses that fulfill the 9-unit requirement for Explorations in General Education take the goals and skills of GE foundations courses to a more advanced level. Your three upper division courses in Explorations will provide greater interdisciplinary, more complex and in-depth theory, deeper investigation of local problems, and wider awareness of global challenges. More extensive reading, written analysis involving complex comparisons, well-developed arguments, considerable bibliography, and use of technology are appropriate in many Explorations courses. This is an Explorations course in Social and Behavioral Sciences. Completing this course will help you learn to do the following with greater depth: explore and recognize basic terms, concepts, and domains of the social and behavioral sciences; comprehend diverse theories and methods of the social and behavioral sciences; identify human behavioral patterns across space and time and discuss their inter-relatedness and distinctiveness; and enhance your understanding of the social world through the application of conceptual frameworks from the social and behavioral sciences to first-hand engagement with contemporary issues. Instructor Rob Malouf Office: Office hours: Email/GTalk: AIM: Phone: SHW-244 Mon 10:00–11:00, Wed 1:00–2:00, or by appointment rmalouf@mail.sdsu.edu maloufsdsu (619) 594-7111 Requirements The final grade will be based on problem sets (10%), four quizzes (15% each), and a final exam (30%). The problem sets are a small part of the grade, but will be excellent practice for the quizzes and exams. Late homeworks will be accepted (with a grade penalty) for one week only after the deadline. Quizzes will be announced in advance. If you can’t make it to a quiz or exam, let me know beforehand! There will be no make-ups without prior arrangements. No form of academic dishonesty, including cheating or plagiarism, will be tolerated in the class. Following Executive Order 1006, all instances of academic dishonesty will be reported to the Center for Student Rights and Responsibilities for investigation. For more information about the judicial process, see http://csrr.sdsu.edu. For more information about what plagiarism is and how to avoid it, see http://its.sdsu.edu/tech/plagiarism.html. If you are a student with a disability and believe you will need accommodations for this class, it is your responsibility to contact Student Disability Services at (619) 594-6473. To avoid any delay in the receipt of your accommodations, you should contact Student Disability Services as soon as possible. Please note that accommodations are not retroactive, and that accommodations based upon disability cannot be provided until you have presented your instructor with an accommodation letter from Student Disability Services. Your cooperation is appreciated. All course information, readings, assignments, slides, etc. will be available on the course website: http://blackboard.sdsu.edu All students will need an active Blackboard account, so make sure you can log in! Readings The required textbooks for this course are: • Department of Linguistics. 2011. Language Files. 11th edition. Ohio State University Press. http://www.ling.ohio-state.edu/publications/files/ • Markus Dickinson, Chris Brew, and Detmar Meurers. 2012. Language and Computers. WileyBlackwell. They for sale in the campus bookstore and at Amazon, etc. Additional readings will be made available in class or via the course web page. Proposed schedule • Week 1 Introduction Background ⋅ What is computational linguistics? ⋅ What’s it good for? ⋅ Linguistics • Week 2–5 Processing words Regular expressions ⋅ Finite state machines ⋅ Word structure ⋅ Computational morphology • Week 6–7 Processing sounds Sound patterns ⋅ Phonological and orthographic rules ⋅ Speech synthesis • Week 8–10 Statistical NLP Noisy channel models ⋅ Spelling correction ⋅ Machine translation • Week 11–12 Information extraction Parts of speech ⋅ Taggers ⋅ Grammars (Context free and beyond) • Week 13–14 Text mining Document classification ⋅ Spam detection ⋅ Data mining ⋅ Question answering • Week 15 Review