Facultatea de Științe Economice și Gestiunea Afacerilor Str. Teodor Mihali nr. 58-60 Cluj-Napoca, RO-400951 Tel.: 0264-41.86.52-5 Fax: 0264-41.25.70 econ@econ.ubbcluj.ro www.econ.ubbcluj.ro DETAILED SYLLABUS Natural Language processing 1. Information about the study program 1.1 University 1.2 Faculty 1.3 Department 1.4 Field of study 1.5 Program level (bachelor or master) Babeș-Bolyai University Faculty of Economics and Business Administration Business Information Systems Business Information Systems Master 1.6 Study program / Qualification Business Modeling and Distributed Computing 2. Information about the subject 2.1 Subject title Natural Language Processing 2.2 Course activities professor Assoc. Prof. Liana Stanca 2.3 Seminar activities professor Assoc. Prof. Liana Stanca 2.4 Year of study II 2.5 Semester I ES (i.e., summative 2.6 Type of assessment 2.7 Subject regime examination ) elective 3. Total estimated time (teaching hours per semester) 3.1 Number of hours per week 4 out of which: 3.2 course 2 3.3 seminar/laboratory 3.4 Total number of hours in the 56 out of which: 3.5 course 28 3.6 seminar/laboratory curriculum Time distribution Study based on textbook, course support, references and notes Additional documentation in the library, through specialized databases and field activities Preparing seminars/laboratories, essays, portfolios and reports Tutoring Assessment (examinations) Others activities 3.7 Total hours for individual study 119 3.8 Total hours per semester 175 3.9 Number of credits 7 2 28 Hours 38 24 45 8 4 4. Preconditions (if necessary) 4.1 Curriculum 4.2 Skills Methods in data science. Descriptive statistics. Basic programming skills 5. Conditions (if necessary) 5.1. For course development 5.2. For seminar / laboratory development The courses should be held in a room with simultaneous access to a computer-projector and a board. The seminars should be held in a room with simultaneous access to a computer-projector and a board. As well, the students need to have access to computers. 1 NOTE: This document represents an informal translation performed by the faculty. 6. Acquired specific competences Professional competences Transversal competences The ability to process a text corpus with natural language processing techniques and tools, both on syntactic level and on semantic level The ability to derive models for natural language Acquiring a set of scientific research skills allowing further professional development at doctoral level Systematic and advanced knowledge of quantitative and qualitative modeling methods and their application to solving complex research problems. 7. Subject objectives (arising from the acquired specific competences) 7.1 Subject’s general objective 7.2 Specific objectives This course cover linguistic and algorithmic foundations of natural language processing. The course uses corpus data to illustrate concepts like language modelling, part of speech tagging, syntactic processing semantic processing. The course is based on the Romanian language experience. - The students should understand: - syntactic and semantic processing of written text - NLP process towards speech recognition or text synthetisation - techniques towards authorship and topic identification 8. Contents Teaching Observations methods The professor gives a talk and Creating and annotating language corpora: markup, annotation, evaluation encourages 2 courses measures, web tools discussions on the themes. The professor gives a talk and Language modeling. Hidden Markov models. encourages 2 courses discussions on the themes. The professor gives a talk and Part of speech tagging. Viterbi algorithm smoothing encourages 2 courses discussions on the themes. The professor Syntax processing: context-free grammars, chart parsing, constituency, gives a talk and subcategorization, dependencies, feature representation, lexicalized grammar encourages 3 courses formalism discussions on the themes. The professor gives a talk and Semantic processing: compositionality, argument structure, word sense encourages 3 courses disambiguation, anaphora resolution discussions on the themes. The professor gives a talk and Information retrieval and NLP encourages 2 courses discussions on the themes. References: 1. D. Jurafsky, J. Martin, speech and Language Processing, Second edition, Blackwell, 2009 2. Xedong Huang, Alex Acero, HW Hon, Spoken Language Processing: A guide to theory, Algorithm and System development, Prentice Hall 2001 8.1 Course 2 NOTE: This document represents an informal translation performed by the faculty. Teaching methods Practical laboratory Practical laboratory Practical laboratory Practical laboratory Practical laboratory Practical laboratory 8.2 Seminar/laboratory Corpora and language models. Python modeling Smoothing and authorship identification. Python modeling HMM construction and use Parsing tools Collocation and mutual information Word sense disambiguation Observations 3 laboratories 2 laboratories 2 laboratories 3 laboratories 2 laboratories 2 laboratories References: 1. D. Jurafsky, J. Martin, speech and Language Processing, Second edition, Blackwell, 2009 2. S. Bird, E. Klein, E. Loper, Natural Language Processing with Python, O’Reilly Media, 2009 . 9. Corroboration / validation of the subject’s content in relation to the expectations coming from representatives of the epistemic community, of the professional associations and of the representative employers in the program’s field. There is accelerated growth in the research conducted at the intersection of computer science and linguistics. Romanian language need tools towards speech recognition and synthesis. Such technologies are of great usage within the society, in all fields, including business – automatic reponse to phone calls, medicine, court rooms etc. 10. Assessment (examination) Type of activity 10.1 Assessment criteria 10.2 Assessment methods 10.4 Course The degree by which the students correctly Written final exam. acquired the concepts, notions and tools of natural language processing. 10.5 The degree by which the students correctly The assessment of the homework Seminar/laboratory acquired the concepts, notions and tools of projects. The assessment tries to NLP. measure the degree by which the The ability of the students to use these students acquired the theory and concepts, notions and tools to solve practical the ability to apply it in practical problems, analyze real life business and examples and real life situations. The realization of the homework economics situations, etc. The capacity of the students to take projects is conditioning the final economic/financial/business decisions based grade. on the results of their analysis and suitably applying the theories and algorithms they’ve studied. 10.6 Minimum performance standard • It is necessary to obtain a minimum final grade of 5 (five) in order to pass this subject; • The grades being granted are between 1 (one) and 10 (ten); • Students must approach each element (question, problem) within the (written) exam sheet; • The exam is written and takes approximately 120 minutes; Date of filling 26 january, 2015 Signature of the course professor Conf.dr. Liana Stanca 10.3 Weight in the final grade 50% 50% Signature of the seminar professor Conf.dr. Liana Stanca 3 NOTE: This document represents an informal translation performed by the faculty. Date of approval by the department 28 january 2015 Head of department’s signature Prof.dr. Gheorghe Cosmin Silaghi 4 NOTE: This document represents an informal translation performed by the faculty.