Natural Language processing

advertisement
Facultatea de Științe Economice și Gestiunea Afacerilor
Str. Teodor Mihali nr. 58-60
Cluj-Napoca, RO-400951
Tel.: 0264-41.86.52-5
Fax: 0264-41.25.70
econ@econ.ubbcluj.ro
www.econ.ubbcluj.ro
DETAILED SYLLABUS
Natural Language processing
1. Information about the study program
1.1 University
1.2 Faculty
1.3 Department
1.4 Field of study
1.5 Program level (bachelor or master)
Babeș-Bolyai University
Faculty of Economics and Business Administration
Business Information Systems
Business Information Systems
Master
1.6 Study program / Qualification
Business Modeling and Distributed Computing
2. Information about the subject
2.1 Subject title
Natural Language Processing
2.2 Course activities professor
Assoc. Prof. Liana Stanca
2.3 Seminar activities professor
Assoc. Prof. Liana Stanca
2.4 Year of study
II 2.5 Semester
I
ES (i.e.,
summative
2.6 Type of assessment
2.7 Subject regime
examination
)
elective
3. Total estimated time (teaching hours per semester)
3.1 Number of hours per week
4 out of which: 3.2 course
2
3.3 seminar/laboratory
3.4 Total number of hours in the
56 out of which: 3.5 course
28
3.6 seminar/laboratory
curriculum
Time distribution
Study based on textbook, course support, references and notes
Additional documentation in the library, through specialized databases and field activities
Preparing seminars/laboratories, essays, portfolios and reports
Tutoring
Assessment (examinations)
Others activities
3.7 Total hours for individual study
119
3.8 Total hours per semester
175
3.9 Number of credits
7
2
28
Hours
38
24
45
8
4
4. Preconditions (if necessary)
4.1 Curriculum
4.2 Skills
Methods in data science. Descriptive statistics.
Basic programming skills
5. Conditions (if necessary)
5.1. For course
development
5.2. For seminar /
laboratory development
The courses should be held in a room with simultaneous access to a computer-projector
and a board.
The seminars should be held in a room with simultaneous access to a computer-projector
and a board. As well, the students need to have access to computers.
1
NOTE: This document represents an informal translation performed by the faculty.
6. Acquired specific competences
Professional
competences
Transversal
competences

The ability to process a text corpus with natural language processing techniques and tools, both
on syntactic level and on semantic level
 The ability to derive models for natural language
 Acquiring a set of scientific research skills allowing further professional development at
doctoral level
 Systematic and advanced knowledge of quantitative and qualitative modeling methods and
their application to solving complex research problems.
7. Subject objectives (arising from the acquired specific competences)
7.1 Subject’s general objective
7.2 Specific objectives
This course cover linguistic and algorithmic foundations of natural language
processing. The course uses corpus data to illustrate concepts like language
modelling, part of speech tagging, syntactic processing semantic processing.
The course is based on the Romanian language experience.
- The students should understand:
- syntactic and semantic processing of written text
- NLP process towards speech recognition or text synthetisation
- techniques towards authorship and topic identification
8. Contents
Teaching
Observations
methods
The
professor
gives a talk and
Creating and annotating language corpora: markup, annotation, evaluation encourages
2 courses
measures, web tools
discussions on the
themes.
The
professor
gives a talk and
Language modeling. Hidden Markov models.
encourages
2 courses
discussions on the
themes.
The
professor
gives a talk and
Part of speech tagging. Viterbi algorithm smoothing
encourages
2 courses
discussions on the
themes.
The
professor
Syntax processing: context-free grammars, chart parsing, constituency, gives a talk and
subcategorization, dependencies, feature representation, lexicalized grammar encourages
3 courses
formalism
discussions on the
themes.
The
professor
gives a talk and
Semantic processing: compositionality, argument structure, word sense
encourages
3 courses
disambiguation, anaphora resolution
discussions on the
themes.
The
professor
gives a talk and
Information retrieval and NLP
encourages
2 courses
discussions on the
themes.
References:
1. D. Jurafsky, J. Martin, speech and Language Processing, Second edition, Blackwell, 2009
2. Xedong Huang, Alex Acero, HW Hon, Spoken Language Processing: A guide to theory, Algorithm and
System development, Prentice Hall 2001
8.1 Course
2
NOTE: This document represents an informal translation performed by the faculty.
Teaching
methods
Practical
laboratory
Practical
laboratory
Practical
laboratory
Practical
laboratory
Practical
laboratory
Practical
laboratory
8.2 Seminar/laboratory
Corpora and language models. Python modeling
Smoothing and authorship identification. Python modeling
HMM construction and use
Parsing tools
Collocation and mutual information
Word sense disambiguation
Observations
3 laboratories
2 laboratories
2 laboratories
3 laboratories
2 laboratories
2 laboratories
References:
1. D. Jurafsky, J. Martin, speech and Language Processing, Second edition, Blackwell, 2009
2. S. Bird, E. Klein, E. Loper, Natural Language Processing with Python, O’Reilly Media,
2009
.
9. Corroboration / validation of the subject’s content in relation to the expectations coming from
representatives of the epistemic community, of the professional associations and of the representative
employers in the program’s field.
There is accelerated growth in the research conducted at the intersection of computer science and linguistics.
Romanian language need tools towards speech recognition and synthesis. Such technologies are of great usage
within the society, in all fields, including business – automatic reponse to phone calls, medicine, court rooms etc.
10. Assessment (examination)
Type of activity
10.1 Assessment criteria
10.2 Assessment methods
10.4 Course
The degree by which the students correctly Written final exam.
acquired the concepts, notions and tools of
natural language processing.
10.5
The degree by which the students correctly The assessment of the homework
Seminar/laboratory acquired the concepts, notions and tools of projects. The assessment tries to
NLP.
measure the degree by which the
The ability of the students to use these students acquired the theory and
concepts, notions and tools to solve practical the ability to apply it in practical
problems, analyze real life business and examples and real life situations.
The realization of the homework
economics situations, etc.
The capacity of the students to take projects is conditioning the final
economic/financial/business decisions based grade.
on the results of their analysis and suitably
applying the theories and algorithms they’ve
studied.
10.6 Minimum performance standard
• It is necessary to obtain a minimum final grade of 5 (five) in order to pass this subject;
• The grades being granted are between 1 (one) and 10 (ten);
• Students must approach each element (question, problem) within the (written) exam sheet;
• The exam is written and takes approximately 120 minutes;
Date of filling
26 january, 2015
Signature of the course professor
Conf.dr. Liana Stanca
10.3 Weight in
the final grade
50%
50%
Signature of the seminar professor
Conf.dr. Liana Stanca
3
NOTE: This document represents an informal translation performed by the faculty.
Date of approval by the department
28 january 2015
Head of department’s signature
Prof.dr. Gheorghe Cosmin Silaghi
4
NOTE: This document represents an informal translation performed by the faculty.
Download