Ronan Collobert
Jason Weston
Leon Bottou
Michael Karlen
Koray Kavukcouglu
Pavel Kuksa
Common Approaches in NLP
 Using Task-specific features
 Knowledge injection about structure of data
 Expertise from Linguists
Approach used in the paper
 No task-specific feature engineering
 Minimal prior Knowledge
Part-Of-Speech (POS) Tagging
 Syntactic Parsing
 Shallow Parsing
Named Entity Recognition
 Person, Location etc.
Semantic Role Labelling
Words to Feature Vectors
 Look up Table
 Random initialization vs Unsupervised Pre-
 Extending to any Discrete Features
Extracting Higher level Features from Word
Feature Vectors
 Window Approach
 Sentence Approach
Word-Level Log Likelihood
 Only words are taken independently for
optimizing the weights
Sentence-Level Log Likelihood
 Optimization function takes into account all the
tags as well as transitions between tags
Stochastic Gradient
 Standard Optimization Algorithm