Ronan Collobert
Jason Weston
Leon Bottou
Michael Karlen
Koray Kavukcouglu
Pavel Kuksa

Common Approaches in NLP
 Using Task-specific features
 Knowledge injection about structure of data
 Expertise from Linguists

Approach used in the paper
 No task-specific feature engineering
 Minimal prior Knowledge

Part-Of-Speech (POS) Tagging
 Syntactic Parsing

Chunking
 Shallow Parsing

Named Entity Recognition
 Person, Location etc.

Semantic Role Labelling

Words to Feature Vectors
 Look up Table
 Random initialization vs Unsupervised Pre-
training
 Extending to any Discrete Features

Extracting Higher level Features from Word
Feature Vectors
 Window Approach
 Sentence Approach

Word-Level Log Likelihood
 Only words are taken independently for
optimizing the weights

Sentence-Level Log Likelihood
 Optimization function takes into account all the
tags as well as transitions between tags

Stochastic Gradient
 Standard Optimization Algorithm