Slides

advertisement
Weakly Supervised Models
of Aspect-Sentiment for
Online Course Discussion
Forums
ARTI RAMESH
SHACHI H. KUMAR
JAMES FOULDS
LISE GETOOR
I
Q
L
N
S
registrants
Low completion
rate
▪ submitting le
▪ submitting as
help model enga
MOOCs
• Massive: attracts thousands of participants
• Open: open access, content, and assessment
• Online: hosted online by education companies
in partnership with top universities
2
Classroom
• Classroom
– Face-to-face interaction
between instructor and
students
MOOCs
• MOOC Discussion Forums
– Primary means of
interaction between
instructor and students
• Large number of students, posts: Hard to monitor manually
• Posts discuss problems in course - course material, errors,
feedback
3
Example MOOC Posts
MOOC Post
The video is very choppy. Can
somebody fix this?
Fine-grained Topic
Lecture-Video
Will subtitles be made available for
the lectures for this week? I liked
the transcripts from last week.
Lecture-Subtitles
Will everyone get a certificate or
only people in the signature track?
Certificate
When is quiz 4 due?
4
Quiz-Deadlines
Predicting fine-grained problems:
Challenges
• Labeled data hard to obtain
– 5-10% posts contain problems
– Privacy concerns around data sharing
– Problems differ across courses
• Unsupervised/weakly supervised approaches desirable
– System not fine-tuned to one course, but can adapt
across courses
5
Related Work
Aspect-sentiment in Online Reviews
• Semi-supervised generative model, with seed
words to identify aspect clusters [Mukherjee et al., 2012]
• Unsupervised Aspect-Sentiment Model for Online
Reviews [Brody et al., 2012]
• Hierarchical Aspect-Sentiment Model for Online
Reviews [Kim et al. 2013]
MOOCs
• Predicting Instructor Intervention in MOOC
Forums[Chaturvedi et al., 2014]
6
SeededLDA for MOOC Forums
SeededLDA
• Guide topic discovery by specifying representative
seed words
• seededLDA uses seeds to bias topic-word and worddocument distributions
• seededLDA gathers words related to seed words
SeededLDA for MOOCs
• Many classes but a common set of seed words
• Seed words for MOOCs from syllabus and forums
7
Jagarlamudi et al. 2010
Hinge-loss Markov Random Fields &
Probabilistic Soft Logic
• Hinge-loss Markov Random Fields (HL-MRFs)
– Logic-based MRFs that can reason about both
discrete and continuous graph data scalably and
accurately
– Efficient Inference: convex optimization in
continuous space
• Probabilistic Soft Logic (PSL)
– Templating language for HL-MRFs
– Weighted logical rules to model dependencies
– Continuous variables in [0,1]
8
Bach et al. 2012
Predicting fine-grained problems and
sentiment: Joint Prediction Problem
• Analogous to predicting aspect-sentiment in online
reviews
• Aspect hierarchy connecting course elements
• HL-MRF framework
– Combining different features
– Encoding coarse-to-fine aspect hierarchy
– Encoding dependencies between aspect and sentiment
• Jointly modeling aspect and sentiment
9
Our Contributions
• Identify fine-grained aspects in online courses
• Extract course-specific features from posts
using SeededLDA
• Construct coarse-to-fine aspect hierarchy to
model aspect dependencies
• Construct weakly-supervised joint model for
aspect-sentiment using HL-MRFs
• Validate system using crowdsourced posts
sampled from 12 courses
10
MOOC Aspect-Sentiment Models:
SeededLDA
• Coarse Aspect seeds
LECTURE: lecture, video, download, transcript, slide, note
QUIZ: quiz, assignment, question, midterm, exam, submission
CERTIFICATE: certificate, score, statement, signature
SOCIAL: name, course, introduction, study, group
• Sentiment seeds
POSITIVE: interest, exciting, thank, great, happy, glad, enjoy
NEGATIVE: problem, difficult, error, issue, unable, misunderstand
NEUTRAL: coursera, class, hello, everyone, greet, name
11
SeededLDA Model
• Fine Aspect seeds
LECTURE-VIDEO: video, problem, download, play, player,
LECTURE-AUDIO: volume, low, headphone, sound, audio, hear
LECTURE-LECTURER: professor, fast, speak, pace, follow, speed
LECTURE-SUBTITLES: transcript, subtitle, slide, note, lecture,
LECTURE-CONTENT: typo, error, mistake, wrong, right, incorrect
QUIZ-CONTENT: question, challenge, difficult, understand, typo
QUIZ-SUBMISSION: submission, submit, quiz, error, unable, resubmit
QUIZ-GRADING: answer, question, answer, grade, assignment, quiz
QUIZ-DEADLINE: due, deadline, miss, extend, late
12
PSL-Joint: Combining Features
SeededLDA score for fine aspect and coarse
aspect to predict fine aspect of post P
13
PSL-Joint: Combining Features
SeededLDA score for sentiment and fine aspect to
predict fine aspect
14
PSL-Joint: Encoding Dependencies
Dependency between coarse aspect and fine
aspect
15
PSL-Joint: Encoding Dependencies
Dependency between sentiment and fine aspect
16
Experimental Evaluation
F-1 scores for SeededLDA and PSL-Joint for coarse aspects
Model
Lecture
Quiz
Certificate
Social
SeededLDA
0.632
0.657
0.459
0.654
PSL-Joint
0.630
0.706
0.621
0.659
SeededLDA and PSL-Joint for sentiment
17
Model
Positive
Negative
Neutral
SeededLDA
0.182
0.517
0.356
PSL-Joint
0.189
0.615
0.434
Experimental Evaluation
SeededLDA and PSL-Joint for coarse aspects
Model
Lecture
Quiz
Certificate
Social
SeededLDA
0.632
0.657
0.459
0.654
PSL-Joint
0.630
0.706
0.621
0.659
PSL-Joint
SeededLDA
SeededLDA andoutperforms
PSL-Joint for sentiment
for most coarse aspects
Model
Positive
Negative
Neutral
and sentiment
SeededLDA 0.182
0.517
0.356
PSL-Joint
18
0.189
0.615
0.434
Experimental Evaluation
Fine-grained aspects under coarse aspect lecture
Model
Content
Video
Audio
Lecturer
Subtitles
SeededLDA
0.08
0.240
0.684
0.06
0.397
PSL-Joint
0.410
0.485
0.582
0.323
0.461
Fine-grained aspects under coarse aspect quiz
19
Model
Content
Submission
Deadlines
Grading
SeededLDA
0.011
0.437
0.214
0.514
PSL-Joint
0.36
0.416
0.611
0.550
Experimental Evaluation
Fine-grained aspects under coarse aspect “lecture”
Model
Content
Video
Audio
Lecturer
Subtitles
SeededLDA
0.08
0.240
0.684
0.06
0.397
PSL-Joint
0.582 distinguishes
0.323
between lecturecontent and quizFine-grained aspects under coarse
aspect “quiz”
content
20
PSL-Joint
0.410
0.485
Model
Content
Submission
Deadlines
Grading
SeededLDA
0.011
0.437
0.214
0.514
PSL-Joint
0.36
0.416
0.611
0.550
0.461
Experimental Evaluation
Fine-grained aspects under coarse aspect “lecture”
Model
Content
SeededLDA
0.08
Significant
Video
Audio
Lecturer
Subtitles
0.240
0.684
0.06
0.397
0.582
0.323
0.461
PSL-Jointimprovement
0.410
0.485
in scores
for lecture-lecturer and
quiz-deadlines
Fine-grained aspects under coarse aspect “quiz”
21
Model
Content
Submission
Deadlines
Grading
SeededLDA
0.011
0.437
0.214
0.514
PSL-Joint
0.36
0.416
0.611
0.550
Interpreting PSL-Joint Predictions
“There is a typo or other mistake in the assignment
instructions (e.g. essential information omitted).”
SeededLDA Prediction: Lecture-content
PSL-Joint Prediction: Quiz-content
“Thanks for the suggestion about downloading the video
and referring to the subtitles. The audio is barely audible,
even when the volume is set to 100%”
SeededLDA Prediction: Lecture-subtitles
PSL-Joint Prediction: Lecture-audio
22
Conclusion: Fine-grained aspectsentiment in MOOC forums
• Automatically detecting problems in forum posts
useful for instructors
• Weakly supervised probabilistic framework to
automatically detect aspect and sentiment in online
courses
– SeededLDA and PSL-Joint models as means to encode
domain information and predict aspect and sentiment
• PSL-Joint significantly outperforms SeededLDA for
many fine aspects, coarse aspects, and sentiment
– Structural dependencies among aspect and sentiment
helps in prediction
23
Download