Course opinion mining methodology for knowledge

advertisement
PCI 2014
Quality in Education Technologies
Title
Course opinion mining methodology
for knowledge discovery, based on
web social media
Authors
Sotirios Kontogiannis
Ioannis Kazanidis
Stavros Valsamidis
Alexandros Karakos
PCI 2014
Quality in Education Technologies
• Introduction
• Method
• Case study
Outline
• Results
• Proposed framework
• Proposed opinion mining system architecture
• Discussion and conclusions
PCI 2014
Quality in Education Technologies

Primarily, authors used the LMS platforms of
academic institutions where course knowledge and
course interest collide (apart from the classroom)

Introduction
Then the authors focused on student-course
appreciation by using questionnaires and course
grades

The results of this twofold evaluation were
sometimes contradicting
PCI 2014
Quality in Education Technologies

Since LMS course evaluation is based on a scientific
evaluation approach, authors concluded that a
similar
approach
is
needed
to
replace
the
questionnaires with a less guided and manipulous
Introduction
scientific methodology

That is, knowledge extraction and discovery from
existing social networks where people express freely
and non guided opposition for an academic course
PCI 2014
Introduction
Quality in Education Technologies
This paper
 This paper proposes a framework for applying
opinion mining in social networks.
 The goals of the proposed framework is to
(a) extract useful textual information from social
networks of blogs regarding learning course
activities or processes and
(b) apply opinion mining techniques on the
extracted text in order to discover the positive or
negative opinions concerning each course.
PCI 2014
Quality in Education Technologies
The 4 steps for opinion mining in a social network
Framework
PCI 2014
Quality in Education Technologies
Study population and context

a microblog by following posts relative to the
educational level and institutional services of a Greek
Technological Educational Institute (TEI)
Case study

a period of six months

comments of different commentators from the same
department in the Greek language

It can be accessed at https://www.facebook.com/
groups/69887509784/
PCI 2014
Quality in Education Technologies
View of the experimental microblog
Case study
PCI 2014
Quality in Education Technologies
1st process
Create, train and store the classification model for automatic
categorization of text as positive or negative
Case study
PCI 2014
Quality in Education Technologies
2nd process
Apply the model to new data to automatically be categorized
into positive and negative
Case study
PCI 2014
Quality in Education Technologies

Results
Words like “painful”, “discourage”, “damage”, “unemployment”,
etc. were found to have negative sentiment whereas words like
“profitability”, “prosperity”, “interest”, “success”, etc. were found to
have positive sentiment in comments.

The comments from the microblog of our study were split in
half between positive and negative about the educational institute
offered services and knowledge.

In other words our sample was split evenly on feelings.
PCI 2014
Quality in Education Technologies
Twofold methodology for the evaluation of an
academic course

LMS course web usage mining evaluation process
with the use of a three tier evaluation architecture and
the measures, metrics and algorithms

Proposed
Framework

Opinion mining process
PCI 2014
Quality in Education Technologies
Opinion mining process
 Step 1

Proposed
Framework

Source selection and monitoring of UGC sources

educational institution channels

general source channels
Step 2

Source crawling engine and initialization mechanism
- crawling design and crawled content storage
capabilities
PCI 2014
Quality in Education Technologies
Opinion mining process
 Step 3

Semantic
enrichment
engine
–
semantic
enrichment design
Proposed
Framework
If the text content is semi-structured, then the use of
either natural language processing (NLP) or other text
analysis techniques in order to interpret (grammatically
and syntactically process) each sentence, prior to the
assignment if possible a sentiment to it

The effectiveness of the different approaches largely
depends on the quality of the raw text to be analyzed; in
general, NLP and therefore semantic enrichment is
effective on syntactically-correct texts while it falls short
on ill-formed sentences or when Internet dialects are
used

PCI 2014
Quality in Education Technologies
Opinion mining process
 Step 4

Sentiment
Analysis
processes,
metrics
and
algorithms
Proposed
Framework
This step involves the use of opinion mining over a
adequate level of enriched sentences of user text. For
this process to be accurate a very well trained dataset
of opposite and negative sentences is required, with a
high level of polarity among those datasets

For such functionality to be performed in an automatic
and real-time manner or even to be a self trained
feedback mechanism, appropriate metrics or scores
need to be defined as well polarity judgment algorithms
need to be proposed and validated

PCI 2014
Quality in Education Technologies
Opinion mining process
 Step 4

Sentiment
algorithms
Analysis
processes,
metrics
and
If a polarized dataset of high confidence is pertained, then
using a Part Of Speech sentence (POS) tagger (or NLP
clustering of a sentence) and a trained Bayesian sentence
classifier, we can pinpoint that a sentence tokens belong to a
class of sentences by looking at the tokens probability

Proposed
Framework
The likelihood of a sentence can be calculated to be negative
as the number of negative sentences in that class over the
total number of negative sentences in all classes

Likelihood_sentence_Negative = Number of class negative
sentences / Total number of Negative sentences

PCI 2014
Quality in Education Technologies
Opinion mining process
 Step 5

Visualization and courses ranking mechanism
based on opinion results
Proposed
Framework
PCI 2014
Discussion
and
Conclusion
Quality in Education Technologies
This paper proposes
 a framework and a testbed system used for
applying opinion mining in blogs regarding
course educational content.
 a
context sensitive sentiment analysis
methodology which provides human like
sentiment analysis based on semi supervised
learning structures
PCI 2014
Discussion
and
Conclusion
Quality in Education Technologies
The expected benefits of applying such a
framework are the following:
 Qualitative presentation of people concerns
over a course and course user preferences.
Recording user’s problems and negative or
positive user opinions concerning educational
courses they are interested in without spatial
and temporal restrictions
PCI 2014
Quality in Education Technologies
Research Limitations
The accuracy of the WSD (Word Sense
Disambiguation) program (OpenNLP) within
this approach, so that the exact sense of
each term can be identified and exact
sentiment scores can be calculated
 The framework has been only applied to a
specific microblog for a set of three courses.
In order to better benchmark it, it must be
also applied to other blogs as well

Discussion
PCI 2014
Quality in Education Technologies
THANK YOU FOR YOUR ATTENTION
Course opinion mining methodology for knowledge
discovery, based on web social media
Sotirios Kontogiannis
Ioannis Kazanidis
Stavros Valsamidis
Alexandros Karakos
Download