Process Mining from Educational Data (Chapter 9)

advertisement
CurriM: Curriculum Mining
Mykola Pechenizkiy
TU Eindhoven
Learning Analytics Innovation
10 October 2012
SURFfoundation, Utrecht, the Netherlands
Initial Motivation for CurriM
• Current practice:
– We think we know what our curriculum is and
how the students study. But is this true?
• CurriM aims at providing tools to analyze
– how the students actually study
• Who would benefit from our tool?
– Directors of education, study advisers, students
• Goal: showcase the potential and feasibility
– Data mining and process mining techniques
– 10 years of TUE administrative data; exam grades
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
1
Questions for CurriM to Answer
• What is the real academic curriculum (study
program)?
• How do students really study?
• Is there a typical (or the best) way to study?
• Do current prerequisites make sense?
• Is the particular curriculum constraint obeyed?
• How likely is it that a student will finish the
studies successfully or will drop out?
• What is my expected time to finish?
• Should I now take courses A & B & C or C & D?
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
2
Refocused to Target Students as Users
(based on the received feedback)
Awareness tool supporting interactive querying:
• How does a course relate to the program?
– Prerequisites, follow up dependencies
• How am I doing wrt the averages, top 10%?
– Aggregates/OLAP
• What is my expected time to finish?
– Predictive modeling
• Should I now take courses A & B & C or C & D?
– Collaborative filtering style recommendations
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
3
CurriM UI Demo
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
4
Where is EDM/LA?
(hidden from the users behind GUI)
Curriculum model:
• Codified constraints with Colored Petri net and LTL
– Prerequisites, follow up dependencies, 3 out of 5
selection, number of attempts, mandatory courses etc.
– Input: domain knowledge and output of patters mining
• Awareness and automated conformance checking
– Is the currently chosen path compliant with the official
guidelines and follows data driven recommendations
– Computed aggregates and mined pattern from the data
• Data driven recommendations and predictions
– What is my expected time to finish?
– Should I take now courses A & B & C or C & D?
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
5
Main Results
• Software prototype – CurriM as ProM plugin,
– Focus on GUI + architecture/interfaces
– Demonstrates the concept
• Experiments with TUE dataset
– Prerequisites, bottleneck/predictive courses
– Recommendations
– Data quality is the key
• Clear motivation and need for a continuation
– The concept is found to be promising
– Potential and feasibility is shown
– Roadmap
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
6
Why Do Students Like the Concept?
CurriM is a tool that
• Provides orientation:
– Curriculum as a guide and motivation
– See the connections and dependencies
• Provides awareness and recommendations
– Global: how good is their personal education route,
where they currently are, where they are heading,
how well they do in comparison with others
– Local: what would it mean to take course X
• Enables better planning and regular monitoring
– Focus on what looks important, not just interesting
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
7
Main Lessons Learnt
Data quality is the key
• Administrative DBs and existing data collection
organization do not keep EDM/LA in mind
• Lots of preprocessing and reorganization is required
Meta-data is the other key (lacking codifiability)
• Everything that is scattered in study guides and minds of
study advisors should become easy to codify
Curriculum changes more often than we tend to think
• Semesters-trimesters-quartiles, courses & course ids
Being “flexible” (written vs. unwritten rules) too much
• Effectively means no formal curriculum
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
8
Conclusions
• CurriM can become a big success
– The students seem to like the idea
– It is promising and it is feasible; but it is a long way
from the current concept to a fully functional and
usable tool
• Surf funding opportunity in LA was nice
– Triggered us to take concrete practical steps, a tool
rather than techniques development;
– But a more serious commitment is needed to
make a real breakthrough and bring CurriM into
the educational practice
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
9
Continuation Roadmap
Conditioned wrt funding opportunities
• Working out the full cycle of the information
flows including pattern mining, predictions and
recommendations, and its
integration/parallelization with the administrative
processes
• Working out different views and functionality for
students vs. educators, HCI/usability aspects
• Improve data quality collection
• Facilitate knowledge base construction (metadata, mappings)
• Facilitate curriculum formalization for faculties
(tooling)
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
10
Project Team
Project leader:
• dr. Mykola Pechenizkiy – educational data mining expert
Driving force:
• Pedro Toledo – software developer, applied researcher
Technology experts:
• Prof. dr. Paul De Bra – Human-computer interaction and databases
expert
• dr. Toon Calders – pattern mining expert, assistant professor
• dr. Nikola Trcka – collaborator on curriculum mining, postdoc
• dr. Boudewijn van Dongen – process mining expert, assistant
professor
• dr. Eric Verbeek – ProM software expert, scientific programmer
Domain experts
• Several domain experts, i.e. responsible educators, are available for
CurriM on request: dr. Karen Ali (STU), Prof. dr. Mark de Berg (CSE)
Learning Analytics @Surf
10 October 2012, Utrecht,
CurriM: Curriculum Mining
Mykola Pechenizkiy, Eindhoven University of Technology
11
Additional slides
• Including some from the original proposal
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
12
Execution plan
Task 1. Developing the first software prototype for
academic curriculum modeling. As mini R&D cycles:
• identifying types of curriculum specific patterns we
need to mine from the event logs (in collaboration with
the domain experts) and to include in the curriculum
modeling and developing corresponding pattern
mining and pattern assembling techniques;
• Implementing techniques and integrating it with ProM
that provides an important process mining foundation
framework and many of the building blocks for
curriculum modeling software;
• testing a particular piece of software.
Learning Analytics @Surf
29 February2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
13
Execution plan
Task 2. Case study: modeling the curriculum of the
Department of Computer Science, TUE; Goals:
• Validating the correctness and usefulness (to the
end users, i.e. teachers, study advisers, students)
of the developed curriculum mining techniques
and their implementations.
• Developing guidelines for managing the
curriculum related data to avoid the problems we
will encounter or envision during the case study.
• Task 1 and Task 2 will run simultaneously
ensuring timely feedback.
Learning Analytics @Surf
29 February2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
14
Execution plan
Task 3. Creating a roadmap for further study and
development of the curriculum modeling toolset
• Develop R&D agenda for the coming years.
• This includes identification of not only research
challenges i.e. answering the question
– “what kind of new data mining and process mining
techniques are needed to address the peculiarities of
the curriculum mining domain?”
• but also the strategy of the smooth technology
transfer to the prospective end users, i.e.
– early adopters (e.g. TUE or 3TU departments) that
would help to validate the usability and usefulness of
the curriculum mining software “in the wild”.
Learning Analytics @Surf
29 February2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
15
Project Team
Task 3. Creating a roadmap for further study and
development of the curriculum modeling toolset
• Develop R&D agenda for the coming years.
• This includes identification of not only research
challenges i.e. answering the question
– “what kind of new data mining and process mining
techniques are needed to address the peculiarities of
the curriculum mining domain?”
• but also the strategy of the smooth technology
transfer to the prospective end users, i.e.
– early adopters (e.g. TUE or 3TU departments) that
would help to validate the usability and usefulness of
the curriculum mining software “in the wild”.
Learning Analytics @Surf
29 February2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
16
Learning Analytics Seminar,
August 30-31, Utrecht, NL
Educational Data Mining & Learning Analytics for All: Potential, Dangers, Challenges
Mykola Pechenizkiy, Eindhoven University of Technology
17
Educational Process Mining Toolbox
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
18
Intuition suggests that curriculum is
• Structured and easy to understand as we think
there are not that many options to choose from
– It may look just like this one:
• but the data may suggest that it looks different…
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
19
… data may suggest that students show
somewhat more
diverse behaviour:
Learning Analytics @Surf
29 February2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
20
Two Different Tasks
Isolate a set of standard curriculum patterns and based on these patterns
• mine the curriculum as an executable quantified formal model and
analyze it, or
• first (manually) devise a formal model of the assumed curriculum and test
it against the data.
Event Log MXML format
Typical forms of
requirements in the
curriculum
supported by ProM
Pre-authored
pattern templates
Data log
Educators
Pattern mining
Pattern set
Process assembling
Colored
Petri net
Process
model
Conformance checking
Model extension
Online monitoring
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
23
Application Scenarios
Scenario 1: Find most common types of
behavior (and cluster them)
 Scenario 2: Find emerging patterns: such
patterns, which capture significant

– differences in behavior of students who
graduated vs. those students who did not
– changes in behaviour of students from year
2006-07 to 2007-08.
– in both cases we search for such patters which
supports increase significantly from one dataset
to another (i.e. in space in the first case and in
time in the second case)

Scenario 3: After finding a bottleneck, find
frequent patterns that describe it, i.e. for which
students it is the bottleneck and why
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
Student
A
A
A
B
B
B
B
C
Timestamp
S1
S2
S3
S1
S3
S4
S5
S1
Student
A
B
C
Events
2, 3, 5
6, 1
1
4, 5, 6
2
7, 8, 1, 2
1, 6
1, 8, 7
Graduated
Yes
No
Yes
24
Example 2-out-of-3 Pattern Check
• At least 2 courses from { 2Y420,2F725,2IH20 } must
be taken before graduation :
• An higher level abstraction can be developed on a
longer run to avoid we aim at developing a
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
25
Process Discovery Example
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
26
Which Courses Are Difficult/Easy for Which
Students?
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
27
References
•
•
•
•
•
•
Trčka, N., Pechenizkiy, M. & van der Aalst, W. (2010) "Process Mining from
Educational Data (Chapter 9)", In Handbook of Educational Data Mining. , pp.
123-142. London: CRC Press.
Pechenizkiy, M., Trčka, N., Vasilyeva, E., van der Aalst, W. & De Bra, P.
(2009) Process Mining Online Assessment Data, In Proceedings of 2nd
International Conference on Educational Data Mining (EDM'09), pp. 279-288.
Trčka, N. & Pechenizkiy, M. (2009) From Local Patterns to Global Models:
Towards Domain Driven Educational Process Mining, In Proceedings of Ninth
International Conference on Intelligent Systems Design and Applications
(ISDA'09), pp. 1114-1119.
Bose, R.P.J.C., van der Aalst, W.M.P., Zliobaite, I. & Pechenizkiy, M.
(2011) Handling Concept Drift in Process Mining, In Proceedings of 23rd
International Conference on Advanced Information Systems Engineering
CAiSE'2011, Lecture Notes in Computer Science 6741, Springer, pp. 391-405.
Dekker, G., Pechenizkiy, M. & Vleeshouwers, J. (2009) Predicting Students
Drop Out: a Case Study, In Proceedings of the 2nd International Conference
on Educational Data Mining (EDM'09), pp. 41-50.
http://www.processmining.org/
Learning Analytics @Surf
29 Febnuary 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
29
Short CV of the Project Leader
Mykola Pechenizkiy
Assistant Professor at Dept. of Computer Science, TU/e
Research interests: data mining and knowledge discovery;
Particularly predictive analytics for information systems
serving industry, commerse, medicine and education.
http://www.win.tue.nl/~mpechen/ - projects, pubs, talks etc.
Major recent EDM-related activities:
Confirmed interest in CurriM at TUE
• Dr. Karen S. Ali - Director of Education and
Student Service Center, STU
• Prof. Dr. Mark de Berg - Director of the
graduate program, Dept. of Computer Science
• Dr. Marloes van Lierop - Director of the
bachelor program, Dept. of Computer Science
• Study advisers at different faculties
Learning Analytics @Surf
29 February 2012, Utrecht,
CurriM: Curriculum Mining Project Proposal
Mykola Pechenizkiy, Eindhoven University of Technology
31
Download