Presentation - Effective Learning Analytics

advertisement
Developing metrics and predictive
algorithms for your institution–
Marist Story
JISC LEARNING ANALYTICS NETWORK EVENT
SANDEEP JAYAPRAKASH
Sandeep Jayaprakash
LEAD DATA SCIENTIST
MARIST COLLEGE & APEREO
Twitter - @sandeep_jay1
Presentation Overview
Open Academic Analytics Initiative (OAAI)
◦ Objectives
◦ Data extraction & preparation
◦ Predictive Models and Results
◦ Impact on Student Success
◦ Deliver insights to end users
Predictive Analytics
in Action: Open Academic
Learning Analytics Initiative
PRACTICAL EXAMPLE: EARLY ALERT SYSTEM
Open Academic Analytics Initiative
EDUCAUSE Next Generation Learning
Challenges (NGLC) in United States
Funded by Bill and Melinda Gates Foundations
$250,000 over a 15 month period
Goal: Leverage Big Data and analytics to create
an open-source academic early alert system and
research “scaling factors”
Input Data Considerations
A predictive model is usually as good as its training data
Good:
◦ Volume - Have lots of data (multiple semesters)
◦ Variety - Diverse data
◦ Veracity – Data should support the value system
Not so good:
◦ Data Quality Issues
◦ Unbalanced classes (at Marist, 6% of students at risk. Good for the student body,
bad for training predictive models  )
Learning Analytics preparedness
Learning Analytics is inter-disciplinary and coordination among
different groups
High level buy in from management for a smooth execution
Data comes from a wide range of systems
Ethics and policy should go hand in hand
Open Academic Analytics Initiative (OAAI)
Jisc Data Specification Github link
Predictive Modeling process
Feature Extraction - Data Quality Issues
Variability in instructor’s assessment criteria
Variability in workload criteria across modules
Variability in period used for prediction (early detection)
Variability in grading criteria across modules (partial grades
with variable contribution)
Data Quality Issues
Variability in VLE tools deployment by instructors
Variability in usage of tools by students
Modules
Result - Missing Values and holes in your data
Tools
How Do we address them ?
Handling Variability - Use ratios and class averages
◦ Activity - Percent of usage over Avg percent of usage per course
◦ Grades - Effective Weighted Score / Avg Effective Weighted Score
Handling missing values
◦ Follow an 80 / 20 rule for selection of metrics
◦ Perform data imputations to further enrich the quality
◦ Build cohort based models to leverage more predictors
Sampling - Balance the datasets
Predictors of
Student Risk
 VLE predictors were
measured relative to
course averages.
 Some predictors
were discarded if
not enough data
was available.
Machine Learning Classifiers
C4.5/C5.0 Boosted Decision Tree
Logistic Regression
Support Vector machines
Predictive Performance of Marist Model
Research Design
Models were developed based on Marist data
◦ 85 % accuracy in capturing at-risk students
Deployed OAAI system to 2200 students across four institutions
◦ Two Community Colleges (FE institutions)
◦ Two Historically Black Colleges and Universities (BaME institutions)
Design > One instructor teaching 3 sections of the same module
◦ One section was control, other 2 were treatment groups
Each instructor received an AAR three times during the semester
◦ Intervals were 25%, 50% and 75% into the semester
Institutional Profiles
Predictive Model Portability Findings
Conclusion
1. Predictive models are more “portable”
than anticipated.
2. It is possible to create generic models
and the “process can be ported” for
use at specific types of institutions.
3. It opens a possibility to explore library
of open predictive models and
techniques that could be shared across
institutions to Learn @ Scale
Intervention Research Findings Final Course Grades
Analysis showed a statistically significant
positive impact on final course grades
◦ No difference between treatment groups
Saw larger impact in spring than fall
Similar trend among low income
students
Intervention Research Findings Content Mastery
Student in intervention groups
were statistically more likely to
“master the content” than those
in controls.
◦ Content Mastery = Grade of C or
better
Similar for low income students.
Intervention Research Findings Withdrawals
Students in intervention groups
withdrew more frequently than
controls
Possibly due to students avoiding
withdrawal penalties.
Consistent with findings from
Purdue University
Instructor Feedback
"Not only did this project directly assist my students by guiding
students to resources to help them succeed, but as an
instructor, it changed my pedagogy; I became more vigilant
about reaching out to individual students and providing
them with outlets to master necessary skills.
P.S. I have to say that this semester, I received the highest
volume of unsolicited positive feedback from students,
who reported that they felt I provided them exceptional
individual attention!
More Research Findings…
JAYAPRAKASH, S. M., MOODY, E. W., L AURÍA, E. J., REGAN, J. R.,
& BARON, J. D. ( 2014). EA R LY A L E RT OF ACA DE M ICA LLY AT - R ISK
ST UDENTS: A N OP E N S OU RCE A N A LYTI CS I N I TIATIVE . JOURNAL
OF L EARNING ANALYTICS, 1( 1), 6 - 47.
Dashboards – Deliver
insights
Learning Activity Radar Chart
Short Demo video link
Radar chart - Low Risk pattern example
Early Alert Insights - Risk Quadrant
High Performance
High Engagement
Student Performance
High Performance
Low Engagement
Low Performance
High Engagement
Low Performance
Low Engagement
Student Engagement
Future Research
◦ Expand our feature set
◦ Try to further minimize percentage of false alarms we raise
◦ Scalability enhancements leveraging Hadoop/Spark
◦ Dynamic modeling capabilities
◦ More UX research on building intuitive dashboards
Join the mailing list!
analytics@apereo.org
(subscribe by sending a message to
analytics+subscribe@apereo.org)
Want the latest updates?
Apereo Learning Analytics Initiative
Wiki: https://confluence.sakaiproject.org/x/rIB_BQ
GitHub: https://github.com/Apereo-Learning-Analytics-Initiative
Sandeep Jayaprakash: Sandeep.Jayaprakash1@marist.edu
Download