Understanding and Predicting the Behaviors of teachers

advertisement
Understanding and predicting the behaviors of teachers using an
online curriculum planning tool
Ogheneovo Dibie, Institute of Cognitive Science and Department of Computer Science, University of Colorado
Boulder, 1777 Exposition Drive Boulder, CO 80309, ogheneovo.dibie@colorado.edu
Abstract: A key interest of the learning sciences and analytics community is in developing
approaches to understanding the practices of teachers using educational technologies. This
knowledge is particularly useful in improving the adoption and diffusion of these
technologies. This paper presents a computational approach to understanding and predicting
the behaviors of Earth science teachers using a web based curriculum planning tool. It
illustrates a methodology for characterizing user behaviors, understanding usage trends and
frequent patterns of use, and predicting long term behaviors of teachers from small time
windows.
Introduction
From supporting educators in the classroom, to providing students with learning opportunities outside the
classroom, educational technologies play an important role in learning today (Brusilovsky, Farzan, & Ahn,
2005, Sumner & CCS Team, 2010). Understanding the complex work that constitutes teaching is an important
area of research in the Learning Sciences (Fishman & Davis, 2006). As much of the processes involved in
teaching moves online, it is necessary to develop computational methods that improve our understanding of
teaching practices within online environments. This research demonstrates a computational approach to
understanding and predicting teacher behavior in an online curriculum planning tool based on clickstream data.
It builds upon research on computational approaches to understanding the use of educational digital resources.
One such approach is the work of Maull, Saldivar, & Sumner (2011) where a computational approach for
understanding teacher use of a web-based curriculum planning tool is investigated. They generated a typology
of user behaviors by clustering the clickstream of users. A user typology refers to the categorization of
aggregate user behaviors into distinct user types (Brandtzaeg, 2010). Each user type gives an overview of the
manner of use of members of that type (Brandtzaeg, 2010). For example a Facebook “power user” could be an
individual who spends a lot of time on Facebook and uses it in a variety of contexts such as gaming, sharing
video, and instant messaging among others.
While current usage theories give insight into user types, they do not account for the evolution of such
behavior and practices. A user of the type power user (spends a lot of time on a product and exercises most of
its features), after a year of using a product may not have exhibited this behavior from first use. This research
aims at shedding light on the evolution of a user type by understanding how the patterns of use that describe it
change over time and what frequent patterns can be observed within it. Furthermore, this research introduces
computational models for predicting a user's behavior based on knowledge of a user typology. This could be
especially useful in assisting platform providers in supporting the needs of users based on predicted behaviors.
This support could be in the form of better professional development (PD) training or improved system
design/usability.
Research Questions
The following questions will be addressed in this research:
1. Does the usage pattern (in terms of usage feature values) which describe a user type remain the same
or does it vary and if so how?
2. What are the frequent patterns of use that can be observed within a user type?
3. How well can computational models predict a user's eventual user type from smaller time windows?
Question one examines how the value of usage features that describe a user type change from time to time.
Question two describes what usage features go together within a particular user type. It would provide a better
understanding of what users of a particular type are likely doing when doing something else. Question three, a
set of classifiers for predicting the user type of users at the end of a school year from earlier time frames such as
the first month of use and first semester of use is explored. The aim is to discover what time frame and usage
features are best predictive of a user's user type.
Research Context
The research proposed in this paper is based on teacher use of a web based curriculum planning tool called the
Curriculum Customization Service (CCS). The CCS provides middle and high school Earth Science teachers
with access to digital versions of their instructional text book (publisher material), curriculum-relevant high
quality digital library resources (interactive resources) and community contributed resources (my stuff & shared
stuff) (Sumner & CCS Team, 2010).
Figure 1: Snapshot of the CCS indicating its core components. Interactive Resource (open tab), Publisher materials (Earth Comm Activities
tab), Community resources (Shared stuff and My Stuff tab)
Background
This research draws upon theoretical models and computational approaches of determining technology use.
Theoretical Models of Technology Use
Technology use research follows from a rich history of work in technology adoption. Technology adoption
occurs when an individual decides that a technology is of value and decides to use it. Technology adoption
theories include Roger's innovation diffusion theory (Rogers, 1995), the concerns based adoption model (Fuller,
1975), and the technology acceptance model (Legris, Ingham, & Collerette, 2003).
A key theory on technology use is the theory of use-diffusion. Originally proposed by Ram and Jung
(Ram & Jung, 1990) and expanded on by Shih & Venkatesh (Shih & Venkatesh, 2004), it measures technology
use on two dimensions: frequency and variety. Usage frequency refers to how often a product is being used
while variety refers to the different applications or contexts within which a product is being used. .
Computational Approaches for Determining Technology Use
Unlike the works of Ram & Jung and Shih & Venkatesh which analyzed usage from self-reported data,
computational approaches have been investigated to generate a typology of user behavior based on clustering of
actual usage. Clustering is a data mining technique for automatically grouping related items into bins (Han &
Kamber, 2006). Xu (2010) examined the use of clustering techniques to generate fine grained user typologies
within a web based instructional tool known as the Instructional Architect (IA). The IA is an educational digital
library service designed to facilitate the creation of simple instructional projects using web resources from the
National Science Digital Library (NSDL) and the web in general (ibid). Working with the Curriculum
Customization Service (CCS), an online curriculum planning tool incorporating digital library resources, Maull
et al. (2011) developed a typology of user behaviors observed in the CCS. This typology is inspired by a usediffusion based methodology and characterizes use based on frequency and variety type metrics observable
through server usage logs. Frequency based metrics include the number of sessions and hours spent; while
variety based metrics include areas of the CCS that were accessed such as interactive resources, publisher
materials, and user-contributed resources.
This research extends the use-diffusion methodology used by Maull et al. (2011) for understanding
technology use in the following ways: (1) This work provides an understanding of how usage features that
characterize user types trend within the time period of the observed user type. A user type is defined by the
values of a set of usage features (described in research context section). (2) This work introduces a marketbasket analysis1 of usage features to understand what features are likely used in concert per user type. This
answers the question of when users of type Y are doing X, what else are they likely to also be doing. (3) This
works explores computational models to predict an educator's behavior from small time windows of usage data.
It aims to provide understanding of the earliest time window of usage data and usage
.
Research Design
1
Market basket analysis is a data mining technique for understanding consumer behavior through their transaction patterns (Han & Kamber,
2006). It seeks to uncover meaningful and interesting associations in customer purchase data. A canonical example of the use of marketbasket analysis is a grocery store analyzing consumer purchases to understand what items shoppers usually purchased together.
The analyses conducted in this research are based on the use of CCS by 80 Earth science teachers during the
2011-2012 school year. Usage is analyzed through the lens of six usage features that cover the core areas of the
CCS. Following the use-diffusion methodology, these features are frequency based: number of sessions, hours
spent; and variety based: publisher material activity, interactive resource activity, my stuff activity and shared
stuff activity. Before proceeding with each analyses performed I introduce the idea of usage feature
discretization which informs the analysis performed.
Usage Feature Discretization
Each usage feature was discretized into three equal frequency bins of high, medium and low. Given the
relatively small size of the dataset, binning features allowed us to manage the wide distribution of continuous
feature values in a consistent fashion. To illustrate this consider the set of usage features {Number of sessions,
Hours spent, Interactive Resource Activity, My Stuff Activity, Shared Stuff Activity, Publisher Material
Activity} with values {20,4,90,23,45,55} that define a user's aggregate usage at the end of an observation
period. The feature set here indicates that the user spent 4 hours on the platform across 20 sessions and
performed 90, 23, 45 and 55 click actions within the interactive resources, my stuff, shared stuff and publisher
material areas of the system respectively. Discretized feature values such as {high,high,mid,low,high} gives a
sense of a user’s usage in comparison to everyone else. Discretized feature values inform our characterization of
clusters generated from clustering usage, trend analysis of usage features, understanding of frequent patterns of
use and prediction of user types.
Characterizing user types
Following a similar clustering approach as Maull et al. (2011), a user typology was generated via clustering the
clickstream of all users. The discretized feature values of each bucket were mapped to user types discovered by
Maull et al. (2011) to generate a label. If a direct mapping did not exist, a new user type was generated.
Study 1: Detecting usage pattern trends
This study addresses the first research questioned. As noted earlier, a user type is described by the values of its
set of usage features. The aim of this analysis is to understand how the mean values of each usage feature per
user types changes from time to time. To do this, each usage feature is observed on a semester-semester, bimonthly and month to month basis. This would give a good idea of which usage features remain stable (in terms
of use) and which vary per user type. This would be particularly useful in detecting strong features that define a
user type, as features that remain relatively stable could be good markers of a user type.
Study 2: Frequent patterns of use
This study addresses the second research question. The aim of this study is to understand what actions of
members of a specific user type usually go together. As an example, this study addresses the question of what
members of the power user user type are likely doing when they spend a high amount of time on publisher
materials. This would give a good idea of what usage feature correlations occur within a user type. I take a
market-basket analysis approach to understand what frequent patterns occur within each user type by
considering the feature values per month for each user as a 6-item transaction.
Study 3: Predicting user types
This study addresses the third research question posed. It examines the use of machine learning classifiers for
predicting a user’s behavior from small windows of time. The aim of this study is two-fold:
 Determine the earliest window of time that provides the best prediction of a user’s eventual class. A
user’s eventual class is the user type they hold membership in at the end of the year
 Determine the usage feature(s) that are most predictive of a user’s eventual type.
Results
Characterizing User Types
The clustering analysis performed produced three distinct user types that roughly map to user types discovered
by Maull et al. They are: power users, community seekers, and limited use user types. The power users had a
high value in each of the usage features examined. The community seekers used the CCS at a lower intensity
compared to users in the power user category, however they have higher values in the shared stuff and my stuff
areas. The limited use user type featured a low value in all features. Many users in this category either used the
CCS platform sporadically or discontinued use after just a month or two.
Usage Pattern Trends
Power users: All usage features had a high value on a semester-semester and bi-monthly basis. However, when
analyzing the usage features on a month to month basis, interactive resource activity, my stuff activity and
shared stuff activity tended to vary between the mid-high value bins. This indicates that a high value in the
number of sessions, hours and publisher material areas are a good indicator of a power user.
Community seekers: From performing semester-semester, bimonthly and month-month analysis of usage
features of users in this user type, it was discovered that although all usage features tend to oscillate between the
low and mid value ranges, shared stuff activity, publisher activity and my stuff activity tended to be the most
stable usage features
Limited Use: While users in this category generally tend to have a low value of use in all areas, on a month to
month basis usage feature values tend to vary between the low and mid value bins.
Frequent Patterns of Use
As a reminder, the goal of this study is to understand how usage features are used in concert. It aims to answer
the question of when a particular feature is used to a certain degree, what are users likely to also be doing. Like
in the usage pattern trend analysis, frequent patterns were analyzed on a user type basis. I discuss the most
interesting associations discovered from the analysis performed below.
Power user type: On average, users in this category have a high value in all usage features. Considering the
monthly usage of each user as an item set in the transaction, it was discovered that when users exhibit a high
frequency of use (high value in hours spent and number of session features), they were likely spending most of
their time accessing publisher materials which also had a value of high.
Community Seeker user type: Users in this group are likely to engage in using community resources compared
to the other groups. However, when they have a low frequency of use, they are more likely to be accessing
community resources.
Limited use: Users in this group featured a low use of the CCS on average. No particularly interesting
association rules were discovered, all rules tend to reinforce the fact that when users have a low use of the
system, they are also likely to have a low use of other system areas.
Predicting User Behavior
In predicting user behavior, a set of machine learning classifiers was implemented on a semester, bi-monthly
and monthly time windows. The months of March-May were the most predictive of a user’s type with a
prediction accuracy of 87.5%. However, this prediction isn’t very useful as it at the tail end of the school year.
October was found to be the earliest month were a fairly good prediction of 72.5% could be made.
Furthermore, the number of sessions, shared stuff activity and publisher material activity were the most
predictive features for that month
Discussion & Conclusion
This paper introduced a methodology for characterizing user types, detecting trends in usage via changes in
usage feature values, understanding frequent patterns of use and predicting long-term usage behavior as
described by user types. This work uses clickstream data to capture teacher behaviors, getting at what they do in
an online system, but not at why (a challenge for learning analytics research). Future work will examine a
different metric for usage time. This work uses months of use. An alternative could be to consider the number of
clicks as a usage time. Thus instead of considering usage at the end of a school year, analysis could be carried
out on the first 1000 clicks of each user.
References
Brusilovsky, P., Farzan, R., & Ahn, J.-w. (2005). Comprehensive Personalized Informationa Access in an
Educational Digital Library. Joint Conference on Digital Libraries, 9-18.
Fishman, B., & Davis, E. A. (2006). Teacher learning research and the learning sciences. In R. K. Sawyer (Ed.),
The Cambridge handbook of the learning sciences (pp. 535-550). New York: Cambridge University
Press.
Fuller, F. F. (1975). Concerns of teachers: A developmental conceptualization. Teaching Effectiveness: Its
Meaning, Assessment, and Improvement, 175.
Gerardo, M. S. (2012). Teacher Adoption of a web-based instruc-tional planning system. Ph.D. dissertation.
Boulder: University of Colorado Boulder.
Legris, P., Ingham, J., & Collerette, P. (2003). Why do people use information technology? A critical review of
the technology acceptance model. Information & management, 191-204.
Maull, K. E., Saldivar, M. G., & Sumner, T. (2011). Understanding digital library adoption: a use diffusion
approach. Joint Conference on Digital Libraries. Ottawa.
Ram, S., & Jung, H.-S. (1990). Journal of the Academy of Marketing Science, 18(1), 67-76.
Rogers, E. M. (1995). Diffusion of innovations. Simon and Schuster.
Shih, C.-F., & Venkatesh, A. (2004). Beyond adoption: development and application of a use-diffusion model.
Journal of Marketing, 68(1), 59-72.
Sumner, T., & CCS Team. (2010). Customizing science instruction with educational digital libraries. Joint
Conference on Digital libraries. New York.
Xu, B. (2010). Understanding Teacher Use of a Digital Library Service: A Clustering Approach. Logan, Utah:
Utah State University.
Download