Context-enriched Activity Modelling P H D W O R K ( E N D O F 1 ST Y E A R ) DIMOKLIS DESPOTAKIS SUPERVISOR: DR. VANIA DIMITROVA University of Leeds, School of Computing September 2010 Outline Motivation & Research Problem Research Goal & Hypothesis Related Work Methodology Research Contribution Discussion - Questions Motivation & Research Problem Research and development in simulated environments Platforms that are used for training (educational and professional) High impact in current and future learning technologies People learn through the simulated interaction experience Does this experience reflect reality? Digital content published on the web Rich content People search, share and exchange comments (e.g. YouTube) Content and social web interaction intuitively mirrors the reality How would you learn now, and in the future? How is the digital content, enriched with knowledge, represented in the social web spaces? Can this combination feed simulated settings? ImREAL : Immersive Reflective Experience-based Adaptive Learning, in Seventh Framework Programme, ICT Call 5. 2009. Motivation & Research Problem Searching the web Why? Exemplify your work? Learn other peoples’ experience How? Effectiveness? Precision and recall Awareness of knowledge Adaptation to individual perspectives Motivation & Research Problem Application domain Job – related activities Impact of learning technologies at workplace Experiential Learning Theory Learning from personal experience Learning from other people’s experience Reflection to everyday job practises Experience equals knowledge Knowledge of job related – activities Knowledge is ‘‘hidden’’ in job – related contextual descriptions and personal experience Can we capture this knowledge? Can digital content motivate and leverage knowledge capturing? Kolb, A.Y. and D.A. Kolb, Experiential Learning Theory: A Dynamic, Holistic Approach to Management Learning, Education and Development, S.J. Armstrong and C.V. Fukami, Editors. 2009, Sage: London. p. 42-68. Research Goal & Hypothesis Goal Augment digital content related to job activities with multi – perspective contextual information in order to: Improve training and Enable context – aware intelligent search Hypothesis Semantic – enriched methods can be developed to capture personal experiences and extract contextual descriptions of activities embedded in digital content Augment digital recourses on the web with an extended context model to increase the effectiveness of content retrieval methods Some Visualization Bla bla bla… ? Augmented Digital Content Digital Content presenting job-related activities Context Knowledge ? Bla bla bla… Context Context Research Questions How to represent contextual knowledge and turn a digital resource to an augmented and reflective digital object? This includes: What knowledge perspectives of an activity should be captured? This includes: defining the main actions in a particular job-related activity; identifying their importance; identifying what connections may exist; deciding how to capture and represent actions related to different individual experiences and contexts. How to elicit contextual knowledge related to digital content with job activities? This includes: changing the knowledge structure when new information is added (i.e. new contextual descriptions); providing appropriate knowledge views (i.e. parts of the knowledge structure) according to different user requirements. deciding what type of digital content is most appropriate to exemplify job-related activities; which technique can be applied to capture knowledge nuggets embedded in human descriptions and comments; and how to derive a knowledge structure representing the extended context of the activity embedded in a digital resource. How to use contextual knowledge to retrieve digital content related to a specific situation? This includes: discovering structural connections between different pieces of contextual knowledge and finding similarities between context models; defining effective algorithms for context matching. Conceptual Framework Related Work - Projects AWESOME: Sharing Dissertation Writing Experiences Challenge: support of the complex dissertation writing process in all stages; the resolution of issues that individual characteristics of the writing process hinder; and the analysis and evaluation of methods to support the writing process Work: AWESOME Dissertation Environment support dynamic content creation from knowledge and experience contributions social scaffolding as a pedagogical solution Semantic Media Wiki technology Bajanki, S., et al., Use of Semantics to Build an Academic Writing Community Environment, in Proceeding of the 2009 conference on Artificial Intelligence in Education: Building Learning Systems that Care: From Knowledge Representation to Affective Modelling. 2009, IOS Press. p. 357-364. Related Work - Projects MATURE: Organizational Knowledge Maturing Challenge: evolution and eventual standardization of knowledge artefacts in parallel with the characteristics of learning processes reflexive knowledge base responsive to learners’ needs career guidance Work: based on is the Semantic Media Wiki support annotation of content knowledge construction monitoring knowledge retrieval user collaboration services enrich the keyword index of an article with recommended tags Weber, N., et al., Knowledge Maturing in the Semantic MediaWiki: A Design Study in Career Guidance, in Learning in the Synergy of Multiple Disciplines, U. Cress, V. Dimitrova, and M. Specht, Editors. 2009, Springer Berlin / Heidelberg. p. 700-705. Related Work - Projects APOSDLE: Aiding Task-based Self-directed Learning Challenge: support the user (worker) at the very beginning of his work adapting the system’s functionality to his learning needs Work: system which recommends workplace documents Domain Model, Task model (processes in the workplace), Competence Performance Model (hierarchy of tasks), Knowledge Capital (repository of annotated documents) Ghidini, C., et al. APOSDLE: learn@work with Semantic Web Technology. in I-Know '07. 2007. Graz, Austria. Related Work - Projects KP-LAB: Collective Knowledge Creation for Workplace Learning Challenge knowledge creation and exchange for workplace learning records of job-related activities to create pedagogical scenarios for experiential learning Work Shared Space (collaborative environment) Semantic Wiki Tools to support document annotation Shared Authentic Objects Markkanen, H., H. Barclay, and K. Schrey-Niemenmaa, Knowledge Practices Laboratory (KP-Lab) Overview. 2008. Related Work -Project Advantages & Limitations Advantages collective knowledge elicitation has been promoted to address the need for more holistic views of a domain; support the learner as an individual, rather than implementing general learner models; aid the learner in all steps of domain specific processes Limitations how is the user-learner connected with the knowledge base? how can we aid learning in domains that activity does not involve finite states? what solutions can be developed to bypass the limitations of existing technologies to support domain knowledge modelling? Approach user perspectives of digital content annotation and views to be implemented within the knowledge base structure provide different views of contextual descriptions, as averse to having the user as an object based on which a system aims to adapt its functionality new means of adaptation will be provided by enriching the knowledge augmented resources and enable more effective content and knowledge retrieval Augment digital records with an evolving and dynamic activity-context model test more effective methods of knowledge capturing, which adequately overcome the limitations that Semantic Wikis or loose knowledge elicitation schemas provide Related Work – Capturing Knowledge Semantic (Media) Wikis Advantages Collaborative environment Collective knowledge assembly: multiple users can add knowledge nuggets and annotate wiki pages to produce links between them Consistency of Content: maintain changes of content Accessing Knowledge: accessibility of knowledge from users and search for content Reusing Knowledge: ontology extraction Limitations Flexibility of coding and error handling Modularized coding will increase the maintainability and extensibility of the framework. “On the fly” interaction with the knowledge base and the current support of ontology extraction mechanisms should provide the opportunity for reflection Extend auto-completion and tag concepts recommendation with more intelligent input recommendation to increase knowledge awareness Reasoning over the knowledge base to construct agents that can recommend possible input to guide the elicitation or the retrieval. Support for individual querying algorithms and integration with more sophisticated approaches Current development includes field value selection and keyword matching. User profiling mechanisms A key concept in CRAM is the user. Semantic Wikis provide only basic user authentication mechanisms, while profiling is missing Capturing user perspectives (extract a user model) from the content that a user has contributed is missing. Krötzsch, M., D. Vrandečid, and M. Völkel, Semantic MediaWiki. 2006. p. 935-942. Related Work – Capturing Knowledge Information Extraction Advantages retrieve certain types of information from unstructured text, usually in the form of natural language (this has been applied also for semi-structured text, e.g. web-pages) entities and concepts (classes of objects and events) , and relationships that may exist between them convert human generated content (text) into a machineunderstandable form Limitations Unclassified concepts and relations Complex extraction rules have to be applied Related Work – Capturing Knowledge Information Extraction (2) Approach Ontology-based Information Extraction Ontologies provide a formal conceptualization of domain “Processes unstructured or semi-structured natural language text through a mechanism guided by ontologies to extract certain types of information and presents the output using ontologies.” Types • Semantic annotation of text: this concerns the full guided process of adding semantics to text concepts using an ontology: here, the text is variable and the ontology is stable. • Ontology population from text: here the ontology changes by adding instances found in the text to the corresponding ontology classes. • Ontology learning from text: this type concerns building ontology from scratch, using text mining techniques. Daya, W. and D. Dou, Ontology-based information extraction: An introduction and a survey of current approaches. Journal of Information Science, 2010. 36(3): p. 306-323. Methodology Application Domain Selection Criteria a. Content exists and/or collected quickly a. Reasonable quality of content a. Users are available for experience capturing Users are available for evaluation a. can be a. Ethical issues: can we handle them reasonably? a. Emotions are embedded (Emotional Intelligence) in Existing digital records presenting human experiences (i.e. videos, images, text) or content that can be created and collected by the user. Content quality refers to: digital quality (e.g. image resolution), richness of activity, richness of context allocation. Comment on digital content. Evaluate the model and the retrieval. Access to activity records; content of activity records; social aspects (e.g. sensitivity, health, religion). Potential to capture the affect of emotional states in human experiences Methodology Application Domain Selection Volunteering Research Job Interviews Great amount of digital content available freely Wealth of contextual knowledge can be captured (students preparing for job placements, career advisers, interviewers) Methodology Platform to support the CRAM Framework Input: Digital resources: videos of job interviews User Input: textual descriptions of the video resource (comments and stories) Methodology How to represent contextual knowledge and turn a digital resource to an augmented and reflective digital object The main constructive component of CRAM is the Context Rich Activity Object (CRAO). A CRAO is a digital object that consists of a semantic multi-perspective knowledge wrapper of a digital record to represent jobrelated activities and individual experiences. Potential relations between CRAOs will be explored to build a semantic graph of objects and provide a classification framework CRAO will be modelled as an ontology (also CRAM). Activity and Context will be modelled using ontologies. The representation of Activity will follow Activity Theory principles. Methodology What knowledge perspectives of an activity should be captured Defining the main actions in a particular job-related activity An activity is represented by a job interview video resource [Activity: A]; Actions are represented as segments (snippets) of the video [Action: α, A= [a1, a2... an]]; Each snippet has a start point (Ts) and an endpoint (Te) time stamp, which might be equal to present an image captured [Ts ≤ Te]; Each snippet is defined by the users who have commented or participated; Dealing with overlapped actions (time) Identifying context dimensions of actions Allen’s Interval Algebra Retrieve content and knowledge according to input queries description of the action presented in the video resource (CDR internal) user’s personal experiences (CU external) related to that action Identification and coverage of context dimensions Formal discussion with domain experts to derive a core structure of the activity. Iterations of testing and evaluation Identifying connections that may exist Micro – level Sequence (time) Similar • • Connected • Contextually (knowledge views) Discovering Semantic relations using existing tools (WordNet, DISCO) Derive patterns of annotated actions Macro – level Connections between different digital content Methodology What knowledge perspectives of an activity should be captured (2) Deciding how to capture and represent actions related to different individual experiences and contexts Three types of Context will be implemented in CRAM: User Context (CU): this class refers to the contextual descriptions of personal job-related experiences and profile records. Digital-Record Context (CDR): this class refers to the contextual descriptions of the activity embedded in the digital content and intuitively represents real-life human experiences in job-related settings. Simulated Context (CS): this class refers to the contextual descriptions of (a) simulated environments for experiential learning and (b) structures of particular contextual queries for content retrieval. CU and CDR will be dynamic and evolving structures, while CS will be predefined. Multi- perceptiveness is defined here as the set of different views of contextual descriptions related to different user models (CU) Methodology How to elicit contextual knowledge related to digital content with job activities defining type of resources that will be used as records of real job-related activities defining the user input and the users’ role sample of video files has been collected of different types: guides stories examples Capture and annotate actions Provide personal experiences Query the system developing a prototype to test the hypothesis of capturing multi-perspective knowledge and start collecting a corpus of resources and user input data CRAM is the output ontology from the elicitation mechanism. Corpus is the set of comments for a particular video resource that includes descriptions of the presented activity and descriptions of personal experiences. Guide is a set of existing ontologies that will leverage the knowledge construction process. Extraction Module is the Information Extraction method, driven by the Guide Methodology Continue... The Gate toolkit will be used for the Extraction Module to: Semantically annotate the Corpus based on the CRAM ontology Populate the CRAM ontology enrich the CRAM ontology with segments of the Guide ontologies (i.e. a second similar module will be integrated in the Extraction Module to semantically annotate the Corpus based on the Guide Ontologies) Tools from GATE Partially ANNIE Onto Root Gazetteer for automatic semantic annotation of text The GATE ontology API for ontology population Cunningham, H., et al. GATE: A framework and graphical development environment for robust NLP tools and applications. in Proceedings of the 40th Annual Meeting of the ACL. 2002. Methodology How to use contextual knowledge to retrieve digital content related to a specific situation Provide keywords (semi-structured): having a single input field to provide keywords, an ontology reasoner can be used to derive ontology components (classes, properties) and automatically generate SPARQL queries. Then, users can select a matching query and retrieve the results to collect the appropriate CRAO(s) . Provide input fields (structured): providing existing categories and properties as input fields and recommend values will also generate the input graph to match with the CRAM Ontology. In this way, the SPARQL query is actually defined by the user. As an ultimate task for this work, the functionality of the model will be evaluated not only with individual’s context but also with the simulated environments contextual activity aspects to align the simulated experiences with real world job-related aspects. This task will involve the extraction of the context model (by experts in the field) and application of the above technique to CRAM. Methodology Plan &Evaluation Publication Prototype Experimentation Domain Experts Core Model Domain Experts Core User Model Research & Development Research & Development on content and knowledge retrieval Data collection Research & Development Publication Data collection Domain Experts Evaluation Publication Data collection Context Model Domain Experts Evaluation Extensive Evaluation with Users Multiperspective Context Model CRAM Research Contribution Contribution to Context-Aware Systems: developing an advanced method for capturing a holistic context-enriched activity model that augments digital resources and enables intelligent context-aware retrieval. Contribution to User Modelling and User-adaptive Systems: capturing different user perspectives of a job activity to derive user models and provide user-adapted content retrieval. Contribution to Semantic Web: an innovative ontological approach to semantically augment and link digital content. It is important to state that the contribution to the above ‘technical’ areas is driven by a key application problem – advancing technologies for learning. Discussion - Questions