PlanetData Network of Excellence FP7 – 257641 D6.2 Open training infrastructure Coordinator: Carolina Fortuna (STI) With contributions from: Mitja Jermol (JSI), Davor Orlič (JSI), Lyndon Nixon (STI), Simeona Pellkvist (STI) 1st Quality reviewer: [Pablo Mendes] Quality reviewer: [Freddy Priyatna] 2nd Deliverable nature: Report (R) Dissemination level: Public (PU) (Confidentiality) Contractual delivery date: 31.3.2011 Actual delivery date: 28.3.2011 Version: 0.5 Total number of pages: 33 Keywords: knowledge, virtual community, video lectures, platform, open infrastructure, training, open content, curriculum, personalization, contextualization, training portal, opencast, filming, captioning, video repository, online training PlanetData Deliverable D6.2 Abstract This deliverable represents the first version of the Open Training Infrastructure and provides an overview of the infrastructure created to support training in the PlanetData network. The infrastructure will be made available on the web via the Web portal created in WP7. The concept of Knowledge Sharing Community (KSC), which presents a vital motivation force for any type of organisation (traditional, networked, and virtual) as well as virtual community operation - Professional Virtual Community, has been successfully adopted in the on-going COIN IP project, currently developing an ICT integrated solution (the COIN system) starting from notable existing research results in the field of Enterprise Interoperability (EI) and Enterprise Collaboration (EC). The base for the KSC concept, its definition and user validation in this deliverable is comprised of and heavily relies on “D2.3.1. Training setup and assessment deliverable”. It has been updated to fit the purpose of the PlanetData Deliverable “D6.2 Open training infrastructure”, this deliverable. [End of abstract] Page 2 of (33) Deliverable D6.2 PlanetData Executive summary This document describes the plan for open training infrastructure of PlanetData which comprises a set-up for the PlanetData Open Training Infrastructure and the migration of all the materials available in the REASE portal into a one-stop-shop entry point at VideoLectures.NET where the new video materials, presentations and educational management solutions will blend together with ideas, findings and developments freely communicated inside the community. This way the already gathered REASE materials will also make a stronger appearance in the PlanetData Open Training Infrastructure (OTI) community. The document describes the concept, strategy and detailed plan for the Open Training Infrastructure activities that will be performed in the time frame of the project and as such presents a rough scenario for the KSC (Knowledge Sharing Community) sustainability. The document focuses on: the aim and goals of the PlanetData Open Training Infrastructure personalised knowledge transfer that defines the training delivery approach building of the PlanetData Open Training Infrastructure community via the PlanetData KSC The document is addressed to: PlanetData consortium partners and in particular to the impact creation cluster PlanetData KSC members and participants on the training events: The document is constituted of: An initial introductory section A chapter dedicated to the description of the open training infrastructure in PlanetData; A chapter describing the PlanetData KSC management approach and structure; The deliverable concludes with the brief status and future sustainability plan. Page 3 of (33) PlanetData Deliverable D6.2 Document Information IST Project Number Full Title Project URL Document URL EU Project Officer FP7 - 257641 PlanetData Deliverable Number D6.2 Title Work Package Number WP6 Title Date of Delivery Status Nature Dissemination level M6 Contractual Actual version 0.1 final □ prototype □ report X dissemination □ public X consortium □ Authors (Partner) STI2 Acronym PlanetData http://www.planet-data.eu/ Leonhard Maqua Name Partner Name Partner Name Partner Name Partner Responsible Author Responsible Author Responsible Author Responsible Author Davor Orlič JSI Mitja Jermol JSI Simeona Pellkvist STI Lyndon Nixon STI E-mail Phone E-mail Phone E-mail Phone E-mail Phone Open training infrastructure Training M6 davor.orlic@ijs.si +3931844381 mitja.jermol@ijs.si +39 1 477 3593 simeona.pellkvist@sti2.org +39 1 2364 002 14 lyndon.nixon@sti2.org +39 1 2364 002 13 Abstract (for dissemination) This document describes the plan for open training infrastructure of PlanetData which comprises a set-up for the PlanetData Open Training Infrastructure and the migration of all the materials available in the REASE portal into one-stopshop entry point at VideoLectures.NET where the new video materials, presentations and educational management solutions will blend together with ideas, findings and developments freely communicated inside the community. This way the already gathered REASE materials will also make a stronger appearance in the PlanetData Open Training Infrastructure (OTI) community. Keywords knowledge, virtual community, video lectures, platform, open infrastructure, training, open content, curriculum, personalization, contextualization, training portal, opencast, filming, captioning, video repository, online training Version Log Issue Date 1.3.2011 24.3.2011 Rev. No. 1 2 Author Davor Orlič Mitja Jermol 28.3.2011 3 Davor Orlič 28.3.2011 31.3.2011 4 5 Mitja Jermol Lyndon Nixon Change document draft, TOC draft change of TOC, formatting, content updates, paragraph exclusion TOC and content update, addition of screenshots, website statistics update JSI internal review Minor comments on deliverable authors and other references Page 4 of (33) Deliverable D6.2 PlanetData Page 5 of (33) PlanetData Deliverable D6.2 Table of Contents Executive summary ........................................................................................................................................... 3 Document Information ...................................................................................................................................... 4 Table of Contents .............................................................................................................................................. 6 List of figures and list of tables ......................................................................................................................... 7 Abbreviations .................................................................................................................................................... 8 Definitions ......................................................................................................................................................... 9 1 Introduction .............................................................................................................................................. 10 1.1 Purpose .............................................................................................................................................. 10 1.2 The aim of PlanetData....................................................................................................................... 10 2 PlanetData open training infrastructure requirements .............................................................................. 11 2.1 Functions and properties of PlanetData KSC.................................................................................... 11 2.1.1 Functions and services ............................................................................................................... 12 2.1.2 Knowledge Sharing Community organisation ........................................................................... 13 2.1.3 Social and societal issues concerning Knowledge Sharing Communities ................................. 14 2.1.4 Technology level and information infrastructure ...................................................................... 14 2.2 Curriculum and target public ............................................................................................................ 15 2.3 Training materials, delivery methods and channels .......................................................................... 16 2.3.1 Training materials ...................................................................................................................... 16 2.3.2 Content delivery methods .......................................................................................................... 17 2.3.3 Content delivery channels .......................................................................................................... 18 2.4 Basic building blocks ........................................................................................................................ 18 2.4.1 VideoLectures.NET educational video website ........................................................................ 18 2.4.1.1 Long term content/context preservation. ....................................................................................... 18 2.4.1.2 Services ......................................................................................................................................... 19 2.4.2 REASE repository ..................................................................................................................... 22 3 PlanetData OTI design ............................................................................................................................. 23 3.1 Selected integration model ................................................................................................................ 23 3.2 Basic Structure .................................................................................................................................. 23 3.3 Current status of the training portal .................................................................................................. 24 3.4 Opencast Matterhorn project ............................................................................................................. 26 3.5 Content Personalization .................................................................................................................... 28 3.6 Context Search .................................................................................................................................. 29 4 User validation methods and KSC development plan .............................................................................. 31 5 Conclusion................................................................................................................................................ 32 References ....................................................................................................................................................... 33 Page 6 of (33) Deliverable D6.2 PlanetData List of figures and list of tables Figure 1: The structure of the curriculum for COIN IP ................................................................................... 16 Figure 2: The general structure of the training activities ................................................................................. 17 Figure 3: VideoLectures.NET website ............................................................................................................ 19 Figure 4: Platform workflow architecture ....................................................................................................... 20 Figure 5: The editorial page knowledge storage preparation – recordings tracking........................................ 20 Figure 6: The editorial page knowledge storage preparation - lecture editing ................................................ 20 Figure 7: Platform editorial office ................................................................................................................... 21 Figure 8: The editorial page knowledge storage preparation – event statistics ............................................... 21 Figure 9: The REASE website......................................................................................................................... 22 Figure 10: OTI infrastructure layers ................................................................................................................ 24 Figure 11: Current personalization loop at VideoLectures.NET ..................................................................... 25 Figure 12: Quintelligence Miner Architecture at VideoLectures.NET ........................................................... 26 Figure 13: Visitor Statistics ............................................................................................................................. 26 Figure 14: Opencast Matterhorn's features ...................................................................................................... 28 Figure 15: Planned personalization loop at VideoLectures.NET .................................................................... 29 Figure 16: SearchPoint query for “leopard” .................................................................................................... 30 Table 1: Language related statistics................................................................................................................. 24 Table 2: User validation activities ................................................................................................................... 31 Page 7 of (33) PlanetData Deliverable D6.2 Abbreviations KSC Knowledge Sharing Community OTC Open Training Community ICT Information and Communication Technologies DC Dublin Core CMS Content Management System Page 8 of (33) Deliverable D6.2 PlanetData Definitions Traditional tutorial is a one topic lecture (one to two hours) that is performed in traditional way in the frame of the non-PlanetData training event i.e. conference. Traditional lecture or seminar is a traditional training event which is a part of the formal educational program performed either at the Universities or companies as an internal training program. Self learning course is an ICT based training course that can be accessed in KSC. This is a set of training activities and materials that are prepared for self learning and discovering. Distance learning course is an ICT based training course that can be accessed in KSC. Distance learning course combines training activities that are supported with electronic and traditional materials and are guided by a tutor. Self-assessment course is an ICT based training course aimed at trainees self-testing. This training course combines several assessment methods from questions, quizzes to simulation and activity games. Quest module is a training approach that enables trainees to explore new knowledge by “travelling” freely through a semantically structured network of knowledge atoms. Particular node in a network can be passive or PlanetData element requiring activities from learning to solving problems. Quest module is an ICT based training module. Traditional workshop is a one to two days training event organised for a particular target audience. It is organised in traditional way with the set of training lectures that will be performed by experts from the topics of PlanetData project. Page 9 of (33) PlanetData 1 Deliverable D6.2 Introduction The aim of the open training platform in PlanetData is to achieve the transfer of knowledge and best practices within the project as well as outside the project. This will be achieved and presented in this deliverable by: 1.1 Building, managing and maintaining the Open Training Infrastructure (OTI Community) and its activities supported by a training portal, ICT based training tools and ‘radar’ activities to gather information from other relevant projects, programmes and initiatives, Migrating and implementing atoms of knowledge from other ICT platforms such as REASE for the benefit of the PlanetData community, Disseminating project results and lessons learned through the training programme, curriculum. Purpose The PlanetData curriculum (D6.1) on data gathering will be one of the main training results in PlanetData training. The developed curriculum will form a body of knowledge that will be useful for the education and training of the wide area of semantic web and data technologies. For that purpose during the timeframe of the project the training courses and corresponding training materials will be put in the ICT supported OTI Community. 1.2 The aim of PlanetData PlanetData aims to establish a sustainable European community of researchers that supports organizations in exposing their data in new and useful ways. The ability to effectively and efficiently make sense out of the enormous amounts of data continuously published online, including data streams, (micro) blog posts, digital archives, eScience resources, public sector data sets, and the Linked Open Data Cloud, is a crucial ingredient for Europe’s transition to a knowledge society. It allows businesses, governments, communities and individuals to take decisions in an informed manner, ensuring competitive advantages, and general welfare. Research will concentrate on three key challenges that need to be addressed for effective data exposure in a usable form at global scale. We will provide representations for stream-like data, and scalable techniques to publish access and integrate such data sources on the Web. We will establish mechanisms to assess, record, and, where possible, improve the quality of data through repair. To further enhance the usefulness of data - in particular when it comes to the effectiveness of data processing and retrieval - we will define means to capture the context in which data is produced and understood including space, time and social aspects. Finally, we will develop access control mechanisms - in order to attract exposure of certain types of valuable data sets, it is necessary to take proper account of its owner’s concerns to maintain control and respect for privacy and provenance, while not hampering non-contentious use. The project commits to test all of the above on a highly scalable data infrastructure, supporting relational, RDF, and stream processing, and on novel data sets exposed through the network, and derive best practices for data owners. By providing these key precursors, complemented by a comprehensive training, dissemination, standardization and networking program, we will enable and promote effective exposure of data at planetary scale. Page 10 of (33) Deliverable D6.2 2 PlanetData PlanetData open training infrastructure requirements The PlanetData Knowledge Sharing Community operation coordinates its actions within the PlanetData impact creation strategy. The PlanetData KSC operates on similar principles as Professional Virtual Communities which present the vital motivation force for any type of organisation (traditional, networked, and virtual) as well as virtual community operation. The following chapter mainly focuses on the ICT based training which is accessible through the OTI entry point in the video training portal (http://videolectures.net/planetdata). 2.1 Functions and properties of PlanetData KSC The PlanetData Knowledge Sharing Community aims at: joining relevant experts, researchers, business decision members, engineers, industry representatives, students, scholars, interested parties from the public sector and general public to share and create new knowledge, share experiences and lessons learned, motivating, supporting and fostering the establishment of a community of interested organizations, groups and individuals locally and cross-border through Europe, building the PlanetData training infrastructure, a one-stop-shop entry point where new ideas, findings and developments are freely communicated inside the community, establishing a learning environment to support knowledge transfer (1) between PlanetData project members, (2) from the community to PlanetData and (3) from PlanetData to the wider community, disseminating the project results to the research community, business environment and government bodies (local and national), setting-up the base for project results promotion and sustainable development after the PlanetData EU funding period, testing and approving the generated results of PlanetData in the wider community. To achieve the above goals the following activities have to be supported: Sharing PlanetData knowledge and results with the help of traditional and ICT based training methods in OTI, Stimulating the creation of new knowledge and exchange of existing knowledge between KSC members by using ICT based support, Growing the community of members, Promoting the usability of PlanetData results to a research and business community, Connecting to other relevant projects and initiatives. KSC is not a training program but a framework for PlanetData that supports knowledge generation and sharing activities in the research domains relevant to PlanetData. That framework consists of: the strategy for establishing, running and evolving the KSC, the OTI-based framework and a set of tools and services, the corresponding organisation, rules, roles and processes, the general plan for training content acquisition and development. KSC development and management is focused on two main elements: KSC community management – users acquisition and retention, and OTI service development and content organisation. All activities are dependent on each other and are aimed at establishing a sustainable environment that will enable the creation of a platform where the PlanetData educational-creative potential will be exposed. Page 11 of (33) PlanetData Deliverable D6.2 Building a learning community involves two key elements: Setting up strategies that will integrate social, economic, educational and cultural developments, Forming strategic partnerships that involve all stakeholders. Based on that the following stages are perceived: Planning (the stage described in this document), Forming partnerships of stakeholders i.e. building up user community of teachers and learners and boost the motivation, OTI promotion, Participation which depends on relevant societal and social mechanisms, Economic success that is linked to the sustainability plan, Quality of learning i.e. content, services, knowledge gathered, knowledge use, etc, Monitoring and assessment which defines the strategy and mechanisms for sustainable improvement of community services In the following subchapters however, the corresponding basic functions and services are defined and elaborated in more detail through the three community levels: technological, organisational and societal. 2.1.1 Functions and services As defined already, the basic function of the PlanetData KSC is to build and manage the OTI Community learners by supporting and fostering unconditional knowledge transfer (traditional and virtual) between community members. The following services for the materials will be supported by the PlanetData OTI: Individual and group training, Self-learning from prepared courses, tutorials, documents, papers,… - asynchronous Self-assessment support - asynchronous Questing and Discovering from the knowledge databases – asynchronous Distance learning with the help of tutors – synchrone Communication and collaborative problem solving, Communication tools (bulletin board, discussion forums, community catalogues) – two-way communication Informing – one-way communication Simple collaborative problem solving support using available tools Knowledge preparation, gathering and storing Knowledge atoms authoring tools, Courses authoring and curriculum support, Reference databases, Simple vocabulary maintenance support. Community support activities: Content exchange service i.e. Wiki or Youtube free but controlled upload, Invitation network support, Information service providing information about the developments in similar and/or complementary projects. OTI management services Content and content spaces management, Users rights management, Users progress, needs and preference analysis. Page 12 of (33) Deliverable D6.2 PlanetData Core functionalities that will be kept from the REASE platform and generally available in VideoLectures.NET are: User registration, authentication, profile management, Learning material upload, indexing, management, Local storage of learning materials (or links to external material by URL), Access control and access logging, with a statistics function. Functionalities specified for the usage of REASE on the VideoLectures.NET platform are: Use of metadata for local data such as learning object annotation, Embedding of the repository materials in other sites, Quality control process for new material, Support for live as well as recorded media delivery (audio, video) to users, Digital rights management. Additional functionalities added in the REASE platform which should be preserved in the migration are: 2.1.2 A consistent and identifiable style and use of logos as a form of branding the portal, Maintenance of the Semantic Web topics hierarchy by which learning materials are annotated, in order to improve the search, Preservation of industry resource as a learning material type. Knowledge Sharing Community organisation The organisational level deals with organisation, roles and governance principles to manage and support the KSC. The major Governance Principles characterising a Human centric organization, can be defined as follows: “Empowerment”. All decisions and responsibility must be taken, as far as possible, at the lowest hierarchical level. “Voluntary approach”. The members’ participation to KSC activities, in terms of both quantity and typology is decided by the member itself on a voluntary basis. “Self organisation”. The organisation of KSC collaborative activities is left, as much as possible, to the actors actually performing those activities. “Peer assessment”. Any time there is the necessity for subjective evaluation this is done by assembling an ad –hoc peer committee which is empowered to take the decision. The application of the above defined governance principles to the Knowledge Sharing Community concept, led to the basic identification of the PlanetData OTI: Reference “rules”: KSC bylaw, KSC Member Agreement, KSC Behaviour Rules Organisational entities: Members, Programme committee, Administration, Organisation for delivering Support Services to members, Steering Board. Major governance processes: membership management and promotion, training content management, knowledge and training process management, general management and support services. In PlanetData OTI the following user types will be supported: OTI administrators, OTI programme committee, PlanetData project community members, OTI registered members, Guests and visitors. Page 13 of (33) PlanetData Deliverable D6.2 The following roles are foreseen: Content/courses owners and creators, Tutors, Learners, Moderators and Administrators. 2.1.3 Social and societal issues concerning Knowledge Sharing Communities The social level is the most comprehensive one since it deals with user motivation to participate and share knowledge, with cultural differences and different geography dependent contexts. The following social and societal issues have been identified and studied widely: Impact on personal welfare and well-being in the following areas: Flexibility Gender issue Patterns of work Welfare net Quality of life Understanding the integration of members in VC depends on: Voluntary approach Job security Risks prevention Understanding the relationships between members: Conflicts between members Engagement and motivations Self organization Dealing with social and virtual environment: Managing time Sharing between virtual and physical communities Creating and stimulating trust between members Avoiding isolation and negative “digital past” in a virtual world Bridging learning between physical and virtual environment Drive multicultural and different members inside social community The corresponding models will be adapted for the purpose of the PlanetData OTI. 2.1.4 Technology level and information infrastructure The technology level will be mainly supported by one of the available ICT frameworks for technology enhanced learning. For the purpose the PlanetData OTI we will use VideoLectures.NET. The platform has been developed using the open source software Django (http://www.djangoproject.com/), which is a highlevel Python Web framework that encourages rapid development and clean, pragmatic design. The platform currently supports the following services: video and other content type distribution, synchronizing slides with video, lecture editing and customisation, training courses generation, user management and user rights management, Page 14 of (33) Deliverable D6.2 PlanetData video processing and uploading, community services, open and networked video production, files distribution on content distribution network, resource management & user statistics. All content objects are stored on several servers (different slots for text, pictures, courses, structures and media, video formats) where every content object is described with the extended DC meta data in one database, namely the one at VideoLectures.NET website. The portal is supported by a set of Linux and MS windows servers (currently 8) with windows media video streaming and Flash video streaming. The complete platform solution is Open Source (under the LGPL license) while the content and website registered under the Creative commons license. The specific licence used at this moment is the Creative Commons Attribution-Noncommercial-No Derivative Works 3. This licence version makes it free to copy, distribute and transmit the work, and it requires that the user must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse the user or the users use of the work). Issues involved in the use of this licence could arise in the context of commercial usage of the materials as the work can’t be used for commercial purposes and it may not be altered, transformed, or build upon. In the following year all required PlanetData OTI services will be developed as well as additional functionalities to enrich the video presentation features and interaction with the community. The platform will also comprise the possibility of content in PlanetData OTI to be used simultaneously by other platforms. 2.2 Curriculum and target public The aim of this task is to draw up a curriculum for the training activities of PlanetData (D6.1), which will form the basis for a globally applicable education and training program on large-scale data management. This program should not only cover the latest state of the art in the different fields related to data management but also the unifying methodologies developed by the PlanetData NoE. It should be noted that learning material will not only comprise the traditional sets of slides, but also multimedia material such as video blogs and podcasts, software with corresponding data sets to be used, and self-assessment material. Ensuring easy access to these training materials across multiple channels will make it easy for learners in the area of large-scale data management to learn and also enable self-learning. Figure 1 shows an already existing training curriculum and training programmes for COIN IP at VideoLectures.NET. The target public will be the academic and research community comprised of the PlanetData partners and consortium and the general scientific community around the world. Page 15 of (33) PlanetData Deliverable D6.2 Figure 1: The structure of the curriculum for COIN IP 2.3 Training materials, delivery methods and channels Training is essential to ensure transfer and sustainability of knowledge, high quality contributors to the network and long term scientific success of researchers across all partner institutions. The training materials that will be prepared in PlanetData follow the strategy for the training program development. Referring to Figure 2 each training program is composed by a set of training courses that can have a form of traditional course, ICT supported course or mix of both. Each course is supplemented with the training materials that is being prepared for each training course and is stored in the training contents repositories. 2.3.1 Training materials Before starting training activities it is crucial to ensure that high quality training materials are available. A flexible curriculum for the different training events will be developed as part of this work package, along with the accompanying learning materials. Ensuring easy access to these training materials across multiple channels will make it easy for learners to study and also enable self‐learning. This work package will develop an open training infrastructure that will use multiple channels, including slides, videos, serious games, podcasts, webinars, and tutorials. This infrastructure will provide a feedback mechanism for learners and also enable closer communication between trainers and learners. Professional experts in the network could act as mentors to researchers pursuing doctoral or postdoctoral research. In addition to mentoring, researchers in the network will have access to tailored career development planning and F2F‐meetings with fellow junior researcher and senior mentors. Page 16 of (33) Deliverable D6.2 PlanetData TRAINING PROGRAM Basic Training Course 1 Basic Training Course 2 Basic Training Course 3 Basic Training Course n Additional Training Course 2 Additional Training Course 3 Additional Training Course n Specific Training Course 1 Specific Training Course n SATELLITE TRAINING COURSES Satellite Training Course A Satellite Training Course B Satellite Training Course C Satellite Training Course Z Figure 2: The general structure of the training activities 2.3.2 Content delivery methods The training in PlanetData will be performed using traditional training events; ICT based training and blended training e.g. summer school supported with ICT based training. All training events should be of a high didactic value. In relation to the categories in courses and content, every training event will also fall in one of the following three categories: basic training event, additional training event (an addition to basic a training event) and specific training event (an addition to either a basic or a supplement training event). The training event type can be: traditional training: workshop, tutorial, lecture or seminar, summer school. ICT based training: self learning course, distance learning course, self-assessment course, discovering and quest. Blended training Any combination of traditional and ICT based training. This task will also establish a “video scientific journal” that will follow the recent explosion of video lectures, which are freely available through different channels on the Web, with an emphasis on training innovation and delivery approaches. The journal will cover topics from the research and industrial topics covered by the project and other related areas. Page 17 of (33) PlanetData 2.3.3 Deliverable D6.2 Content delivery channels Although PlanetData will use all potentially available training channels the most important training channel will be OTI at http://videolectures.net/planetdata. Most of the content will be available in the form of textual documents and PowerPoint presentations, filmed presentations and tutorials and as prototypes and demos. Several screen casts will be prepared to ease the use and adaptation of developed prototypes. Movies will also be available in the form of training DVDs. 2.4 Basic building blocks 2.4.1 VideoLectures.NET educational video website With the development and rapid technological advances in recording, hosting and production of digital video content, an extremely important European- academic video digital library has emerged - VideoLectures.NET (http://videolectures.net). VideoLectures.NET was founded in 2001 as an internally-funded project and is now run by the dedicated Center for Transfer in Information Technologies at the Josef Stefan Institute (JSI), Ljubljana. It is a free and open-access educational video lectures digital library currently offering more than 12,000 video lectures, archived for long-term preservation. This rich resource offers lectures by distinguished scholars and scientists at some of the world’s most prestigious universities, as well as presentations at internationally significant conferences, summer schools, workshops and science promotional events from many fields of Science. The portal is aimed at promoting science, exchanging ideas, and fostering knowledge sharing by providing high quality didactic content not only to the scientific community but also to the general public. In order to fulfil this aim the following challenges need to be addressed: 2.4.1.1 Improving the quality of service by offering innovative browsing and viewing applications that are directly related to the problem of preservation. Improving the quality of materials. The quality problem ranges from the quality of the lecture itself, to technical quality and distribution channel/method quality. Long term content/context preservation. VideoLectures.NET is committed to maintaining the fixed URL, preferably forever, preferably as long as the owner (author, organization, event organizer, etc.) of the lecture wishes to remove the video. VideoLectures.NET gathers different types of digital objects such as PowerPoint presentations, emails, word and PDF documents, RSS feeds, social networking data over Twitter and Facebook and most importantly multimedia objects containing videos, audio, and pictures. These digital objects are reusable and do not get discarded. Most of the training materials are being developed within the FP5, FP6, and FP7 European Framework Programs, where VideoLectures.NET is being used as an educational platform for several past and ongoing EU funded research projects such as PASCAL NoE, Ecolead IP, SEKT IP, ACTIVE IP, COIN IP, E4 STREP, Euridice IP, NeOn IP, PetaMedia NoE, SMART STREP, Tool East STREP and many others from the field of Open Education such as MIT OpenCourseWare, OpenCourseWare Consortium, CERN, CMU, Stanford, Yale and many other educational institutions. This can be a hint to the diversity of the video content presented and especially to the quantity of it. In addition to this, the two most active academic/ university networks - OpenCourseWare Consortium (OCWC) (http://www.ocwconsortium.org/) and Opencast Community (http://www.opencastproject.org/) - are taking into consideration VideoLectures.NET as a default hosting service for their materials. At this moment VideoLectures.NET does provide services for the speakers to manage their own personalized video lectures site at the portal. VideoLectures.NET also provide basic few minutes training for speakers about how not to behave when being video recorded. This is usually done before the recording session and is available in a one page sheet. Page 18 of (33) Deliverable D6.2 PlanetData Videos are partly enriched with the metadata (extended Dublin Core). By introducing additional metadata, search and retrieval mechanisms and in particular the L1, L2 and L3 scenarios, VideoLectures.NET expects that the value of the service will significantly increase. Figure 3: VideoLectures.NET website 2.4.1.2 Services Currently the editorial part of the software technology within the portal VideoLectures.NET is capable of interlinking the contents of different training events with different services provided on the VideoLectures.NET platform. The current editorial architecture from Figure 4 to 8 shows how advanced and easy to handle is the editorial flow within the software platform. Figure 4 shows the general platform workflow architecture. Video Projet Event Course Windows Media Files Tape Metadata Content File Dispatch Service Processed Digital Files Lecture Flash Files Mpg4 Files Page 19 of (33) PlanetData Deliverable D6.2 Figure 4: Platform workflow architecture Figure 5: The editorial page knowledge storage preparation – recordings tracking Figure 6: The editorial page knowledge storage preparation - lecture editing Page 20 of (33) Deliverable D6.2 PlanetData Figure 7: Platform editorial office Figure 8: The editorial page knowledge storage preparation – event statistics Page 21 of (33) PlanetData 2.4.2 Deliverable D6.2 REASE repository REASE (http://rease.semanticweb.org) is the repository for Semantic Web learning units which has been hosted by the University of Hannover on behalf of the educational association EASE since its founding as part of the Reasoning Web and Knowledge Web networks of excellence. The REASE is shown in Figure 9 and was eventually partly migrated into the frame of the SARIT project (http://sarit.sti2.org/) a web based repository with the training materials. It is a unique repository containing a diverse set of learning resources, ranging from annotated slides to video recordings, and from one-hour tutorials to fully-fledged university courses, for both academic and industrial audiences. It aims to accommodate the heterogeneous requirements of different users trying to learn about the Semantic Web. 214 resources are available on REASE including materials collected from the networks of Reasoning Web and Knowledge Web and events organised by them such as summer schools. The main target group is researchers and students, but providers could also mark their material as "industrial". In early 2009, it was decided to close the association EASE and no longer maintain the repository REASE. The University of Hannover was then approached by STI International about the possibility to host REASE on their servers, as part of a general strategy to maintain a Web portal for access to training materials created for and used in STI training events (such as those presented in SARIT D2.3 and D2.4). The advantage of taking over REASE included the acquisition of the users of REASE, as well as the learning materials hosted on REASE. The chosen learning object platform on which REASE was running was EducaNext. The version of EducaNext that REASE was built on was at the time four years old. The then new release of EducaNext was actually a totally different version, so no upgrade could be done. An attempt to upgrade the EducaNext failed, which prompted the deployment of REASE on a new learning object platform. It was decided that REASE should run on a new server at STI International to ensure the required performance and stability. The installation of DSpace and the migration of the REASE data into DSpace have been successfully performed in tests. However, another step before migrating the data onto the “live” version of REASE was to provide the vanilla installation of DSpace with a new design, branding the page as REASE. Figure 9: The REASE website Page 22 of (33) Deliverable D6.2 3 PlanetData PlanetData OTI design The developed Open Training Infrastructure (OTI) will provide easy access to the PlanetData training material. This infrastructure will include a feedback mechanism to enable closer communication between trainers and learners. It was decided that the main platform, which was evaluated for use with the training infrastructure, will be VideoLectures.NET (hosted by IJS) and via a material migration will include the content from REASE (hosted by STI International). The open training infrastructure will be a sub-site of the PlanetData website created in task T3.2.1. A subpage on http://videolectures.net/planetdata/ will be setup by the end of the first year of the project as training events will gradually built up and will fit into the current video training portal. After that the migration of materials from the REASE portal will begin and will be implemented also to assure sustainability of the community. According to the strategy behind VideoLectures.NET, PlanetData will benefit from the network of top world academic institutions and EU related research projects that will be connected in the network of free and high quality training materials. If copyrights will permit, Open Training Infrastructure will make it possible for all training content that will be produced from PlanetData project materials, materials from REASE and materials from repositories like VideoLectures.NET and other materials, to be freely available,. All materials will be available for the consortiums internal training purposes and their external use for training purposes will have to be approved by the PlanetData OTI programme. 3.1 Selected integration model In order to begin the preparations for the migration, an integration model and a migration plan needs to be set up. The integration of REASE into VideoLectures.NET should follow the basic content and files matching of both repositories. REASE is foreseen by STI and VideoLectures.NET to have two purposes: 3.2 To form a publically accessible repository of learning materials on Semantic Web topics, both for academic and industry audiences, regulated by user registration and log-in; To act as the training material repository for STI activities, which may be public or have restricted access, including workshops, tutorials and training courses? This will include the SARIT training materials set. Basic Structure PlanetData OTI will be structured in three levels. In order to reassure consistency every KSC service mentioned above or KSC structure element will need to be defined and implemented according to the three levels. Moreover, to enable user cantered learning, OTI services and functionality have to be personalized: The technology level: tools and system to support the storage and access to information, user interfaces, CMS, Distance learning tools, The organizational level: rules, procedures and organization to support the KSC operation management structure and procedures, The social level: the strategy and its implementation procedures to build trust between members and behaviour of people, to create, share and use knowledge within the KSC, Networked Intelligence (NQ) building procedures, community/group member. Page 23 of (33) PlanetData Deliverable D6.2 Figure 10 below presents the infrastructure layers. Figure 10: OTI infrastructure layers Application Django software / VideoLectures Nginx, Apache, Services PostgreSQL, Memcached Flash Streaming server System level Ubuntu Linux Server Windows Media Server Windows Server Cloud storage, Static web hosting Linux Servers Web server, Database 3.3 Development Storage, Flash video Processing streaming Windows streaming Amazon S3 Current status of the training portal As video is exponentially becoming a first class citizen of the web, it is also becoming an indispensable learning tool with the most extraordinary and effective educational impact (see Figure 13). What sets apart VideoLectures.NET from its peers in the web based video world is that most of the content was filmed by its own video group around the world, and is hosted on its own cloud. The portal also provides synchronization of slides with video, and soon multilingualism. VideoLectures.NET is currently one of the most visited and referenced video portals that offers free and open access to high quality training videos and courses. The content is strongly related to the information technologies, semantic technologies, networked organisations and other topics that are relevant for PlanetData. This is why also the users that are visiting the site are mainly related to the PlanetData domain and are forming our first and most important target audience. At this moment VideoLectures.NET has a database of: 792 events, 9206 authors, 14031 lectures, 12919 videos, 106 English subtitles, 4302 organizations, 8348 recorded tapes from 2008 until today, 76050 attached files, 356540 other files, 393 categories, 14986 registered users, 3502 user related comments. Below are language related statistics. The visitor statistics are the following: 5672588 visits, 15843994 page views, 58.67% new visits. Table 1: Language related statistics English: Slovene: Events (427) Events (88) Lectures (7910) Lectures (858) Tutorials (273) Tutorials (3) Keynotes (425) Keynotes (2) Interviews (118) Interviews (85) Other... (871) Other... (175) English all Slovenian all (1184) (10024) French: German: Dutch: Events (2) Lectures (12) Keynotes (1) Externals (34) French all (53) Events (1) Events Lectures (9) Lectures Keynotes (1) Dutch all Externals (1) German all (14) (2) (18) (20) Page 24 of (33) Deliverable D6.2 PlanetData Figure 11: Current personalization loop at VideoLectures.NET Quintelligence Miner (QM) is a decision support environment that integrates several data mining techniques and OLAP-style splits of large-scale data stores containing structured information (e.g. customer information). QM is being used by online publishers along with VideoLectures.NET to model and understand the users, their needs and preferences and behaviour characteristics. It modelling is based on the content users are reading in every web service session. It can model a particular user, cluster them in the interest groups and develop the most used user scenarios which are in particular important for the personalization of training. A unique part of the tool is the ability to handle unstructured data such as text. In default setting, the analytic user can filter and split the data along any dimension in structured (e.g. gender or age) and unstructured data (e.g. topic, keywords, named entities). The system can aggregate all other dimensions for each split and visualize those using standard techniques for structured data (e.g. pie charts, histograms, world map). Aggregation of unstructured data is done using text mining techniques such as clustering, feature extraction and text visualization. An additional result of each split is a machine learning model which can be used for database fields (e.g. gender of a customer based on their shopping or reading behaviour). QM presents a successful approach for user modelling and understanding user behaviour which is the entry point and first part of functionalities in the fore coming personalized training loop (Figure 15). The tool will possibly be used for the segmentation of the PlanetData OTI community and the detection of website accessibility trends. Page 25 of (33) PlanetData Deliverable D6.2 Figure 12: Quintelligence Miner Architecture at VideoLectures.NET Figure 13: Visitor Statistics from 25.2.2008 until 27.3.2011 3.4 Opencast Matterhorn project The PlanetData KSC will try to make use of the technology that the open source project Matterhorn provides. Matterhorn is being developed by the Opencast Community to develop an end-to-end, open source platform that supports the scheduling, capture, managing, encoding and delivery of educational audio and video content. In this way a window of opportunity would open to use Matterhorn as a possible solution for captioning video content within the PlanetData consortium. The dissemination of the video output from this project could appropriately be channelled and provided by the educational video repository VideoLectures.NET (http://videolectures.net/). The main purpose of this would be to try to low costs and bring to the table a new technology that would be adopted by PlanetData partners for captioning video of Page 26 of (33) Deliverable D6.2 PlanetData training events even after the finalisation of the project. The current release of Matterhorn 1.01 which is being tested in the community at the moment could be adopted for the PlanetData KSC needs and includes the following features: Administrative tools for scheduling automated recordings, manually uploading files, and managing metadata, captioning and processing functions Recommended capture agent hardware specifications Integration with recording devices in the classroom for managing automated capture Processing and encoding services that prepare and package the media files according to configurable specifications Distribution to local streaming and download servers and configuration capability for distribution to channels such as YouTube, iTunes or a campus course or content management system Rich media user interface for learners to engage with content, including slide preview, content-based search and captioning In general the Matterhorn project should provide the following routine of actions in order to perform a finalized recording. Opencast Matterhorn's features are best detailed in covering the story of a recording, from settings determined when Matterhorn is installed, through scheduling the recording, capture, processing and finally when it reaches the learner. The footsteps for a theoretical user scenario from scheduling until the final product on the web are the following: Schedule a class to be recorded Upload Recording Hold for review Add Captions Slide Thumbnail Navigation Embed In-Video Text Search Feeds View the Whole Classroom Experience Matterhorn is also a framework of media services. It is therefore highly configurable to meet individual institutional needs. It seems that possibly for institutions who want to easily produce audio and video webcasts and podcasts, Matterhorn significantly lowers the technical and cost barriers to entry. The Matterhorn 1.0 release is an easy-to-install, out-of-the-box system with automated workflows. For institutions who want to replace, expand or evolve their existing commercial or home grown systems, Matterhorn is highly configurable and services-based, so that organizations can choose to implement only the components they need, or replace default service implementations with their own to meet specific institutional needs. Matterhorn is backed by a large community comprised of world-class experts in many domains relevant to audio and video technology and academic content production and delivery. Through the engagement of the Opencast Community, Matterhorn will continue to evolve in response to advances in technologies, emerging needs of end users, and lessons learned by all along the way. The Matterhorn project was inspired by a common set of challenges that we face as a community, including: High costs and constraints of vendor solutions Proprietary code lock-in (closed source) Patchwork quilt of technologies and research projects -- great systems that can't play together well Limited enterprise integration with SIS or CLE/LMS Rich media accessibility (and associated benefits) Lack of learning tools Media preservation Rapidly changing technology Page 27 of (33) PlanetData Deliverable D6.2 Figure 14: Opencast Matterhorn's features 3.5 Content Personalization The current state of the art in educational technologies is mostly based on user environment personalization and not content personalization. Technological platforms among those also VideoLectures.NET and learning devices like iPad, Kindle, etc., enable only superficial or design oriented personalization which affects only the learners viewing, interaction and not full educational experience. In order to shift the focus centre from superficial or design oriented to deep knowledge derived experience we plan to introduce a new set of applications for PlanetData KSC that will provide the user/learner with a completely new learning experience namely a personalization of content for his specific needs based on a deeper profiling set of information. For this purpose we decided to use a suite of tools containing a service oriented text enrichment tool, a user targeting data mining tool and a recommendation content system based on user preferences. Page 28 of (33) Deliverable D6.2 PlanetData Figure 15: Planned personalization loop at VideoLectures.NET 3.6 Context Search As text and video in the vast video library at VideoLectures.NET represent knowledge or basic fragments of it, there is a need to extract and properly display in a personalized way the information users might be looking for or need. Therefore the personalization of the users experience must be forwarded and provided with new existing technologies not yet used in the educational sphere as such. For that purpose we will integrate a search engine add-on, with a novel concept of how we can search – SearchPoint at http://searchpoint.ijs.si. In a usual search scenario we write a query and receive ranked result-set in return. Instead of just using one ranking per query, SearchPoint (Figure16) creates a "Ranking Space", a space in which each point represents a unique ranking, thus getting a continuum of different rankings for one query. Obtaining ranking space can be done automatically, per query, in several ways. A ranking space consists of topics of interest, and is in fact drawn on a screen. A user can easily navigate this space, effectively choosing any ranking he desires. At the same time, there is no hindrance by this add-on, as the default position on a ranking space corresponds to the original ranking of a search engine. SearchPoint is able to obtain the topics directly from a result-set and even more; it can extract them from a predefined directory or even ontology, effectively enabling us to search inside a given context. Page 29 of (33) PlanetData Deliverable D6.2 Figure 16: SearchPoint query for “leopard” Page 30 of (33) Deliverable D6.2 4 PlanetData User validation methods and KSC development plan User validation (including user needs analysis, usability testing, user satisfaction measurement and other methods) is a mature and well documented discipline. For specific domains and development groups the approach and methods to be used are quite well established, and need not be introduced in detail. The issues specific to the KSC will be considered and appropriate approaches proposed. User validation methods were adopted from a larger research undertaken in D2.3.1 for COIN IP. The common approach to user validation as described at www.vnet5.org is representative for the organisation of user cantered activities in development projects, and has been adopted widely also for the training activities. A key property is that the actual process of user validation is tailored for each training module in response to the specific goals and parameters of the training. The potential user validation activities for KSC are described in the following table. Table 2: User validation activities Method Description Remarks Further information Contextual Inquiry In-depth analysis and observation of user behaviour in a realistic setting using personalisation facilities in KSC Dependent of the quality of the acquired data and the number of visits per KSC member VNET5 resources. Beyer, H. & Holtzblatt, K. (1998) Contextual Design: Defining Customer-Centred Systems. San Francisco: Morgan Kaufmann. Focus groups Structured and directed meetings with representative users and training domain experts. The goal is to elicit needs and requirements from the user representatives. Checklists for analyzing all relevant aspects of the physical and organisation context of use. A background in group management (mediation) is recommended; otherwise the validity of results may suffer. Caplan, S. (1990). Using focus groups methodology for ergonomic design. Ergonomics 33(5), 527-533 Inspection of training programs by experts, following guidelines centred on ”heuristics” identified from a large number of training program evaluations. Systematic inspection of the cognitive processes of users in the particular learning session To be carried out by 1-3 experts unconnected to the KSC program developers. VNET5 resources. Dependent of the quality of the acquired data and the number of visits per KSC member May be partly applicable. VNET5resources. Experience in experimental behaviour research desirable. A more controlled and rigorous approach is used in systems tests. VNET5 resources. Context of use checklist Heuristic evaluation Cognitive walkthrough Style guides Usability tests of prototypes Various style guides exist for application environments or minimum requirements. A participative test with users under controlled conditions. Data on user problems is collected by observations and “thinking aloud”. Efficient and recommended for analysing the interCOIN learning modules. Bevan, N., & MacLeod, M. (1994). Usability measurement in context. Behaviour and Information Technology, 13, 132–145 VNET5 resources guidelines Page 31 of (33) PlanetData 5 Deliverable D6.2 Conclusion This document described the plan for Open Training Infrastructure of PlanetData which comprises of building the PlanedData KSC setup and migration of all the materials available in the REASE portal into one-stop-shop entry point at VideoLectures.NET. The document also provides a brief status of VideoLectures.NET and future sustainability plan that comprises an attempt of the PlanetData KSC to make use of the technology that the open source project Matterhorn provides and also some novelties on how to provide a content and context personalisation for the PlanetData learner. Page 32 of (33) Deliverable D6.2 PlanetData References [1] http://www.ics.forth.gr/isl/PlanetData/ [2] http://videolectures.net/ [3] http://sarit.sti2.org/results [4] http://www.opencastproject.org/ [5] http://searchpoint.ijs.si/ [6] http://enrycher.ijs.si [7] http://www.djangoproject.com/ [8] SARIT D1.3 Training Repository [9] COIN D2.3.1. Training set-up and assessment deliverable [10] COIN D2.3.3b Training content and activities [11] SearchPoint — a New Paradigm of Web Search, Grobelnik M., Pajntar B. [12] Training personalization with Knowledge Technologies and Contextualization, Orlič D., Jermol M. Page 33 of (33)