VideoWorks –Blueprint for the functioning of a National Video DataGrid David Shotton, Danny Torbica and John Pybus [e-mail: david.shotton@zoo.ox.ac.uk, danny.torbica@zoo.ox.ac.uk, john.pybus@zoo.ox.ac.uk] Image BioInformatics Laboratory, Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS Key words to describe the work: Video e-services, editing, format conversion, semantic content analysis Key Objectives: To provide an integrated suite of video e-services to enable authorized users to analyse, edit, customize, trans-code and repurpose digital video files located in distant video archives and databases Motivation for the work (problems addressed): User authentication and access control between Grid services and third party databases outside the Grid community; interactive processing and two-way database access; video streaming and file transfer using GridFTP protocol; task scheduling in a client-server-slave processor farm system using Grid middleware. VideoWorks Project Description Grid stretching aspects of VideoWorks project VideoWorks for the Grid is an Oxford e-Science Centre project that will provide an integrated suite of video e-services to enable authorized users to analyse, edit, customize, trans-code and repurpose digital video files located in distant video archives and databases, thereby adding value to the holdings in these databases. Each video file to be processed is first uploaded into VideoWorks, and is then analysed structurally to identify scene change key frames, and converted into a low-resolution streaming preview. Subsequent interactive processing can take two forms: the VANQUIS service permits content analysis of the video, generating semantic metadata to permit subsequent query by content, while the VIDOS service permits repurposing of the video by permitting spatial and temporal editing and/or format conversion (transcoding) to create a customized output video file for local use, for example in teaching material or research. Industrial Partner software permits video structural analysis, format conversion and steaming. The VideoWorks for the Grid project combines the data complexity of the Semantic Web with the computational complexity of ‘classic’ Grid applications. It involves gathering large video data files from distant databases and initiating computationally complex processing activities and analyses upon them. Existing Grid middleware meets few of the requirements for such a Grid database project. For the prototype, we will work with two academic video databases: the BUFVC MAAS Media Online archive of rights-cleared videos for UK academic use (http://www.bufvc.ac.uk/maas/index.html), to be located at the Edinburgh EDINA National Data Centre (http://edina.ac.uk), and the BioImage Database (www.bioimage.org), a database for multidimensional images and videos of biological specimens. Both will be enabled to provide VideoWorks services. The complex series of interactions that will occur between a user, the BioImage Database, the VideoWorks system are illustrated in Fig 1. User authentication Providing a unified authentication regime for users of third-party video databases linking into VideoWorks services, and security of the VideoWorks services themselves, will require use of an external gateway that triggers an appropriate digital certificate, since the X509 digital certificate system employed in GSI is only appropriate for Grid-enabled participants, and it is unrealistic to expect individual database users to rush off individually to some Certification Authority. Such authorised access must be sufficiently persistent to enable a user to start a long conversion job, log off, and then reconnect later to review progress or download the customized video. Database access Unlike most Grid projects involving databases, VideoWorks requires not just the passive extraction of data, but the ability both to communicate interactively with the VideoWorks system in order to specify customisation or analysis parameters, and also to write newly-created semantic metadata back into the video database of origin, while ensuring the security and integrity of that database. Video streaming and file transfer Existing Grid middleware cannot allocate bandwidth to facilitate streamed video delivery, and GridFTP does not permit data buffers, flow control and staged delivery services that might assist the asynchronous transfer of large video data files. VideoWorks task scheduling Within the VideoWorks system, Grid middleware or Condor will be used to schedule and prioritise concurrent users’ jobs having different requirements, distributing them to the VideoWorks slave processors in the most efficient manner depending upon their individual capabilities with regard to disc space, processing speed or access to particular video codec implementations. The presentation will include a live demonstration of the VIDOS service within the VideoWorks prototype, and discussion of our progress in integrating the various other software components and in surmounting some of the obstacles to Grid integration mentioned above. Figure 1 Interactions between a user, the BioImage Database and the VideoWorks system (Arrow widths approximate the volumes of data being transferred) 1 A BioImage user selects a video in the BioImage Database and chooses to use the VideoWorks services. 2 The selected video is uploaded from the distant database to VideoWorks file store using GridFTP. 3 An interactive session is established between the user and the VideoWorks server. 4 Video sent to both conversion and analysis slaves. 5 Video preview version passed to streaming server. 6, 7 Preview and keyframe storyboard sent to user to permit interactive processing. 8 User’s customization parameters are sent to the VIDOS conversion slave; customized video created and sent to VideoWorks server. 9 Customized video is downloaded to the user. 10 And / or Interactive semantic analysis of video content undertaken by user 11 Semantic content metadata written to VideoStore 12 Semantic metadata transferred to BioImage for longterm storage and subsequent Query by Content. Current Status and Development Substantial progress has been made with VideoWorks and we now have a stable VIDOS video editing and transcoding system that permits one to upload a file from one’s local machine or from any Web site using ftp and http through a Web interface. Extensive testing was implemented, with testing the current system on the most commonly used hardware platforms and also with the most commonly used browsers. Bug fixes ensured that the system is now reliable enough to use in conjunction with BioImage and with EDINA, now that they have digitized a substantial proportion of their videos from the British Universities Film and Video Council (http://www.bufvc.ac.uk/maas/). Further developments with respect to accessing EDINA through a SOAP interface using SOAP and WSDL standards are to be implemented as the next phase of VideoWorks development. Publication is planned in the journal Animal Behaviour of work that has led to the development of an automated procedure for tracking fish in aquaria through a video analysis module of VideoWorks.