XML for taming data to be visualized over the Grid – within the gViz project R&D in Grids & e-science is transforming:- data access and - discipline-oriented data e.g. :- Marine XML - Earth Science Markup Language - NERC Data Grid - CCLRC Data Portal But there is still a gap to be bridged between:multiple data formats and models and Precious legacy data multiple preferred visualization systems Programming script oriented e.g. Matlab Satellite data HDF5 MVE e.g. Iris Explorer New data Joe Bloggs’ data Applicationoriented XML Conventional approaches:- “That’s easy, I’ll write a converter” - “Collaborating team uses just one viz system” But Grid-enabled VO encourages teams that: - form, change and disperse - are multidisciplinary So we would prefer to have:- ?? Toolkit e.g. VisAD, PV3 Precious legacy data Programming script oriented e.g. Matlab Satellite data HDF5 MVE e.g. Iris Explorer New data Joe Bloggs’ data Applicationoriented XML Toolkit e.g. VisAD, PV3 …. move from Axmxn to Bxm+Cxn+D (and avoid making B,C,D too big) Investigating an approach which:- Current work, investigating:- •Processes structure by sequence of filters •Relies on effective coordination of filters (e.g. skML?) •Processes bulk data (e.g. re-sequence) as late as possible (lazy transformation) – e.g. use handles •Bulk data can be text-based or binary •Uses intermediate XML-based language - e.g. XDFor GGF’s DFDL (Data Format Definition Language) – to express structure. •If not XML, then expresses structure in XML as early as possible •Each filter should be straightforward e.g. •Splitter/combiners; XML transforms (XSLT); regular expressions •Uses existing tools: e.g. XSLT, OGSA-DAI •Each filter should be expressable as a Web/Grid Service Feasibility for small set of diverse data sources and viz systems Extent of performance loss for bulk data cf existing alternatives Convenience for new combination of data source D and viz system V Framework for processing components Future Feasibility and benefit for highly structured data Xquery / XPath 2.0 Frequently updated data (simulations, experiments) More general metadata (not only structure) I am interested in testing the approach with diverse datasets. Please contact Julian.Gallop@rl.ac.uk e.g. large legacy XML front end subset handle intermediate language viz viz system- system specific