DCT/PS/VDO SITools Enhanced use of laboratory services and data - Architecture and first feedback – PV 2005 T. Levoir CNES 18, Avenue E.Belin 31401 Toulouse Cedex France Email:Thierry.Levoir@cnes.fr 1 Introduction For several years CNES has been developing generic software systems in order to set up Data Centers, in particular SIPAD(-NG) The SITools project aims to enhance this offer. SITools : – an « open-source » Framework, – to be installed in scientific laboratories and managed by them. – allows access to services and data » previously inaccessible » or accessible through specific system hard to maintain. • SITools aims to reduce the number of such specific systems, and federate the effort of the laboratories. SITools - Architecture and first feedback CNES 2 SITools SITools is a technical layer, – based on a concept of services interconnected via a Web Services virtual bus – accessed by one or more client applications » (SITools initially includes one client application – a Web interface -). Five types of basic services have been implemented – (On-line catalog, Off-line catalog, Repository, Command and User Space). Accept specific services : Added-Value Services (AVS) as plug-in. – independent software programs, interfacing with the system to provide new features (graphs, data-mining, 3D navigation, etc.). SITools - Architecture and first feedback CNES 3 SITools All the services may use the already installed facilities (database, services, …). The Web Services bus gives a native interoperability. – may run locally, on a single machine, – be distributed over » several machines » even several laboratories. – It enables the creation of a distributed data centre. All the services have to be configured : » to access data, » to specialize the GUI (CSS, JSP, …) »… – to create what we call “a SITools Instance” i.e. a data access system. SITools - Architecture and first feedback CNES 4 Site A Site A Repository Client Application (Web Server) Site C Description of Site B services Description of Site A services Description of services Internet browser Synonyms dictionnary Site A Connection bus between the various services Client Catalog ... Catalog 1 Catalog 2 Data 1 Site A Site C Added-Value Services 1 Added-Value Services 2 Site A Site C User Space Service 1 Command Processing Site B Catalog off-line Catalog 1 Catalog 2 Data 1 Site B AVS 3 Site N SITools - Architecture and first feedback Added-Value CNES 5 SITools Instance Repository Catalog Service Client Application Catalog 1 Meta Model Super Catalog DataSet Model 1 DataSet Model 2 DataSet Model … Meta-data DataSet 1 Meta-data DataSet 2 Catalog 2 Meta Model Dictionary Associeted Synonyms Associeted Synonyms Catalog Service Catalog Client Interface Associeted Synonyms DataSet Model 1 DataSet Model 2 DataSet Model … Meta-data DataSet 1 Meta-data DataSet 2 Association between Dictionary/catalogs when starting the SiTools instance Identification of catalogs and access rights when user authentication is complete Transparent access to catalogs SITools - Architecture and first feedback CNES 6 Utilisation 2 types of user : – The end user, » the scientist who consults the final system set up to look for and download useful data. – The manager, » the entity (scientific laboratory, organization, etc.) which will set up and run the system to provide access to its data. The manager creates a SITools Instance for end users – main steps » catalogs configuration » repository configuration. » GUI SITools - Architecture and first feedback CNES 7 Catalogs services Data – SITools main goals : (!) » give access to data. – SITools uses a database to refer and possibly to store the data. In the first case, – the data are in files described by metadata. » those metadata are in database. » allow the end-user to make a research by criterion. In the second case – the data are directly in database. » will also be used as metadata for research. SITools - Architecture and first feedback CNES 8 Catalogs services One dataset, one table – SITools considers a dataset as a unit of homogeneous data, » same source » same level of treatment, » identical attributes and criteria. – To a dataset, a table in the catalogue will correspond. » For each new dataset inserted in a SITools instance, • a new table will be created. to offer the access to them automatically, – each table is described in technical tables » dataset is initially declared in the table “DataSets”. » each field of the dataset (column of the table of the set) is described using the table “Attribut” SITools - Architecture and first feedback CNES 9 Table : HIRES SITools - Architecture and first feedback CNES 10 Table : DataSet Table : HIRES SITools - Architecture and first feedback CNES 11 Table : Attribut Table : DataSet Table : HIRES SITools - Architecture and first feedback CNES 12 Catalogs services / Table Attribut Colonne dataset_name name label tooltip type class size keyindex criterion Type Varchar Varchar Varchar Varchar Varchar Varchar Int Int Int display Int advanced Int mandatory updatable default_value min_value max_value comment Int Int Varchar Varchar Varchar Varchar Description Data set name Attribute name Displayed name More information to be displayed Attribute Type (see below) Attribute class (see below) size (for display information) Indicates if it is a key Indicates if this attribute is to be used as a criteria for the end user. Indicates if this attribute is to be displayed to the end user. Indicates if this attribute may be displayed if the end user asks for it. Indicates if the attribute is mandatory Indicates if the attribute is updatable Main information : type and class. – make possible for SITools to know how to manage this information, how to present it like criteria and how to present it as results. SITools - Architecture and first feedback CNES 13 Catalogs services / Type (1/2) Type addresses – traditional basic types : » boolean, Float, Geometry, string, int, length, timestamp – and specific types : – – – – – – – – – – – – – – ids_child ids_parent multi_string multi_int multi_long multi_float multi_date multi_timestamp multi_inter_int multi_inter_long multi_inter_float multi_inter_date multi_inter_timestamp Serial Auto-increment Idents of the child data Idents of the parent data Multiple values string Multiple values integer Multiple values big integer Multiple values decimal Multiple values date Multiple values date time Multiple values of interval integer Multiple values of interval big integer Multiple values of interval decimal Multiple values of interval date Multiple values of interval date time id type » make possible for SITools to manage the data of a dataset in a single table, while offering the power of a relational database. • SITools automatically create additional tables according to the type of the data. SITools - Architecture and first feedback CNES 14 Catalogs services / Types (2/2) Example: type "multi _..." : – Field "string" with several values (numerical, string, ...) separated by the separator ";". » It is obvious that with such a string, it is impossible to make an elaborate request (like Greater than). DataSet Table SITools - Architecture and first feedback CNES 15 Catalogs services / Types (2/2) Example: type "multi _..." : – Field "string" with several values (numerical, string, ...) separated by the separator ";". » It is obvious that with such a string, it is impossible to make an elaborate request (like Greater than). Table : Attribut DataSet Table SITools - Architecture and first feedback CNES 16 Catalogs services / Types (2/2) Example: type "multi _..." : – Field "string" with several values (numerical, string, ...) separated by the separator ";". » It is obvious that with such a string, it is impossible to make an elaborate request (like Greater than). Table : Attribut DataSet Table SITools automatically create the tables necessary to answer requests efficiently. SITools - Architecture and first feedback CNES 17 Catalogs services / Class The concept of class makes it possible to specialize a type to give it a specific behavior. It adds semantics to the type which is purely syntactic. – For example for a type "string", we can use the class "URL" » allows the client application to present it in the form of URL. – With each class is associated a template in the system (in the form of jsp) » defining the presentation of the attribute • as well in its criterion representation (for input in a user search) • as in its result representation. – Pre-defined class : applet_carto checkbox date date_short date_time fichier_dav histogram ids_asso image liste_choix –- Map Selector Display True/False/All choice Full date (dd-mm-yyyy) Short date (dd-mm-yy) full date (dd-mm-yyyy) with time (hh:mm:ss) HTTP link to files stored into DAV popup that display histogram Idents of the parent and child data Image to be displayed Single Selection list SITools - Architecture and first feedback liste_choix_multi liste_choix_dyn liste_choix_multi_dyn multiple_criterion multiple_values multiple_intervalle number_float number_integer password quicklook text url url_filedata zone_texte Multiple Selection list Single Selection list filled dynamically Multiple Selection list filled dynamically Multiple values for criterion field only Multiple values field Multiple values field Float number values field Long or integer values field Password field Display quicklook with a link to data Text field (default class) HTTP link HTTP link to the external data Text field with vertical scrollbar CNES 18 Repository service (1/2) Through configuration XML files, – link all elements of one SITools instance – declaration of all the catalogs connected. » SITools will determine automatically the common research criteria of the various datasets. – declaration of dictionaries of synonyms » link between attributes of different datasets, » convertor features (for instance, conversion of °C to °F). <synonym word=“dataSetDate"> <synonym>Date</synonym> <convertor classe="com.cnes.sitools.webapp.supercat.ConvertisseurIntervalle"> <parametre> <name>min</name><value>From</value> </parametre> <parametre> <name>max</name><value>To</value> </parametre> </convertor> <operator>GE_LE</operator> </synonym> SITools - Architecture and first feedback CNES 19 Repository service (2/2) – declaration of groups of datasets, » graphs of navigation, for example by mission/experiment. – declaration of user rights – declaration of Added Values Services » For instance : zip <service nom="zip" internal="false" batch="false" synchron="true" copyUS="false"> <display>Extraction (zip)</display> <adress>http://localhost:8080/zip/servlet.zip</adress> <picto>/images/picto_zip.gif</picto> </service> SITools - Architecture and first feedback CNES 20 SITools OK, but how does it look like ? SITools - Architecture and first feedback CNES 21 SITools - Architecture and first feedback CNES 22 SITools - Architecture and first feedback CNES 23 SITools - Architecture and first feedback CNES 24 SITools - Architecture and first feedback CNES 25 SITools - Architecture and first feedback CNES 26 SITools - Architecture and first feedback CNES 27 SITools - Architecture and first feedback CNES 28 SITools And what about Added Values Services ?? SITools - Architecture and first feedback CNES 29 SITools - Architecture and first feedback CNES 30 SITools - Architecture and first feedback CNES 31 SITools - Architecture and first feedback CNES 32 SITools - Architecture and first feedback CNES 33 SITools - Architecture and first feedback CNES 34 SITools - Architecture and first feedback CNES 35 SITools - Architecture and first feedback CNES 36 SITools - Architecture and first feedback CNES 37 SITools - Architecture and first feedback CNES 38 SITools - Architecture and first feedback CNES 39 Experimentation SITools V2 (full version) was released in september 2005. – As it was created as an R&D product, it has to be fully tested and qualified by some laboratories before being available. – SITools is currently tested or used by 3 CNRS laboratories and by one project at CNES. » Data accessed through those SITools instances are very heterogeneous. – So far, one of the main issues is to install all the needed production tools on a linux platform. » Apache / DAV / JK – Tomcat - MySQL / PostGres SITools - Architecture and first feedback CNES 40 SITools & Virtual Observatory As SITools design was done in VO logical concepts (from the interoperability point of view), it’s not a big challenge to integrate VO capabilities. – All the services are Web-services, – catalogs are linked through synonyms, which could be UCDs, SITools has also the capability to use an added value service to handle VO protocol queries (SIAP, SSAP, VOQL, …) and to transform results in VOtable format. A study is undergoing for, initially, developing an AVS which can reply to a SIAP request. SITools - Architecture and first feedback CNES 41 SITools Thank you Questions ??? SITools - Architecture and first feedback CNES 42