Creating Smart Clients with the Collaboration Notebook Greg Quinn Principal Investigator Desktop and Mobile Data Management San Diego Supercomputer Center Topics Data overload The Collaboration Notebook: a data management solution Smart client development framework Future work Science research: it’s about the DATA! Locally generated, e.g. observations of a researcher at the lab bench Theoretical data Archived local data, collected from many researchers Data in legacy databases Data from collaborators, both near and far Data from Internet sources such as NCBI (NIH) The Encyclopedia of Life Project More than 800 genomes that have been completely or partially sequenced For each publicly available protein sequence derived from genomic data, EOL attempts to locate structural domains and correlate this data with other publicly available sequence annotation A large amount of information for end-users to collate Data management app Many Excellent Online Data Sites BLAST NCBI BLAST Protein NR updated nightly (~2 million sequences) XML DATA OVERLOAD! Online data sources Legacy and archived data Collaborative research Research notes and observations DATA OVERLOAD! No meaningful way to store data No simple way to re-purpose online data No mechanism to share data Difficult to keep track of data acquired during an online session Resource-intensive searches frequently repeated When data is retrieved, it can be difficult to manipulate and/or search No simple way to annotate downloaded data The Collaboration Notebook A desktop application to better enable the scientific researcher and knowledge worker utilize network information resources and manage data Leverages features of Windows and the .Net development Powerful local db with search functionality “Knowledge” of data types through the use of Ontologies Ability to annotate stored data Peer-to-peer querying of stored data and annotations Data export capability to popular formats Unattended/automatic data updates via web services & HTTP User notification of new data Plugin API for data visualization components – c/w basic data viewers for popular Bio-data types, e.g. text, DNA seqs, etc. Smart client framework for SOAP-based web services Point-and-click interface to support Tablet PC’s and ink data types Data Sources XML doc Local datastore personal database Data presentation and Smart client for network data services personal database personal database The Connected Research Environment Web Interfaces SOAP Services Other Data Sources Report and paper preparation P2P Research data access, input & annotation Group collaboration GUI Design schematic Data browser Smart Client display area N-App availability P2P Collaboration group Fast search options Winforms Avalon Embedded apps Collaboration Notebook N-Apps Networked Data services N-App Server EOL PDB NCBI Workbench Work bench N-App Collection … Smart Client..? Local resources Connected Offline capable Intelligent deployment and update Collaboration Notebook N-App Interface class Reference Data service wrappers Available Services Run request Data services Service wrapper SOAP services Service wrapper Web Pages Service wrapper DB’s data Data changed event manager Service wrapper Smart Client Local DB Collaboration subsystem Indigo Local Instrumentation Stages in Smart Client Dev Write your application with Windows forms or Avalon (control library project) Reference Notebook interface class and Notebook data objects class to gain access to Notebook services (e.g. data access, persistence and retrieval) Write a wrapper for the online service (if not using a pre-existing one) Write a data translator to convert incoming data into internal canonical format Debugging Smart Clients Service wrapper Test Harness Data Service Embedding a Pre-existing Application Create a new Class Library project in Visual Studio Add reference to executable Use Visual Studio object browser to determine available methods Add references to Notebook interface class and Notebook data objects class Wrapper communicates with Embedded app Wrapper Worldwind.menuItemEditPaste.PerformClick() N-App System Clipboard WorldWind://lat=x&long=y Lat/Long data Notebook Service wrapper Data Service Example N-Apps BLAST interface Winforms BioConductor Interface Avalon (XAML) World Wind Embedded BLAST interface BLAST – Basic Local Alignment Search Tool Several BLAST SOAP services available: NCBI PDB EBI DDBJ Unified interface to these services BLAST Interface Smart Client Microarray Data Analysis Developed by Dr. Robert Byrnes Statistical analysis of microarray data can be used to determine up- and down-regulation of genes, expression signatures, etc. The Bioconductor package is a widely used set of biostats routines that can be used to analyze microarray data Written in “R” language Current GUI’s to use Bioconductor are relatively crude XAML-based Applications C# XAML <FlowPanel xmlns="http://schemas.microsoft. com/2003/xaml"> <Text>Hello World</Text> <Button>Click me!</Button> </FlowPanel> + public partial class MyPanel { public string Hello() { return "Hello!"; } } Microarray Data Analysis Developed by Dr. Robert Byrnes R Interpreter Notebook App Bioconductor packages Avalon Control XML Metadata SQL SERVER Microarray App Microarray Data Analysis Developed by Dr. Robert Byrnes Embedded Applications .Net-based app’s can be embedded with no modification required Need to creatively use existing methods within the application for data analysis, persistence and retrieval Embedded Applications Smart Clients Under Development Protein Data Bank (PDB) Next Generation Biology WorkBench National Ecological Observatory Network (NEON) Homeland Security projects CORE APP DEV ADVANCED FEATURES CORE FEATURES DATA PLUGIN API SERVICE PLUGIN API XAML DEV FRAMEWORK SQL SERVER FIXED INTERNAL DATA MAPPING P2P COMMUNICATIONS INTEGRATION OF DATA MAPPING SOLUTION NOTEBOOK RUNTIME SERVICE BACKGROUND DATA UPDATES INTEGRATION INTO “LONGHORN” PLATFORM DATA MAPPING & DISTRIBUTED QUERY ADVANCED QUERY PROCESSING LAB ONTOLOGY-BASE DATA MAPPING DISTRIBUTED QUERY OPTIMIZATION CROSS-DOMAIN QUERY N-APP DEV CHARTER N-APPS MICROARRAY DATA ANALYSIS APP GEOSCIENCES-BASED APP OTHERS… FIELD TRIALS & TESTING INTERNAL UCSD RESEARCH LABS SOCIAL MONITORING 2004 2005 PUBLIC BETA TESTING BUG TRACKING 2006 Hands-on One Day Workshop: Creating applications with the Collaboration Notebook Date: October 2005 Location: San Diego Supercomputer Center Further info: notebook@sdsc.edu Acknowledgements Programmers Blair Jennings (lead) Robert Byrnes Martin Dubcovsky Kevin Fowler Acknowledgements Project Support Dan Fay & Microsoft Research Mark Miller, SDSC The SDSC Synthesis Center Questions? Greg Quinn quinn@sdsc.edu http://www.notebookproject.org © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.