Grids for Chemical Informatics - Community Grids Lab

advertisement
Grids for Chemical
Informatics
Chemistry, IU Bloomington
Oct. 21 2005
Geoffrey Fox
Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
gcf@indiana.edu
http://www.infomall.org
Why are Grids Important







Grids are important for Chemistry because they support key
functionalities that grow in importance as we are deluged with
data from instruments and simulations
Grids provide information access, storage and management
Grids manage multiple simulations with different defining
parameters
Grids allow complex workflows with data flowing between
filters
Grids define models for portals
Grids are built on top of commodity web service technology
with broad industry support – the next generation
information technology
Grids are used in multiple NIH and other life
science/chemistry projects across the world (BIRN, caBIG,
myGrid, Comb-e-Chem )
Internet Scale Distributed Services



Grids use Internet technology and are distinguished by
managing or organizing sets of network connected resources
• Classic Web allows independent one-to-one access to
individual resources
• Grids integrate together and manage multiple Internetconnected resources: People, Sensors, computers, data
systems
Organization can be explicit as in
• TeraGrid which federates many supercomputers;
• Deep Web Technologies IR Grid which federates multiple
data resources;
• CrisisGrid which federates first responders, commanders,
sensors, GIS, (Tsunami) simulations, science/public data
Organization can be implicit as in Internet resources such as
curated databases and simulation resources that “harmonize a
community”
Different Visions of the Grid






Grid just refers to the technologies
• Or Grids represent the full system/Applications
DoD’s vision of Network Centric Computing can be considered a
Grid (linking sensors, warfighters, commanders, backend
resources) and they are building the GiG (Global Information
Grid)
Utility Computing or X-on-demand (X=data, computer ..) is
major computer Industry interest in Grids and this is key part of
enterprise or campus Grids
e-Science or Cyberinfrastructure are virtual organization Grids
supporting global distributed science (note sensors, instruments
are people are all distributed
Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and
VRVS/GlobalMMCS like Internet A/V conferencing are
Collaboration Grids)
Commercial 3G Cell-phones and DoD ad-hoc network initiative
are forming mobile Grids
Types of Computing Grids









Running “Pleasing Parallel Jobs” as in United Devices, Entropia
(Desktop Grid) “cycle stealing systems”
Can be managed (“inside” the enterprise as in Condor) or more
informal (as in SETI@Home)
Computing-on-demand in Industry where jobs spawned are
perhaps very large (SAP, Oracle …)
Support distributed file systems as in Legion (Avaki), Globus with
(web-enhanced) UNIX programming paradigm
• Particle Physics will run some 30,000 simultaneous jobs
Distributed Simulation HLA/RTI style Grids
Linking Supercomputers as in TeraGrid
Pipelined applications linking data/instruments, compute,
visualization
Seamless Access where Grid portals allow one to choose one of
multiple resources with a common interfaces
Parallel Computing typically NOT suited for a Grid (latency)
Analysis and
Visualization
ADVANCED
VISUALIZATION
,ANALYSIS
QuickTime™ and a
decompressor
are needed to see this picture.
Large Disks
Old Style Metacomputing Grid
COMPUTATIONAL
RESOURCES
LARGE-SCALE DATABASES
Large Scale Parallel Computers
Original: Spread a single large Problem over multiple supercomputers
Now-1: Control multiple smallish jobs each on independent Computers
Now-2: Choose which of a few supercomputers to use
Towards an International
Compute Grid
Infrastructure
US TeraGrid
SDSC
Starlight (Chicago)
UK NGS
Leeds
Manchester
Netherlight
(Amsterdam)
Oxford
RAL
NCSA
PSC
UCL
UKLight
SC05
All sites connected by
production network (not
all shown)
Computation
Steering clients
Network PoP
Service Registry
Local laptops in
Seattle and UK
Information/Knowledge Grids


Distributed (10’s to 1000’s) of data sources (instruments,
file systems, curated databases …)
Data Deluge: 1 (now) to 100’s petabytes/year (2012)
• Moore’s law for Sensors

Possible filters assigned dynamically (on-demand)
• Run image processing algorithm on telescope image
• Run Gene sequencing algorithm on compiled data



Needs decision support front end with “what-if”
simulations
Metadata (provenance)
critical to annotate data
Integrate across experiments
as in multi-wavelength
astronomy
Data Deluge comes from pixels/year available
Data Deluged Science


Now particle physics will get 100 petabytes from CERN using
around 30,000 CPU’s simultaneously 24X7
Exponential growth in data and compare to:
•
•
•
•







The Bible = 5 Megabytes
Annual refereed papers = 1 Terabyte
Library of Congress = 20 Terabytes
Internet Archive (1996 – 2002) = 100 Terabytes
Weather, climate, solid earth (EarthScope)
Bioinformatics curated databases (Biocomplexity only 1000’s of
data points at present)
Virtual Observatory and SkyServer in Astronomy
Environmental Sensor nets
In the past, HPCC community worried about data in the form of
parallel I/O or MPI-IO, but we didn’t consider it as an enabler
of new science and new ways of computing
Data assimilation was not central to HPCC
DoE ASCI set up because didn’t want test data!
Virtual Observatory Astronomy Grid
Integrate Experiments
Radio
Far-Infrared
Visible
Dust Map
Visible + X-ray
Galaxy Density Map
International Virtual
Observatory Alliance
• Reached international agreements on Astronomical
Data Query Language, VOTable 1.1, UCD 1+,
Resource Metadata Schema
• Image Access Protocol, Spectral Access Protocol
and Spectral Data Model, Space-Time Coordinates
definitions and schema
• Interoperable registries by Jan 2005 (NVO,
AstroGrid, AVO, JVO) using OAI publishing and
harvesting
• So each Community of Interest builds data AND
service standards that build on GS-* and WS-*
• Imminent
‘deluge’ of data
• Highly
heterogeneous
• Highly complex
and inter-related
• Convergence of
data and
literature
archives
myGrid Project
The Williams
Workflows
A
A: Identification of
overlapping sequence
B: Characterisation of
nucleotide sequence
C: Characterisation of
protein sequence
B
C
Web services

Programs
Computational resources
service logic
BPEL, Java, .NET
Databases
resources
Humans
<env:Envelope>
<env:Header>
...
</env:header>
<env:Body>
...
</env:Body>
</env:Envelope>
SOAP messages
message processing

Web Services build
loosely-coupled,
distributed
applications, (wrapping
existing codes and
databases) based on the
SOA (service oriented
architecture) principles.
Web Services interact
by exchanging messages
in SOAP format
The contracts for the
message exchanges that
implement those
interactions are
described via WSDL
interfaces.
SOAP and WSDL

Devices
A typical Web Service


In principle, services can be in any language (Fortran .. Java ..
Perl .. Python) and the interfaces can be method calls, Java RMI
Messages, CGI Web invocations, totally compiled away (inlining)
The simplest implementations involve XML messages (SOAP) and
programs written in net friendly languages like Java and Python
Web Services
WSDL interfaces
Portal
Service
Security
WSDL interfaces
Web Services
Payment
Credit Card
Catalog
Warehouse
Shipping
control
Two-level Programming I
• The Web Service (Grid) paradigm implicitly assumes a
two-level Programming Model
• We make a Service (same as a “distributed object” or
“computer program” running on a remote computer) using
conventional technologies
– C++ Java or Fortran Monte Carlo module
– Data streaming from a sensor or Satellite
– Specialized (JDBC) database access
• Such services accept and produce data from users files and
databases
Service
Data
• The Grid is built by coordinating such services assuming
we have solved problem of programming the service
Two-level Programming II




The Grid is discussing the composition of distributed
services with the runtime Service1
Service2
interfaces to Grid as
opposed to UNIX
Service3
Service4
pipes/data streams
Familiar from use of UNIX Shell, PERL or Python
scripts to produce real applications from core programs
Such interpretative environments are the single
processor analog of Grid Programming
Some projects like GrADS from Rice University are
looking at integration between service and composition
levels but dominant effort looks at each level separately
Repositories
Federated Databases
Database
Sensors
Streaming
Data
Field Trip Data
Database
Sensor Grid
Database Grid
Research
SERVOGrid
Education
Compute Grid
Data
Filter
Services Research
Simulations
?
GIS
Discovery Grid
Services
Customization
Services
From
Research
to Education
Analysis and
Visualization
Portal
Grid of Grids: Research Grid and Education Grid
Education
Grid
Computer
Farm
SERVOGrid Requirements


Seamless Access to Data repositories and large scale
computers
Integration of multiple data sources including sensors,
databases, file systems with analysis system
• Including filtered OGSA-DAI (Grid database access)





Rich meta-data generation and access with
SERVOGrid specific Schema extending openGIS
(Geography as a Web service) standards and using
Semantic Grid
Portals with component model for user interfaces and
web control of all capabilities
Collaboration to support world-wide work
Basic Grid tools: workflow and notification
NOT metacomputing
SERVOGrid
Portal Screen
Shots
Earthquake Grid
DoD NCOW Grid
C2 (JBI CEE etc.)
NCOW-IS Services
CoI Specific…
…Grids/Services
Earthquake Data
& Simulation Service
ServoIS
Information Grid
7: Portals
Compute Grid
6: Collaboration Grid
GIS Grid
Sensor Grid
9: Application Services
10: Policy (ECS)
8: Data Access/Storage
4: Discovery
2: Security
11: Metadata
Core Low Level Grid Services
3: Messaging
5: Mediation
1: Management
Physical Network
n: Service refers to core services identified by DoD
CoI Community of Interest
GIS Geographical Information System
BioInformatics Grid
Chemical Informatics Grid
…
HTS Tools
Quantum
Calculations
CIS
…
Domain Specific
Grids/Services
7: Portals
Compute Grid
MIS Grid
Instrument Grid
Information Grid
6: Collaboration Grid
9: Application Services
10: Policy
8: Data Access/Storage
4: Discovery
2: Security
Sequencing Tools
Biocomplexity
Simulations
BIS
11: Metadata
Core Low Level Grid Services
3: Messaging
5: Workflow
1: Management
Physical Network
M(B,C)IS Molecular (Bio, Chem) Information System
GIS Grid with WMS, WFS, data sources and GML
<gml:featureMember>
<fault>
<name> Northridge2 </name>
<segment> Northridge2
</segment>
<author> Wald D. J.</author>
<gml:lineStringProperty>
<gml:LineString
srsName="null">
<gml:coordinates>
-118.72,34.243 118.591,34.176
</gml:coordinates>
</gml:LineString>
</gml:lineStringProperty>
</fault>
</gml:featureMember>
`
WMS
le
ec
tio
n
Fe
a
ol
tur
eC
eC
oll
Ge
tF
ea
e
r
tu
r
tu
a
Fe
a
Fe
et
G
tur
e
Client
io
ct
n
s
ad
i l ro ]
a
R [a-b
Railroads
WFS Server
Hi
River [a-d]
Bridge [1-5]
ry
SQL Query
ue
LQ
SQ
SQ
L
gw
ay
[1
2-
Q
ue
18
ry
]
Interstate
Highways
Rivers
Bridges
90
GML becomes CML, CellML, SBML
Electric Power and Natural Gas data from LANL
Interdependent Critical Infrastructure Simulations
Zoom-in
Zoom-out
FeatureInfo mode
Measure distance mode
Clear Distance
Drag and Drop mode
Refresh to initial map
Google maps
can be
integrated with
Web Feature
Service
Archives to
filter and
browse seismic
records.
Integrating
Archived Web
Feature Services
and Google Maps
What is Happening?







Grid ideas are being developed in (at least) four communities
• Web Service – W3C, OASIS, (DMTF)
• Grid Forum (High Performance Computing, e-Science)
• Enterprise Grid Alliance (Commercial “Grid Forum” with a
near term focus)
Service Standards are being debated
Grid Operational Infrastructure is being deployed
Grid Architecture and core software being developed
• Apache has several important projects as do academia; large
and small companies
Particular System Services are being developed “centrally” –
OGSA or GS-* framework for this in GGF; WS-* for
OASIS/W3C/Microsoft-IBM
Lots of fields are setting domain specific standards and building
domain specific services
USA started but now Europe is probably in the lead and Asia
will soon catch USA if momentum (roughly zero for USA)
continues
The Grid and Web Service Institutional Hierarchy
4: Application or Community of Interest
Specific Services
such as “Run BLAST” or “Look at Houses for sale”
3: Generally Useful Services and Features
Such as “Access a Database” or “Submit a Job” or “Manage
Cluster” or “Support a Portal” or “Collaborative Visualization”
OGSA GS-*
and some WS-*
GGF/W3C/….
WS-* from
Handlers like WS-RM, Security, Programming Models like BPEL OASIS/W3C/
Industry
2: System Services and Features
or Registries like UDDI
1: Container and
Run Time (Hosting) Environment
Must set standards to get interoperability
Apache Axis
.NET etc.
Location of software for Grid Projects in
Community Grids Laboratory






htpp://www.naradabrokering.org provides Web service
(and JMS) compliant distributed publish-subscribe
messaging (software overlay network)
htpp://www.globlmmcs.org is a service oriented (Grid)
collaboration environment (audio-video conferencing)
http://www.crisisgrid.org is an OGC (open geospatial
consortium) Geographical Information System (GIS)
compliant GIS and Sensor Grid (with POLIS center)
http://www.opengrids.org has WS-Context, Extended
UDDI etc.
The work is still in progress but NaradaBrokering is
quite mature
All software is open source and freely available
Project Goals

Establish Requirements from stakeholders
• Research
• Pharmaceutical Industry
• Government

Consider educational implications
• e-Science v Bio/Chem/Molecular Informatics






Consider other national and international projects to ensure we
either lead or use best practice
Design a Grid architecture and staged implementation
Start pilot projects led by Chemistry/Chemical Informatics
Evaluate and iterate
Design and implement ?(Chem, Life Science, Science, Molecular)
Informatics educational program that will attract students
Write winning center grant in 2006-7
Web Services Introduction
• What are “Web Services”?
– A distributed invocation system built on Grid
computing
• Independent of platform and programming
language
• Built on existing Web standards
– A service oriented architecture with
• Interfaces based on Internet protocols
• Messages in XML (except for binary data
attachments)
Web Services Introduction
• A web-based architecture providing for
interoperability among resources
– Centralized service registry
– Solves problems associated with finding, using, and
combining online resources
• Employ standard Internet protocols for:
– Communication with resources
– Automated discovery using centralized registries
• Communicate with devices, people, and each
other with the protocols and computer
languages
Service Oriented Architecture
(SOA)
• Goal is to achieve loose coupling among
interacting software agents
• Define service: a unit of work done by a
service provider to achieve desired end
results for a service consumer
• Both provider and consumer are roles
played by software agents on behalf of
their owners.
How does SOA work?
• Two architectural constraints are
employed
– Small set of simple and ubiquitous interfaces
to all participating software agents
– Descriptive messages constrained by an
extensible schema delivered through the
interfaces
Web Services Architectures
• Individual services are registered globally
– Broken down into individual services with
inputs and outputs specified
• Services are published
• Services are requested
• Open registry, publishing, and requesting
Service-Oriented Architecture
• From Curcin et al.
DDT, 2005,
10(12),867
Web Services for Science
• Invisible Services, Semantic Web, and
Grid
• Easy-to-use tools for any scientist
• High throughput, resource intensive
computing done for low cost/resources
• Shared community
– Collaborations between labs and fields
– Shared data
– Shared tools
e-Science and the Grid 1
• e-Science: Major UK Program
– global collaboration in key areas of science and the
next generation of infrastructure that will enable it
• reflects growing importance of international
laboratories, satellites and sensors and their
integrated analysis by distributed teams
• total investment of some £200M over the five-year
period from 2001 to 2006
• CyberInfrastructure: the analogous US initiative
• Grid Technology: supports e-Science &
Cyberinfrastructure
Basic Architectures:
Servlets/CGI and Web Services
Browser
GUI
Client
Browser
HTTP GET/POST
Web
Server
WSDL
SOAP
SOAP
JDBC
DB
or MPI
Appl.
Web
Server
WSDL
Web
Server
WSDL
WSDL
JDBC
DB
or MPI
Appl.
Importance of Web Services
• Building a true science community
• Enabling interoperability between tools
and the integration of data
• Less time coding, more time for science
• Change the way scientists work by
achieving new levels of integration
When To Use Web Services?
• Applications do not have severe restrictions on
reliability and speed.
• Two or more organizations need to cooperate.
– One needs to write an application that uses another’s service.
• Services can be upgraded independently of
clients.
• Services can be easily expressed with simple
request/response semantics and simple state.
Web Services Benefits
• Web services provide a clean separation
between a capability and its user interface.
• Increase in productivity
• Increase in flexibility
• Rapid return on investment
• Integration across multiple applications
Web Services Advantages
• Output in human- and computer-readable
formats
• I/O formats based on standard Internet
protocols
• Resources accessible server to server
allow automated I/O
• Integration based on specific services: you
select services or data needed without
downloading the entire data set
Web Services Advantages
• Description protocols provide details of
service provided and interface
components
• Semantic Web standards increase
efficiency
• Use a central registry and standardized
description of services
• Quality and status of the information is
dynamically available
Web Services Drawbacks
•
•
•
•
Based on new technologies
Time and commitment required to learn
Standards still in a state of rapid flux
Issues with quality of data, (and for
chemistry, quantity of open data), security,
and privacy
Components of Web Services
• Protocols
– SOAP
– WSDL
– UDDI
• XML as a basis for the protocols
• Ontologies
– OWL: Ontology Web Language
• Semantic Web
Components of the Semantic Web
for Chemistry
•
•
•
•
XML – eXtensible Markup Language
RDF – Resource Description Framework
RSS – Rich Site Summary
Dublin Core – allows metadata-based
newsfeeds
• OWL – for ontologies
• BPEL4WS – for workflow and web services
– Murray-Rust et al. Org. Biomol. Chem. 2004, 2, 31923203.
SOAP: Simple Object Access
Protocol
• Flexible protocol to communicate
information between server and server or
client and server using XML
• Supports Remote Procedure Calls
• Allows layers (security, authentication,
transactions) over the basic SOAP
elements
WSDL: Web Service Definition
Language
• Describes a service’s interface to clients
• Services register themselves with Web
Services
• WSDL describes how to contact and
interact with services
– I/O, operations and messages to aid
interaction with client
WSDL Overview
• An XML-based Interface Definition Language.
– You can define the APIs for all of your services in WSDL.
• WSDL docs are broken into five major parts:
– Data definitions (in XML) for custom types
– Abstract message definitions (request, response)
– Organization of messages into “ports” and “operations”
(classes and methods).
– Protocol bindings (to SOAP, for example)
– Service point locations (URLs)
• Some interesting features
– A single WSDL document can describe several versions of an
interface.
– A single WSDL doc can describe several related services.
UDDI: Universal Description,
Discovery, and Integration
• Provides ways for clients and services to interact
with other services
• Uses XML
• Defines the means of access, e.g.,
– URL
– E-Mail
• Defines services hosted by an entity
• Business-oriented tags
• Uses SOAP for communicating
XML: eXtensible Markup Language
• Allows definitions of types of documents
• Tags are used to specify components of
documents
• Allows specification of namespaces to
differentiate between identical tag names
• Tag names do not provide semantics other
than simple hierarchical relations
XML Overview
• A language for building languages
• Basic rules: be well formed and be valid
• Particular XML “dialects” are defined by XML
schemas.
– XML itself is defined by its own schema.
• Extensible via namespaces
• Many non-Web services dialects
– RDF, SVG, GML, CML, XForms, XHTML
• Many basic tools available: parsers, XPath
and XQuery for searching/querying, etc.
XML and Web services
• XML lends itself to distributed computing:
– It’s just a data description.
– Platform, programming language independent
• Web Services Description Language
(WSDL)
– Describes how to invoke a service
– Can bind to SOAP, other protocols for actual
invocation
• Simple Object Access Protocol (SOAP)
– Wire protocol extension for conveying RPC calls
– Can be carried over HTTP, SMTP
OWL: Web Ontology Language
• Builds on RDF and RDFS and adds a
means for richer descriptions of properties
and classes
– Disjoint classes
– Cardinality of classes
– Characteristics of relations, like symmetry
Standards for Web Services
• Business Process Execution Language for
Web Services (BPEL4WS)
• Ontology Web Language Semantics
(OWL-S)
• Web Service Modeling Ontology (WSMO)
Standards Setting Boards
• OASIS: Organization for Advancement of
Structured Information Standards
– ebXML: e-business XML
– UDDI: Universal Description, Discovery and
Integration
• Global Grid Forum
– community of users, developers, and vendors
leading the global standardization effort for
grid computing
Standards Setting Boards
• W3C: World Wide Web Consortium
– OWL: Ontology Web Language
– RDF/RDFS: Resource Description
Framework/Schema
– SOAP: Simple Object Access Protocol
– URI/URL/URN: Universal Resource
Identifier/Locator/Name
– WSDL: Web Service Definition Language
– XML: eXtensible Markup Language
SWWS: Semantic Web-Enabled
Web Services
• Main objectives:
– Provide a comprehensive Web Service
description framework
– Define a Web Service discovery framework
– Provide a scalable Web Service mediation
middleware
• A program of the European Commission to
run 2002-2005
– http://swws.semanticweb.org
Web Services Integration Projects:
Biosciences
• myGrid
– http://www.mygrid.org.uk/
• BIOPIPE
– http://biopipe.org/
• BioMOBY
– http://biomoby.org/
Web Services for Chemistry:
Problems
• Performance and scalability
• Proprietary data
• Competition from high-performance desktop
applications
-- Geoff Hutchison, it’s a puzzle blog, 2005-01-05
• ALSO:
– Lack of a substantial body of trustworthy Open
Access databases
– Non-standard chemical data formats (over 40 in
regular use and requiring normalization to one
another)
Missing Ingredients in Chemistry
• Chemical communities to assemble Open
Access databases
– Well-defined quality assurance procedures
performed by distributed peer-review systems
– Software underlying the databases needs to
be open source.
Chemistry Databases on the Web
• Marc Nicklaus lists 37 databases as of
October 2001
– Must have structure searching and at least
100 molecules
– http://cactus.nci.nih.gov/ncidb2/chem_www.html
• SoaringBear’s List has 15 databases
– http://geocities.com/soaringbear/biomed/chem.html
Institutional Repositories
• NARSTO Quality Systems Science Center
– http://cdiac.esd.ornl.gov/programs/NARSTO/
– Pollutant species in the troposphere over
North America
– Part of the Carbon Dioxide Information
Analysis Center at ORNL
– NARSTO Data and Information Sharing Tool
• http://mercury.ornl.gov/narsto/
Public Data Repositories
• Developmental Therapeutics Program/NCI
– Some assay data for download
– Structures for over 200,000 compounds
• http://dtp.nci.nih.gov/docs/dtp_search.html
• Zinc and other screening databases
• NIST computational chemistry database
• Environmental fate and exposure
databases
Other Public Repositories 1
• ChemExper Chemical Directory
– > 200,000 substances; > 10,000 IR spectra
– http://chemexper.com/
• HIC-Up; Hetero-Compound Identification Centre
– Uppsala
– 5384 substances as of 1/15/05
– http://xray.bmc.uu.se/hicup/
• Chemicals with Pharmaceutical Activity; a 3D
Structural Database
– 400 3D structures
– http://www.chem.ox.ac.uk/mom/chemical-database/
Other Public Repositories 2
• Cheminformatics.org
– 41 data sets in 9 categories as of 8/18/05
– http://www.cheminformatics.org/
• WebReactions
– http://webreactions.net/
Other Public Repositories 3
• MolTable
– http://www.moltable.org/
• MatWeb Materials Property Data
– http://www.matweb.com/index.asp?ckck=1
• Spectral Database for Organic Compounds (SDBS)
– Over 32,000 compounds
– Has EI-MS, FT-IR, 1H NMR, 13C NMR, Raman, ESR
– http://www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi
• NMRShiftDB (Christoph Steinbeck)
– 14,753 structures as of 8/19/05
– Features peer-reviewed submission of data sets
– http://www.nmrshiftdb.org/
Other Public Repositories:
Commercial Teasers
• FTIRsearch.com (Thermo Electron)
– Demo file of 575 spectra from 87,000 in the full database
– https://ftirsearch.com/default3.htm
• ChemACX
– 30 of >350 suppliers catalog data
– http://chemacx.cambridgesoft.com/chemacx/index.asp
• Sunset Molecular Discovery, LLC
– Wombat (World of Molecular BioAcTivity)
• 117,007 entries with over 230,000 biological activities
– Wombat PK
• Database for Clinical Pharmacokinetics: 643 substances with 4668
measurements
– Three sample files from Wombat containing 341 Histamine-1 receptor
antagonists
– http://www.sunsetmolecular.com/
BlueObelisk.org
• A group of chemists, programmers, and
informaticians working collaboratively on
projects such as:
–
–
–
–
–
–
–
–
–
Chemistry Development Kit (CDK)
JChemPaint
Jmol
JUMBO
NMRShiftDB
Octet
Open Babel
QSAR
World Wide Molecular Matrix (WWMM)
Indiana University Existing Projects
• System for the Integration of
Bioinformatics Services (SIBIOS)
– http://sibios.engr.iupui.edu
• PlatCom: A Platform for Computational
Comparative Genomics
– http://bio.informatics.indiana.edu/sunkim/Platcom/
• Reciprocal Net
– http://www.reciprocalnet.org/index.html
Indiana University Planned Projects
• Design of a Grid-based distributed data
architecture
• Development of tools for HTS data analysis and
virtual screening
• Database for quantum mechanical simulation
data
• Chemical prototype projects
– Novel routes to enzymatic reaction mechanisms
– Mechanism-based drug design
– Data-inquiry-based development of new methods in
natural product synthesis
Web Services Future
• Depends on
– Adoption of standards
– Incorporation of WS in current and newly
developed applications
– Security, privacy, quality of data issues
– Development of WS tools and resources for
e-Science
Download