Supplementary Text and Figures

advertisement
Supplementary Table 1: Sharing Initiatives
Project name
Commercial
DNAnexus
http://sra.dnanexus.com/
Illumina
https://basespace.illumina.com/home
Life Technologies Ion Torrent
Community
http://www.iontorrent.com/communit
y/
Complete Genomics
http://www.completegenomics.com/s
ervices/data-management-analysis/
Genedata
http://www.genedata.com/profession
al-service/data-analysis.html
GenomeQuest
http://www.genomequest.com/techno
logy
ID Business Solutions
http://www.idbs.com/products-andservices/inforsense-suite/
Non-commercial alliances
Pistoia Alliance
http://www.pistoiaalliance.org/
BioIT Alliance
http://bioitalliance.org/
Non-profit initiatives
BioSharing
http://Biosharing.org
crowdLabs
http://www.crowdlabs.org/
Galaxy
http://galaxy.psu.edu/
myExperiment Virtual Research
Sharing functionality
Sharing trends
Hosts National Center for
Biotechnology Information
(NCBI) Sequence Read
Archive (SRA) for raw
sequence data from next-gen
sequencing platforms.
Data-sharing, analysis and
storage for Illumina platform
users
User portal share data,
protocols and code
Built user interface and mirrors 300-400
terabytes of SRA data (no medical data) on
Google’s cloud. Offers proprietary cloud-based
analysis and visualization tools that users can
share.
Offers sequencing services,
data management, analysis,
and results sharing.
Software has built-in sharing
functionality for power-users
and workflow users
Downstream analysis and data sharing services
invade market of software service providers.
Sharing functionality built
into analysis tools
Software and consulting firm
with platform for data
analysis and data integration,
InforSense Suite
Description
Collaborative group of
pharma and life sciences
companies exploring precompetitive data-sharing.
Founded by Microsoft, now a
non-profit organization.
International network of
organizations geared toward
data-sharing and
standardization in the life
sciences
Repository for computational
workflows (not only life
sciences); offers access to
high performance computing
Web and cloud-based open
source sequence analysis
tools
Collaboration between the
BaseSpace, genomics data-sharing space on
Amazon’s AWS cloud infrastructure, in beta
testing phase. Requires user registration.
Sharing portal Ion Torrent Community requires
user registration
Positioned for pharma outsourcing and publicprivate projects, such as Europe’s
InnoMedPredTox.
http://www.imi.europa.eu/content/pilot-projectinnomed
As customer sharing behavior changes, less
interest in data storage. More sharing of analysis
results that sharing of raw data.
Lung Genomics Research Consortium expanded
one of suite’s components, ClinicalSense, for its
data analysis and sharing portal
Projects
Launches data-sharing projects for next-gen
sequence data, biomarker exchange standards.
Runs competitions, for ex. Sequence Squeeze
Competition seeking algorithm to compress
next-gen sequence data
Seeks to create standards—data models and
transmission standards—to enable data-sharing
in translational medicine
Developed standard called ISA Commons to
streamline data sharing
Uses VisTrails, an open source workflow
system.
Galaxy Pages lets users see, re-use, and extend
workflows
http://wiki.g2.bx.psu.edu/Learn/Galaxy%20Page
s
Platform to share workflows. Users can share
Environment
http://www.myexperiment.org/
National Center for Biotechnology
Information (NCBI)
http://www.ncbi.nlm.nih.gov
Sage Bionetworks
http://sagebase.org/
World Wide Web Consortium
Semantic Web
http://www.w3.org/2001/sw/
Workflow 4 Ever
http://www.wf4everproject.org/web/guest/home
Sharing networks and repositories
BioPortal
http://bioportal.bioontology.org/
Concept Web Alliance
http://www.nbic.nl/aboutnbic/affiliatedorganisations/cwa/introduction/
Cytoscape
http://www.cytoscape.org/
Datacite
http://datacite.org/
Force11
http://force11.org/
universities of Southampton,
Manchester and Oxford in
the UK.
Online resources with
databases and analysis tools.
A division of the National
Library of Medicine at the
National Institutes of Health.
A non-profit focused on
sharing science founded by
former Merck researchers
Stephen Friend and Eric
Schadt
Part of the international
community organization
World Wide Web
Constorium (W3C)
Web-based resource to
preserve and share methods
and workflows.
workflows openly or keep them private.
Repository run by The
National Center for
Biomedical Ontology, part of
the National Centers for
Biomedical Computing
Group effort addressing
semantic web applications,
based at The Netherlands
Bioinformatics Centre
Open-source software to
analyze and visualize
biological networks
A non-profit, international
organization of libraries
Has portal stores over 300 controlled
vocabularies and ontologies in biomedicine.
Users can submit download ontologies and
upload them to share with others.
DNA sequence resource, GenBank, run by NIA,
EMBL and DNA DataBank of Japan. Dozens of
terabytes of data are downloaded from NCBI
resources every day.
Launches research collaborations, for example
the public-private CommonMind Consortium to
share neuropsychiatric disease
Has groups devoted to data-sharing in the life
sciences, for example Semantic Web Health
Care and Life Sciences Interest Group
Has partners in genomics and astronomy.
Complementary to SHIWA (Sharing
Interoperable Workflows for large-scale
scientific simulations on distributed computing
infrastructures)
Establishing uniform, user-friendly online
platform for text-mining from published texts,
databases, and offline resources.
Developers are working on a database for
sharing network models.
Offers service for data publishers to mint Digital
Object Identifiers (DOIs) for data-sharing. DOIs
are also available for datasets. Datacite is
compiling a list of research data repositories and
working on ways to use DOI to retrieve
metadata.
Formed in 2011 to explore new ways to share,
create, and communicate scholarly knowledge.
A group of editors,
publishers, scientists
librarians, and research
funders
Genocoding Project
A data harvesting initiative
Software tool scans journal papers for genomic
http://text.soe.ucsc.edu/
based at the university of
identifiers and maps them to human genome.
California at Santa Cruz and
the University of Manchester
Nanopublications
A venture seeking to use
Nanopublications are being tested in Open
http://www.nanopub.org
semantic tools to harvest
Pharmacological Concepts Triple Store (Open
assertions and to them with
PHACTS), a European public- private venture
DOIs
http://www.openphacts.org
EMBL: European Molecular Biology Laboratories. Sources: Nature Biotechnology research, Frost &
Sullivan, company data
Download