MIAPE API.

advertisement
New tools for MIAPE Generation
Emilio Salazar Doñate
Bioinformatics Group
CNB – CSIC
Overview
The Motivation
What is an API?
What is the MIAPE API?
How can I use it?
A use-case of the MIAPE API. The Pride MIAPE Converter.
Future Plans.
Motivation
The report of MIAPEs (Minimum Information About Protein
Experiments) is recommended by the most prestigious journals.
The creation of these reports is at present a laborious task as it
is done manually, and very often the scientist skip it.
Motivation
On top of that, the unnecessary duplication of data is obvious,
why, if the data of the experiment are already there, must I report
them again?
Raw Data, XML Files…
The logical approach is to automate the process of generation of
MIAPEs. That is what the Java MIAPE API aims to do.
API (Application Programming Interface)
An API is an abstraction that defines and describes an interface
for the interaction with a set of functions used by components of
a software system.
The software that provides the functions described by an API is
said to be an implementation of the API.
Therefore an API is not an application, but a helpful tool to be
used programmatically by third party users.
API (Application Programming Interface)
APIs, depending on the type of usage they can be:
•
Local: They are installed in the final user (client) and they do
not need interaction online whatsoever.
- the full set of an API that are bundled in the libraries of a
programming language, like Java, C….
•
Remote: It is provided from another terminal (servers),
somewhere in the internet. Normally these servers have a
great computational power, and are a great help to perform
heavy processes which otherwise could not be executed by
normal computers.
- The Blast Service at EBI can accessed programmatically
via web service
API (Application Programming Interface)
•
Hybrid: Some components need to be downloaded in the
client and others need remote access.
- The Google Maps API
API (Application Programming Interface)
Why to use an API?
To illustrate the problem, let’s see a typical situation:
How to sort a list of Integers?
{2, 71, 38, 16, …… An -1, An}
a) Beginners in programming:
Search for the smallest number and place it on the right of
the sorted list
Complexity: O(n²)
API (Application Programming Interface)
b) Computer Geeks:
function mergesort(array A[x..y]) begin if (x-y > 1)): array A1 :=
mergesort(A[x..(int( x+y / 2))]) array A2 :=
mergesort(A[int(1+(x+y / 2))..y]) return merge(A1, A2) else:
return A end function merge(array A1[0..n1], array
A2[0..n2]) begin integer p1 := 0 integer p2 := 0 array
R[0..(n1 + n2 + 1)] while (p1 <= n1 or p2 <= n2): if (p1 <= n1
and A1[p1] <= A2[p2]): R[p1 + p2] := A1[p1] p1 := p1 + 1 if
(p2 <= n2 and A1[p1] > A2[p2]): R[p1 + p2] := A2[p2] p2 :=
p2 + 1 return R end
Complexity: O(n.logn)
If n = 1000.
a) Complexity: 1,000,000
b) Complexity 9,996 (less than 1%)
API (Application Programming Interface)
C) Using the Java Collections API:
Collections.sort(List list)
They claim:
The sorting algorithm is a modified mergesort (in which the
merge is omitted if the highest element in the low sublist is
less than the lowest element in the high sublist). This
algorithm offers guaranteed n log(n) performance.
Java MIAPE API
The Java MIAPE API is a set of libraries and webservices
which provide functionality to retrieve, store and exchange
MIAPEData in a efficient manner.
It has four main modules:
•
•
•
•
XML module (parses and creates XML files with MIAPE
information)
Database Manager (includes a web service which connects
with the ProteoRed database).
MIAPE Factory: Allows the creation of MIAPE data
regardless the source of information.
Entity MIAPE: The connection between modules.
Java MIAPE API
API Architecture:
Java MIAPE API
Why to use the Java MIAPE API:
•
•
•
•
•
It is developed in Java, a widely used, platform independent
language.
It is implemented based in the most popular Design patterns
to ensure performance.
Built multi modular to encourage extension and re-usage of
code.
Provides a storage system for the user (although it allows a
customized one)
The software can store/retrieve some of the XML HUPO-PSI
standards (Pride at Present) with MIAPE Format
Java MIAPE API
•
Fully tested and documented
Java MIAPE API Usage

Manually behind a user interface: Setting the MIAPE Data
manually by the user.
- It is really tedious and time consuming
- Very accurate as the information is complete and no
redundant.
- It does not need intermediate sources like XML
- Already available at http://www.proteored.org

Automatically by parsing a XML file.
- It saves time to the user although increases the
computational time.
- The accuracy is lower as the mapping between data is far
from 100%
- If the experiment information is stored as XML, seems a
natural way of use.
Java MIAPE API Usage
•
Integrated in third party software:
- The mapping between data is done by the programmers,
which means more work for them, but no work at all for the
users.
- The accuracy might be 100% when properly used.
- Full functionality only with Java.
A Use-Case: Pride-MIAPE Converter
Converter
Databa
se
MIAPE Generator
Tool
A Use-Case: Pride-MIAPE Converter
Most common XML Format: Pride.
Wide range of file sizes: from less than 1MB to 13GB
Issues:
• Problems for mapping: The XML allows anything as a user
parameter, which makes things complicated for an accurate
identification.
<cvParam cvLabel="PSI" accession="PSI:1000008"
name="Ionization Type" value="ESI" />
• Problems of text size: The XML allows text of any size in the
fields, but unfortunately the database has a limit, and this can
throw exceptions or truncate the text with the loss of
information.
• Problems to Upload files. A traditional web application has
limitations to upload files , and considering that files might be
as heavy as 13GB, the development must find a workaround.
A Use-Case: Pride-MIAPE Converter
•
It is definitely impossible to have a one to one Mapping
between the Pride File and the MS and MSI MIAPEs but it
can provide us with a very good starting point including the
Spectra, and the identification of Proteins and Peptides
(manually, this task is nearly impossible)
The MIAPE Generator Web is available to complete the
information missing.
•
Instead of a traditional Web Site, the application is a client –
server application, in which the transfer of the File is done
via FTP (File Transfer Protocol) which is far more reliable
than HTTP. The drawback is that the User needs to have
Java installed in the computer.
A Use-Case: Pride-MIAPE Converter:
Current State
It is available a beta version at
http://proteo.cnb.csic.es:9999/miape-webservice-pride/
It uses a Test Database and a Test version of the MIAPE
Generator Tool before a definitive version is online.
It should allow any size of files, although the times vary a lot,
depending on network traffic.
The most illustrative example is to map to both MS and MSI
MIAPE as it uses all the information in the MIAPE and generates
the mapping between the 2 MIAPE types.
A Use-Case: Pride-MIAPE Converter:
Current State
Launch the application with Java Web Start.
s
A Use-Case: Pride-MIAPE Converter:
Current State
First introduce user and password (It must match an existing one in the
ProteoRed Database.
Select a PRIDE file from your local computer and the type of MIAPE to
generate, and press start.
A Use-Case: Pride-MIAPE Converter:
Current State
After some time the application will have uploaded the file to the FTP server and
will have stored it as a MIAPE.
A browser will be opened with the address, showing what it stored in the
Database.
Future Plans
The API is not fully functional yet, it is expected to be finished in
May.
More XML formats will be used to be stored as a MIAPE:
•mzML
•mzIndentML
•GelML
Thank you!
Download