Sting Prototype Product Specification

advertisement
Running Head: Lab 2 – Sting Product Description
Sting Prototype Product Specification
SCORPION – Blue Team
Old Dominion University
CS411W – Janet Brunelle
Author: David Eason
Last Modified: November 12, 2014
Version: 2.0
1
Lab 1 – SCORPION Product Description
2
TABLE OF CONTENTS
1 Introduction ........................................................................................................................................................................... 3
1.1
Purpose .....................................................................................................................................................................3
1.2
Scope ..........................................................................................................................................................................4
1.3 Definitions, Acronyms, and Abbreviations ........................................................................................................6
1.4 References .......................................................................................................................................................................7
1.5 Overview ..........................................................................................................................................................................7
2
General Description ....................................................................................................................................................... 8
2.1 Prototype Architecture Description .....................................................................................................................8
2.2 Prototype Functional Description .........................................................................................................................9
2.3 External Interfaces ................................................................................................................................................... 13
3
Specific Requirements ................................................................................................................................................ 13
LIST OF FIGURES
Figure 1- Website GUI.......................................................................................................................................................... 10
Figure 2- Using the API........................................................................................................................................................ 12
LIST OF TABLES
Table 1 - RWP vs Prototype................................................................................................................................................. 9
Table 2- User Capabilities .................................................................................................................................................. 11
Lab 1 – SCORPION Product Description
3
1 INTRODUCTION
1.1 PURPOSE
At Old Dominion University, a program named SCORPION is used to predict the secondary
structure of a protein, and it is gaining popularity. This program is currently being used by
researchers around the world in an effort to aid in cancer research. A very important part of
cancer research and treatment involves the study of protein folding. A protein fold is made of a
sequence of proteins, and the proteins are made of a sequence of amino acids. Currently, the
most popular method of studying these sequences is slow and expensive. SCORPION gives a
researcher the ability to accurately predict the sequences of amino acids, resulting in what is
known as the protein’s secondary structure. With SCORPION as a tool for researchers, studies
can be done more efficiently and with fewer expenses.
Sting, the product being developed by the fall 2014 Blue team, will provide a new method to
access SCORPION, as well as address issues with the current website. SCORPION was
originally designed by Dr. Ashraf Yaseen as a graduate project that became popular with cancer
researchers and computational biologists due to its high accuracy. With the high level of
accuracy, growing numbers of people are accessing SCORPION to predict secondary structures
and only have one method of using it. Currently, a website user visits the website, manually
enters a series of characters that represent amino acids, enters an email address, and waits for the
results to be sent to them. Sting will streamline the use of SCORPION by using an Application
Programming Interface (API). The API will allow the user to programmatically submit queries to
SCORPION, allowing it to be used quickly and efficiently.
Sting will also provide a redesigned public website that will address issues relating to
compliancy and professional image. The most important issue that will be addressed is the
absence of government requirements. SCOPRION is partly funded by federal grants, and must
Lab 1 – SCORPION Product Description
4
be compliant with the laws listed in the Rehabilitation Act of 1973. Section 508 of this Act is
what SCORPION must specifically comply with, which was put in place to allow persons with
accessibility needs to use electronic media such as video and text. Sting will ensure compliancy
by using alternative text for images, using a color scheme friendly to the color blind, and detail
text used for screen readers. The new 508 compliance website will have a user friendly and
professional look, still providing the intended functionality to the user.
Both product and prototype will incorporate user logins and usage data to the administrators.
Website users will now have the ability to log in to see all of their previously submitted
sequences as well as the results. This will aid the user in tracking and organizing queries. For
logged in administrators, previous submissions and results will also be available as well as
information about current website statistics. These statistics will include page views and user
locations by country, allowing the customer to better understand the scope of the website users.
1.2 SCOPE
An API will give a new method of accessing SCORPION which will be interfaced with the
existing software. In order for more people to use it, it will be language agnostic, meaning any
computer programming language can be used with it. The goal for users is to be able to
efficiently and effortlessly send submissions to SCORPION for processing. Doing so allows
other software to interface with SCORPION to take advantage of the secondary structure
prediction accuracy.
Sting will give a new, professional look to SCORPION and a new means of using it. With
SCORPION associated closely with Old Dominion University, the website affects the
institution’s image. As SCORPION continues to grow in popularity, it should professionally
represent the organizations that own it. In order for SCORPION to be funded by federal grants
and pass an audit, the website will be section 508 compliant. The website will also incorporate a
Lab 1 – SCORPION Product Description
log in functionality, which helps users organize submissions and feel more connected to
SCORPION. Within the login functionality will be administrative specific information about
website usage. For both prototype and final product, a super-user account will be provided.
All statistical information, as well as user credentials, will come from third party sources.
Doing so will keep security concerns minimal and keep the website maintenance low. For the
prototype, the user will login with Google, and submissions will be stored into a SQLite 3
database. With the user’s unique ID given by Google, the submission can be tied to a user and
kept in the database.
(This space is intentionally left blank)
5
Lab 1 – SCORPION Product Description
1.3 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS
Amino Acids- Building blocks of proteins
API- Application Programmable Interface (abstract way for services to communicate)
Data cleansing- The process of removing non-representative instances from the data set.
GeoIP- Uses a lookup table of Internet Protocol addresses with known municipalities and
providers to match IP origin
GUI- Graphical User Interface
oAuth2- Standard format of authentication used by security professionals
Predict time- The calculated amount of time for SCORPION to receive input, predict a
secondary protein structure, and send results back to the user.
Protein Fold- A group of proteins made of amino acids that are formed into a functional shape.
Sanitize – Removing invalid amino acid characters from input.
SCORPION- SeCOndaRy structure PredictION
Section 508 Compliance- Guidelines established to make website content equally accessible to
people with disabilities. It is a part of the Accessibility Act of 1973.
(This space is intentionally left blank)
6
Lab 1 – SCORPION Product Description
7
1.4 REFERENCES
Biological Macromolecular Resource. (n.d.). RCSB Protein Data Bank. Retrieved Feb. 20, 2014,
from http://www.rcsb.org/pdb/home/home.do
Blue Team. (n.d.). SCORPION Protein Prediction Timed Experiment. . Retrieved February 11,
2014, from www.cs.odu.edu/~410blue/CS410SCORPIONProteinPredictionTimeEx
periment.xlsx
Cancer Research Funding - National Cancer Institute. (2013, August 23). Cancer Research
Funding National Cancer Institute. Retrieved May 8, 2014, from
http://www.cancer.gov/cancertopics/factsheet/NCI/research-funding
Freitas, R. (1998, January 1). Nanomedicine. Chapter 3 page 1. Retrieved May 8, 2014, from
http://www.foresight.org/Nanomedicine/Ch03_1.html
Lab 1 - SCORPION. Version 2. (2014, October). STING. Blue Team. CS411W: David Eason
Murphy, S. (2013, May 8). Deaths: Final Data for 2010. . Retrieved May 8, 2014, from
http://www.cdc.gov/nchs/data/nvsr/nvsr61/nvsr61_04.pdf
RCSB PDB - Histograms. (n.d.). RCSB PDB - Histograms. Retrieved May 8, 2014, from
http://www.rcsb.org/pdb/statistics/histogram.do?mdcat=mvStructure&mditem=residueC
ount&name=Residue%20Count
Section 508 . (n.d.). United States Department of Health and Human Services. Retrieved March
15, 2014, from http://www.hhs.gov/web/508/index.html
Section 508 Of The Rehabilitation Act. (n.d.). Section 508 Home. Retrieved March 15, 2014,
from http://www.section508.gov/Section-508-Of-The-Rehabilitation-Act
Yaseen, A., & Li, Y. Context-based Features Enhance Protein Secondary Structure Prediction
Accuracy.
1.5 OVERVIEW
This product specification provides the software configuration, required program
libraries, interface, and features of the Sting prototype. The information provided in the
remaining sections of this document includes a detailed description of required tasks and items
needed for the functional requirements. The product specification requirements provided in Lab
II Section 3.1 can be found in a separate document.
Lab 1 – SCORPION Product Description
2
8
GENERAL DESCRIPTION
Sting is a two part solution that allows individuals to access SCORPION via the Internet.
The majority of users who will access SCORPION will be through the new website which will
replace the existing one. The new website will automatically check user input for errors which
will increase overall performance of SCORPION, removing wasted computation time. The new
website will also feature optional login capabilities for users and administrators.
The second part of Sting consists of a public API for anyone wishing to streamline
SCORPION sequence submission. The public API is of a REST format so there will be less
communication over the network rather than other API formats. The API will handle sequence
submission to SCORPION as well as retrieving submission results. The documentation for using
the API endpoints will be simple and concise, allowing new users to quickly learn how to utilize
the services.
2.1 PROTOTYPE ARCHITECTURE DESCRIPTION
The SCORPION website will be redesigned and a public API will be integrated into the
system. The website will feature a navigation menu, user logins, 508 compliancy, and sequence
submission cleansing. For website administrators, website statistics and user information will be
displayed. Users will have access to all their previously submitted sequences as well as results.
When the user submits a sequence through the website, embedded JavaScript code will check for
invalid characters. If there are any illegal characters in the submission they will be removed. The
integrated API will be on the same server as the website and will be accessible to anyone with
access to the Internet. The API will allow for streamlined submissions to SCORPION, speeding
up the time it takes for researchers to map protein folds. With a unique ID returned to the API
user after a submission, the user can use it to find results of the completed job.
Lab 1 – SCORPION Product Description
9
The differences between the product and prototype are functionality based, and are
illustrated in Table 1.The prototype will only use Google for user authentication, whereas the
product would use Yahoo, Facebook, and Amazon as well. For the prototype, the submissions
are sent through a simulated version of SCORPION for demonstration purposes. The API would
function the same way in both the prototype and product.
Function
Real World Product
Prototype
Google Authentication
x
x
Yahoo Authentication
x
Facebook Authentication
x
Amazon Authentication
x
Administrator Profiles
x
x
User Profiles
x
x
Automatic Gathering of User Data*
x
Website Statistics
x
x
Public API
x
x
* Users will manually enter locations with the prototype
TABLE 1 - RWP VS PROTOTYPE
2.2 PROTOTYPE FUNCTIONAL DESCRIPTION
The Sting prototype will give SCORPION a new website. The website will have a new
look and feel while providing more functionality for sequence submissions. The home page,
illustrated in Figure 1, shows the new layout each web page will have. This format will be
consistent, and the contents will be filled dynamically with information for a logged in user or
administrator. After receiving input from a user for an amino acid sequence, the sequence will be
checked for invalid characters, and if found, alert the user that they have been removed.
Lab 1 – SCORPION Product Description
10
FIGURE 1- WEBSITE GUI
Another feature provided in the prototype website will be user logins. Authentication will
be provided by the Google oAuth API. This authentication method allows the user to
authenticate with an existing account associated with Google. The user logs in, and then Google
sends back user information. The information sent back to the webpage will be the user’s name,
email address, and unique ID. With this functionality, users will be greeted with their Google
associated name on the homepage. When the user submits a protein sequence, they will not be
required to manually enter an email address. When no email address is entered, the default will
be the associated login email address. Upon submission, the database will be updated to contain
the email address, unique ID, and submitted sequence. Logged in users will also have access to
see previous submissions and results that is stored during submission.
Logged in administrators will have a section on the website showing user statistics as
well as previous submission data. Administrators will also be able to log in and use the website
as all other users. On each web page, a section of code will be placed that sends data about each
visitor to Google Analytics. This data will later be collected using the Google Analytics API and
Lab 1 – SCORPION Product Description
11
return information about page views and user locations. Table 1 illustrates the abilities of logged
in users, non logged in users, and Administrators.
Function
Submit sequence
Greet message
View previous submissions
View previous submission results
Access to page views
Access to user locations
Not Logged In
x
User Type
Logged In
x
x
x
x
Administrator
x
x
x
x
x
x
TABLE 2- USER CAPABILITIES
The website will be 508 compliant. The current website does not conform to government
standards required for federally funded projects, and if audited, this may cause financial penalties
for Old Dominion University. For compliance, the prototype will contain accompanying text that
would allow for disabled users with accessibility software to use the website. Users who are
vision impaired will be able to use a screen reader, which would read aloud the accompanying
text and website directions, giving the ability to navigate the page and submit sequences.
The public API prototype will be integrated with the same system that the website is for
sending submissions. The user will be able to submit an amino acid sequence accompanied by a
subject and email through Internet protocols. Users will receive a unique ID upon submitting that
can be used to get results programmatically.
The API will be accessed via the Internet and will use the Hypertext Transfer Protocol
(HTTP) methods, GET and POST. The process for sending and receiving data to the Sting API is
illustrated in Figure 2. The POST method will be used to submit a sequence and will return a
unique ID. The GET method will be used to retrieve submission results based upon the ID. All
other HTTP methods will be refused, increasing security on the system.
Lab 1 – SCORPION Product Description
12
FIGURE 2- USING THE API
The prototype website and API will submit sequences to a mock SCORPION algorithm.
For prototyping purposes, the real version of SCORPION will take a long time to process
submissions and return results for testing. Instead, the Sting will use an algorithm that takes input
like the SCORPION neural network, but returns a random sequence of amino acid characters.
Lab 1 – SCORPION Product Description
13
The prototype will be architected in such a way that the real SCORPION system may be easily
integrated.
2.3 EXTERNAL INTERFACES
There exist no external interfaces needed in construction of the prototype. For the final
product, no external interfaces will exist either.
3
SPECIFIC REQUIREMENTS
All requirements for the Sting prototype exist in a separate document named, “Lab 2
Section 3”. The functional requirements are explicitly listed in section 3.1 and all aspects of the
prototype are detailed. Specifications on how to configure external applications are also listed in
this document.
Download