Vision Document version 2.0 - People

advertisement
Software Requirements Specification
1. Introduction
1.1 Purpose
This software requirements specification (SRS) describes the functions and
requirements of the web-based ESTMD system. The document is used during the
specification process. It is the baseline of developing, validating and testing the
software.
The intended audience for the SRS is the faculty and researchers of Biology
department of Kansas State University.
1.2 Scope
This software requirements specification defines the requirement specification of
the web-based ESTMD system. The system will focus on searching different level
sequences (such as raw sequences, cleaned sequences, and assembled sequences)
and related information (such as Gene Ontology, pathway) efficiently.
1.3 Definitions
ESTMD – Expressed Sequence Tag Model Database.
Java Servlets – A Java applet that runs within a web server environment.
JSP – JavaServer Page. It provides a simplified, fast way to create dynamic web
content.
XML – Extensible Markup Language. A subset of SGML constituting a particular
text markup language for interchange of structured data. The Unicode Standard is
the reference character set for XML content.
XSLT – Extensible Stylesheet Language Transformations. An XSLT style sheet
specifies the presentation of a class of XML documents by describing how an
instance of the class is transformed into an XML document that uses a formatting
vocabulary, such as (X)HTML or XSL-FO.
1.4 References
IEEE STD 830-1998, “IEEE Recommended Practice for Software Requirements
Specifications”, 1998 Edition, IEEE, 1998
Marty Hall, “Core Servlets and JavaServer Pages”, Prentice Hall PTR, 2000
ESTAP, “http://www.vbi.vt.edu/~estap”
1.5 Overview
This document provides a description of the requirements for the web-based
ESTMD system. Section 2 is the overall descriptions of the package including
major components and product design. Section 3 provides specific requirements
of different components and performance criteria.
2. Overall Description
1
This section provides an overview of key web-based ESTMD system requirements. It
is intended for general information only, and does not describe all the details of the
various items.
2.1 Product perspectives
This product is a web-based database system. The main functions are: allow users
to query information by inputting some keywords or IDs through web interfaces.
They may also download sequences from the database or submit data to the
database.
2.1.1
System Interfaces
 Access data from database
 Handle data submission
2.1.2
User Interfaces
All the user interfaces are web-based.
 Main page
Figure 1. Snapshot of the main page shows all the functions and search tools.

Search in Detail
2
Figure 2. Snapshot of the “Search in Detail” page.

Search by Keyword
Figure 3. Snapshot of the “Search by Keyword” page.
3

Gene Ontology
Figure 4. Snapshot of the “Gene Ontology” page.

GO Classification
4
Figure 5. Snapshot of the “Gene Ontology Classification” page.

Pathway
5
Figure 6. Snapshot of the “Pathway” page.

Downloads
Figure 7. Snapshot of the “Downloads” page.

Data Submission
6
Figure 8. Snapshot of the “Data Submission” page.
2.1.3
Hardware Interfaces
Server side:
 Speed: Pentium 4 Processor at 2.8 GHz
 CPU Architecture: x86
 Network/connection architecture: TCP/IP, HTTP protocol
 Storage: 120 GB Ultra ATA/100 Hard Drive
 Memory: 1 GB PC 1066 RDRAM
Client side:
 Network connection
2.1.4
Software Interfaces
Server side:
 Operating System
The software can be run on multiple platforms such as Microsoft
Windows, Linux, and UNIX systems.
Name
Mnemonic
Version
Source
Microsoft Windows Windows
UNIX
UNIX
2000,XP
V1.1.7
Microsoft Corp.
Sun Corp.
Red Hat Linux
V9.0
Red Hat Corp.
Linux
7


Web server: Apache 2.0
Database server: MySQL 4.0
Client side:
 Internet Browser: Internet Explore or Netscape
2.2 Product functions (use case)
Search in Detail
Search by Keyword
Contig View
Gene Ontology
User
Login
Tree View
GO Classification
Pathway
Download
Diagram: Use Case
Project: ESTMD System
Author: Yinghua Dong
Data Submission
Figure 9. Use Case for ESTMD System
The product has the following main functions as shown in Figure 9:
 Search detail information on different levels of sequences
 Search general information on different levels of sequences
 Search gene ontology information
 Classify gene ontology of the sequences
 Search pathway information
 Download or submit data
2.3 User characteristics
8
It is necessary to know how to use a mouse, keyboard and Internet browser. The
user interface will be friendly enough to guide the users. Knowledge of basic
biology will be performed.
2.4 Constraints
The main constraint of the project is MySQL 4.0 database management system.
MySQL is faster than Oracle on small to medium sized databases, and is easy to
administrate. But MySQL is less powerful on complex queries.
Another constraint is that some data are not available now. Some related
databases need to be downloaded from other web sites or obtained from the labs
of Biology department.
3. Specific Requirements
3.1 External Interfaces
There are 7 main web pages as user interfaces in this system. The detailed
descriptions of all inputs and outputs of the system are as follow:
3.1.0
Login
Inputs: user name and password
Outputs: if correct, show main page; otherwise, show error message.
3.1.1
Search in Detail
Inputs: choose items from drop-down boxes, type gene symbol, gene name, or
any type of ID (such as unique sequence ID, clone ID, FlyBase ID, Genbank
ID or accession ID), and check the check boxes of the features which users
expected in the results.
Outputs: the corresponding features according to users' selections.
3.1.2
Search by Keyword
Inputs: choose items from drop-down boxes, and type keyword
Outputs: a table includes clone ID, raw sequence length, cleaned sequence
length, unique sequence ID, unique sequence length, gene name, and gene
symbol.
3.1.3
Gene Ontology Search
Inputs: choose items from drop-down boxes, type a single gene symbol/name,
or ID, or choose a local file containing a batch of sequence IDs or FlyBase
IDs; and choose radio boxes of gene ontology type and sort by options.
Outputs: the results table includes GO ID, term, type, sequence ID, hit ID
(FlyBase ID), and gene symbol. The hyperlinks on terms can show Gene
Ontology tree structure.
3.1.4
Gene Ontology Classification
9
Inputs: a batch of gene symbols/names, or choose a local file containing
sequence IDs; and check items from checkboxes of gene ontology types
which users want to classify.
Outputs: a table includes gene ontology type, subtype, sequence count, and
percentage of sequences.
3.1.5
Pathway Search
Inputs: choose items from drop-down boxes, type a single gene symbol/name,
ID, EC number, pathway name, or choose a local file containing a batch of
sequence IDs or FlyBase IDs; and choose radio boxes of “sort by” and “search
scope”.
Outputs: the results table includes pathway name, category, sequence ID, EC
number and Enzyme count.
3.1.6
Downloads
Inputs: click the item which user wants to download.
Outputs: the corresponding sequences information
3.1.7
Data Submission
Inputs: data information and user information
Outputs: a success or failure message
3.2 Functions
The validity of inputs will be checked on the client side. Error and exception will
be handled on the server side.
3.3 Logical Database requirements
Figure 10 shows the Entity-Relationship model of ESTMD.
3.4 Software System Attributes
3.4.1 Efficiency
With traditional CGI, a new process is started for each HTTP request.
However, with servlets, the Java virtual machine stays running and handles
each request with a lightweight Java thread. If there are N requests to the same
CGI program, the code for the CGI program is loaded into memory N times.
With servlets, however, only a single copy of the servlet class would be
loaded. This approach reduces server memory requirements and saves time by
instantiating fewer objects. Servlets remain in memory even after they
complete a response, so it is straightforward to store arbitrarily complex data
between client requests.
3.4.2
Platform-independence
Servlets are the Java platform technology of choice for extending and
enhancing web servers. They provide a component-based, platformindependent method for building web-based applications.
10
Figure 10. E-R Model for ESTMD
3.4.3
Convenience
Web interfaces make the system easy to use. User only needs to know how to
use a web browser and does not need to download, install, or learn any special
software.
3.4.4
Reliability
HTML with JavaScript will validate user input on client side. Exceptions and
errors on server side will be handled by java exception handling.
3.4.5
Security
In traditional CGI, the programs are often executed by operating system
shells, and processed by languages that do not automatically check array or
string bounds. Servlets suffer from neither of these problems. Even if a servlet
executes a system call to invoke a program on the local operating system, it
does not use a shell to do so. And array bounds checking and other memory
protection features are a central part of the Java programming language.
11
Three-tier structure can make the data safe. The client tier is not in direct
communication with the database. In order to send or receive data it must
communicate with the application-server tier which in turn communicates
with the data-server tier.
12
Download