directions for selected sections in the masters project proposal

advertisement
Project Proposal Outline
i. Title Page - student does
ii. Approval Page - student does
iii. Table of Contents - student does
1. Introduction and Background
1.1 Problem Statement - student does
1.2 Previous Work - student extracts from materials
1.3 Background - student extracts from materials
1.4 Glossary - student extracts from materials
2. Project Description
2.1 Functional Specification
2.1.1 Functions Performed - customize from below
2.1.2 Limitations and Restrictions - customize from below
2.1.3 User Interface Design [if required] - n/a
2.1.4 Other User Inputs [if required] - n/a
2.1.5 Other User Outputs [if required] - n/a
2.1.6 System Data Files - provided below
2.2 Design Specification
2.2.1 System Data Flow Diagrams - customize from this document
2.2.2 System Structure Chart - student does
2.2.3 System Data Dictionary - provided below
2.2.4 Equipment Configuration - provided below
2.2.5 Implementation Languages - provided below
2.3 Implementation Plan
2.3.1 Deliverable Items - student does, notes provided below
2.3.2 Milestone Descriptions - student does
2.3.3 Milestone Completion Criteria - customize from below
2.3.4 Schedule of Milestone Completion - provided below
3. References - student does
4. Qualifications - student does
4.1 Personal Background
4.2 Courses Taken
4.3 Programs Written
4.4 Investigations
4.5 Projects
5. Grading Criteria - provided below
Detailed Description of Outline Sections
You will be integrating a library system (“XXX” in the text below) into the IntegraL Digital
Library Integration infrastructure. Your masters project will be writing this “XXX Integrator” (also
called the “IntegraL Plugin”, wrappers, and document schema mappers). Your integrator will run
in the background. You do not need to write any interface. Instead you will be creating the linking
rules, parameters and document input that the IntegraL infrastructure uses to create the link anchors
and links.
from Computer Science Project Guide - Page 2
You will need to customize each of these descriptions for your XXX Integrator.
1. Introduction and Background
1.1 Problem Statement - student does
1.2 Previous Work - student extracts from materials
1.3 Background - student extracts from materials
1.4 Glossary - student extracts from materials
Your four sections need to be different from everybody else’s including from your project partner.
Please write this from scratch yourself. The glossary, especially should match your project.
2.
Project Description
The purpose of this section is to describe the proposed project in detail: what will
you do, how will you do it, and when will you do it.
2.1
Functional Specification
This is a detailed specification of functions performed by the proposed
system, from an external or user perspective, not from an internal or
programmer viewpoint. Thus, the system is regarded as a black box with
various inputs and outputs related by the functions performed by the system.
The description should be sufficient for another programmer to implement
the system.
2.1.1
Functions Performed
List and briefly describe each of the functions which the system will
be designed to perform for its user: What the system will do.
The XXX Integrator ensures that IntegraL can automatically add link anchors to the pages
generated by the XXX system. The XXX Integrator will perform the following functions.
1. Parsing pages
Whenever a page of information is about to display to the user, the XXX integrator will parse the
page to identify the elements-of-interest, determine their location and assign each a unique ID. The
XXX integrator will create an XML message that includes the page, and this information and its
elements-of-interest. It will also pass the page to the IntegraL lexical analyzer to gather keyword
elements-of-interest and receive the results.
To distinguish your report from your partners’ you should list the pages you will be integrating
here. State that more details will be given in section 2.3.1.
2. Pass commands to the XXX system
from Computer Science Project Guide - Page 3
Whenever the user selects a link to the XXX system from the list of links generated by IntegraL,
the XXX integrator will pass that command to the XXX system. (These links are generated by the
linking rules described later (also called mapping or relationship rules).)
2.1.2
Limitations and Restrictions
List and describe each of the internal (self) and external
(environment) limitations and/or restrictions on the range of system
functions: What will the system not do. DO NOT INSULT THE
READER BY INCLUDING ITEMS THAT WOULD NOT BE A
SURPRISE.
1. There is no user interface design involved in this project. IntegraL provides the user interface.
The XXX Integrator will only provide background functionality.
2. The XXX Integrator only parses documents of certain types. Currently these are HTML
documents. But a final determination will be made in September (based on work outside this
masters project) whether to also parse PDF and MS Word documents. In this case the appropriate
parsing tools will be provided.
2.1.3
User Interface Design
Give a detailed description of the system user interface including
diagrams of all the ``work'' windows (or screens or panes), a table of
operations for each work window, and precise descriptions of each
operation that the user would regard as unfamiliar. A work window
is one that contains data the user is editing, browsing or viewing.
This section is required for all programs that engage the user
interactively. Refer to the sample in Section 3.4 of this document.
2.1.4
Other User Inputs
Give a precise description of the other inputs to the system including
source (human or storage) syntax (format) and semantics (meaning).
Give examples. This section is required for all programs that obtain
input from their environment non interactively.
2.1.5
Other User Outputs
Give a precise description of the other outputs of the system
including syntax and semantics. Correlate the outputs with the inputs
and the functions performed. Give examples. This section is required
for all programs that obtain input from their environment non
interactively.
n/a
n/a
from Computer Science Project Guide - Page 4
n/a
2.1.6
System Data Files
Give a precise description of the data files created or maintained by
the system. Thus, for example, you would include files in a database
and you would exclude executable files and text files.
configuration file: Configuration options for the module – including all URLs, parsing
constants, database connection parameters (if any), logging options etc.
log4j.properties file: IntegraL uses log4j as the logging framework – all logging statements
must conform to this specification, configuration options are also reviewed from the
log4j.properties file which must be present in the classpath of the application.
Database table structure (if any) used by the module.
NOTE no option shall be hard-coded into the source code. If the student delivers code that
has hard-coded constants or configuration parameters this will result in a significant
reduction in grade – list all module specific configuration options here.
System-wide configuration options:
o Configuration for the Mapping Rules Engine.
o Mapping Rules registry – an XML file that contains a dictionary of types and their
applicable mapping rules.
o Logging options for IntegraL
o Database connection parameters
2.2
Design Specification
This is a top level preliminary or provisional indication of the proposed
system architecture and flow. You should correlate system functions with
system structure and interface specifications.
2.2.1
System Data Flow Diagrams
This is a hierarchical (or leveled) set of diagrams showing the flow of
data elements into and out of the functional units of the program,
data stores and environmental sources and sinks. Labeled arrows
denote data flows. This diagram is complementary to the structure
chart described next. Refer to the sample in Section 3.4 of this
document.
IntegraL is a loosely coupled system built on the IBM Web-Based Intermediary (WBI) proxy
platform. IntegraL is a proxy server that allows users to browse the WWW through it.
from Computer Science Project Guide - Page 5
IntegraL
Request Editor
Web
HTMLTokenizer
Plugin
Mapping Rules
Engine
When the user browses to page his browser sends an HTTP request to the website via the IntegraL
proxy server. IntegraL modifies the request and stores user parameters for user-browsing analysis.
As the destination web server returns its response it passes through the IntegraL tokenizer. The
IntegraL tokenizer then parses the HTML stream into HTML tokens.
The XXX Integrator (a.k.a IntegraL plugin) then receives this tokenized stream and marks up its
elements of interest. This marked up stream of HTML tokens is then sent to the Mapping Rules
Engine (MRE). The MRE parses these marked up tokens, and supplements with links based on the
semantic type of the elements of interest (specified by the mapping rules, also called linking or
relationship rules).
2.2.2
System Structure Chart(s)
This is a (set of) chart(s) showing the functional units of the system
hierarchically organized to show which units call, use or contain
other units. Each interface between two units (a call) is annotated
with small arrows and data item labels to show the data exchanged
between the units. Refer to the sample in Section 3.4 of this
document.
[The student should add a flowchart describing the core units of XXX integrator]
To distinguish your report from your partners’ you should include the modules from the XXX
system that your project will be integrating within the structure charts.
2.2.3
System Data Dictionary
This is a comprehensive dictionary of all the data items that appear
in the system data flow diagrams and the structure charts. At a
minimum it contains, for each data item, its identifier, any
abbreviation used instead of the identifier, the name of the type of the
data, and a definition of the data item in the form of either a symbolic
expression or a precise description. Refer to the sample in Section
3.4 of this document.
from Computer Science Project Guide - Page 6
IntegraL's data model is based on the HTTP request/response structure. Each request and each
response consists of a structured part and a stream part. The structured part corresponds to the
header and the stream part corresponds to the body.
HEADER (DocumentInfo)
POST http://www.ibm.com/java HTTP/1.0
User-agent: MyBrowser
Accept: text/html
Content-length: 15
BODY (MegInputStream)
My name is Paul
Figure 1. An HTTP request containing both a header (structured) part and a body (stream) part.
Figure 1 shows an HTTP request that contains both a header and a body. When IntegraL receives
an HTTP request, it is parsed into these two parts. The header information is stored in an object of
class DocumentInfo and the body information is made available through an object of class
MegInputStream. The body information can then be read from the MegInputStream using its read(...)
methods.
HEADER (DocumentInfo)
HTTP/1.0 200 Ok
Server: MyWebServer
Content-type: text/html
Content-length: 36
BODY (MegInputStream)
<html>
<h1>Hello, world</h1>
</html>
Figure 2. An HTTP response containing both a header (structured) part and a body (stream) part.
Figure 2 shows a typical HTTP response. When IntegraL receives this response, it is parsed in the
same way as a request. The header information is stored in a DocumentInfo object and as with the
Request the body is made available through a MegInputStream object.
To produce new requests and responses, a IntegraL plugin is given a DocumentInfo object and a
MegOutputStream object to manipulate. One may set a property of the DocumentInfo object using either
the setRequestHeader(...) or setResponseHeader(...) methods. The HttpHeader, HttpRequest and
HttpResponse classes are designed to make producing such headers easier.
When the header information has been set appropriately, the IntegraL plugin may begin writing the
body content to the MegOutputStream using its write(...) methods.
2.2.4
Equipment Configuration
from Computer Science Project Guide - Page 7
Describe the equipment you will use to support the operation and
development of your system.
All development will be on standard PCs or laptops, using a Java editing environment. Code will
be checked into CVS. The final product will run on Linux with an Apache server. Development
work can be conducted on any platform, however, and ported to the test and production server.
2.2.5
Implementation Languages
List the programming languages you plan to use for the
implementation of your project and give reasons for choosing each
language.
Implementation will be in HTML, XML and Java. Pages are displayed to Web browsers in HTML.
XML is a standard industry message passing format and the prescribed format for the IntegraL
system. Java is the standard programming language used within the IntegraL system and thus
required for compatibility with the system.
2.3
Implementation Plan
This is a description of the plan for implementing the project. Here you
commit yourself to a course of action and specify the criteria by which your
performance is to be judged. Your final grade will depend, in large measure,
upon your success in achieving the goals agreed upon between you and your
project advisor.
2.3.1
Deliverable Items
List and describe each of the items you will submit in fulfillment of
the project requirements. Deliverable items include, but are not
limited to, program executable file(s), program data file(s), program
listings, program documentation, user manual and sample program
runs.
1. XXX Integrator system that parses the following types of pages generated by the XXX system:
Here list the types of pages in the XXX system that you will integrate. Carefully go through the
XXX system and list each kind of page the system generates - including the home page, query
pages, query results, content pages, help pages, etc. Note that most systems have a set number
of pages between 8 and 30. Most screens will fall under one of these types. For people working
in pairs on their system, split these pages between the two project proposals. Also list the kinds
of elements on each page that IntegraL could place links on (using linking rules - see #2 below).
Note that these elements may receive links to services from other systems, so don’t worry if you
can not think of any services from the XXX system for a given element type. Include it anyway.
To distinguish your report from your partners’ you should provide a detailed description (1
paragraph or more) about each of these pages. After the page description, list the actual elements
on that page that you will place links on.
from Computer Science Project Guide - Page 8
2. Linking rules for the following services and elements within the XXX system:
Here identify the kind of services that the XXX system can provide for each type of element-ofinterest or object. Each service for each type of element will be one linking rule (also called
mapping rules). For example if a library system were presented with a keyword, it could search
for all documents with that keyword. If a library system were presented with an author, it could
search for all documents written by that author. If a library system presented with an ISBN, it
would see if it has that ISBN available. Figure out all the kinds of services your system
provides for any kind of object. For people working in pairs on their system, split these services
between the two project proposals.
To distinguish your report from your partners’ you should provide a detailed description about each
linking rule. Note from the PowerPoint presentation and some of the documentation that each
linking rule has 6 parameters. Give your first cut at the values of these 6 parameters. For the
relationship metadata, you can give a short “semantic description” of the link. For the condition,
you can state “only for authenticated users” if XXX is a subscription database that the library pays
for, or “none” if it is a generally available system.
3. The names and locations of any glossaries and thesauri the XXX system provides:
See if any kind of glossary or thesaurus is available within your system and list these here. If so
state that you will provide the IntegraL team with details about these glossaries/thesauri. Not all
systems will have glossaries/thesauri. For people working in pairs on their system, only one of
you should list this.
Only one partner in a team will work on glossary/thesauri integration. This is a fairly trivial task.
This should not appear in the proposal of the other partner anywhere. Include a description of any
glossaries or thesauri the XXX system has - their location and what they contain. Write this
yourself. It should not be the same as the description for the integrations that other teams are
doing.
4. Search integration for the XXX Integrator.
Most systems provide a search API. You will identify the search API that the XXX system uses
and provide it to the IntegraL team. If the XXX system has no search API then you will need to
write a search wrapper, which would be similar in detail to 2 page parsers. (We’ll help you do
this.) Determine whether your system has a search API for your proposal. For people working
in pairs on their system, one of you must do this. If your system has a search API, then this will
be no additional work for you once identified. If your system needs a search wrapper, one
partner should plan to integrate 3 fewer pages (item #1) than the other partner.
Only one partner in a team will work on search integration. This should not appear in the proposal
of the other partner anywhere. Include a description of the integration you will have to perform.
Write this yourself. It should not be the same as the description for the integrations that other
teams are doing.
from Computer Science Project Guide - Page 9
5. Program code for the XXX Integrator.
6. Detailed documentation for the XXX Integrator.
7. Detailed documentation for each linking rule.
8. Detailed documentation of search interface.
Only one partner in a team will work on search integration.
9. others?
Only include this if you know what “others” you are going to do, and then list them explicitly!
2.3.2
Milestone Identification
Identify each of the milestones or check points that mark the
completion of some phase of project implementation. Milestones
include, but are not limited to, detailed system analysis, system
design, file design, module design, system test design, module coding,
working breadboard with stubs, working system with stubs, system
testing and documentation.
List each of the deliverables from section 2.3.1 in the following items.
1. Detailed analysis of XXX system.
2. Detailed analysis of how to parse each type of page.
3. Detailed analysis of each linking rule.
4. Implementation of parsing each type of page.
5. Implementation of each linking rule.
6. Integration of search into IntegraL’s MetaFed environment.
7. whatever else is relevant (e.g., from paragraph description above)
Only include #7 if you know “whatever else” you are going to do, and then list them explicitly!
2.3.3
Milestone Completion Criteria
List the criteria by which the completion of each milestone is to be
judged. If an objective measure is available then it should be
specified. If a personal judgment is required then indicate who will
make the determination. This information may be given in tabular
form if desired.
from Computer Science Project Guide - Page 10
Read this and only include what is relevant to your project. Don’t just copy the text directly.




Deliver a technical specification describing the class-structure, data flow diagram and
database structure if any by week 5.
During development review with Project Leader any issues, problems you maybe having.
Deliver a working version of your project by Week 12. If your project cannot be executed it
should have been made clear to the Project Leader during the development phase, there
should not be any surprises!
Deliver a Project report in the departmental format by the end of Week 13.
2.3.4
Schedule of Milestone Completion
Prepare a diagram or table giving the proposed completion date for
each of the milestones listed in the previous two sections. See the
sample in section 3.4 of this document.
List each of the deliverables from section 2.3.1 in the following items.





Weeks 1-3: Research & Analysis
Weeks 3-5: Preparing Technical Specification, obtaining signoff from the Project Leader
Weeks 6-11: Development
Week 12: Testing and final review
Week 13: Final changes if any.
5.
Grading Criteria
In this section you establish and define the criteria governing the grading of your
project. Here you specify the relative emphasis you wish to be placed upon the
different phases of your project. Assign a weight to each of the deliverable items
and/or milestones listed in section 5.3 of the proposal so that the weights sum to
one. Display this information in a table which your advisor will use in determining
your grade for the project. Refer to the sample in section 3.4 of this document.
You can note that the following criteria have been assigned by Professor Bieber.




Does the student show an understanding of the basic principles of IntegraL, this must be
clear from the documentation to be delivered – 40%
Is the project functional – 10%
Programming style – 10%
Documentation (Tech. Spec, Project Report, Code documentation, comments etc.)- 40%
Download