Good_Final

advertisement
Stukova 1
Summer 2010 CS 350 Final Exam
By Inna Stukova
1. The client has decided that it may be desirable to add additional rows of tiles to
the existing structure with the result being a protein sheet. We currently have a
row which describes the edge of the protein sheet and a row that can be built
upon. The client wants to be able to input an integer R, which represents the
number of rows of length N, and search for and display the largest protein sheet
possible given a particular data file.
New Requirements:
-
-
User-inputted R that represents the number of rows for the protein sheet.
Thus, the protein sheet will consist of an initial strand and R-1 complement
strands of length N.
Similarly to the initial requirement, each complementary strand will begin and
end with Type Single tile and consist of the Type Zero tiles.
Proposal:
The new requirements posed by the user will affect the following epics: User
Input, Solver, and Validation. The Creation epic will remain unchanged since
there are no new requirements about the tile and bucket structure.
User Input Epic:
As set up initially, the Helix application allows a file name to be passed in as a
parameter. This file enables the user to input tile data directly for the Type
Double, Type Single, and Type Zero buckets. The user will then be prompted to
specify N, the length of protein sheet, and R, the number or rows for the protein
sheet. The user will have an option to enter R and thus receive an output as a
protein sheet of size NxR, or to let the program search for and display the largest
protein sheet possible without specifying R.
Solver Epic:
The initial strand of size N is created (solved) in accordance with the initial
design. The left side Type Double type is the root of a created tree. From left to
right, each tile is added to the previous tile until the number of tiles reaches N-1.
Type Double tile is the last in the strand of size N. The complement strand is then
found according to the initial story cars. The bonding rules remain unchanged.
However, after the first complement strand is found, the same process is
Stukova 2
repeated to solve for the rest of the complement strands until a protein sheet of
size NxR. If the user does not specify R and chooses to find the largest protein
sheet possible given a particular data file, then the process of solving the
complement strands will stop only if there are no more possible complement
strands to create.
Validation Epic:
The validation stage will check the built protein sheet against the other sheets
created. In case of specified R, if there are any tiles in the new sheet that
correspond in edges to the tiles in the previous sheet, the new protein sheet will
be disregarded as a valid solution. If the user did not specify R, the Validation
process will also compare each protein sheet in terms of the number of rows in
the sheet. The output of this process, the largest protein sheet will then be
displayed to the user. Otherwise, the output will be all possible solutions. The
exact format of the output is still under the consideration.
The following Story and Test cards are added:
S-U1
T-U1
Prompt the user to enter the number of
rows for the desired protein sheet.
Check integer R to make sure that R>0
Allocate the space for integer R.
The user will have an option to skip this
step by pressing Enter. In this case the
solution will be the largest possible
protein sheet.
Estimated LOC: 20
Estimated Time: 30 min
S-S1
Once the first complement strand is built,
continue the same process of solving the
rest of the complement strands until the
number of rows reaches R, if R is
specified, or until there can be no more
complement strands built.
Estimated LOC: 50
T-S1
If R is specified, check that the number of
complement strands is equal to R-1
Stukova 3
Estimated Time: 75 min
S-V1
T-V1
Check for duplicate protein sheet
solutions.
Ensure that only duplicates are removed
from the list and all other solutions are
ignored.
Estimated LOC: 30
Estimated Time: 45 min
S-V2
T-V2
If R is not specified, check each protein
sheet solution for the number of rows.
Output the solution with the largest
protein sheet built
Calculate R for each protein sheet
solution. Find the max R and compare it
to the R of the output for accuracy
Estimated LOC: 30
Estimated Time 45
Project Plan Summary
Epics
Initial LOC
Added LOC
Total
Total % Increase
User Input
200
20
220
9.1
Random Input
410
0
410
0
Creation
715
0
715
0
Solver
495
50
545
9.2
Validation
85
60
145
41.4
Total
1905
130
2035
59.7%
Stukova 4
2. The client been contacted by the FBI and has been asked to modify our code to
be used on fragmented DNA sequences of length N. Imagine that instead of an
empty list of tiles we are provided with a number of substring fragments that may
occur in either of the two rows. The buckets now contain the DNA acid
components and an exhaustive list is provided. In some special cases, letters
besides A, T, C, and G are present in a sequence. These letters represent
ambiguity. Of all the molecules sampled, there is more than one kind of
nucleotide at that position. What is desired is for the user to be receive a report of
all possible solutions and of equal importance is to report what is NOT possible in
order to find any negative matches within a set of potential DNA sequences to
eliminate suspects. False negatives are not acceptable. For example given a
particular input the substring AATC may not be possible in the existing gaps.
New Requirements:
-
Create tiles of types A, T, C, G, and N
Create buckets for each tile type
Create look up list for DNA bonding rules
Proposal:
The new requirements will eliminate the Random Input epic as the information to
be analyzed will be provided in a user-specified file. All other epics will be
affected in the following manner:
Creation:
Tile creation: a tile is an object of type A, T, C, and G
Bucket creation: five bucket objects are created for each tile type.
User Input Epic:
The user will be prompted to enter the name of the file that will provide a number
of substring fragments that occur in either of the two rows. The information
provided will look similar to this
(Row 1) A C G A T
(Row2) T
C
Solver Epic:
Stukova 5
Below is the look up table that will be available for the Solver Epic as well as the
Validation Epic.















A = adenine
C = cytosine
G = guanine
T = thymine
R = G A (purine)
Y = T C (pyrimidine)
K = G T (keto)
M = A C (amino)
S = G C (strong bonds)
W = A T (weak bonds)
B = G T C (all but A)
D = G A T (all but C)
H = A C T (all but G)
V = G C A (all but T)
N = A G C T (any)
Following the rule of DNA bonding (A binds with T and C binds with G), the
Solver will first analyze the first row. For each tile in the top row it will check the
tile in the bottom row. If the tile is correct, it will move on to the next tile in the top
row, it the bottom row tile is missing, the appropriate tile will be inserted. If the top
tile is absent, the Solver will analyze the next tile pair. Once the top row ends, the
Solver will then move on to the bottom row and pair every bottom tile with the
appropriate tile on the top. It will skip all blanks.
Validation Epic:
This process will search for every invalid solution in the following way. If the
following sequence is to be validated: AACT GA
TTGA CT such solution will be considered
invalid due to the break in the strand.
The following Story and Test cards will be in effect for this project:
S-U1
T-U1
Prompt the user to enter the name of
the file that contains number of
substring fragments that may occur in
either of the two rows.
Check that the file specified by the
user does exist.
Estimated LOC: 5
Estimated Time: 10 min
Stukova 6
S-C1
T-C1
Create a tile object. Four tile types are
A, C, G, T
Verify that only types A, C, G, and T
are accepted by the program
Estimated LOC: 30
Estimated Time: 45 min
S-C2
T-C2
Create a bucket for tile type A
Implement bucket as an indexed list
(vector).
Read a line from the file and call tile
builder passing it the information. Add
the returning tile to the bucket.
Estimated LOC: 60
Estimated Time: 90 min
S-C3
Create a bucket for tile type C
Implement bucket as an indexed list
(vector).
Read a line from the file and call tile
builder passing it the information. Add
the returning tile to the bucket.
Estimated LOC: 60
Estimated Time: 90 min
S-C4
Create a bucket for tile type G
Implement bucket as an indexed list
(vector).
Read a line from the file and call tile
builder passing it the information. Add
the returning tile to the bucket.
Estimated LOC: 60
Estimated Time: 90 min
S-C5
Create a bucket for tile type T
Check that the bucket only contains
tiles of type A
T-C3
Check that the bucket only contains
tiles of type C
T-C4
Check that the bucket only contains
tiles of type G
T-C5
Check that the bucket only contains
tiles of type C
Stukova 7
Implement bucket as an indexed list
(vector).
Read a line from the file and call tile
builder passing it the information. Add
the returning tile to the bucket.
Estimated LOC: 60
Estimated Time: 90 min
S-S1
Starting with the top row, check each
tile for its complemented tile on the
bottom row and fill the bottom row
according to the look up table
provided. Skip the blanks and move to
the next tile until the end of the row
Example
ACG T AG
↓ ↓ ↓ ↓ ↓ ↓
TGCAA TC
Estimated LOC: 200
Estimated Time: 300 min
S-S2
Starting with the first tile on the bottom
row check each time for its
complement on the top row and fill in
the top row blanks with the
appropriate tiles according to the look
up table provided. If there is a blank,
skip it
Example
ACGTT AG
↑
TGCAA TC
Estimated LOC: 200
Estimated Time: 300 min
S-V1
Check the solution for blanks. If found,
the solution becomes invalid as there
is a break in the DNA strand. If no
T-S1
Check the look up table to ensure that
all tiles are matched correctly with
their complements
T-S2
Check the look up table to ensure that
all tiles are matched correctly with
their complements
T-V1
Check that only solutions with breaks
in the strand are considered invalid
Stukova 8
breaks found, display the completed
DNA strand as a valid solution
Estimated LOC: 100
Estimated Time: 150 min
Project Plan Summary
Epics
Initial LOC
New LOC
New Total %
User Input
200
10
5
Random Input
410
0
0
Creation
715
270
37.8
Solver
495
400
80.8
Validation
85
100
117.6
Total
1905
780
40.9%
3. The client is interested in adapting the software to move from 4 bonds on each
tile to 6 bonds on a cube. Furthermore these bonds can rotate on a surface
(imagine a box with a pencil in one side, you can spin the box around that axis a
full 2 PI radians. This can lead to strange three dimensional tree-like structures,
so just as our 2-d tiles have a North orientation, each cube has a north face and
each face has a rotation angle around one of the three Cartesian coordinate axis.
E.g. 6 reference angles for each cube structure. Additionally, there will now be 4
buckets, one for each of the three current buckets and one more for 3 blocks.
The 3 block-cubes are arranged so that all surfaces with a block share a
common vertex (or corner). As in the original project a size N is entered and all
possible 3d solutions of that size are presented.
New Requirements:
-
Create an additional Type 3 bucket that will contain 3-block cubes, Type 3
Solve and verify all possible 3-d solutions.
Proposal:
Stukova 9
To satisfy the new requirements, no changes have to be made to the existing
epics. However, additions are necessary for all epics in the following way.
User Input:
The user input file enables the user to input data directly for the Type 3, Type 2,
Type 1, and Type 0, rather than have it randomly generated. Essentially, the
input file is split into four new files, one for each tile type. In addition, the user will
still be prompted to specify N, the length of a strand that forms a solution. The
user will also have to specify whether a 2d or 3d solution is desired. If a 2d
solution is desired, then the program will follow the original algorithm for solving
for initial and complement strands. If 3d solution is requested, the program will
follow a new algorithm.
Creation:
In addition to creating 2d tiles in accordance with the initial story cards, a new 3d
cube with 3 blocks will be created. The cubes are arranged so that all surfaces
with a block share a common vertex. The structure of the cube consists of the
following information: ID (char Type T, int seqNum), Name, Orientation. New
bucket will also be created to contain these Type T cube.
Random Input:
In addition to randomly creating Type 2, Type 1, and Type 0 tiles, the new
algorithm will be implemented to randomly create 3-block cubes, Type 3.
Solver:
Solver will build an initial strand that meets the following requirements: has
specified length N, consists of cubes from the Type 3 bucket, the beginning
cubes will have blocks on the top, front and left sides and the end cubes will have
blocks on the top, front and right sides. Each interior cube must have a block on
top and a left key that complements the right key of the cube on its right and a
right key that complements the left key of the cube on its left. The complement
strand will be built in a similar way. Additionally, each cube’s top key much
complement the bottom key of the appropriate cube from the initial strand. The
first and last cubes of the complement strand will have blocks of the left and right
side respectively.
Validation:
Similarly to the initial algorithm, the validation process will check for the
duplicates. If there are any cubes in the new solution that correspond in edges to
cubes in previous solutions, the new solution will be disregarded.
Stukova 10
The following Story and Test cards will be added to the project:
S-U1
T-U1
Prompt the user to specify whether the 2d
or 3d solution is needed.
Estimated LOC: 10
Estimated Time: 20 min
S-R1
If user chooses randomly generate the
tiles and cubes, build the 3d cube with the
following attributes:
ID (char Type T, int seqNum), Name
(RandT + seqNum), Orientation.
Fill Type with T,
Assign 0 key to any three sides of the
cube that share common corner,
randomly assign numbers from -11 to 11
excluding 0 to other sides of the cube.
Output cubes to the cube data file.
Estimated LOC: 100
Estimated Time: 150 min
S-C1
Create 3d cubes Type 3. Each cube has
three block keys that share a common
vertex.
The structure of the cube consists of the
following information:
ID (char Type T, int seqNum), Name
(RandT + seqNum), Orientation. Each
cube will look similar to this:
0
0
5
0
4
1
Estimated LOC: 200
Estimated Time: 300 min
S-C2
T-R1
Check that Type field is T.
Only three sides contain a block key
All other sides contain non-zero integers
from -11 to 11
T-C1
Verify that Type 3 cube is created
Verify that there are 3 block keys
T-C2
Stukova 11
Initialize a Type 3 bucket for the Type 3
cube. Implement bucket as an indexed
list (vector)
Read the cube data input file. Each line
contains the input data to build one cube.
Estimated LOC: 150
Estimated Time: 200 min
S-S1
Initial and complement strands of length
N consist of only Type 3 elements. The
beginning cubes will have blocks on the
top, front and left sides and the end
cubes will have blocks on the top, front
and right sides. Each interior cube must
have a block on top and a left key that
complements the right key of the cube on
its right and a right key that complements
the left key of the cube on its left.
Verify that the bucket contains only Type
3 elements
T-S1
Check that each key of the interior cubes
is matched correctly with its complement.
Check that the beginning cubes have
blocks on the top, front and left sides and
the end cubes will have blocks on the top,
front and right sides.
Estimated LOC: 300
Estimated Time: 400 min
S-S2
T-S1
The complement strand is built similarly
to initial strand. Each cube’s top key
much complement the bottom key of the
appropriate cube from the initial strand.
The first and last cubes of the
complement strand will have blocks of the
left and right side respectively.
Check that each key of the interior cubes
is matched correctly with its complement
in the complement strand as well as with
its initial strand cube complement.
Check that the beginning cubes have
blocks on the left side and the end cubes
have blocks on the right side.
Estimated LOC: 300
Estimated Time: 400 min
S-V1
T-V1
If there are any cubes in the new solution
that correspond in edges to cubes in
previous solutions, the new solution will
be disregarded.
Output the valid 3d solution to the user.
Estimated LOC: 50
Estimated Time: 75 min
Check that only duplicate solutions are
eliminated as valid.
Stukova 12
Project Plan Summary
Epics
Initial LOC
Added LOC
Total
Total % Increase
User Input
200
10
210
4.8
Random Input
410
100
510
19.6
Creation
715
350
1065
32.9
Solver
495
600
1095
54.8
Validation
85
50
135
37.0
Total
1905
1110
3015
36.8%
Download