CS 205 ­ Programming for the Sciences Spring 2011 ­ Final Exam Programming Project

advertisement
CS 205 ­ Programming for the Sciences
Spring 2011 ­ Final Exam Programming Project
50 points
Out: April 20, 2011
Due: May 2, 2011 (Monday of finals week), no later than 9:00am, no late work accepted
Honor Code
This final exam programming project is to be your own work. Any assistance you receive must be from the instructor. Assistance may "purchased" for a penalty of 0­3 points depending on the type and complexity of the assistance. Questions regarding interpretation and clarification of the assignment or provided code will not incur penalties and are encouraged. Questions regarding previous assignments and examples also will not incur penalties.
The last class period (Monday, April 25) will be an open lab, project work day. Assistance received during the class period will be heavily "discounted" and likely will not incur any penalty. This is to encourage student to start early on the project and answer questions of interpretation and clarification that can then be conveyed to the entire class.
Note: Example usages of almost all of the code you are asked to write for this project are contained the in­class exercises, programming assignments and projects, and (practice) exams given out throughout the entire term.
Logistics
While this project is large, it consists of pieces that are similar to programs that we have written before. Lecture notes from the following projects may be particularly illustrative:
Simple RPN Calculator: popping up message boxes in response to errors
Point Class Demo: class constructors, properties, and methods
Graph Function: translating world coordinates to screen coordinates; drawing lines between a series of points
● Game of Life: reading data from a file
● Traveling Salesman Problem: finding the minimum value
● Draw Polygons: use of ArrayList
●
●
●
The project is laid out below as a series of parts to be completed. It is suggested that you do the parts in the order given as generally an earlier part must be completed to make a later part work. After each part, you should be able to test your program and see that what you have written works.
The instructor will have regular office hours 2pm­4pm on Tuesday, April 26, and extra office hours 2pm­
4pm on Thursday, April 28, and Friday, April 29. If you are not able meet during these times, send email to the instructor to make an appointment.
When you have finished your project or on Monday, May 2, at 9:00am, whichever comes first, please make sure your name is in the comments as indicated in both NameSurfer.cs and NameData.cs
04/20/2011
1 of 9
D. Hwang
files. Create a zipfile of your entire NameSurfer project folder and submit it using the submission system as explained in the handout Submission Instructions for CS 205. NO LATE WORK WILL BE ACCEPTED. Final exam programming project scores will be posted to Blackboard and final course grades will be posted to WebAdvisor no later than 5pm on Wednesday, May 4.
Note: As this project is in place of the final exam, it is worth 10% of the final course grade. Final grades are based on the final weighted score percentage as explained in the syllabus. The grading scale will be no higher than 90/80/70/60 and may be lower depending on overall class performance.
Background
The Social Security Administration provides a neat web site showing the distribution of names chosen for children in the US (http://www.ssa.gov/OACT/babynames/). Among the statistics presented is data giving the 200 most popular boy and girl names for children born in the US for each decade starting with the 1880s. For this project, we will use similar data starting from the 1900s for the top 1000 names. The data has been boiled down to a single text file with a format as shown below. On each line we have the name, followed by the rank of that name in the decades starting in 1900, 1910, 1920, ..., 2000 (11 numbers). A rank of 1 was the most popular name that decade, while a rank of 997 was not very popular. A rank of 0 means the name did not appear in the top 1000 that decade at all. The elements on each line are separated from each other by a single space. The lines are in alphabetical order, although we will not depend on that. Here is a sample of the file:
...
Sam 58 69 99 131 168 236 278 380 467 408 466
Samantha 0 0 0 0 0 0 272 107 26 5 7
Samara 0 0 0 0 0 0 0 0 0 0 886
Samir 0 0 0 0 0 0 0 0 920 0 798
Sammie 537 545 351 325 333 396 565 772 930 0 0
Sammy 0 887 544 299 202 262 321 395 575 639 755
Samson 0 0 0 0 0 0 0 0 0 0 915
Samuel 31 41 46 60 61 71 83 61 52 35 28
Sandi 0 0 0 0 704 864 621 695 0 0 0
Sandra 0 942 606 50 6 12 11 39 94 168 257
...
We see that "Sam" was #58 in 1900 and is slowly moving down. "Samantha" popped on the scene in 1960 and is moving up strong to #7. "Samuel" has been fairly popular throughout the century. "Samir" barely appears in 1980, but in 2000 is up to #798. The database is for children born in the US, so ethnic trends tend to show up when immigrants have children.
Ultimately, we want to organize the data to graph it as shown below (with the names Sam and Samantha ­ the figures are shrunk from the actual interface so they are a little fuzzy). There are around 4500 names in the database. Each name has at least one decade where it was ranked in the top 1000. The data just records literally what people put on the forms, so there are things like "A" and "Baby" recorded as names (the data is more cleaned up in the later years). We will not worry about that, and we will not combine names that are similar in some sense. E.g., "Cathy" and "Catherine" and "Kathryn" and "Katie" and "Kati" will all count as different names. 04/20/2011
2 of 9
D. Hwang
Project Overview
The project file NameSurfer.zip is to be downloaded from the course webpage. The application will run, but none of the buttons do anything. The goal of this project is to complete this application. The application consists of the following files:
●
Point.cs: this is the same Point class we have been using for other graphics projects and is complete. The only difference from previous versions is that this definition is in the NameSurfer namespace.
● NameData.cs: the NameData class holds the data for one name. You are responsible for declaring the private data of this class, completing the explicit­value constructor, and writing several properties and methods for this class.
● NameSurfer.cs: this is the main form program file. Several methods are provided. You are responsible for declaring the ArrayLists that will hold the name data, completing the default constructor, and writing the button Click event handlers and the panel Paint event handler.
There are comments in the files that indicate where the code for various parts are to be added.
04/20/2011
3 of 9
D. Hwang
Part 1a (20 points): NameData class
The NameData class is used to encapsulate the data for one name ­ the name and its ranks over the decades. This is essentially the data of one line from the file shown above. The start of the NameData class is contained in file NameData.cs. It currently contains the following items:
Definition of public integer constants NUM_DECADES, START_YEAR, and MAX_RANK, so that if we were to change the number of decades, the year of the first decade, or the maximum possible rank, respectively, we can do so easily by changing the constants. Outside of the class file, these constants are accessed with the class name as a prefix, e.g., NameData.MAX_RANK. These constants are currently defined with values 11, 1900, and 1000, respectively, to match the current data file specifications.
● The start of an explicit­value constructor that takes a string argument. The argument must be in the format of a line of the data file (i.e., a name followed by 11 numbers). The code for splitting this string between the spaces into a local array of strings (data) where each element is one of the "words" of the input string has been provided. (I.e., data[0] will be the name, data[1]
will be a string with the digits of the rank of the first decade, etc.)
●
You are to implement the following for the NameData class (i.e., all of this code goes in NameData.cs):
●
●
●
●
●
●
Declare two private attributes where indicated. One is a string for the name (name); the other is an array of integers to store the rank numbers (rank). The rank attribute should be initialized to a new array of NUM_DECADES elements (using the new operator).
Complete the explicit value constructor to store the data into these attributes. As noted above, data[0] stores the name. For the rank numbers you will need to loop through the data
elements indexed 1 to NUM_DECADES, parsing the data string into an integer rank to be stored in rank elements indexed 0 to NUM_DECADES-1.
A public property Name with only a get operation that returns the name attribute
A public method RankInDecade that receives an integer representing a decade and returns the rank of the name in the given decade (also an integer). We will use the convention that an argument of 0 is the START_YEAR decade (currently 1900), an argument of 1 is the next decade (currently 1910), and so on. Hint: compare the index of the rank for each decade with the argument value.
A public method BestRank that does not receive anything and returns the (integer) rank of the decade where the name was most popular, using the earliest decade in the event of a tie. For example, from the data above, Sam's rank when it was most popular (1900) is 58 and Samantha's is 5 (from 1990). Note: we are looking for the minimum value in the rank array. However, a rank of 0 should be considered the same as the maximum rank, since it represents a rank higher than the maximum rank, so is not the minimum value rank. It is safe to assume that every name has at least one decade with a non­zero rank.
A public method BestDecade that does not receive anything and returns the first year of the decade where the name was most popular (i.e., lowest rank number that is not 0), using the earliest decade in the event of a tie. This method should return the actual year, e.g., 1920 when the lowest rank number is in the element indexed by 2. For the data above, Sam's best decade is 1900, while Samantha's best decade is 1990. This is method similar to BestRank.
04/20/2011
4 of 9
D. Hwang
Part 1b (5 points): Reading from the data file
The data for this program is in a textfile named names-data.txt. (You can open this file in Visual Studio, if you want to see all the entries.) Since we do not know exactly how many names are in the file, we will store the data from this file into an ArrayList where each element is a NameData object containing data from one line of the data file. We will call this list databaseList.
The code to read in the data goes in the NameSurfer class constructor where indicated in the comments. The code to open the data file and attach it to StreamReader object inputFile is provided. The input file is assumed to be located in the same folder as the executable program, which is the bin\Debug
folder of the NameSurfer project during development. If you copy the executable to somewhere else, be sure to copy the data file as well.
For this part, you are to do the following: Declare a private ArrayList variable databaseList in the file NameSurfer.cs where indicated in the comments, and initialize it to a new ArrayList. Recall that you also will need to resolve the use of ArrayList.
● In the NameSurfer constructor, implement the loop that will read each line of the file, create a NameData object with the line (using the new operator), and add this NameData object to databaseList using the Add method. The place where this code goes is indicated in the comments.
●
When you have finished both parts of Part 1, you can run the program with debugging and use the debugger to look at the database list. This is done as follows:
Set a breakpoint at the end of the NameSurfer constructor by clicking in the far left margin next to the last closing curly brace of the constructor. This will put a red dot in the left margin. ● Run the program using Start Debugging, and the program will stop at the red dot. ● In the bottom left corner should be a window for viewing variable values. Click on the Watch1 tab, then type in databaseList, then Enter, to see the database ArrayList. The plus signs to the left of the variable allows you to "open" up the object and see the values of the individual parts of the variable. ● Check the first few NameData objects in databaseList to see they have the correct name and rank data in them from the data file.
●
If the data is not correct, choose Stop Debugging under the Debug menu, fix your code, and run it again. When you are done debugging, delete the breakpoint by clicking on the red dot.
Part 2 (5 points): Best Decade button Click event handler
GUI notes: The Textbox where a name to be graphed is input is named txtName. The results area is a ListBox named lbxResults. The application window has its minimum and maximum size set to the current size so that its size cannot be changed while it is running. Double­click on the Best Decade button to create a method stub for t the Click event handler (btnBestDecade_Click). Implement this handler do the following:
04/20/2011
5 of 9
D. Hwang
If the input textbox is empty, pop up a message box saying that a name must be entered and go back to waiting.
● Otherwise use a for­loop to search through the database list for the name given in txtName.Text by comparing it to the Name property of the list element. Remember that an item from an ArrayList must be cast to its actual type in order to be accessed.
● If it finds the name, it should display the following in the results area: the name, the decade of the name's highest rank (obtained by calling the BestDecade method of the list element) and the name's rank for its best decade (obtained by calling the BestRank method). Afterwards, the handler should return, so that the application goes back to waiting for input.
● If it does not find the name after looping through all of the database list, it should display an error message in the results area saying the that name was not found in the database list.
●
Be sure to hand check the results with the data file to make sure the BestDecade and BestRank
methods are working correctly.
Part 3a (5 points): Graph button Click event handler
GUI note: The area where the graph is drawn is a Panel named pnlGraph that is the same width as the application. Graphing the rank data for a name is a two step process involving the Click event handler for the Graph button and the Paint event handler for the panel. Double­click on the Graph button to create the handler stub (btnGraph_Click).
The application is to keep track of the names to be graphed. Since we do not know how many names there will be, we will use another ArrayList to store the NameData objects of the names to be graphed.
●
Declare another private ArrayList variable namesToGraph in the file NameSurfer.cs where indicated in the comments, and initialize it to a new ArrayList. The handler for the Graph button should do the following:
If the input textbox is empty, pop up a message box saying that a name must be entered and go back to waiting.
● Otherwise use a for­loop to search through the database list for the name in the txtName.Text as is done is the Best Decade button Click event handler.
● If it finds the name, it should add the NameData object to namesToGraph, then invalidate pnlGraph (to force it to be redrawn)
● If it does not find the name, it should display an error message in the results area saying that the name was not found in the database list.
●
Notes: Translate method
To graph the rank data, we can consider a world coordinate system as follows. Since there are 11 decades of data for each name, the x­axis of the world coordinates has range [0, 11]. The y­axis is the rank of a name for each decade, so it has range [1,1000], the possible rank values. Variables xRange and yRange have been declared and initialized representing these ranges.
04/20/2011
6 of 9
D. Hwang
For the screen coordinate system, the x­axis has range [0, pnlGraph.Width], the width of the panel, and the x­offset is 0. The y­axis of the screen coordinate system is a bit trickier, since the graph area is not the entire panel. For this project, we want a horizontal line drawn 20 pixels from the top and 20 pixels from the bottom of the panel to mark off space for labels. A constant integer LABEL_OFFSET has been defined with value 20. Thus the y­axis of the screen coordinates has range [0, pnlGraph.Height - 2*LABEL_OFFSET] with a y­offset of LABEL_OFFSET. Since lower ranks are to be graphed higher on the grid, the translated world y­coordinates do not have to negated. For example, a name with a rank of 1 is at the top of the graph, a rank of 475 would be near the middle of the graph, and a rank of 997 is at the bottom of the graph.
The translation from world coordinates to screen coordinates is encapsulated in the provided Translate method. This method receives a (world­coordinate) Point object and returns a (screen­
coordinate) Point object. It is to be used in the Paint event handler.
Part 3b (10 points): pnlGraph Paint event handler
GUI notes: An array of Pen objects (pens) and an array of SolidBrush objects (brushes) have been created and initialized with various colors in the NameSurfer constructor.
The start of the Paint event handler (pnlGraph_Paint) is provided. It obtains a Graphics object from the panel and calls the DrawGridLines method to draw the grid lines of the graph. This method shows an example of drawing strings. The DrawString method of the Graphics object receives the string to be drawn, a font, a brush (not a pen), the x­coordinate of the upper­left corner of the bounding box, and the y­coordinate of the upper­left corner of the bounding box. Variable fontTNR10 has been declared and initialized to a 10­point Times New Roman Font object.
In general, the world coordinates of the points to be graphed are of the form "(decade index, the rank in decade)", where the rank in decade is obtained by calling the RankInDecade method of the NameData object being graphed. However, if the rank is 0, then the y­coordinate should be set to the maximum rank (NameData.MAX_RANK), so that is it graphed at the bottom of the panel. The Translate method then is used to obtain the screen coordinate equivalent.
For this part, you are to complete the implementation of Paint event handler to graph the rank data for the names in the name list by doing the following:
●
Use a for­loop to index through namesToGraph and for each NameData item in the list do the following:
○ Cast the current ArrayList item into a NameData object
○ Compute the world coordinate Point, then screen coordinate Point for the first decade (i.e., the first left endpoint, which has a decade index of 0). ○ Draw the name and first decade rank next to the point using DrawString.
○ Use another for­loop to index the rest of the decade ranks (i.e., starting with 1) and do the following:
■ Compute the world coordinate Point, then screen coordinate Point of the current decade rank (i.e., the right endpoint)
■ Draw the name and current decade rank next to the right endpoint.
■ Draw a line from the left endpoint to right endpoint.
04/20/2011
7 of 9
D. Hwang
■
Set the left endpoint variable to the right endpoint value. (I.e., the right endpoint becomes the new left endpoint for the next iteration.)
For now, use the black pen (pens[0]) or black brush (brushes[0]) as appropriate. As noted above, DrawString interprets the coordinates given to it as the upper left corner of the box around the text to be drawn. We would like these strings to be drawn above the point rather than below the point, so you will need to adjust the y­coordinate given to DrawString. The examples shown uses LABEL_OFFSET
as the adjustment.
At this point, the program should graph names as they are added using the Graph button. As we add names, they will tend to draw on top of each other, especially at the very top and very bottom. Since the name string is repeated each decade, it still is possible to figure out which line is which. However, it would be nicer if the graph lines were in a few different colors. The arrays pens and brushes contain NUM_PENS pens/brushes of different colors. Instead of always using 0 as the index to these arrays, we can rotate through the colors by keep track of a currentPenIndex that is initialized to 0, then is "incremented" after each name is graphed. As we have seen before, we want to do a modular increment with respect to NUM_PENS so that "incrementing" the last array index rolls over back to 0.
●
Add code to the handler to rotate pen/brush color as it graphs the data.
As more pens/brushes are added to the arrays, more names can be graphed before a color repeats itself. If you want to add more colors, change the value of NUM_PENS and add initialization code in the NameSurfer constructor for the new pens/brushes as shown for the additional pens/brushes.
Part 4 (5 points): Clear All and Clear First button handlers
In the figure below, we see the names "Bertha" and "Wendy". Bertha starts strong in 1900 and trails off to 0 in 1990. Wendy is at 0 until 1940.
04/20/2011
8 of 9
D. Hwang
In the next figure below, we add "John" who is very near 1 the whole time, and "Bethany" who comes on the scene only starting in 1950. Both Wendy and Bethany are 0's in 1900, 1910, ... so they draw on top of each other there. That is fine ­ we draw what we can and if they draw on top of each other, so be it. Finally, after graphing a few names, the graph gets very messy. Implement the Click event handlers for the last two buttons as follows:
The Clear All button Click event handler should reset the application to its initial state by erasing the entire list of names to be graphed and clearing the results area (both using the Clear
method), resetting the input textbox to an empty string, and invalidating the panel (to cause a repaint).
● The Clear First button Click event handler should remove the earliest added name (i.e., the first one at index 0) from the list of names to be graphed (using the RemoveAt method that receives the index of the item to be removed) and invalidate the panel (to cause a repaint).
●
For example, if the names Samantha, Wendy, and John are added to the name list, then clicking the Clear First button removes Samantha. It is fine that the color that is used with each name changes as names are added and removed.
Acknowledgments
This assignment is based on a similar Java assignment developed by Nick Parlante at Stanford University that was presented during a Nifty Assignment session at the 2005 SIGCSE Conference.
04/20/2011
9 of 9
D. Hwang
Download