MCicenas5a

advertisement
Maria Cicenas
Applied Mathematics
Weekly Summary 1
June 9 – June 16, 2009
Gerrymandering
This week my goal was simply to gather as much information as possible in order to evaluate the direction
in which I would like to further explore the topic of gerrymandering. I began by researching constitutional
requirements on congressional districts, current districts and current statistics on those districts. On the first
day of class, we discussed various parameters which could/do affect the creation of congressional districts;
therefore, I also located reliable sources for data on the possible parameters we discussed.
After looking at the congressional districts for the state of Kansas, and after the advice of fellow
classmates, I decided finding information on a more local level might not only be more interesting, but
would also give me to a better understanding of local factors that might need to come into play when
creating districts. Therefore, as done on the national level, I researched restrictions and located data for
creating districts on both the state and city level. As a side note, finding this information was much more
difficult than finding national information/statistics.
Finally, I researched mathematical approaches already attempted by others with regards to the
gerrymandering problem. I read the available research, and have listed those that I found credible and
possibly valuable in the attached Gerrymandering Resource File. (The file Kansas Maps includes maps
that I cited in the resource file, but saved as one document so I could access them quickly)
Unfortunately, after all that I have done, I feel that writing a program to create districts might be beyond my
limited coding experience (however, idea still slightly intrigues me). The new idea I would like to explore for
this upcoming week revolves around the idea of "compactness" (not in the mathematical sense ). From
what I have read the general public wants their districts "compact." If I can create a mathematical standard
for the "compactness" of a district, I can hopefully prove or disprove whether or not the mathematical
approaches already available indeed result in compact districts.
Maria Cicenas
Applied Mathematics
Weekly Summary 2
June 17 – June 23, 2009
Gerrymandering
This week I began my work by brainstorming for ideas on how I would like to define a compact
congressional district. I decided on expressing compactness as a ratio:
area of district
c
area of smallest circumscri bed rec tan gle
After testing a few examples, it seemed logical to further say that when 0.7  c  1a district is considered
compact and when 0  c  0.7 a district is not compact. For some states, odd coast lines and panhandles
make it near impossible for all their districts to fall into the compact category. At first I considered weighting
state borders differently than district borders, but then that isn't very fair to states like Kansas! Besides, the
main purpose of creating my definition was as means to compare current districts to other possible, more
mathematical, districts.
My next task was to collect the data needed to begin calculating "c" for current congressional districts as
well as the districts created by the mathematician Dr. Warren D. Smith and Jan Kok (using the shortest line
algorithm) and Brian Olson (an algorithm based on finding the districts with the shortest distance to their
center). I found complete data on the area of the districts only for 13 of the 50 states. The data for the
other states was either incomplete or nonexistent. I used the data I did find and printed maps of the current
districts so that I could trace rectangles around them and calculate the compactness ratio. Although I did
have data for all 53 of California's districts, the small size of California's districts with respect to the large
size of the state made finding a good quality map and measuring the corresponding rectangles nearly
impossible. Therefore, I left California out of my data set.
I began finding the ratios for the shortest line algorithm as well. This process was much more time
consuming because I did not have the areas of the created districts. I estimated these areas by breaking
the districts into rectangles, trapezoids and triangles (Heron's Formula). Due to time constraints I did not
finish this process for the 12 states, and I have not yet begun the calculations on Mr. Olson's districts.
Because the borders of Mr. Olson's districts are mostly curves, and not mostly lines like the shortest line
districts, they should be even more difficult to calculate.
I am turning in a condensed version of the Excel spreadsheet I created (with only the data I collected and
not all 50 states) as well as a scanned example of the work I did by hand. I do have all of the work I did by
hand and will turn in a hard copy of that information, but I did not think it would be necessarily to scan all of
them (I doubt my classmates would need to see it); however, I will do so if you like.
For the upcoming week, I would like to examine the work of my Gerrymandering peers and see how I
could possibly contribute to their work. On the side, I will probably also work on finishing collecting the data
I started collecting this week so that I can have a complete set, or at least as close to complete as possible.
I will give a full analysis of my results then.
Maria Cicenas
Applied Mathematics
Weekly Summary 3
June 17 – June 29, 2009
Gerrymandering
While I didn't take the time to fully complete the set of data on the congressional districts, I did take
a few more measurements so that I could at least begin to make some comparisons between the three sets
of districts: current districts, shortest line algorithm districts, and smallest geographic center districts.
Since I had seven states for which I had already calculated the compact ratio for the first two types of
districts, I set out to find the compact ratio for the smallest geographic center districts for those same seven
states. I did so, with the exception of Kansas, because the "center" district method gave Kansas the wrong
number of congressional districts.
After taking the needed measurements (using the same method as last week), I updated the Excel
Spreadsheet and added a sheet were I summarized my results. Only 19 percent of the current districts
tested qualify as compact. The shortest line algorithm improves the current districts by 22 percent on
average and the geographic center algorithm does an even better job boasting a 27 percent of change and
having the best overall results on 4 out of the 6 states.
Overall the mathematical algorithms do produce positive results; however, I do have a few
concerns. To my knowledge the districts created to not take into account the city, county and state districts
that must also be formed. Current districts, although oddly shaped, have needed smaller districts formed
as subsets within the larger federal districts. In order to truly see if the mathematical algorithms create
more compact districts one would have to start on the local level and build up (otherwise organizing
election days might be a disaster!). My other concerns are with my calculations. They were done by hand
which by default introduces error. In addition, the resolution of the maps from which I gathered the
information was not great. The "center" maps were of particularly poor quality.
In addition to working with my data, I also read the article Rich found which used the Hamiltonian to
create congressional districts: "Preferential Diffusion Method for Assigning Congressional Districts." I
enjoyed the article because it showed me the math behind all the ideas I had for creating the districts, but
didn't know how to implement. I still wouldn't be able to write such an extensive program; however, I do
feel that I understood the Hamiltonian and could find ways to use it in the future. I am really glad I read it.
Other Stuff
After working on Gerrymandering, I wanted to practice the Monte Carlo Methods you demonstrated
in class. In order to get familiar with Excel, I first duplicated the demos done in class. Then I created two
samples of my own one on the Golden Ratio and one on Benford's Law. Creating valid ideas was the most
difficult part. I really wanted to create a simulation for the Tower of Hanoi, but I couldn’t find a way to
account for the varying ring sizes.
I have also started working on creating a Markov Chain for the game Candyland. I have never
played the game before, so I had to search for all the basic information. During my search I found I website
that gives the results from a Markov Chain they did. I hope to use that site to check my answers when I am
done…just like using the back of the book! (Don't worry I am pretty sure they didn't show any work so I
can't cheat…and wouldn't)
Other than finishing the Markov Chain, I am not sure what I will be working on this week. I guess I
will start on one of the other problems.
Maria Cicenas
Applied Mathematics
Weekly Summary 4
June 30-July 7, 2009
CandyLand Markov Chain
After collecting information on the game of CandyLand last week, this week I used a Markov Chain to
create the needed 136X136 matrix. There are 134 spaces on the game board, but I added extra spaces for
the starting and winning locations. The only rule to the game that I knowingly changed was how you
shuffle. In order to keep the probabilities manageable, I went under the assumption that after drawing a
card, players must return the card to the deck and reshuffle. I did account for the two shortcuts, six candy
spaces, and the double moves. I also attempted to account for the three spaces on the board which
requires the player to stay at the current spot until a card the same color of the current square is drawn. I
question whether or not I did this in a valid manner, because the probabilities in these three columns then
do not sum to one. All other columns sum to one.
I created the matrix in Excel instead of Matlab, because I thought it would be easier to see the matrix and
keep track of where I was in the matrix as well as the game. It was; however, since I have not preformed
matrix operations in Excel in the past, I am a bit concerned that the size of the matrix might be too much for
Excel. I guess I will see how those calculations turn out next week. Perhaps I will have to construct the
matrix in Matlab.
Also, I am sending the electronic copy of the matrix, but I did not print it because it is 24 pages long! I am
providing a paper copy of the notes I worked from to make the matrix.
Contest Judging Problem
I also started work on the Math Contest Judging Problem. I tested several ideas I had with some by hand
calculations and with the use of Excel. I think I have finalized how I would like to account for any major
discrepancies in grading amongst the judges. I used Excel to experiment with ideas, by using the random
number generator, and putting various restrictions on eight judges. I had six "average" judges, one who
scored high and one low. However, I have not yet finalized the process or processes I would like the judges
to go through in order to go from the P papers submitted to the W winners. After finalizing that process, I
would like to create a Hamiltonian which will determine the number of judges needed for the job given the
amount of money available, the number of papers submitted and the number winners selected.
Other Stuff
Last week I also submitted the integral problem discussed in class. Sorry, I forgot to send it electronically
and I turned in my only copy of the work
Maria Cicenas
Applied Mathematics
Weekly Summary 4
July 8-July 13, 2009
CandyLand Markov Chain
Using xi 1  Axi , I began using the matrix I created last week to obtain the probability vectors. It
looked as though I was on the right track because the results claimed that a player has a slim chance of
winning by the fourth move, which was also claimed at the website, "A Mathematical Analysis of
Candyland." Unfortunately, after 33 moves (where I had roughly a 16% chance of winning the game), the
probably of winning the game began to decline instead of increase. I extended the game out to 45 moves
and the probability of winning was still on the decline. I am not sure where I went wrong, but there must be
a mistake or misunderstanding somewhere. I suspect it has to do with the "sticky" positions I attempted to
include in the game. At those positions my probability columns do not total one. However, I am not sure
the correct way to fix the problem. I will probably ask you about it at some point this week.
Contest Judging Problem
I also continued work on the contest judging problem. I came up with two methods/process of
judging the contest that seemed fair and practical. In order to explore and compare these two options more
thoroughly, I programmed their processes into Excel, and created a log to record the data. I had to enter
and record the data manually (I couldn't find a practical way to generate it randomly); however, the
calculations all happened automatically in Excel. I tested the methods given different numbers for the total
amount of papers submitted and number of desired winners (Total Papers 20, 100, 500, 1000 and Total
Winners 1, 2, 3, and 10), then evaluated the total papers each judge would have to grade (time constraint)
and the total papers graded overall (money constraint) for varying number of judges in order to locate the
optimum solution. In the Excel sheet Judging Process, is where the calculations took place. The different
trials are numbered and color coded with a corresponding key as to the given constraints. A total of 157
trials were completed.
I began to analyze the data in the record log by highlighting the column where the total papers
were minimized and the papers graded per judge was less than 50% of the total papers submitted. In the
cases where these numbers per judge where still high (500 and 1000 papers submitted) I also highlighted a
more reasonable optimal situation, if it existed. I also created line graphs in Excel to help me better
compare the results of the two examples.
Overall Example 5 worked much better than Example 6. In most cases, it had fewer papers per
judge and in total. The program for it was also easier to write and had fewer kinks along the way. Example
6 was about equal with Example 5 in the smallest population, 20 papers. Given enough judges Example 6
will also eventually outshine Example 5 with the total number of papers being graded; however, this only
happens because it reduces the contest to only 2 rounds, which is a great cost to the integrity to the contest
(Example 5's routine holds the number of rounds steady, while Example 6's number of rounds are always
changing).
Next week I hope to figure out how to fix the Candyland problem and continue working with the
math contest parameters.
Download