STAT 512: Final Project Topic Proposal Jason Gabl, Matt Steiner

advertisement
STAT 512: Final Project Topic Proposal
Jason Gabl, Matt Steiner
1) Description of the problem
We plan to investigate what factors contribute to a US citizen voting in the general presidential election,
our dependent variable being percentage of eligible voters that voted. Data show that typically only half
of all eligible voters actually cast their votes on Election Day. We will look at the following demographic
information that we believe to be independent of political affiliation:
o
o
o
Average Income/Cost of living (non-dimensionalized to represent disposable income)
Average highest achieved grade level
Average age
We will take average values of this data from the largest city in each state (excluding state capitals),
each one city being a data point for a total of 50 data points. Data will be taken for the 2012 presidential
election and from US Census data for 2010. We assume that the population demographics will not have
changed significantly over the 2 years difference.
2) Preliminary analysis that helps in that description
In many states, the capital is not the largest city (take for example, New York: the capital is Albany,
which has around 1% the population of New York City). Additionally, capital cities have a large
population of government workers and politicians, which we thought might skew the data (as they may
have more of a vested interest in election outcomes). For this reason we decided to take the largest
non-capital city in each state.
Data and analyses on government information are currently very limited due to the government
shutdown. However, there are numerous third party census information databases that are open to
public access. Throughout our analysis, we will assume that these census data and voting statistics were
accurately measured with no bias.
3) Preliminary plans of analysis
Our analysis will consist of a multiple regression analysis using the explanatory variables listed above.
We will first analyze them separately and determine individual correlations to the dependent variable,
and then using the multiple regression analysis method, determine the strength of the explanatory
variables when used in congruence with one another. To make sure that the explanatory variables
aren’t dependent on one another, we will analyze the correlations between them. Finally, if the data do
not follow a linear relationship, we will attempt to transform the data to find a significant relationship.
Download