Agent 008,

advertisement
Agent 008,
Welcome to the government’s top secret statistical organization. Thanks to technology that is classified
above top secret we have discovered how to view alternative dimensions to see what would happen if
we went to war. I could explain more and then shoot you.
What I can tell you is that we have seen what would happen if a war was waged against Latveria. Our
observations have recorded many variables concerning the war and the number of enemy and civilian
casualties that would result. It is safe to assume that our dimension will be similar to the others,
although random fluctuations occur between the dimensions. Your job is to determine how each
variable affects the number of enemy casualties and civilian casualties.
Variables:
Enemy: The number of enemy casualties. We would like to maximize this number.
Civilian: The number of civilian casualties. We would like to minimize this number.
Stock: The American stock index. The value of the stock market plays a role in the morale of the troops
as well as the options for spending within the units.
Terrorism: Whether the Terrorist threat level is determined to be Low, Medium, or High. This
determination is made by a committee with input from the President of the United States.
Firepower: A ratio of the number of weapons used toward enemy targets divided by the intensity of
each weapon measured in kilotons of TNT.
Payload: The impact explosive force allotted to each unit measured in Newtons.
Weapons: The budget for weapons allotted per unit as determined by the conflict economic committee.
Bombs: The number of bombs used for the first initial strike on the capitol city.
Missiles: The number of missiles used to clear suspected enemy strongholds outside city limits.
FirstAid: The number of first aid centers that will set up behind the front lines per unit.
Spies: This number is numerical, greater than zero, in extreme cases it can be quite high. Other than
that the meaning behind this number is classified.
Media: The percentage of suppression placed on the media coverage of the war. While media
suppression limits the enemy knowledge it also lowers troop morale.
Personnel: The number of personnel who assist the soldiers from their desks in Washington D.C. This
includes the President of the United States which is not negotiable.
Temperature: The temperature in Latveria at the time of the initial assault measure in Fahrenheit. Due
to the dynamic nature of Latveria weather this temperature could be quite hot, or even negative.
Napalm: The number of gallons of napalm given to the main front attack unit.
IG88: Classified. The values are “None” when the IG88 guidance system is not in place.
Your mission (should you choose to accept it):
General Deilppa: Agent, you are being given two numerical responses, 12 numerical covariates,
and 2 categorical covariates. I want you to analyze both responses separately. I expect a high R2 score,
like above 90% for both responses.
I want to see that you have considered the following principles, and I want an explanation of how you
choose to handle these issues:





Errors in the data (or problems reading the data into your program)
Non straight-line regression (polynomial or log transformed)
Categorical variables (affecting the intercept or the slopes)
Interactions (between numerical or categorical variables)
Model Selection (including dealing with muticollinearity)
For each variable in your model (categorical or numerical) you must explain how that variable affects the
response. As part of the conclusion state how you would maximize enemy casualties (and minimize
civilian casualties).
General Erolpxe: We want to get a feel for the accuracy of your findings. Find the variable in each
model that is the closest to zero (relative to the error) and make a 95% confidence interval for the value
of the slope. Then explain what that confidence interval means.
Use the first row of your data to make a prediction for what each response should be in that row
according to your model. Comment on the accuracy of your prediction based on the actual response in
the first row.
General Laciteroeht: You must attach an appendix showing the computer output for the final
model you chose. Be sure it includes betas, standard errors, p-values, R2, the standard deviation for the
model, and the residuals. I’m sure General Deilppa only wants the relevant output in your report, but
with the appendix I can look at your model to check that you are truly loyal to your country.
When you choose your final model only include variables that are statistically significant unless you can
justify why that variable is supposed to stay in your model.
General Tsenoh: You are not allowed to discuss your data with other students – that qualifies as
cheating and will be treated as treason. You are not a spy, so this needs to be your own work to show
us where your abilities are (in statistics and in writing) currently. You MAY ask me, Jared, or Kim any
questions you like, about the data, the analysis, or your paper. We do not promise to give you
answers, but we can at least have a discussion about it. You may have someone (who is not in the
class) check the grammar, English, or readability. We will handle their brain washing later.
The fate of the free world is in your hands. Make us proud.
Rubric for Final Project (out of 100 points – numbers show how many could be subtracted per item)
2
4
3
2
5
2
3
4
4
4
2
2
2
2
2
2
2
2
2
2
2
2
2
2
5
3
3
4
2
3
5
10
10
10
10
10
10
10
10
5
10
10
Formatting (title, neat, paper copy, stapled)
Readability, English, grammar, punctuation
Intro - what client wants, value of project, future implications
What program was used
Finding the errors & justify the resolution
Defined α used to select variables
Enemy Casualty model
Civilian Casualty model
Describe Enemy distribution
3 Describe civilian distribution
Explain use (or lack) of polynomials
4 Explain use (or lack) of polynomials
Explain use (or lack) of logs
4 Explain use (or lack) of logs
Explain use (or lack) of interactions
4 Explain use (or lack) of interactions
Effect of Stock
2 Effect of Stock
Effect of Terrorism
2 Effect of Terrorism
Effect of Firepower
2 Effect of Firepower
Effect of Payload
2 Effect of Payload
Effect of Weapons
2 Effect of Weapons
Effect of Bombs
2 Effect of Bombs
Effect of Missiles
2 Effect of Missiles
Effect of First Aid
2 Effect of First Aid
Effect of Spies
2 Effect of Spies
Effect of Media
2 Effect of Media
Effect of Personnel
2 Effect of Personnel
Effect of Temperature
2 Effect of Temperature
Effect of Napalm
2 Effect of Napalm
Effect of IG88
2 Effect of IG88
Justified with residuals
5 Justified with residuals
S explained in context
3 S explained in context
2
R as a percentage
3 R2 as a percentage
CI for small β
4 CI for small β
Prediction equation
2 Prediction equation
Prediction for first row
3 Prediction for first row
How to maximize
5 How to minimize
Explained linear effects using the value
Explained logged effects as a percentage
Explained polynomial effect graphically
Explained categorical effects using the value
Explained categorical interactions (value or graphically)
Explained numerical interactions (descriptively or graphically)
Interpreted the meaning of confidence interval
Discussed accuracy of predictions compared to actual values
Conclusion discusses value of report and ideas for future research
All output is explained in the paper
Appendix shows summaries and residual plots by variable
Download