File

advertisement
Trevor Larsen
Term Project – Part 2
Confidence Level Intervals
For this next section, we will be looking into what these confidence level intervals are. It is the
certain range that corresponds to a certain confidence level. Example; I’m 95% sure that I will get
between 80 and 99 out of 100 on the next test. It tells you how confidence that person who made that
statement.
Here are a few problems to illustrate my point:
Interpretation & Discuss:
Problem 1:
For the first problem, we need to construct a 95% confidence interval for the mean starting
package students graduating in Engineering. Our alpha is simply one minus the confidence level, which
is .05. We first find the sample mean and standard deviation of those fifty students graduating in the
field using StatCrunch. Then we calculate the Error using the standard deviation, the sample size,
which is 50, and𝑑𝛼/2 , using the t-distribution table. Remembering that this is a two tailed problem, the
5% percent that is unsure is really 2.5% at each end of the normal distribution graph. Be sure that when
using the t-table you are looking under .05 in two tails, not one tail. We subtract and add the error to
the sample mean. The confidence level interval shows $58,425 < u < $62,903. So overall, we are 95%
sure that the population mean value for the students graduating in Engineering will be within this
specific range.
Problem 2:
For the second problem, we are to construct a 99% confidence interval for the standard
deviation of starting packages for Computer Science. Our alpha is .01 and again, a two-tailed problem,
so 0.5% is at each end. We start by finding the standard deviation, the degrees of freedom (n-1), and
chi-squared – left and right. The chi-squared values can be obtained using the chi-squared distribution
chart. Since this is two tailed, use .005 and .995. The equations for the range work differently, no error
to subtract from. We take the square root of the degrees of freedom times the dev. Squared divided by
the left and right values of the chi squared table. The right one goes on the left of the interval because a
bigger denominator is smaller. This whole thing gives us 3667 < 𝝈 < 6179. Overall, we are 99% sure that
the standard deviation for the students graduating in Computer Science in the population will be within
this specific range.
Problem 3:
For the final problem in this section, we are to construct a 90% confidence interval for the
proportion of all students that are graduating will start out with a starting package with a value of over
$50,000. We know our alpha to be .10, and that this is a two tailed problem, 5% percent at each end.
Our sample size is all the students in the table, that’s 350. We will need to find the sample proportion
(the probability of success), but taking the number of people that have over 50,000 and dividing by the
sample size. The sample proportion is 149/350 or .426 that a random student will make over 50,000.
Next, we find π‘§π‘Ž/2 using the normal distribution table (hint: there’s a smaller table with the most
common confidence levels with their z-score). Now you can find the error. The equation is your z-score
times the square root of the success probability times the failure probability divided by the sample size.
The interval now states that we are 90% confident that our probability of success for the population
proportion is within this range; 383 < p < .470.
Hypothesis Testing
This next section is all about using analysis to test a hypothesis which is a claim whether a
certain condition is met. We test this claim with a certain significance level, because most of the time
the answers will not be exact. Depending on if the test analysis is within the significance or not, then
the claim is either labeled ‘rejected’ or ‘fail to reject’. These claims, confirmations, and rejections are
sometimes presented in front of very important people or scientists, so they are always put in a formal
statement for them.
Here are the problems we are given and their answers:
Interpretation & Discuss:
Problem 1:
For the first problem, we are given this claim with a .05 significance level; “Students graduating
in Human and Social Sciences will start off with an average of under $38,000.” First we need to
understand what we are testing here. We always test our null hypothesis. Since our claim fits under the
alternate hypothesis, our null will say our mean is equal to $38,000. Alpha will be our significance level,
and we can figure out the sample mean and standard deviation from the statcrunch. We now will solve
for our test statistics. We find this by subtracting the population mean from our sample mean. Then
divide by the standard deviation divided by the square root of the sample size.
The next step can be done in one of two methods; the critical value method or the P-value
method. For convenience, we will use the critical value method for this problem. First, by using the
significance level and the degrees of freedom (sample size – 1) from our t-distribution chart, we find the
critical value that separates the safe zone from the critical zone. An important note to take into
account, the claim is not an equal, it’s a less-than. This tells us that this is a left-tailed problem. So,
make sure you look under the right column. Since the test static value falls in the safe zone, we are
going to fail to reject the null hypothesis. Here is the final statement to answer that question.
“There is not sufficient evidence to support the claim that Students graduating in Human and
Social Sciences will start off with an average of under $38,000.”
Problem 2:
For the last problem, we are given this claim with a .01 significance level: “80% of the students
graduating with a college degree will find a starting package valued over $40,000”. Again, alpha is the
significance level. Now our claim fits right in line with a null hypothesis of p = .80. Our alternate
hypothesis will be the exact opposite. First, we get our sample proportions for success and failure.
Then our sample size is all the students. We now will solve for our test statistic. We find this by
subtracting the population proportion from and sample proportion. Then, divide by square root of the
population proportion of success times the population proportion of failure divided by the sample size.
For this next step we will use the P-value method. The P-value represents probability of
observing a test statistic equal to or more ‘extreme’ than what was observed, assuming the null
hypothesis is true. For our problem, the test statistic is a z-score, which can be used to find the P-value.
If the P-value is less than or equal to the significance level, we reject the null hypothesis. If it’s more
than the significance level, we fail to reject. Since our P-value is more than the significance level. We
will fail to reject the null hypothesis and we will formally write this statement:
“There is not sufficient evidence to warrant rejection of the claim that 80% of the students
graduating with a college degree will find a starting package valued over $40,000”.
Conditions for Confidence Intervals and Hypothesis Tests
1. The sample is a simple
random sample.
2. The conditions for the
binomial distribution are
satisfied. Fixed number of
trials. Only two possible
categories of outcomes.
3. There are at least 5
successes and at least 5
failures.
Conditions for the Confidence
intervals for Proportions
Conditions for the
Confidence intervals for the
Mean
1. The sample is a simple
random sample.
2. The population is normally
distributed or
n > 30.
Conditions for the Confidence
intervals for Standard Deviations
1. The sample is a simple
random sample.
2. The population must have
normally distributed
values.
Our third problem satisfies
all three. The samples are
random. The binomial
distribution is met, as you
make $50,000 or you don’t.
Finally there are at least 5
successes and failures each,
with 149 making over
$50,000 and 201 not
making over $50,000.
Our fifth problem satisfies
all three. The samples are
random. The binomial
distribution is met, as you
make $40,000 or you don’t.
Finally there are at least 5
successes and failures each,
with 269 making over
$40,000 and 81 not making
over $40,000.
Our first and fourth
problem satisfies both
conditions. The samples
are random, and
distribution is normal or the
sample size is greater than
30. Both of them had a
sample size of 50 students.
Our second problem does
have random samples.
However, right now it’s not
normally distributed. It’s
skewed to the right. By
speculation, if we expand
our sample size
substantially, it would
become a normal
distribution because any
skews would go away with
more samples and
therefore satisfies the
second condition.
There’s a couple ways that errors could have been made during these problems.
ο‚· If the sampling method wasn’t random. If it were any other sampling method, there
would probably be different results.
ο‚· The other thing that might be uncertain is the distribution of the Standard Deviation. In
a sample, the graph would appear skewed to the right. If we increase the sample size
dramatically, it will eventually become a normally distributed graph. But this is all
speculation regardless, because we have no idea what the population number is.
ο‚· Not enough of a sample size. If we decided to base our results off only 5 people, the
result would be off because that sample size doesn’t reflect the statistics for the whole
population. On that same note, if we had 20 successes, and 1 failure for a binomial
distribution, this statistic would be far from accurate because this sample is recording
the condition of success. Without the failures, we would not get an accurate reading
and base the population on this statistic.
I conclude that we will have better statistical results if we plan on finding the confidence interval
and hypothesis tests of the proportions and mean of the population. There would be fewer ways to
make errors this way. The only things we need to be careful of regardless of what method we choose is
the sampling method. It needs to be solidly random, otherwise, lots of potentially big errors will be
made.
Download