Construction Engineering 221 Probability and Statistics

advertisement
Construction Engineering 221
Probability and Statistics
Problem 10
• Solution to problem 10 on page 82
– First recognize that it is a binomial (counting)
problem
– Second recognize that the binomial
calculation will be problematic because n is
large (n=100)
– Third- use the normal probability distribution
as an approximation of the binomial
distribution
Problem 10
• Normal probability approximation:
– µ=nπ, or µ = 100*.3 = 30
– sd= nπ(1-π), or 100*.3*.7 = 4.58
– P (40); z= (40-30)/4.58 = 2.18
– A(x) @ z=2.18 = .48537
– Probability of 40 hits is 1-(.5 +.48537)
– P(40) = .0146, or 1.46%
– I believe the book’s answer is incorrect
Linear Regression
• Sometimes we need to make predictions about
the likelihood of an event (flood, traffic accident,
inflation, disease, etc.)
• We can use statistics to sort variance into
recognizable patterns to help us interpret what is
“random” variance” and what is “sample”
variance.
• Random variance is distributed throughout the
population at random. Sample variance is
created by membership in a sample (people who
smoke and get lung cancer)
Linear Regression
• Sample variance can be correlated
between -1 and +1. If a high score is
correlated (occurs frequently within the
sample) with a low score, then the
correlation coefficient is negative. If a high
score occurs frequently with a high score,
the data is positively correlated
Linear Regression
• What type of correlation would you expect
between:
– IQ and salary?
– GPA and hours studying?
– GPA and hours drinking/partying?
– Price of tea in China and number of wins in a
season by the Chicago Cubs?
– Socio-economic standing and crime rate?
Linear Regression
• Correlation coefficient r =
2[Σ(y-ˉ)2]
Σ(x-ˉ)(y-ˉ)/
[Σx-ˉ]
X
Alternate formula eq. 9-2 on page 109
Assumptions:
relationship is linear
both variables are random
conditional variances are equal
variables are bivariate normal
Linear regression
• Example of
correlation
height
Weight
65
185
67
200
69
215
62
140
71
220
77
250
75
245
79
235
70
220
Linear Regression
Column 1
Column 2
Column 1
1
Column 2
0.909022
1
Linear Regression
• Can be done with Excel spreadsheets
• Linear regression is a special form of correlation,
attempts to find the regression line, or the line
through the correlated data that best fits the
data. The regression line can then be used to
predict outcomes.
• Regression has formula y=bx +a, where
– Y is the dependent variable, x is the independent
variable, b is the regression coefficient, and a is a
constant
Linear Regression
• When one predictor (independent) variable
is used, it is called a simple regression,
when more than one predictor is used, it is
called multiple regression
• Restatement of regression formula in
common terms:
– Expected value of the variable to be predicted
=intercept +(slope X value of predictor
variable); where slope is regression coeff.
Linear Regression
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.909022
R Square
0.826322
Adjusted R Square
0.801511
Standard Error
15.15392
Observations
9
Linear Regression
•
•
ANOVA
•
Regression
•
•
Residual 7
Total
8
df
Coefficients
•
•
Intercept -176.3
X Variable 5.506608
SS
MS
7648.067
1607.489
9255.556
229.6413
1
Standard Error t Stat
67.5124
0.954187
-2.61137
5.770997
F
7648.067
P-value
95.0%
0.034844
0.000684
Significance F
33.30441
0.000684
Lower 95%
-335.941
3.250317
Upper 95%
-16.6582
7.762899
Lower 95.0%
-335.941
3.250317
Upper
-16.6582
7.762899
Formula is: weight = -176.3 +height(5.51)
So if a new person joined the team and all we
knew was that he was 6’-10”, we would be able
to guess his weight at w= -176.3 +(82)(5.51)=
275 pounds
Download