26133: Business Statistics Exam Notes Types of Data Data Quality

advertisement
26133: BUSINESS STATISTICS
EXAM NOTES
1
TYPES OF DATA.................................................................................................................................................. 2
1.1
1.2
1.3
2
Data Quality (Nominal, Ordinal, Interval, Ratio) .................................................................................................................................. 2
Method of data collection .................................................................................................................................................................... 2
Types of graphs .................................................................................................................................................................................... 2
DESCRIPTIVE STATISTICS, NUMERICAL MEASURES...................................................................................................... 3
2.1
2.2
3
Numerical Data Summaries ................................................................................................................................................................. 3
Finding Outliers .................................................................................................................................................................................... 4
PROBABILITY [1] ................................................................................................................................................ 5
3.1
Miscellaneous Laws ............................................................................................................................................................................. 5
DEPENDANCE (CHI2 TEST) .................................................................................................................................... 6
4
5
PROBABILITY [2]: DISCREET PROBABILITY DISTRIBUTIONS ........................................................................................... 7
5.1
5.2
6
Binomial Distribution ........................................................................................................................................................................... 7
Poisson Distributions............................................................................................................................................................................ 7
PROBABILITY [3]: CONTINUOUS DISTRIBUTIONS........................................................................................................ 8
6.1
6.2
6.3
7
Uniform Distribution ............................................................................................................................................................................ 8
Normal Distribution ............................................................................................................................................................................. 8
Exponential Distributions ..................................................................................................................................................................... 8
SAMPLING AND SAMPLING DISTRIBUTIONS .............................................................................................................. 9
7.1
7.2
7.3
8
Can the sample be assumed to be normal? ......................................................................................................................................... 9
Standard error of a sample mean ........................................................................................................................................................ 9
Finite correction factor ........................................................................................................................................................................ 9
INTERVAL ESTIMATION ...................................................................................................................................... 10
8.1
8.2
8.3
8.4
8.5
9
Estimating the population mean with a large N, using “z” ................................................................................................................ 10
Estimating the population mean, using “t-statistic” (𝜎 unknown) ..................................................................................................... 11
Estimating the population proportion................................................................................................................................................ 12
Estimating population variance ......................................................................................................................................................... 12
Estimating sample size ....................................................................................................................................................................... 13
HYPOTHESIS TESTING [1 POPULATION] .................................................................................................................. 14
9.1
9.2
9.3
Methodology ..................................................................................................................................................................................... 14
Rejection and non-rejection regions .................................................................................................................................................. 14
Types of questions ............................................................................................................................................................................. 14
10
HYPOTHESIS TESTING [2+ POPULATIONS]............................................................................................................... 17
11
REGRESSION [1] .............................................................................................................................................. 19
12
REGRESSION [2] .............................................................................................................................................. 20
1
1
TYPES OF DATA
1.1
D ATA Q UALITY (N OMINAL , O RDINAL , I NTERVAL , R ATIO )
ο‚·
ο‚·
ο‚·
ο‚·
1.2
Nominal (purely descriptive)
Ordinal (ordered)
Interval (each group of equal magnitude)
Ratio (has a zero point)
M ETHOD OF DATA COLLECTION
ο‚·
ο‚·
ο‚·
ο‚·
1.3
Sampling (small group to represent population)
o Cheap
Population (everyone)
o Thorough
Time-series (over time)
o Shows change
Cross-sectional (once/a snapshot)
o Cheap/where time is irrelevant
T YPES OF GRAPHS
ο‚·
ο‚·
ο‚·
ο‚·
ο‚·
Bar chart
o Sectional comparison/growth
Line graph
Ogive
o Cumulative frequency (percentage less than)
Pie chart
o Percentages
Scatter plot
o Infer trends
2
2
2.1
DESCRIPTIVE STATISTICS, NUMERICAL MEASURES
N UMERICAL D ATA S UMMARIES
2.1.1 Mode
Most popular option
2.1.2 Median
Central option
2.1.3 Mean
π‘€π‘’π‘Žπ‘› = πœ‡ = π‘₯Μ… =
1.
2.
3.
4.
∑ π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘›π‘ 
# π‘œπ‘“ π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘›π‘ 
SD Mode [MODE], [MODE], [1].
Stat clear [SHIFT], [MODE], [1].
Enter values [OBSERVATION], [SHIFT], [,], [NUMBER OF OBSERVATIONS], [M+] (repeat for each
observation value).
Calculate [SHIFT], [2] (S-VAR), [1] (𝑋̅), [=].
2.1.4 Variance
π‘‰π‘Žπ‘Ÿπ‘–π‘Žπ‘›π‘π‘’ = 𝜎 2 =
1.
2.
3.
4.
∑𝑛𝑖=1(π‘₯𝑖 − πœ‡)2
∑𝑛𝑖=1(π‘₯𝑖 − π‘₯Μ… )2
= 𝑠2 =
𝑛
𝑛−1
Stat clear [SHIFT], [MODE], [1].
Enter values [OBSERVATION], [SHIFT], [,], [NUMBER OF OBSERVATIONS], [M+] (repeat for each
observation value).
Calculate standard deviation [SHIFT], [2] (S-VAR), [2] (x 𝜎n) OR [3] (x 𝜎n-1), [=].
Square for variance [2], [=].
2.1.5 Standard Deviation
π‘†π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π·π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘› = 𝜎 = √𝜎 2 = 𝑠 = √𝑠 2
1.
2.
3.
Stat clear [SHIFT], [MODE], [1].
Enter values [OBSERVATION], [SHIFT], [,], [NUMBER OF OBSERVATIONS], [M+] (repeat for each
observation value).
Calculate standard deviation [SHIFT], [2] (S-VAR), [2] (x 𝜎n) OR [3] (x 𝜎n-1), [=].
3
2.1.6 Coefficient of Variation
Measure of data spread; best method where the data set is positive.
πΆπ‘œπ‘’π‘“π‘“π‘–π‘π‘–π‘’π‘›π‘‘ π‘œπ‘“ π‘‰π‘Žπ‘Ÿπ‘–π‘Žπ‘‘π‘–π‘œπ‘› =
2.2
𝑠
𝜎
(100) = (100)
π‘₯Μ…
πœ‡
F INDING O UTLIERS
2.2.1 Z-Score
Z-score describes the distance of a number from the average in terms of standard deviations.
π‘π‘ π‘π‘œπ‘Ÿπ‘’ = 𝑧𝑖 =
ο‚·
π‘₯𝑖 − π‘₯Μ…
𝑠
In outliers, 𝑧𝑖 > 3
2.2.2 Box and whisker plot
Use for irregular/asymmetrical data
Describes the data set in terms of 5 points: min, q1 , median, q3 , max → 𝐼𝑄𝑅 = π‘ž3 − π‘ž1.
ο‚·
ο‚·
ο‚·
ο‚·
ο‚·
min = π‘ž1 − 1.5(𝐼𝑄𝑅)
π‘ž1 = 𝑠𝑝𝑙𝑖𝑑 π‘Žπ‘”π‘Žπ‘–π‘›
median = π‘π‘’π‘›π‘‘π‘Ÿπ‘Žπ‘™ π‘‘π‘Žπ‘‘π‘Ž π‘π‘œπ‘–π‘›π‘‘
π‘ž3 = 𝑠𝑝𝑙𝑖𝑑 π‘Žπ‘”π‘Žπ‘–π‘›
max = π‘ž3 + 1.5(𝐼𝑄𝑅)
4
3
PROBABILITY [1]
3.1
M ISCELLANEOUS L AWS
ο‚·
ο‚·
Sum of probabilities = 1 = 100%
𝑝′ = 1 − 𝑝
P(A∩B)
P(A’∩B)
P(A∩B’)
P(A’∩B’)
P(A)
P(A’)
P(B)
P(B’)
1
3.1.1 Intersection
Both occur: 𝑃(𝐴 ∩ 𝐡)
3.1.2 Union
Either A or B or both occurring: 𝑃(𝐴 ∪ 𝐡)
𝑃(𝐴 ∪ 𝐡) = 𝑃(𝐴) + 𝑃(𝐡) − 𝑃(𝐴 ∩ 𝐡)
3.1.3 Conditional Probability
Probability of A occurring given that B already occurs
𝑃(𝐴|𝐡) =
𝑃(𝐴 ∩ 𝐡)
𝑃(𝐡)
5
4
DEPENDANCE (CHI 2 TEST)
4.1.1 Observed Data
Insert observed data into a probability
table
Observed data
W
Retail
Sale
'W
420
1140
1560
sum w
280
160
440
sum 'w
700 sum rp
1300 sum 'rp
2000
TT
4.1.2 Probability from observations
π‘ƒπ‘Ÿπ‘œπ‘π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦ π‘“π‘Ÿπ‘œπ‘š π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘› =
π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘›
∑∑
Probability
P (Retail)
P' (Sale)
P (W)
P' (W)
0.21
0.14
0.57
0.08
0.78
0.22
P (W)
P' (W)
0.35 P (RP)
0.65 P' (RP)
1
TT
4.1.3 Predicted results if events are independent
π‘ƒπ‘Ÿπ‘’π‘‘π‘–π‘π‘‘π‘’π‘‘ π‘Ÿπ‘’π‘ π‘’π‘™π‘‘π‘  𝑖𝑓 𝑒𝑣𝑒𝑛𝑑𝑠 π‘Žπ‘Ÿπ‘’ π‘–π‘›π‘‘π‘’π‘π‘’π‘›π‘‘π‘Žπ‘›π‘‘ =
Events as independent
W
'W
Retail
546
154
Sale
1014
286
1560
440
sum w
sum 'w
πΆπ‘œπ‘™π‘’π‘šπ‘› ∗ π‘…π‘œπ‘€
∑∑
700 sum rp
1300 sum 'rp
2000
TT
4.1.4 Chi 2 Test
1.
Create table: for each cell, πΆβ„Žπ‘– = πœ’ =
2.
Total all cells: TTotal = Chi2 value
(π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ π‘Ÿπ‘’π‘ π‘’π‘™π‘‘π‘ −π‘π‘Ÿπ‘’π‘‘π‘–π‘π‘‘π‘’π‘‘ π‘Ÿπ‘’π‘ π‘’π‘™π‘‘π‘  𝑖𝑓 π‘–π‘›π‘‘π‘’π‘π‘’π‘›π‘‘π‘Žπ‘›π‘‘)2
π‘π‘Ÿπ‘’π‘‘π‘–π‘π‘‘π‘’π‘‘ π‘Ÿπ‘’π‘ π‘’π‘™π‘‘π‘  𝑖𝑓 π‘–π‘›π‘‘π‘’π‘π‘’π‘›π‘‘π‘Žπ‘›π‘‘
Compare Chi2 value with Chi2 critical value [found by entering degrees of freedom (π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘Ÿπ‘œπ‘€π‘  −
1)(π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘π‘œπ‘™π‘’π‘šπ‘›π‘  − 1) and alpha value (1 − π‘π‘’π‘Ÿπ‘‘π‘Žπ‘–π‘›π‘‘π‘¦ π‘Ÿπ‘’π‘žπ‘’π‘–π‘Ÿπ‘’π‘‘) into the chi2 tables] οƒ  if π‘β„Žπ‘– 2 >
π‘β„Žπ‘– 2 π‘π‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ π‘£π‘Žπ‘™π‘’π‘’, then the values are dependant
6
5
PROBABILITY [2]: DISCREET PROBABILITY DISTRIBUTIONS
Finite number of observations
1.
Determine the type of distribution
a. Binomial Distribution
b. Poisson Distribution
What is the question?
a. Probability of x? Probability of more or less than?
b. DRAW
Get the formula
Apply the terms
2.
3.
4.
5.1
B INOMIAL D ISTRIBUTION
𝑃(π‘₯) = (𝑛π‘₯)𝑃 π‘₯ π‘ž 𝑛−π‘₯ =
𝑛!
π‘₯!(𝑛−π‘₯)!
𝑝 π‘₯ π‘ž 𝑛−π‘₯
X = number of successes required
N = number of trials
P = probability of success
Q = 1-probability of failure
𝑓(π‘₯ = π‘Ž) = π‘›πΆπ‘Ž ∗ π‘π‘Ž ∗ π‘ž 𝑛−π‘Ž
F(x) = probability of x successes in n trials
5.2
P OISSON D ISTRIBUTIONS
𝑃(π‘₯) =
πœ†π‘₯ 𝑒 −πœ†
π‘₯!
Λ = mean of Poisson distribution
7
6
PROBABILITY [3]: CONTINUOUS DISTRIBUTIONS
Working strictly with probabilities (percentages etc)
6.1
U NIFORM D ISTRIBUTION
This one looks like a rectangle; you merely need to find the area.
6.2
N ORMAL D ISTRIBUTION
6.2.1 Probability density function of the normal distribution
𝑓(π‘₯) =
1
𝜎√2πœ‹
1 π‘₯−πœ‡ 2
]
𝜎
𝑒 −(2)[
6.2.2 Standardization (z-scores)
𝑧=
π‘₯−πœ‡
𝜎
Then plug the z score into the z distribution table (single sided test)
6.3
E XPONENTIAL D ISTRIBUTIONS
6.3.1 Probability density function of the exponential
distribution
𝑓(π‘₯) = πœ†π‘’ −πœ†π‘₯
X & πœ†must be greater than zero
6.3.2 Probability of the right tail of the exponential distribution
𝑃(π‘₯ ≥ π‘₯0 ) = 𝑒 −πœ†π‘₯0
X0 must be greater than 0
8
7
7.1
SAMPLING AND SAMPLING DISTRIBUTIONS
C AN THE SAMPLE BE ASS UMED TO BE NORMAL ?
If: sample >30, yes
If: population is normal, yes
7.2
S TANDARD ERROR OF A SAMPLE MEAN
For infinite population
𝜎π‘₯Μ… =
𝜎
√𝑛
For finite population
𝜎
𝑁−𝑛
𝜎π‘₯Μ… = ( ) (√
)
𝑁−1
√𝑛
N = observations in population
n = observations in sample
7.3
F INITE CORRECTION FAC TOR
This is necessary when
𝑛
𝑁
> 0.05
For proportions
πœŽπ‘Μ‚ = √
π‘π‘ž 𝑁 − 𝑛
√
𝑛 𝑁−1
𝑝̂ = π‘π‘Ÿπ‘œπ‘π‘œπ‘Ÿπ‘‘π‘–π‘œπ‘› =
π‘₯
𝑛
X = number of items in sample with the requisite characteristic
For quantitative data
𝜎
𝑁−𝑛
𝜎π‘₯Μ… = ( ) (√
)
𝑁−1
√𝑛
9
8
INTERVAL ESTIMATION
8.1
E STIMATING THE POPULATION MEAN WITH A LAR GE N, USING “ Z ”
8.1.1 Basic form
π‘π‘œπ‘–π‘›π‘‘ π‘’π‘ π‘‘π‘–π‘šπ‘Žπ‘‘π‘’ ± π‘π‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ π‘£π‘Žπ‘™π‘’π‘’ ∗ π‘ π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ
If 𝑧 =
π‘₯Μ… −πœ‡
𝜎
√𝑛
πœ‡ = π‘₯Μ… ± 𝑧
and sample mean can be greater or less than the population mean, the confidence interval is:
𝜎
√𝑛
8.1.2 Estimating πœ‡
πœ‡ = π‘₯Μ… ± 𝑧𝛼/2
𝜎
√𝑛
𝑧𝛼/2 =z-score of the one sided area outside of the confidence interval
Or
π‘₯Μ… − 𝑧𝛼/2
𝜎
√𝑛
≤ πœ‡ ≤ π‘₯Μ… + 𝑧𝛼/2
𝜎
√𝑛
Usually, 𝑧𝛼/2 for confidence of 95%, see below
8.1.3 Finding 𝑧𝛼/2
1.
2.
Draw
Plug 𝑧𝛼/2 into z-tables
10
8.1.4 Add a finite correction factor
π‘₯Μ… − 𝑧𝛼/2
𝜎
𝑁−𝑛
𝜎 𝑁−𝑛
√
√
≤ πœ‡ ≤ π‘₯Μ… + 𝑧𝛼/2
√𝑛 𝑁 − 1
√𝑛 𝑁 − 1
8.1.5 If n is small (<30), then you can only use the above formulae if the population is
normal
8.2
E STIMATING THE POPULATIO N MEAN , USING “ T - STATISTIC ” (𝜎 UNKNOWN )
8.2.1 T distribution
A distribution that describes the standardized sample mean when 𝜎 is unknown and population is normal
8.2.2 T value
Tool used to reach conclusions about null hypothesis
𝑑=
π‘₯Μ… − πœ‡
𝑠 ⁄ √𝑛
8.2.3 T distribution table
To read the table we need degrees of freedom and a t value
π·π‘’π‘”π‘Ÿπ‘’π‘’π‘  π‘œπ‘“ π‘“π‘Ÿπ‘’π‘’π‘‘π‘œπ‘š = 𝑛 − 1
𝑑 = 𝛼/2
8.2.4 Confidence intervals to estimate the population mean using the t -stat
π‘₯Μ… − 𝑑𝛼,𝑛−1
2
𝑠
√𝑛
≤ πœ‡ ≤ π‘₯Μ… + 𝑑𝛼,𝑛−1
2
𝑠
√𝑛
11
8.3
𝑧=
E STIMATING THE POPULATION PROPORT ION
𝑝̂ − 𝑝
√𝑝̂ π‘žΜ‚
𝑛
𝑝̂ = sample proportion
π‘žΜ‚= 1-𝑝̂
p= population proportion
n= sample size
8.3.1 Confidence interval to estimated p
𝑝̂ − 𝑧𝛼/2 √
8.4
𝑠2 =
𝑝̂ π‘žΜ‚
𝑝̂ π‘žΜ‚
≤ 𝑝 ≤ 𝑝̂ + 𝑧𝛼/2 √
𝑛
𝑛
E STIMATING POPULATION VARIANCE
∑(π‘₯ − π‘₯Μ… )2
𝑛−1
8.4.1 Chi 2 formula for variance
NB: Distribution must be normal to use this formula
πœ’2 =
(𝑛 − 1)𝑠 2
𝜎2
𝑑𝑓 = (𝑛 − 1)
8.4.2 Confidence interval to estimate the population variance
(𝑛 − 1)𝑠 2
2
πœ’π‘Ž/2
≤ 𝜎2 ≤
(𝑛 − 1)𝑠 2
2
πœ’1−π‘Ž/2
𝑑𝑓 = (𝑛 − 1)
2
Work πœ’ 2 out using πœ’(2π‘Ž),𝑑𝑓 and πœ’(1−
and the πœ’ 2 tables.
π‘Ž
),𝑑𝑓
2
2
12
8.5
E STIMATING SAMPLE SIZ E
This is used to find the minimum sample size to fulfill the requirements of a particular confidence level within a
certain amount of error.
8.5.1 Sample size when estimating µ
𝑛=
2
π‘§π‘Ž/2
𝜎2
𝐸2
2
π‘§π‘Ž 𝜎
=(
2
𝐸
)
𝐸 = (π‘₯Μ… − πœ‡) = πΈπ‘Ÿπ‘Ÿπ‘œπ‘Ÿ π‘œπ‘“ πΈπ‘ π‘‘π‘–π‘šπ‘Žπ‘‘π‘–π‘œπ‘›
You either need to work out E, or it can be given as “to be within .03 of the true population proportion”
Always round up, since you can’t have half-people
8.5.2 Sample size when estimating p
𝑛=
𝑧 2 π‘π‘ž
𝐸2
Work out z-stat through confidence interval and tables
13
9
HYPOTHESIS TESTING [1 POPULATION]
9.1
M ETHODOLOGY
1.
2.
3.
4.
5.
9.2
Specify the thing of interest
Formulate H0 and Ha
a. Draw
Define the level of significance
a. 1 sided or two sided test?
i. 1 sided for greater or less
ii. 2 sided for equals
Test
a. Determine the appropriate statistical test
b. Establish the decision rule
c. Gather sample data
d. Analyze the data
Conclude/business application
R EJECTION AND NON - REJECTION REGIONS
Via critical values (inside is non-rejection, outside is rejection region)
14
9.3
U SING Z - STAT
9.3.1 Testing hypothesis about a population mean using the z -stat
Z test for a single mean
𝑧=
π‘₯Μ… − πœ‡
𝜎⁄√𝑛
Where result is z, minus z from 0.5 or 1 and find on z table then look up row/column (i.e. the reverse of finding z
score)
9.3.1.1 EXAMPLE QUESTION
CPA’s average net Y for sole proprietor is $74914 [statistic from 10 years ago]
Test again, n=112, 𝜎=$14530
STEP 1: HYPOTHESISE
H0: µ=$74914
Ha: µ≠$74914
STEP 2: WHICH TEST TO USE?
Sample size is large (n>30), sample mean as stat, therefore z-stat.
𝑧=
π‘₯Μ… − πœ‡
𝜎⁄√𝑛
STEP 3: WHAT ARE THE CRITICAL VALUES?
Accuracy required: 95%, therefore α=.05
This test involves an = sign, not a ≤ or ≥ sign, so it is a two tailed test
α/2=.05/2=.025
Each side therefore has a .475 success area and a .025 fail area.
Plug .025 into z table to find zα/2 οƒ  +/- 1.96
STEP 4: FIND TEST STATISTIC
Sample mean = $78695, n = 112, µ = $74914, 𝜎=$14530
𝑧=
78695 − 74914
𝜎14530⁄√112
= 2.75
15
STEP 5: COMPARE TO CRITICAL VALUES
Accepted range = +/- 1.96; 2.75 is not in this range, reject null hypothesis
9.3.2 Testing the mean with a finite population
𝑧=
π‘₯Μ… − πœ‡
𝜎 √𝑁 − 𝑛
√𝑛 𝑁 − 1
9.4
U SING F- STAT
9.4.1 T-test for µ
P320
π‘₯Μ… − πœ‡
𝑠
√𝑛
𝑑=
𝑑𝑓 = 𝑛 − 1
9.5
𝑧=
H YPOTHESIS ABOUT A PR OPORTION
𝑝̂ − 𝑝
π‘π‘ž
√𝑝
9.6
H YPOTHESIS ABOUT A VARIANCE
P331
πœ’2 =
(𝑛 − 1)𝑠 2
𝜎2
𝑑𝑓 = 𝑛 − 1
9.7
T YPE 2 ERRORS
When null hypothesis is false
See p 334
16
10 HYPOTHESIS TESTING [2+ POPULATIONS]
p399
10.1 Z FORMULA FOR THE DI FFERENCE IN TWO SAMP LE MEANS AND POPULATION VARIA NCES
𝑧=
(π‘₯1 − π‘₯2 ) − (πœ‡1 − πœ‡2 )
√(
𝜎12 𝜎22
+ )
𝑛1 𝑛2
πœ‡1 − πœ‡2 = 0
10.1.1
Confidence intervals in estimate of πœ‡1 − πœ‡2
( SEE P 360)
10.2 T STAT FOR THE DIFFERE NCE IN TWO SAMPLE MEANS (V ARIANCES UNKNOWN )
(see p365)
10.2.1
Confidence intervals in estimate of πœ‡1 − πœ‡2
(see p369)
10.3 S TATISTICAL INFERENCE S FOR RELATED POPULATIONS
(see p 373)
10.4 S TATISTICAL INFERENCE S FOR TWO POPULATION PROPORTIONS
(p383)
10.5 S TATISTICAL INFERENCE S FOR TWO POPULATION VARIANCES
(p390)
Ratio of two sample variances gives F value
17
11 ANOVA
18
12 REGRESSION [1]
12.1 S INGLE R EGRESSION
𝑦 = (π‘–π‘›π‘‘π‘’π‘Ÿπ‘π‘’π‘π‘‘) + 𝑐1 π‘₯1 + 𝑐2 π‘₯2 + β‹― + 𝑐𝑛 π‘₯𝑛
If regression output “p-value” is smaller than .05 reject null hypothesis and use in formula
R^2 shows “goodness” of model (0=bad, 1=good)
12.2 M ULTIPLE REGRESSION
In multiple regression R^2 is inaccurate, so we have to adjust
12.3 P ROBLEMS
Multi collinearity (values overlap)
19
13 REGRESSION [2] MORE PROBLEMS
Residual is the difference between predicted and actual results
13.1 F-T EST
H0, all of the coefficients = 0
If f-stat > critical F
If significance f < alpha, reject
Testing each coefficient, change one at a time to 0, see if there is a change
20
Download