Uploaded by pr3cin

Applied Business Statistics 4th Edition

advertisement
Applied Business Statistics
METHODS AND EXCEL-BASED APPLICATIONS 4 t h E d it io n
Applied Business
Statistics
METHODS AND EXCEL-BASED APPLICATIONS
TREVOR WEGNER
www.jutaacademic.co.za
4th Edition
SOLUTIONS MANUAL
TREVOR WEGNER
Applied Business Statistics: Methods and Excel-based Applications: Solutions Manual
Print edition first published in 1993
Reprinted 2000 and 2003
Second Edition 2008
Third Edition 2012
Fourth edition 2015 (Web PDF)
Juta and Company (Pty) Ltd
First Floor
Sunclare Building
21 Dreyer Street
Claremont
7708
PO Box 14373, Lansdowne 7779, Cape Town, South Africa
© 2015 Juta & Company (Pty) Ltd
ISBN 978 1 48511 788 9 (Web PDF)
All rights reserved. No part of this publication may be reproduced or transmitted in any form or
by any means, electronic or mechanical, including photocopying, recording, or any information
storage or retrieval system, without prior permission in writing from the publisher. Subject to
any applicable licensing terms and conditions in the case of electronically supplied
publications, a person may engage in fair dealing with a copy of this publication for his or her
personal or private use, or his or her research or private study. See section 12(1)(a) of the
Copyright Act 98 of 1978.
The author and the publisher believe on the strength of due diligence exercised that this work
does not contain any material that is the subject of copyright held by another person. In the
alternative, they believe that any protected pre-existing material that may be comprised in it
has been used with appropriate authority or has been used in circumstances that make such
use permissible under the law.
CHAPTER 1
STATISTICS IN MANAGEMENT
1.1
It is a decision support tool. It generates evidence based information through analysis of
data to inform management decision making.
1.2
Descriptive statistics summarises (profiles) sample data; inferential statistics generalises
sample findings to a broader population (to estimate values or confirm relationships).
1.3
Statistical modelling is explores and quantifies relationships between variables for
estimation or prediction purposes.
1.4
Data quality is influenced by: (i) Data source; (ii) Data collection method; and (iii) Data type
1.5
Different statistical methods are valid for different data types.
1.6
In data preparation, consider (i) Data relevancy; (ii) Data cleaning; and (iii) Data enrichment.
1.7
(a) Random variable: Performance appraisal system used
(b) Population:
All JSE companies
(c) Sample:
The 68 HR managers surveyed
(d) Sampling unit:
a JSE-listed company
(e) 46% is a sample statistic
(f) Random sampling is necessary to allow valid inferences to be drawn based on the sample
evidence.
1.8
(a) Random variable: Female magazine readership
(b) Population:
All female magazine readers
(c) Sample:
The 2000 randomly selected female magazine readers
(d) Sampling unit:
a female reader of a female magazine
(e) 35% (700/2000) is a sample statistic
(f) Inferential statistics – as its purpose is to test the belief that market share = 38%
1.9
(a) Three (3) random variables. They are:
(i) weekly sales volume; (ii) number of ads placed per week; (iii) advertising media used.
(b) Dependent variable = weekly sales volume
(c) Independent variables = number of ads placed per week; advertising media used.
(d) Statistical model building (predict sales volume from ads placed and media used)
1.10
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Scenario 5
Scenario 6
Scenario 7
1.11
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
1.12
Inferential statistics
Descriptive statistics
Descriptive statistics
Inferential statistics
Inferential statistics
Inferential statistics
Inferential statistics
(j)
(k)
(l)
(m)
(n)
(o)
(p)
(q)
(r)
(s)
(t)
numeric, ratio-scaled, continuous
{21,4 years; 34,6 years}
numeric, ratio-scaled, continuous
{416,2m²; 3406,8m²}
categorical, ordinal-scaled, discrete
{matric; diploma}
categorical, nominal-scaled, discrete
{married; single}
categorical, nominal-scaled, discrete {Boeing; Airbus}
categorical, nominal-scaled, discrete
{verbal; emotional}
numeric, ratio-scaled, discrete
{41 ; 62}
categorical, ordinal-scaled, discrete
{salary only; commission only}
(i) categorical, ordinal-scaled, discrete {1 = apple; 2 = orange}
(ii) categorical, nominal-scaled, discrete {yes; no}
(iii) categorical, nominal-scaled, discrete {train; bus}
(iv) numeric, interval-scaled, discrete {2; 5}
numeric, ratio-scaled, continuous
{12,4kg; 7,234kg}
categorical, nominal-scaled, discrete
{Nescafe; Jacobs}
numeric, ratio-scaled, continuous
{26,4 min; 38,66 min}
categorical, ordinal-scaled, discrete
{Super; Standard}
numeric, ratio-scaled, continuous
{R85,47; R2315,22}
numeric, ratio-scaled, discrete
{75; 23}
numeric, ratio-scaled, discrete
{5; 38}
numeric, ratio-scaled, continuous
{9,54 hours; 10,12 hours}
numeric, interval-scaled, discrete
{2; 6}
numeric, ratio-scaled, discrete
{75; 238}
categorical, nominal-scaled, discrete
{Growth funds; Industrial funds}
(a)
11 random variables
(b)
Economic sector
Head office region
Company size
Turnover
Share price
Earnings per share
Dividends per share
Number of shares
ROI (%)
Inflation index (%)
Year established
categorical, nominal-scaled, discrete
categorical, nominal-scaled, discrete
numeric, ratio-scaled, discrete
numeric, ratio-scaled, continuous
numeric, ratio-scaled, continuous
numeric, ratio-scaled, continuous
numeric, ratio-scaled, continuous
numeric, ratio-scaled, discrete
numeric, ratio-scaled, continuous
numeric, ratio-scaled, continuous
numeric, ratio-scaled, discrete
(c) Illustration value
{retail}
{Gauteng}
{242}
{R3 432 562}
{R18.48}
{R2.16 / share}
{R0.86 / share}
{12 045 622}
{8.64%}
{6.75%}
{1988}
1.13
1.14
(a)
13 random variables
(b)
Gender
Home language
Position
Join date
Status
Claimed
Problems
Yes problem
Services - airlines
Services – car rentals
Services - hotels
Services – financial
Services – telecomms
categorical, nominal-scaled, discrete
categorical, nominal-scaled, discrete
categorical, ordinal-scaled, discrete
numeric, ratio-scaled, discrete
categorical, ordinal-scaled, discrete
categorical, nominal-scaled, discrete
categorical, nominal-scaled, discrete
categorical, nominal-scaled, discrete
numeric, interval-scaled, discrete
numeric, interval-scaled, discrete
numeric, interval-scaled, discrete
numeric, interval-scaled, discrete
numeric, interval-scaled, discrete
(c) Illustration value
{female}
{Xhosa}
{middle manager}
{1998}
{gold status}
{yes}
{yes}
{online access difficult}
{2}
{5}
{4}
{2}
{2}
Financial Analysis data:
mainly numeric (quantitative), ratio-scaled.
Voyager Services Quality data: mainly categorical (qualitative); but when rating scales
are used, such as in Question 8, the data is numeric, but interval-scaled and discrete.
---ooOoo---
CHAPTER 2
EXPLORATORY DATA ANALYSIS
SUMMARISING DATA SUMMARY TABLES AND GRAPHS
A picture is worth a thousand words.
Exercise 2.1
Exercise 2.2
(a)
(b)
(c)
(d)
bar (or pie) chart
multiple (or stacked) bar chart
histogram
scatter plot
Exercise 2.3
Cross-tabulation table (or joint frequency table; or two-way pivot table).
Exercise 2.4
Bar chart (i)
(ii)
(iii)
displays data on a categorical variable
categories can be displayed in any order
width of bars is arbitrary (but all of equal widths)
Histogram (i)
(ii)
(iii)
displays numerical data only
intervals must be continuous (and constant width) and sequential
width of bars is determined by interval width
Exercise 2.5
Line graph
Exercise 2.6
(a)
File:
X2.6 - magazines.xlsx
Magazine preferences by female teenagers
Magazine
True Love
Seventeen
Heat
Drum
You
Total
Count
95
146
118
55
86
500
%
19%
29%
24%
11%
17%
100%
Percent of Female Teenagers
17%
19%
Seventeen
11%
Heat
29%
24%
(b)
True Love
Drum
You
Interpretation
Seventeen is the most popular teenager magazine (29% of female teenager prefer it).
Almost a quarter of the female readers surveyed prefer reading Heat (24%), while the least
preferred magazine is Drum with only 11% of female magazine readers preferring it.
Exercise 2.7
File:
(a)
X2.6 - magazines.xlsx
Magazine preferences by female teenagers
Magazine
True Love
Seventeen
Heat
Drum
You
Total
Count
95
146
118
55
86
500
%
19%
29%
24%
11%
17%
100%
% of Female Teenagers per Magazine
35%
29%
30%
25%
20%
15%
24%
19%
17%
11%
10%
5%
0%
(b)
Heat is preferred by 24% of all female teenager readers.
Exercise 2.8
File:
(a) and (b)
Categorical Frequency Table - Job Grades
Job grade
A
B
C
D
Total Count
Total %
Data
Count
%
Count
%
Count
%
Count
%
Total
14
35%
11
27.5%
6
15%
9
22.5%
40
100%
(c)
22.5% of employees are in job grade D
(d)
Bar Chart and Pie Chart - Job Grades
35
%
35
27.5
15
22.5
100
35
27.5
30
22.5
25
20
D
23%
15
15
C
15%
10
5
0
Job Grade
A
B
C
D
Total
% of Employees per Job Grade
% Employees per Job Grade
40
X2.8 - job grades.xlsx
A
B
C
D
B
27%
A
35%
A
B
C
D
Exercise 2.9
(a) and (b)
File:
Numerical Frequency Distribution and Cumulative Frequency Distribution
Rentals
≤ 200
201 - ≤250
251 - ≤300
301 - ≤350
351 - ≤400
More
Total
(c)
X2.9 - office rentals.xlsx
(i)
(ii)
(iii)
(iv)
Count
4
8
9
6
3
0
30
% Count
13.3
26.7
30
20
10
0
100
Cum %
13.3%
40.0%
70.0%
90.0%
100.0%
100.0%
13.3% of all office space costs less than or equal to R200 / m2
70% of all office space costs at most R300 / m²
10% of all office space costs more than R350 / m²
9 office buildings have rentals between R300 and R400 / m²
Exercise 2.10
File:
X2.10 - storage dams.xlsx
Cape Town water storage dams capacities
(a)
Storage Dam
Wemmershoek
Steenbras
Voelvlei
Theewaterskloof
Total capacity
Capacity (Ml)
158644
95284
244122
440255
938305
%
16.9
10.2
26
46.9
100
Capacity of Cape Town Storage Dams
(in Million litre)
17%
47%
10%
26%
(b)
(i)
(ii)
Wemmershoek
Steenbras
Voelvlei
Theewaterskloof
Voelvlei dam supplies 26% of Cape Town's water.
Wemmershoek and Steenbras dams together provide 27.1% of Cape Town's water.
Exercise 2.11
(a)
X2.11 - taste test.xlsx
File:
Taste test preferences for fruit juices
Blind Label
A
B
C
D
E
Brand
Number
Liqui Fruit
Fruiti Drink
Yum Yum
Fruit Quencher
Go Fruit
Total
%
18.0
10.4
25.6
15.2
30.8
100.0
45
26
64
38
77
250
Bar Chart - Fruit Juice Preferences
Consumer Preferences for Fruit Juice Brands
40.0
% of consumers
35.0
30.8
30.0
25.6
25.0
20.0
18.0
15.2
15.0
10.4
10.0
5.0
0.0
Liqui Fruit
Fruiti Drink
Yum Yum
Fruit Quencher
Go Fruit
Pie Chart - Fruit Juice Preferences
Consumer Preferences - Fruit Juices
45, 18%
77, 31%
Liqui Fruit
Fruiti Drink
26, 10%
Yum Yum
Fruit Quencher
38, 15%
64, 26%
(b)
18% of the sampled consumers prefer Liqui Fruit.
(c)
56.4% of the sampled consumers prefer either Yum Yum or Go Fruit.
Go Fruit
Exercise 2.12
Manufacturer
Toyota
Nissan
Volkswagen
Delta
Ford
MBSA
BMW
MMI
Total Sales
(a)
X2.12 - annual car sales.xlsx
File:
Annual Sales
96959
63172
88028
62796
74155
37268
51724
25354
499456
% Sales
19.4
12.6
17.6
12.6
14.8
7.5
10.4
5.1
100.0
Bar Chart - Annual Car Sales by Manufacturer
Annual Sales of Passenger Cars by Manufacturer
120000
100000
80000
96959
88028
74155
63172
62796
60000
51724
37268
40000
25354
20000
0
(b)
Percentage Pie Chart - Annual Car Sales by Manufacturer
% Annual Sales of Passenger Cars by Manufacturer
MMI, 5.1
MBSA, 7.5
BMW,
10.4
Toyota
Toyota, 19.4
Volkswagen
Ford, 14.8
Nissan, 12.6
Volkswagen,
17.6
Delta, 12.6
(c)
Nissan
Delta
Ford
MBSA
BMW
MMI
Total % held by top three manufacturers - Toyota (19.4%), Volkswagen (17.6%)
and Ford (14.8%) - represents 51.8% of the total passenger car market.
Exercise 2.13
File:
Manufacturer
Toyota
Nissan
VW
Delta
Ford
MBSA
BMW
MMI
(a)
First half
42661
35376
45774
26751
32628
19975
24206
14307
X2.13 - half-yearly car sales.xlsx
Second half
54298
27796
42254
36045
41527
17293
27518
11047
% Change
27.3
-21.4
-7.7
34.7
27.3
-13.4
13.7
-22.8
Multiple bar chart - Car Sales by Half-Year and Manufacturer
Half-yearly Car Sales by Manufacturer
60000
50000
40000
First half
30000
Second half
20000
10000
0
First half
Toyota
42661
Nissan
35376
VW
45774
Delta
26751
Ford
32628
MBSA
19975
BMW
24206
MMI
14307
Second half
54298
27796
42254
36045
41527
17293
27518
11047
(b)
First half-year best performers: Nissan; Volkswagen; MBSA and MMI
(c)
Delta showed the largest % increase from the first half to the second half of 34.7%.
Refer to the above Table for the % Change between First and Second Half-Year Sales.
Exercise 2.14
(a)
File:
Categorical Frequency Table - Television Brands Owned
Count of Brands
Brands
Daewoo
LG
Philips
Sansui
Sony
Grand Total
(b)
X2.14 - television brands.xlsx
Total
16%
30.4%
10.4%
24%
19.2%
100%
Percentage Bar Chart - Television Brands Owned
% of TV Brands Owned
35%
30%
25%
20%
15%
10%
5%
0%
30.4%
Total
16%
30.4%
10.4%
24%
10.4%
19.2%
100%
Brands
Daewood
LG
16%
Philips
Sansui
Sony
Total
Daewood
LG
Philips
24%
19.2%
Sansui
Sony
(c)
Philips is the least preferred brand (preferred by only 10.4% of households surveyed).
(d)
The most popular brand is LG that is owned by 30.4% of the households surveyed.
Exercise 2.15
(a)
(b)
File:
Frequency Count Table
X2.15 - estate agents.xlsx
Count of House sales
House sales
3
4
5
6
7
8
Grand Total
Total
12
15
6
7
5
3
48
Histogram - Residential Properties Sold per Estate Agent
Histogram of Residential Properties Sold per Agent
16
14
12
10
8
6
12
15
4
6
2
0
Total
7
5
3
3
4
5
6
7
8
12
15
6
7
5
3
(c)
The most frequently sold number of properties per estate agent was 4.
4 properties each were sold by (15/48) 31.3% of all estate agents
(d)
The same frequency count table (a) and histogram (b) is produced.
Exercise 2.16
File:
Fast Food Outlet
KFC
St Elmo's
Steers
Nandos
Ocean B
Butler's
Total
(a)
Count
56
58
45
64
24
78
325
X2.16 - fast foods.xlsx
Fast Food
KFC
St Elmo's
Steers
Nandos
Ocean B
Butler's
%
17.2
17.8
13.8
19.7
7.4
24.0
100
%
17.2
17.8
13.8
19.7
7.4
24
Percentage Bar Chart - Consumer Preferences of Fast Food Outlet
Percentage of Consumers
30.0
24
25.0
20.0
17.2
19.7
17.8
13.8
15.0
%
7.4
10.0
5.0
0.0
KFC
(b)
St Elmo's
Steers
Nandos
Ocean B
Butler's
Percentage Pie Chart - Consumer Preferences of Food Type
Firstly produce the categorical frequency table of Food Type Preferences. Sum
the frequency counts of the different food types (e.g. Chicken = 56 + 64 = 120) and
express the total count as a % of the total number of customers (e.g. 120/325 = 36.9%).
Categorical Frequency Table - Consumer Preferences of Food Type
Food type
Chicken
Pizza
Burger
Fish
%
36.9
41.8
13.8
7.4
Consumer preference (%) by Food Type
Fish, 7.4
Burger, 13.8
Chicken
Chicken,
36.9
Pizza, 41.8
Pizza
Burger
Fish
(c)
Brief Report
Pizza (42%) and Chicken (37%) dominate almost 80% of
the fast food market with Pizzas being slightly more favoured by fast food consumers.
Exercise 2.17
(a)
File:
Two-way Pivot Table - Counts, Row % (by Airline), Column % (by Passenger)
Airline
Comair
Data
Count of Passenger
Row %
Column %
Total %
Kulula
Count of Passenger
Row %
Column %
Total %
SAA
Count of Passenger
Row %
Column %
Total %
Total Count of Passenger
Total Row %
Total Column %
Total Total %
(b)
Passenger
Business
12
60.0%
33.3%
17.1%
4
20.0%
11.1%
5.7%
20
66.7%
55.6%
28.6%
36
51.4%
100%
51.4%
Tourist
8
40.0%
23.5%
11.4%
16
80.0%
47.1%
22.9%
10
33.3%
29.4%
14.3%
34
48.6%
100%
48.6%
Grand Total
20
100%
28.6%
28.6%
20
100.0%
28.6%
28.6%
30
100.0%
42.9%
42.9%
70
100%
100%
100%
Two-way Pivot table - Row Percentage by Airline
Count of Passenger
Airline
Comair
Kulula
SAA
Grand Total
(c)
X2.17 - airlines.xlsx
Passenger
Business
60.0%
20.0%
66.7%
51.4%
Tourist
40.0%
80.0%
33.3%
48.6%
Grand Total
100.0%
100.0%
100.0%
100.0%
Multiple Bar Chart - Passenger Type by Airline
Multiple Bar Chart - Airline by Passenger Type
% of passenger per airline
90%
70%
60%
50%
40%
0.67
0.6
0.4
30%
0.33
10%
Comair
Kulula
SAA
Business
0.6
0.2
0.666666667
Tourist
0.4
0.8
0.333333333
42.9% of passengers prefer to fly with SAA.
Business
Tourist
0.2
20%
0%
(d)
0.8
80%
(e)
Kulula is most preferred by tourists (47.1% of tourists prefer Kulula).
(f)
Not true. Most business travellers prefer SAA (55.6%).
More than half (55.6%) of all business travellers prefer SAA.
Exercise 2.18
(a)
(b) (i)
File:
Random Variable - Number of occupants per car
Data Type - Numerical, discrete, ratio-scaled
Numeric % Frequency Distribution (and Cumulative Frequencies)
Occupants
1
2
3
4
5
Total
(b) (ii)
X2.18 - car occupants.xlsx
Count
23
15
10
5
7
60
%
38.3
25.0
16.7
8.3
11.7
100.0
Cum Count
23
38
48
53
60
Cum %
38.3
63.3
80.0
88.3
100.0
Histogram - Occupants per Car
Histogram of Occupants per Car
30
No. of Cars
25
23
20
15
15
10
10
0
7
5
5
1
2
3
4
5
Occupants
"Less-Than" Ogive (see (a) above) and Cumulative Frequency Polygon
Cumulative Frequency Polygon - Car Occupants
70
60
No. of Cars
(b) (iii)
60
50
53
48
40
38
30
23
20
10
0
1
2
3
No. of occupants
4
5
(c)
(i)
(ii)
(iii)
38.3% of motorists travel alone.
36.7% of vehicles have 3 or more occupants
63.3% of vehicles have no more than 2 occupants.
Exercise 2.19
(a)
(b) (i) (ii)
File:
Random variable - distance travelled (in kms) per courier trip
Data type - numerical, continuous, ratio-scaled
Numeric % Frequency Distribution (and Cumulative Frequencies)
Distance
≤10
11 - ≤15
16 - ≤20
21 - ≤25
26 - ≤30
31 - ≤35
Total
%
8
14
30
24
18
6
100
Count
4
7
15
12
9
3
50
Cum Count
4
11
26
38
47
50
Cum %
8
22
52
76
94
100
Histogram - Courier Travelling Distances per Trip
Histogram of Courier Travelling Distances
18
15
16
No. of trips
14
12
12
9
10
7
8
6
4
4
3
2
0
≤10
11 - ≤15
16 - ≤20
21 - ≤25
26 - ≤30
31 - ≤35
Distance (in km)
Cumulative % Frequency Polygon
Distances Travelled per Trip
Distance
10
15
20
25
30
35
100
Percent of trips
(b) (iii)
X2.19 - courier trips.xlsx
80
60
40
20
0
22
Cum %
8
22
52
76
52
94
100
94
100
76
8
10
15
20
25
Distance (in km)
30
35
(c)
(i)
(ii)
(iii)
18% of deliveries (9 trips) were between 25 km and 30 km.
76% of deliveries (38 trips) were within a 25 km radius.
48% of deliveries (24 trips) were beyond a 20 km radius.
(iv)
Reading off the Cumulative % Frequency Table (or Polygon) above
52% of trips were no more than 20 km from the depot.
The longest 24% of trips were above 25 km.
(v)
(d)
The percentage of trips above 30 km is only 6%.
Hence there is adherance to the company policy.
Exercise 2.20
(a)
File:
X2.20 - fuel bills.xlsx
Random variable - monthly fuel expenditure (in Rands)
Data type - numerical, continuous, ratio-scaled
(b) (i) (ii) Numeric % Frequency Distribution (and (d) the Ogive)
Fuel bill
≤300
301 - 400
401 - 500
501 - 600
601 - 700
701 - 800
800+
Total
(b) (iii)
Count
7
15
13
7
5
2
1
50
% Count
14
30
26
14
10
4
2
100
Cum Count
7
22
35
42
47
49
50
Cum %
14
44
70
84
94
98
100
Histogram of Fuel costs / motorist (Rand)
Histogram of Motorists' Monthly Fuel Costs
15
16
13
no. of motorists
14
12
10
8
7
7
5
6
4
2
2
0
≤300
301 - 400
401 - 500
501 - 600
601 - 700
701 - 800
Fuel bill (in Rands)
(c)
14% (7 motorists) spend between R500 and R600 per month on fuel.
(d)
Cumulative % Frequency Polygon - Motorist fuel bills per month.
(See the Ogive in (b) for Cumulative Frequencies)
1
800+
Cumulative % Frequency Polygon - Fuel Bills per Month
110
100
90
80
70
60
50
40
30
20
10
0
94
98
100
84
70
44
14
≤300
301 - 400
401 - 500
501 - 600
601 - 700
701 - 800
800+
(e)
From (d), approx. 77% of motorists spend less than R550 per month on fuel.
(f)
From (a) (and (d)), 30% (15 motorists) spend more than R500 per month on fuel.
Exercise 2.21
File:
Data
(a)
Quarters
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
X2.21 - car sales.xlsx
Corsa Sales
37
25
41
29
31
28
30
36
38
62
53
63
43
39
52
61
58
65
73
52
61
46
49
54
Time Line graph - Quarterly Vehicle Sales
Line Graph of Opel Corsa Quarterly Sales
80
70
units sold
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Quarters
(b)
Yes, Corsa sales are showing a general upward trend.
Exercise 2.22
File:
Data
(a)
Year
1
2
3
4
5
6
7
8
9
10
VW
Toyota
13.4
11.6
9.8
14.4
17.4
18.8
21.3
19.4
19.6
19.2
9.9
9.6
11.2
12.0
11.6
13.1
11.7
14.2
16.0
16.9
X2.22 - market shares.xlsx
Trend Line graphs of Market Shares (%) per car type (VW, Toyota)
Market Share (%) Line Graphs
% market share
25
20
15
10
5
0
1
2
3
4
5
6
7
8
9
year
Legend: Top graph - VW;
Bottom graph - Toyota
(b)
VW shows a higher sales level but at a declining growth rate.
Toyota shows a lower sales level, but at a rising growth rate.
(c)
Choice of franchise is not clearcut, but suggest choosing Toyota because
of its more consistent (steady) growth rate.
10
Exercise 2.23
(a)
File:
X2.23 - defects.xlsx
Scatter graph - Inspection time (x) vs Defects found (y)
Scatter Graph of Defects against Inspection time
no. of defects found per batch
20
18
16
14
12
10
8
6
4
2
0
20
30
40
50
60
inspection time (minutes)
(b)
Yes, there appears to be a moderate to strong positive linear relationship
between the inspection time of a batch and the no. of defects found per batch.
70
Data
Consignment
AA
AB
AC
AD
AE
AF
AG
AH
AI
AJ
AK
AL
AM
AN
AO
AP
AQ
AR
AS
AT
AU
AV
AW
AX
AY
AZ
BA
BB
BC
BD
Time
48
50
43
36
45
49
55
63
55
36
40
46
32
50
42
36
48
38
45
30
34
43
53
48
56
40
33
35
50
48
Defects
17
9
12
7
8
10
14
18
19
6
8
14
10
15
14
8
12
8
10
6
9
11
16
16
15
12
7
10
16
18
Exercise 2.24
(a)
File:
X2.24 - leverage.xlsx
Scatter Graph - Profit Growth (y) vs Leverage Ratio (x)
Scatter Graph
Profit Growth (y) and Leverage Ratio (x)
160
profit growth
140
120
100
80
60
40
20
0
30
32
34
36
38
40
42
44
leverage ratio
(b)
Yes, there is a clear moderate to strong positive linear relationship between
the leverage ratio of a company and its profit growth.
Data
Leverage
40.8
42.3
43.2
37.9
36.2
35.6
36.4
39.5
42.6
42.1
37.8
34.4
36.5
38.3
39.3
36.4
33.5
32.4
35.4
35.4
35.7
35.2
35.3
44.9
35.9
38.0
36.7
39.2
41.1
38.7
Profit Growth
111
116
132
105
69
40
58
118
104
125
97
76
98
100
75
88
20
25
78
65
84
88
86
115
50
92
110
72
128
86
46
Exercise 2.25
File:
(a)
Sector
Mining
Services
Grand Total
(b)
Service companies have a higher average ROI% (11.33%) than mining companies (9.87%).
The volatility of ROI% amongst service companies (2.99%) is far lower than amongst mining companies (4.58%)
By inspection, there is a high overlap of ROI% between the two sectors (based on a two-standard deviation
interval around each sample mean). Thus it is likely that there is no statistically significant difference in
mean ROI% between the two sectors.
Average
9.87
11.33
10.70
Std dev
4.58
2.99
3.78
Exercise 2.26
File:
(a)
Aisle
Front
Middle
Back
(b)
X2.25 - roi%.xlsx
Middle
Average
Std dev
Average
Std dev
Average
Std dev
Average
Std dev
X2.26 - product location.xlsx
Shelf position
Top
Total
6.08
0.890
4.24
1.387
4.66
1.193
4.99
1.359
5.08
0.622
2.74
0.297
4.1
0.648
3.97
1.114
5.58
0.895
3.49
1.232
4.38
0.952
4.48
1.327
Based on shelf position alone, middle shelf positions generate higher average sales (R4.99) than top shelf
positions (R3.97).
Based on aisle location alone, a front-of-store aisle location generates the highest average sales (R5.58), followed
by a back-of-store aisle location (R4.38).
The lowest average sales occur when the product is displayed in a middle-of-store aisle location (R3.49).
In combination, a front-of-store aisle location on a middle shelf position generates the highest average sales
(R6.08), while a top shelf position in a middle-of-store aisle is the least desirable product location with an average
sales volume of only R2.74.
Sales variability is relatively consistent across aisle locations (0.895 to 1.232) as well as between shelf positions
(1.114 to 1.359).
In combination however, sales volumes show highest variability when the product is positioned in a middle shelf
position in a middle-of-store location (1.387) while the lowest variability in sales volumes occur when positioned in
a top shelf position in a middle-of-store aisle location (0.297).
The large differences in average sales volumes (ranging from R6.08 to R2.74) is evidence of a likley statistically
significant difference in sales volumes due to choice of aisle location and shelf positioning.
(c)
Recommendation:
A middle shelf position in a front-of-store ailse location is the most preferred product
display location.
File:
Exercise 2.27
(a)
X2.27 - property portfolio.xlsx
Numeric Frequency Distribution and Cumulative % of NP%
Intervals
-5
-2.5
0
2.5
5
7.5
10
12.5
15
17.5
20
More
Count
Cumulative %
0.0%
1.2%
3.4%
4.6%
11.4%
38.3%
71.9%
86.1%
95.7%
99.4%
100%
100%
100%
0
4
7
4
22
87
109
46
31
12
1
1
324
Histogram - Net Profit %
120
100
Frequency
120.0%
109
100.0%
87
80
80.0%
60
60.0%
46
40
40.0%
31
22
20
0
4
7
4
-5
-2.5
0
2.5
12
20.0%
1
1
0
0.0%
5
7.5
10
12.5
15
17.5
20 More
Bin NP%
(b) and (c)
Region
A
B
(d)
Average
Std dev
Minimum
Maximum
Count
Average
Std dev
Minimum
Maximum
Count
Average
Std dev
Minimum
Maximum
Count
Type of business usage
Commercial Industrial
Retail
7.5
4.9
10.2
2.5
3.1
3.6
-2.4
-4.2
0.8
14.6
8.1
18.4
104
40
70
12.3
6.8
8.5
3.1
4.2
1.8
2.7
-3.4
4.4
20.3
10.2
13.2
46
16
48
9.0
5.4
9.5
3.5
3.5
3.1
-2.4
-4.2
0.8
20.3
10.2
18.4
150
56
118
Total
7.9
3.5
-4.2
18.4
214
9.8
3.5
-3.4
20.3
110
8.6
3.6
-4.2
20.3
324
Profile of property portfolio:
The company has almost twice as many properties in region A (214 or 66%) compared to region B (110 or 34%).
Almost half of their properties are commercial (46%) followed by retail (37%) and then industrial (17%).
Of all the prpoperties in the portfolio, the majority are commercial properties in region A (104 or 32%).
followed by retail properties in region A (70 or 22%).
The smallest component of their property portfolio consists of industrial properties in region B (only 16 or 5%).
(e)
Portfolio performance :
Net profit % across the entire portfolio is normally distributed (histogram) with an average return of 8.6% and a
standard deviation of 3/6%. NP% ranged from the lowest of -4.2% to the highest of 20.3%.
From the cumulative % distribution, 75% of all properties (cumulative 86.1% - cumulative 11.4%) earned a NP% of
between 5% and 12.5% p.a.
Region B (9.8%) has outperformed region A (7.9%) by almost 2% on average, while commercial (9.0%) and retail (9.5%)
have significantly outperformed industrial properties (5.4%).
The worst performing segment is industrial properties in region A (4.9%) while the best performing properties are
commercial properties in region B (12.3%).
There are 15 (4.6%) properties that are under-performing (with less than a 5% net profit % p.a.).(see histogram).
Volatility of NP% is fairly consistent across the segments (approx. 3.5%), except for higher variability noted in the
industrial properties of region B (4.2%).
Growth potential (high NP% p.a. segments) is mainly in commercial properties in region B which represents only 14% of the current portfolio
retail properties in region A (currently only constitute 22% of the current portfolio).
(f)
Recommendations:
Dispose of the worst performing industrial properties in both regions A and B and
purchase more commercial properties in region B followed by retail properties in region A.
CHAPTER 3
EXPLORATORY DATA ANALYSIS - DESCRIBING DATA
NUMERIC DESCRIPTIVE STATISTICS
Exercise 3.1
(a) median
(b) mode
(c) mean
Exercise 3.2
Upper quartile
Exercise 3.3
Statements (c) and (f). The mode would be more appropriate (both are categorical)
Exercise 3.4
(a) False (b) False (c) True (d) False (e) False
The new median mass will depend only on the masses of parcels in the 3rd and 4th
ordered positions out of the 6 positions (after adding the extra parcel).
Also, the rank order position of this extra parcel's mass is unknown. It could be the
4th, 5th or 6th ranked mass, but this depends on the masses of parcels that are heavier
than the current median mass.
If the 4th ranked mass is also equal to 6.5 kg, then the new median will not increase.
If, on the other hand, the 4th ranked mass is greater than 6.5 kg, then the median will
increase. Therefore the only statement that can be made with complete certainty is (c),
(i.e. that it is impossible for the new median mass to be less than it was.)
Exercise 3.5
Correct method is (b). Use the formula for the arithmetic mean (Formula 3.1)
By definition, Mean = Σx / n
Given Mean = 4.1 and n = 9245, it is possible to
compute Σx (total number of persons) = Mean x n .
i.e. total no. persons in Mossel Bay = 4.1 x 9245 = 37905 (rounded)
Exercise 3.6
File:
X3.6 - equity returns.xls
General Equity Unit Trust % Returns
Sum
n
% Returns
9.2
8.4
10.2
9.6
8.9
10.5
8.3
65.1
7
Deviation
-0.1
-0.9
0.9
0.3
-0.4
1.2
-1
Deviation2
0.01
0.81
0.81
0.09
0.16
1.44
1
4.32
Using Excel
Mean
65.1/7 =
Std Dev
√[4.32/(7-1)] =
9.3
0.8485
'=AVERAGE(9.2,...,8.3)
'=STDEV(9.2,...,8.3)
Exercise 3.7
File:
Sum
n
Mass (kg)
11
12
8
10
13
11
9
74
7
Deviation
0.43
1.43
-2.57
-0.57
2.43
0.43
-1.57
X3.7 - luggage weights.xls
Deviation2
0.1849
2.0449
6.6049
0.3249
5.9049
0.1849
2.4649
17.7143
Using Excel
(a)
(b)
Mean
74/7 =
10.57
'=AVERAGE(11,...,9)
Std Dev
√[17.7143/(7-1)] =
1.7183
'=STDEV(11,...,9)
On average, each passenger's hand luggage weighs 10.57 kg.
68.3% of all hand luggage is likely to weigh between 8.85 kg and 12.29 kg.
(This corresponds to one standard deviation limits about the mean).
(c)
Coefficient of Variation
1.7183/10.57% =
16.26%
(d)
The variation in hand luggage weights is moderate (close together).
File:
Exercise 3.8
Bicycles sold Deviation Deviation2
Sum
n
(a)
25
18
30
36
18
20
16
24
30
19
236
10
1.4
-5.6
6.4
12.4
-5.6
-3.6
-7.6
0.4
6.4
-4.6
X3.8 - bicycle sales.xls
Sorted data
16
18
18
19
20
24
25
30
30
36
1.96
31.36
40.96
153.76
31.36
12.96
57.76
0.16
40.96
21.16
392.4
Mean =
236/10 =
23.6
On average, 23.6 bicycles are sold each month.
(Median sales lies in the 5.5th position)
Median = (20 + 24)/2 =
22
For half of the months (i.e. 5 months), bicycles sales were
less than 22 bicycles per month.
(b)
Range =
36 - 16 =
20
The range of sales between the worst and best months was 20 bicycles.
(i.e. In the worst sales month, 16 were sold; in the best month, 36 were sold.
Variance =
392.4/(10-1) =
43.6
Standard deviation =
√43.6 =
6.603
68.3% of all monthly bicycle sales are likely to lie between 17 and 30.2.
(c)
th
18 (2.5 position)
25% of monthly bicycle sales were less than or equal to 18.
Or: No more than 18 bicycles per month were sold in 25% of the months.
Note: Using Excel :
QUARTILE(data values,1) =
18.25
Lower Quartile (Q1) =
Upper Quartile (Q3) =
27.5
(7.5th position)
25% of monthly bicycle sales were above 27.5.
Or: More than 28 (27.5) bicycles per month were sold in 25% of the months.
Note: Using Excel :
QUARTILE(data values,3) =
28.75
(d)
Approximate Skewness = 3x(23.6 - 22)/6.603 =
0.7269
There is moderate positive skewness in monthly bicycle sales.
(i.e. There are one / two months with relative high bicycle sales)
(e)
Box plot of monthly bicycle sales
Interpretation
Monthly bicycle sales range between 16 and 36.
The median monthly sales is 22. The positive skewness shows a wider spread
of monthly sales toward the months of high sales.
(f)
Opening monthly stock level =
23.6 + 6.603 =
30.2
bicycles in stock
If orders = 30, then the dealer will have sufficient bicycle stock to meet demand.
Exercise 3.9
(a)
File:
Setting time
Deviation
3
-6
-3
-2
-4
4
7
1
0
Sum
n
27
18
21
22
20
28
31
25
24
216
9
Mean =
216/9 =
24
minutes
Std dev = √140/(9-1) =
4.183
minutes
4.183/24 =
X3.9 - setting times.xls
Deviation2
9
36
9
4
16
16
49
1
0
140
(b)
Coefficient of Variation =
17.43 %
(c)
No, since the consistency index is greater than 10%, this consignment
will not be approved for dispatch to the clients.
Exercise 3.10
File:
% Increases
5.6
7.3
4.8
6.3
8.4
3.4
7.2
5.8
8.8
6.2
7.2
5.8
7.6
7.4
5.3
5.8
Sum
n
Deviation
-0.83
0.87
-1.63
-0.13
1.97
-3.03
0.77
-0.63
2.37
-0.23
0.77
-0.63
1.17
0.97
-1.13
-0.63
102.9
16
X3.10 - wage increases.xls
Deviation2
0.6910
0.7547
2.6610
0.0172
3.8760
9.1885
0.5910
0.3985
5.6110
0.0535
0.5910
0.3985
1.3660
0.9385
1.2797
0.3985
28.8144
Sorted %
3.4
4.8
5.3
5.6
5.8
5.8
5.8
6.2
6.3
7.2
7.2
7.3
7.4
7.6
8.4
8.8
Using manual computations
(a)
Mean =
102.9/16
6.43
%
6.25
%
(lies in the 8.5th position)
Median =
(6.2+6.3)/2 =
(b)
Variance =
Std dev =
[28.8144/(16-1)] =
√28.8144/(16-1) =
(c)
Lower limit =
6.43-2*(1.386)
3.66
Upper limit =
6.43+2*(1.386)
9.20
95.5% of all agreed wage increases lie between 3.66% and 9.2%.
(d)
CV =
(1.386/6.43) % =
21.56 %
Agreed wage increases are only moderately consistent.
1.921
1.386 %
Using Excel
Excel 's Data Analysis option
(a)
(b)
Wage increases
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
(c) and (d) must be computed manually.
Excel 's Function Keys
6.43
0.346
6.25
5.8
1.386
1.921
0.196
-0.286
5.4
3.4
8.8
102.9
16
'=AVERAGE(data values)
'=MEDIAN(data values)
'=STDEV(data values)
'=VAR(data values)
Exercise 3.11
Bank Trainee Exam Performance
Mean
Variance
Sample size
Std deviation
Coefficient
of Variation =
Group 1
76
110
34
Group 2
64
88
26
√110 =
10.488
√88 =
9.381
Group 1
(10.488/76)%
13.8
Group 2
(9.381/64)%
14.66
Interpretation
Both groups performed consistently well. The difference in CV% measures is marginal.
However, group 1 trainees performed marginally more consistently than group 2 trainees.
Exercise 3.12
(a)
File:
Random variable - value of a restaurant meal (in Rand)
Data type - numerical, continuous, ratio-scaled
X3.12 - meal values.xls
(Using Data Analysis in Excel )
Meal values
(b)
Mean
1119/20 =
= R55.95
3902.95
20
∑(deviation)² =
n=
Variance = 3902.95/(20-1) =
Std deviation = √205.42 =
(c)
205.42
14.33
Median
th
th
Average the Rand values in 10 and 11 positions.
= (51+55)/2
= R53
Half off the meals were valued at R53 or less.
Ranked
Position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Meal
Value
35
36
44
44
44
47
48
48
50
51
55
56
58
62
65
65
69
72
80
90
R44
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
55.95
3.20
53
44
14.33
205.42
0.26
0.74
55
35
90
1119
20
Median is midpoint between
these two middle values (R51 and R55)
(d)
Mode
occurs 3 times (see grouped ranked values in (c) above).
(e)
There is moderate positive skewness caused by two high meal values (i.e. R80 and R90).
Hence recommend the median as the most representative central location meal value.
Exercise 3.13
(a)
File:
10.3 days absent
(Using Data Analysis in Excel )
Mean
= 237/23
Median
Median is found in the (23+1)/2 th position
i.e. 12th position
Mode
Median =
9 days absent
Position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Days absent
2
4
4
5
5
5
6
6
6
8
9
9
10
10
10
12
15
15
15
16
17
18
30
X3.13 - days absent.xls
Q1 position
and value
days absent
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Skewness
Range
Minimum
Maximum
Sum
Count
10.30
1.33
9
5
6.36
40.49
1.38
28
2
30
237
23
Q1
Q3
Median position and value
Q3 position
and value
There are 4 possible modal values (5; 6; 10 and 15 days).
All occur with a frequency of 4.
Note: Only the first modal value is reported in Excel .
The mode is an unreliable measure of central location in this study.
Interpretation
On average, an employee is absent for 10.3 days over this 9-month period.
Half the employees were absent for up to 9 days.
The most common number of days absent was 5 (or 6, or 10 or 15) days
(b)
Lower Quartile (approximated manually)
Q1 position = (23/4) = 5.75th position
Q1 value in this position is = 5+(0.75*(5 - 5) =
5 days absent
Using Excel
5.5 days absent =QUARTILE(data range,1)
25% of employees were absent for no more than 5 (or 5.5) days altogether.
5.5
15
Upper Quartile (approximated manually)
Q3 position = (23*3/4) = 17.25th position
Q3 value in this position is = 15+(0.25*(15 - 15) =
15 days absent
Using Excel
15 days absent =QUARTILE(data range,3)
25% of employees were absent for more than 15 days over this 9-month period.
(c)
Average per 9-month =
10.3 days
Average per 1-month =
1.14 days absent per month
= (10.3 / 9) = ave per month
Since the monthly average is above 1 day (actually 1.14 days), the company is
not succesfully managing its absenteeism levels.
Exercise 3.14
(a)
Average
File:
(Using Data Analysis in Excel )
4.665%
= 79.3/17
54.85882
17
∑(deviation)² =
n=
Variance = 54.85882/(17-1) =
Std deviation = √(3.43) =
(b)
Median
Rank
Position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
3.43
1.85
th
Median is found in the (17+1)/2 position
th
i.e. 9 position
Median =
5.40%
Ordered
bad debts %
1.8
2.2
2.2
2.4
2.6
3.4
4.4
4.7
5.4
5.7
5.7
5.8
6.1
6.3
6.6
6.8
7.2
X3.14 - bad debts.xls
bad debts %
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Q1
Q3
Lower Quartile position
and value
Median position and value
Upper Quartile position
and value
(c)
Average: On average, each furniture retailer has a bad debt % of 4.665%.
Median: 50% of furniture retailers have a bad debt % of 5.4% or less.
Since the mean < median, there is evidence of negative skewness.
Hence propose the use of the median as the representative central value.
(d)
There are two modal values (2.2% and 5.7%) both occurring with frequency of 2.
This makes the mode an unreliable measure of central location.
(e)
Skewness coefficient
(Formula 3.14)
Then Skp =
Values required for Formula 3.14
-31.2227
∑(x - x(bar))3 =
n=
17
s (std dev) =
1.85
3
(17*(-31.2227))/((17-1)*(17-2)*1.85 ) =
-0.35
Since the skewness coefficient is close to zero, there is only
There is evidence of moderate negative skewness in the data on bad debts %
(i.e. only a few fairly low bad debt % values - the majority are higher).
4.665
0.45
5.4
2.2
1.85
3.43
-1.49
-0.35
5.4
1.8
7.2
79.3
17
2.6
6.1
(f)
Lower Quartile (approximated manually)
Q1 position = (17/4) = 4.25th position
Q1 value in this position is = (2.4*+(0.25*(2.6 - 2.4)) =
2.45%
Using Excel
= 2.6%
=QUARTILE(data range,1)
25% of furniture retailers have a bad debts % of no more than 2.45% (or 2.6%).
Upper Quartile (approximated manually)
Q3 position = (17*3/4) = 12.75th position
Q3 value in this position is = (5.8*+(0.75*(6.1 - 5.8)) =
6.025%
Using Excel
= 6.1%
=QUARTILE(data range,3)
25% of furniture retailers have a bad debts % of more than 6.025% (or 6.1%).
(g)
The average % bad debts is 4.665% while the median % bad debts is 5.4%. Since
there is moderate negative skewness (Skp = -0.35), the median should be used
as the representative central value. Thus, since the median % of bad debts,
is above 5% (median = 5.4%), an advisory note should be sent out.
Exercise 3.15
(a)
Average
(or Mean)
(Formula 3.5)
File:
T/O bins
500 - 750
750 - 1000
1000 - 1250
1250 - 1500
1500 - 1750
1750 - 2000
Average
midpoint (x )
625
875
1125
1375
1625
1875
Total
freq (f )
15
23
55
92
65
50
300
X3.15 - fish shop.xls
xf
9375
20125
61875
126500
105625
93750
417250
R 1,390.80
= 417250/300
Interpretation The average daily turnover for the fish shop is R1390.80 per day.
(b)
Median
T/O bins
500 - 750
750 - 1000
1000 - 1250
1250 - 1500
1500 - 1750
1750 - 2000
Total
freq (f )
15
23
55
92
65
50
300
∑f
15
38
93
185
250
300
Q1 interval
Median Interval
Q3 interval
150th position out of 300 positions
Median position = (300/2) =
Median value therefore lies in the 4th interval [i.e. Between R1250 and R1500]
Median
= 1250+(250*(150-93)/(185-93))
(based on Formula 3.2)
(c)
Mode
Modal value lies in 4th interval [1250 - 1500]
as it has the highest frequency count of 92 days
Mode
= 1250+(250*(92-55)/(2*92-55-65)) =
(based on Formula 3.3)
(d)
R 1,404.90
R 1,394.50
Lower Quartile (using Formula 3.7)
Q1 position = (300/4) = 75th position which lies in the interval [1000 - 1250]
Q1 value in this position is = (1000+250*(75-38)/(93-38)) =
The maximum turnover for the slowest 25% of trading days was R1168.18.
(e)
R 1,168.18
Upper Quartile (using Formula 3.8)
Q3 position = (300*3/4) = 225th position which lies in the interval [1500 - 1750]
Q3 value in this position is = (1500+250*(225-185)/(250-185)) =
R 1,653.85
A minimum turnover of R1653.85 was generated on the 25% of busiest trading days.
Exercise 3.16
(a)
(b) (i)
Average
(Mean)
File:
% Spend
10 - 20
20 - 30
30 - 40
40 - 50
50 - 60
midpoint (x )
15
25
35
45
55
Total
freq (f )
6
14
16
10
4
50
X3.16 - grocery spend.xls
xf
90
350
560
450
220
1670
(Formula 3.5)
Average
Interpretation
On average, a family spends 33.4% of their income on groceries.
Median
% Spend
10 - 20
20 - 30
30 - 40
40 - 50
50 - 60
Total
33.4%
= 1670/50
freq (f )
6
14
16
10
4
50
∑f
6
20
36
46
50
Q1 interval
Median Interval
Q3 interval
25th
Median position = (50/2) =
Median value therefore lies in the 3rd interval [30 - 40]
(Formula 3.2)
Interpretation
Median
= 30+(10*(25-20)/(36-20))
33.125%
50% of families spend no more than 33.125% of their income on groceries.
(b) (ii) Lower Quartile (Q1) (using Formula 3.7)
Q1 position = (50/4) = 12.5th position which lies in the interval [20 - 30]
Q1 value in this position is = 20+(10*(12.5-6)/(20-6)) =
Interpretation
(c)
24.64%
25% of families spend no more than 24.64% of their income on groceries.
Upper Quartile (Q3) (using Formula 3.8)
Q3 position = (50*3/4) = 37.5th position which lies in the interval [40 - 50]
Q3 value in this position is = 40+(10*(37.5-36)/(46-36)) =
Interpretation
41.5%
25% of families spend more than 41.5% of their income on groceries.
Exercise 3.17
File:
Shares
40
10
5
50
105
Price
15
20
40
10
Total
Value
600
200
200
500
1500
= 1500/105
(This is a weighted average measure)
(Using Formula 3.5)
Average value / equity
X3.17 - equity portfolio.xls
R 14.29
Exercise 3.18
File:
Cars sold
5
12
3
20
Price
25000
34000
55000
Total
Value
125000
408000
165000
698000
= 698000/20
(This is a weighted average measure)
(Using Formula 3.5)
Average Price / Car
X3.18 - car sales.xls
R34 900
Exercise 3.19
(Use Formula 3.4)
Geometric mean =
File:
4
√(1.16*1.14*1.1*1.08) - 1 =
X3.19 - rental increases.xls
0.11955
Interpretation
The average annual % escalation rate in office rentals is 11.955%
Using Excel
=GEOMEAN(1.16,1,14,1,1,1,08) - 1 =
0.11955
Exercise 3.20
(a)
(Using Formula 3.4)
Geometric mean =
File:
X3.20 - sugar increases.xls
6
√(1,05*1,12*1,06*1,04*1,09*1,03) - 1 =
0.06456
Interpretation
The average annual % increase in the sugar price (per kg) has been 6.456%
Using Excel
(b)
=GEOMEAN(1.05,1.12,1.06,1.04,1.09,1.03) - 1 =
The geometric mean is more appropriate since the base value of each
percentage change is different .
Each year's percentage change is based on the previous year's sugar price.
0.06456
Exercise 3.21
File:
X3.21 - water usage.xls
Using Excel 's Descriptive Statistics option in Data Analysis
(a)
(b)
(c)
water usage
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Using Excel 's Function Key - QUARTILE
Lower Quartile - Q1
Upper Quartile - Q3
(d)
21.2
1.751
19.5
25
9.590
91.959
1.744
1.165
42
8
50
636
30
15
25.75
'=QUARTILE(data range,1)
'=QUARTILE(data range,3)
Interpretation - central location measures
The average monthly water consumption per household is 21.2 kl.
50% of households consume no more than 19.5 kl per month.
The most frequently occuring monthly consumption is 25 kl (Note that this modal
(value is a misleading and misrepresentative measure at it only occurs 3 times).
Interpretation - dispersion measures
Based on the mean and standard deviation values, approximately 95.5% of
households consume between 2.02 kl and 40.38 kl per month.
Note, that the data is heavily skewed to the right (see skewness = 1.165), implying
that there are a few households that consume high volumes of water per month.
This skewness makes the interpretation given above unreliable as both the mean
and the standard deviation are likely to be inflated (or over-estimated) due to the
presence of a few high-valued outliers).
Interpretation - quartiles
25% of households consume at most 15 kl of water per month.
The 25% of heavy water consumers use at least 25.75 kl per month.
(e)
Households in Paarl suburb
- expected total usage per month
- expected total usage per year
750
21.2 x 750 =
15900 x 12 =
15 900 kl
190 800 kl
Exercise 3.22
File:
X3.22 - veal dishes.xls
(a)
Random variable - cost of a veal cordon bleu meal at a Durban restaurant
Data type - numerical, continuous, ratio-scaled.
(b)
Using Excel 's Descriptive Statistics option in Data Analysis
cordon bleu meal price
Mean
61.25
Standard Error
1.99
Median
59
Mode
48
Standard Deviation
10.54
Sample Variance
111.08
Kurtosis
0.62
Skewness
0.78
Range
45
Minimum
45
Maximum
90
Sum
1715
Count
28
Interpretation
On average, a patron can expect to pay R61.25 for a veal cordon bleu meal
at a Durban restaurant.
50% of Durban restaurants charge no more than R59 for a veal cordon bleu meal.
(c)
There are two modal values (R48 and R55). Both occur with a frequency of 3.
It is a misleading value because of its low frequency of occurrence.
Note: Excel only shows the first occurrence of a modal value (i.e. R48) in its output.
(d)
Standard deviation =
R 10.54
This means that 68.3% of Durban restaurants are likely to charge between
R50.71 (61.25 - 10.54) and R71.79 (61.25 + 10.54) for a veal cordon bleu meal.
(e)
Skewness Coefficient =
Formula 3.14
0.78
Skewness coefficient (approx) =
Formula 3.15
0.64
The relative high positive skewness is caused by two Durban restaurants
charging high prices (R80 and R90) for a veal cordon bleu meal.
(f)
Both measures indicate that there is moderate-to-high positive skewness, hence
the median cost of R59 would be a more representative measure of central location.
(g)
R68
Upper Quartile (Q3)
=QUARTILE(data range,3)
25% of Durban restaurants charge at least R68 for a veal cordon bleu meal.
(h)
R 54.75
Lower Quartile (Q1)
=QUARTILE(data range,1)
25% of Durban restaurants charge no more than R54.75 for a veal cordon bleu meal.
(i)
90th percentile value
R 72.90
=PERCENTILE(data range,0.9)
File:
Exercise 3.23
(a)
Using Excel 's Descriptive Statistics option in Data Analysis
(i)
Mean
(ii)
Median
(iv) Standard deviation
(iii) Variance
fuel bill
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
X3.23 - fuel bills.xls
418
13.128
398
350
113.693
12926.054
-0.503
0.601
420
256
676
31350
75
(v)
Skewness
(b)
Interpretation - central location
The average monthly fuel bill per motorist is R418; while half of the motorists
spend no more than R398 on their monthly fuel.
Interpretation - standard deviation
68.3% of monthly fuel bills range between R304.31 and R531.69 (one std dev from mean)
95,5% of monthly fuel bills range between R190.61 and R645.39 (within 2 std devs of mean)
(c)
Interpretation - skewness
There is moderate positive skewness (Skp = 0.601) caused by 8 motorists who spend
more than R600 per month - well above the majority of motorists' fuel spend.
(d)
Coefficient of Variation CV%
113,693/418 =
27.2%
There is only moderate consistency across monthly fuel bills of motorists
This greater relative spread could be caused by different size vehicles and varying distances travelled.
(e)
Lower Quartile - Q1
R332.5
=QUARTILE(data range,1)
Upper Quartile - Q3
=QUARTILE(data range,3)
R502.5
Inter Quartile Range = (Q3 - Q1)
R170
The middle 50% of monthly fuel bills span R170 from a low of R332.50 to a high of R502.50.
(f)
Five-Number Summary table
Using the Excel Function Keys
Minimum
Lower Quartile Q1
Median
Upper Quartile Q3
Maximum
=MIN(data range) or =QUARTILE(data range,0)
=QUARTILE(data range,1)
=MEDIAN(data range) or =QUARTILE(data range,2)
=QUARTILE(data range,3)
=MAX(data range) or =QUARTILE(data range,4)
R256
R332.5
R398
R502.5
R676
(g)
Box plot
(h)
Interpretation of Box Plot
There is clear evidence of moderate positive skewness (skewed-to-the-right) in fuel bills.
There are a few motorists who spend a large amount on fuel a month.
(i)
Total amount of fuel consumed monthly by Paarl motorists for commuting to work:
Expected total monthly fuel bill =
(average bill / motorist x no.motorists)
R418 * 25000
Expected Total fuel consumed (in litres) =
(Expected total monthly cost / cost per litre)
R10450000/10 =
R 10,450,000
1 045 000 litres
Exercise 3.24
(a)
(b)
File:
X3.24 - service periods.xls
Using Excel 's Descriptive Statistics option in Data Analysis
(i)
Mean
(ii)
Median
(iii)
Std Dev
(iv)
Skewness
service periods
Mean
7
Standard Error
0.384
Median
6
Mode
6
Standard Deviation
3.838
Sample Variance
14.727
Kurtosis
-0.437
Skewness
0.553
Range
15
Minimum
1
Maximum
16
Sum
700
Count
100
Interpretation - central location
On average, each engineer has 7 years of service with a company
Half of all engineers spend no more than 6 years with a company
Interpretation - standard deviation
68.3% of engineers spend between 3.16 and 10.84 years with a company.
Similar intervals can be computed for 2 and 3 std devs from the mean.
Note: when the lower limit is computed to be negative, in practice it is zero.
Interpretation - skewness
There is a moderate positive skewness (Skp = 0.553) in length of service periods,
meaning that a few engineers have long service periods with their company.
(c)
Using Excel 's Histogram option in Data Analysis
Frequency Distribution - Years of Service
Years of Service
0-3
3-6
6-9
9 - 12
12 - 15
15 - 18
Count
21
30
24
15
8
2
(d)
Lower limit =
3,16 years
7 - 3.838 =
Upper limit =
10,84 years
7 + 3.838 =
68.3% of engineers spend between 3.16 years and 10.84 years with a company.
(e)
Percent of members with less than 3 years of service
Cumulative % up to 3 years =
21/100 =
21%
Percent of members with more than 12 years of service
Cumulative % above 12 years =
10/100 =
10%
They meet the guidelines for "new blood" (21,2%) ,but have
far fewer "experienced" members (only 10%) than their guidelines require.
Exercise 3.25
File:
X3.25 - dividend yields.xls
(a)
Random variable
Data type
(b)
Using Excel 's Descriptive Statistics option in Data Analysis
dividend yield (%) of a company
numeric, ratio-scaled
Dividend yields
(i)
Mean
(ii)
(iii)
(iv)
Median
Mode
Std Dev
Mean
4.273
Standard Error
0.218
Median
Mode
Standard Deviation
4.1
2.8
1.444
Sample Variance
Kurtosis
(v)
Skewness
Skewness
Range
Minimum
Maximum
Sum
Count
(c)
2.084
-0.359
0.259
6.1
1.5
7.6
188
44
Mean
The average dividend yield per company is 4.273%
Median
Half of the dividend yields are at or below 4.1%
Mode
A misleading measure because there are 4 dividend yield values
(viz. 2,8, 3,6; 4,1; 5,1) of equal modal frequency (of 3)
Std dev
68.3% of all dividend yields lie between 2.83% and 5.72%
Similarly for 2 and 3 std devs from the mean
Skewness
There is very slight positive (right) skewness (Skp = 0.259).
The histogram can be assumed to be normally distributed.
(d)
(e)
the Mean (Average) as there is minimal skewness present in the data.
Bin yields
Frequency
Below 2%
2 - 3.5%
3.5 - 5%
5 - 6.5%
6.5 - 8%
3
11
16
11
3
(f)
Five-Number Summary Table
Minimum
Lower Quartile Q1
Median
Upper Quartile Q3
Maximum
(g)
Box plot
1.5
3.175
4.1
5.15
7.6
=QUARTILE(data range,0)
=QUARTILE(data range,1)
=QUARTILE(data range,2)
=QUARTILE(data range,3)
=QUARTILE(data range,4)
Dividend Yield %
The very slight positive skewness can be seen from the longer tail
on the right of the box plot. It implies a few companies have achieved
significantly higher dividend yields than the majority of the sample.
(h)
Minimum dividend yield achieved by the top performing 10% of companies
6.17%
=PERCENTILE(data range,0.90)
The top 10% of companies achieved at least a dividend yield of 6.17%
(i)
% of companies who did not declare more than a 3.5% dividend yield
Cumulative frequency up to upper limit of 3.5 = (3 + 11) = 14.
% Cumulative =
14 / 44 % =
31.8%
Almost 1/3 (31.8%) of companies did not declare more than a 3.5% dividend yield.
Exercise 3.26
File:
(a)
Random variable - the unit price of a rosebud (in cents)
Data type - numerical, continuous, ratio-scaled
(b)
Using Excel 's Descriptive Statistics option in Data Analysis
(i)
(iii)
(ii)
(iv)
selling price
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
rosebuds.xls
60.312
0.319
59.95
60.6
3.188
10.160
1.770
0.932
18.3
55.2
73.5
6031.2
100
Interpretation
Mean: The average selling price of rosebuds is 60.312 cents.
Median: 50% of rosebuds sold for no more than 59.95c.
Std dev: 68.3% of rosebud unit prices are likely to lie between 57.12c and 63.5c.
Skewness: There is excessive positive skewness (one very high unit price)
(c)
Coefficient of Variation CV%
3.188/60.312% =
5.3%
There is very low variability amongst unit selling prices of rose buds
(d)
Lower Quartile Q1
Upper Quartile Q3
57.7 cents
62.45 cents
=QUARTILE(data range,1)
=QUARTILE(data range,3)
(e)
The highest unit selling price of the cheapest 25% of sales is 57.7 cents (i.e. Q1)
(f)
The minimum unit selling price of the highest-priced 25% of sales is 62.45 cents (i.e. Q3)
Overall interpretation of (a), (d), (e) and (f)
Unit selling prices ranged from 55.2c to 73.5c where 50% of the selling prices
were above 59.95c. 25% of sales were below 57.7c while the most expensive 25%
of unit selling prices were above 62.45c.
(g)
90th percentile
64.6c
=PERCENTILE(data range,0.9)
(h)
10th percentile
56.6c
=PERCENTILE(data range,0.1)
(i)
Five-Number Summary Table and Boxplot
Minimum
Lower Quartile Q1
Median
Upper Quartile Q3
Maximum
55.2
57.7
59.95
62.45
73.5
=QUARTILE(data range,0)
=QUARTILE(data range,1)
=QUARTILE(data range,2)
=QUARTILE(data range,3)
=QUARTILE(data range,4)
Boxplot
Interpretation
The distribution of unit selling prices of rosebuds is skewed to the right.
There is also one extremely high unit selling price of 73.5c which can be
considered an outlier.
Overall, unit selling prices of rosebuds ranged between 55.2c and 73.5c.
25% of unit selling prices lay below 57.7 c (i.e. lower quartile); while
the middle 50% of unit selling prices ranged from 57.7c to 62.45c.
The "best" 25% of unit selling prices achieved were above 62.45c (Q3).
Finally, the median (middle) unit selling price achieved was 59.95c.
File:
Mini Case Study 3.27
Frequency Distribution
Savings Intervals
0 - ≤ 200
201 - ≤ 400
401 - ≤ 600
601 - ≤ 800
801 - ≤1000
More
Histogram of Savings Balances (R10's)
Frequency
21
71
56
19
4
4
80
71
70
Number of Clients
1 (a)
X3.27 - savings balances.xlsx
56
60
50
40
30
21
19
20
10
4
4
0
0 - ≤ 200
201 - ≤ 400
401 - ≤ 600
601 - ≤ 800 801 - ≤1000
Savings Intervals
1 (b)
Descriptive Statistics
Savings Balance (R10's)
Mean
421.86
Standard Error
16.247
Median
385
Mode
326
Standard Deviation
214.93
Sample Variance
46195.1765
Kurtosis
4.727
Skewness
1.634
Range
1383
Minimum
85
Maximum
1468
Sum
73826
Count
175
1 (c )
1 (d)
Lower Quartile
Upper Quartile
Female
Male
Grand Total
Married
23
83
106
Single
41
28
69
Grand Total
64
111
175
Female
Male
Grand Total
Married
512.6
368.0
399.4
Single
563.7
299.3
456.4
Grand Total
545.3
350.7
421.9
2 (a)
Gender
Marital status
Month-end balances
Categorical
Categorical
Numeric
2 (b)
Refer to 1 (a) (Histogram) and 1 (b) (Descriptive Statistics)
266.5
520.5
Nominal-scaled
Nominal-scaled
Ratio-scaled
The average savings balance of bank clients is R421.86.
50% of these clients have month-end balances of less than R385 (median).
From the histogram, 21 clients have month-end balances of less than R200 and 8 are above R800.
The 8 banking clients with relatively high month-end savings balances (skewness = 1.643 > 1.0) skew the average
towards these higher balances and distorts the overall picture.
Therefore the median month-end balance of R385 is a more representative indicator of savings balances.
25% of their clients have month-end balances of less than R266.6, while the top 25% of savers
have month-end balances in excess R520.5 with 4 clients above R1000 (maximum = R1468).
Refer to 1 (c) (Count of Gender and Marital Status)
Amongst the female clients, the majority (64%) are single (41/64), while amongst the male clients,
the majority (75%) are married.
Thus they are attracting mainly single females; and married males.
Refer to 1 (d) (Breakdown Table of Average Savings by Gender and Marital Status)
Single females are saving the most (average of R563,7) while single males are the worst savers with
an average savings balance of only R299.3 (compared to the overall average of R421.9).
Also females save more, on average (R545) than males (R350).
Single bank clients save more, on average (R456.4) than married bank clients (R399.4).
More
2 (c)
Plan of Action by Bank
Bank should target females in general (as they comprise only 37%) of the current client base
but have the largest average balances of all clients.
Single females (23%) in particular should be targetted to attract more high savers to the bank.
They should encourage their main client base - married males - to save more.
Mini Case Study 3.28
X3.28 - medical claims.xlsx
Frequency Distribution
Claims Ratio
Below 0.1
0.1 - 0.5
0.5 - 0.9
0.9 - 1.3
1.3 - 1.7
1.7 - 2.1
2.1 - 2.5
Above 2.5
Count
6
35
26
36
30
14
2
1
150
%
4.0
23.3
17.3
24.0
20.0
9.3
1.3
0.7
100
Histogram of Claims Ratios
No. of Members
1 (a)
File:
40
35
30
25
20
15
10
5
0
36
35
30
26
14
6
2
Claims Ratio Intervals
1 (b)
Descriptive Statistics
Claims Ratio
Mean
Median
Mode
Standard Deviation
Sample Variance
Skewness
Range
Minimum
Maximum
Sum
Count
Lower Quartile
Upper Quartile
0.974
0.996
#N/A
0.594
0.353
0.242
2.672
0.013
2.685
146.1
150
0.460
1.421
1(c) and (d) Cross-tabulation Table / Breakdown table
Age
Marital
Married
26 - 35
Row totals
1.326
0.915
1.123
1.112
std dev
0.552
0.589
0.629
0.610
minimum
0.147
0.051
0.013
0.013
maximum
2.685
1.989
2.408
2.685
23
27
35
85
average
0.919
0.474
0.947
0.793
std dev
0.543
0.356
0.500
0.524
minimum
0.014
0.024
0.218
0.014
maximum
1.905
1.262
1.814
1.905
36
19
10
65
1.078
0.733
1.084
0.974
count
average
Column totals
46 - 55
average
count
Single
36 - 45
std dev
0.577
0.547
0.602
0.594
minimum
0.014
0.024
0.013
0.013
maximum
2.685
1.989
2.408
2.685
59
46
45
150
count
2 (a)
Marital Status
Age Group (bands)
Claims Ratio
Qualitative / Categoric
Qualitative / Categoric
Quantitative / Numeric
Nominal
Ordinal
Ratio, Continuous
2 (b)
Claims Ratio Pattern of Members
The average claims ratio is 0.974. This means that, on average, members are claiming
as much as they contribute. The median claims ratio is a similar value (0.996).
This means that at least half the members are claiming more than they contribute.
The distribution (see histogram) appears to be bi-modal showing a low claims ratio (i.e. less than half their contributions)
by at least a quarter of members (Q1 = 0.46) and a high claims ratio by at least half of the
members (50% greater than median claims ratio of 0.996). At least 25% of members are
claiming significantly more than they contribute (Q3 = 1.42).
Only 4% of members claim less than 10% of their contributions (first interval of histogram).
Also at least 11% (top three intervals of histogram) are claiming at least twice as much as they contribute.
Refer to 1(c) and (d)
It would appear that married members claim significantly more (1.112) than single members (0.793).
Married members also make up the majority of members (85/150 = 57%).
1
Also, the younger members (26 - 35) and older members (46 - 55) claim significantly more
(1.078 and 1.084 respectively) than the middle-aged members (36 - 45) (0.733), on average.
These two 'high-claiming' age segments of members represent 69% (104/150) of all members.
The highest claiming segments are the married younger (26-35) (1.326) and married older members (46 - 55) (1.123).
Members from these two segments also have the maximum claims ratio of all members (2.685 and 2.408 respectively).
Collectively these members represent 39% (58/150) of all members of the medical scheme.
The lowest claiming segment is the middle-aged single members (0.474) but they only
comprise 13% (19/150) of all scheme members.
The range of claims ratios is highest amongst married (25-35 years) (0.147 - 2.685) and married (46 - 55 years) (0.013 - 2.408).
It is lowest amongst single (36 - 45 year) members (0.024 - 1.262).
In conclusion:
Younger and Older married members claim more on average - and above their contribution levels in most cases).
They also make up the bulk of membership (57%).
The single members tend to claim less than the married members, but only make up 43% of the membership base.
2 (c)
There is cause for concern about the financial viability of the Scheme because:
The claims ratio of married members exceed 1 on average
Within the married group, the age groups 26 -35 and 46 - 55 are claiming more than they contribute
The married group represent the majority of members (57%)
The two age groups within the married group represent 69% of all married (and are claiming more than they contribute).
Overall the scheme is operating close to its breakeven of financial non-viability (overall mean claims ratio = 0.974)
There is significant cross-subsidization of the married younger and older members by mostly
the middle-aged single members.
The mangement of this medical scheme needs to review member contributions from the married younger and older members.
CHAPTER 4
BASIC PROBABILITY CONCEPTS
Exercise 4.1
P(A) = 0.2 means that an event has a 20% chance of occurring.
Exercise 4.2
Mutually exclusive events.
Exercise 4.3
The outcome of one event does not influence / nor is influenced
by / the outcome of the other event.
Exercise 4.4
P(A or B) = P(A) + P(B) - P(A ∩ B) = 0.26 + 0.35 - 0.14 = 0.47
Exercise 4.5
P(X / Y) = P(X ∩ Y) / P(Y) = 0.27 / 0.36 = 0.75
P(Y / X) = P(Y ∩ X) / P(X) = 0.27 / 0.54 = 0.50
Thus P(X / Y) ≠ P(Y / X)
Exercise 4.6
(a)
File:
Sector
Mining
Financial
IT
Production
Total
Count
45
72
32
101
250
X4.6 - economic sectors.xls
% count
18.0
28.8
12.8
40.4
100
(b)
P(Financial) =
28.8%
(c)
P(Not Production) =
100 - 40.4 =
59.6%
(d)
P(Mining or IT) =
18 + 12.8 =
30.8%
(e)
In (b), the marginal probability was computed.
In (c), used the Complementary probability rule
In (d), used the Addition rule for mutually exclusive events.
Exercise 4.7
File:
X4.7 - apple grades.xls
(a)
Apple Grades
A
B
C
D
Total
Quantity
795
410
106
189
1500
(b)
P(Grade A) =
53%
(c)
P(Grade B or D) =
27,3 + 12,6 =
39.9%
(d)
P(not (Grade C or D)) =
100 - (7,1 + 12,6) =
80.3%
(e)
%
53.0
27.3
7.1
12.6
100
In (b), the marginal probability was computed.
In (c), used the Addition rule for mutually exclusive events.
In (d), used the Complementary probability rule.
Exercise 4.8
File:
Count
6678
1492
653
2865
914
12602
X4.8 - employment sectors.xls
(a)
Sector
Formal Business
Commercial Agriculture
Subsistence Agriculture
Informal Business
Domestic Service
Total
% count
53.0
11.8
5.2
22.7
7.3
100
(b)
P(Domestic Service) =
(c)
P(Commercial or Subsistance Agric) = 11,8 + 5,2 =
(d)
P(Informal Business / Business) =
(e)
In (b), the marginal probability was computed.
In (c), the Addition rule for mutually exclusive events.
In (d), the Conditional Probability rule.
7.3%
2865/(2865+6678) =
17%
30.02%
Exercise 4.9
(a)
Random variable 1
Random variable 2
(b)
(c)
(d)
(e)
File:
Qualification
Matric
Diploma
Degree
Total
X4.9 - qualification levels.xls
Managerial level (categorical, ordinal-scaled and discrete)
Qualification level (categorical, ordinal-scaled and discrete)
Section Head
28
20
5
53
Managerial Level
Dept Head
14
24
10
48
Division Head
8
6
14
28
50/129 =
5/129 =
24/50 =
28/129 =
(53+28)/129 =
(50+50+29)/129 =
10/48 =
(28+50-6)/129 =
38.76%
3.88%
48%
21.71%
62.79%
100%
20.83%
55.81%
Total
50
50
29
129
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
P(Matric) =
P(Section head ∩ Degree) =
P(Dept head / Diploma) =
P(Division head) =
P(Division head U Section head) =
P(Matric U Diploma U Degree) =
P(Degree / Dept head) =
P(Division head U Diploma U both) =
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
Probability Types and Rules
Marginal probability
Joint probability and Multiplication rule
Conditional probability
Marginal probability
Addition rule for mutually exclusive events
Collectively exhaustive set of events and Addition rule for mutually exclusive events
Conditional probability
Addition rule for non-mutually exclusive events
Yes, since these outcomes cannot occur simultaneously.
Exercise 4.10
Admin
Production
Total
File:
Cash bonus
28
56
84
(a)
(b)
(c)
(d)
(e)
P(Cash bonus) =
P(Share option) =
P(Production ∩ Cash bonus) =
P(Share option / Admin) =
P(Production / Cash bonus) =
(f)
P(A/B) = P(A) ?
(g)
See (a) to (e) above
Profit-sharing
44
75
119
84/300 =
97/300 =
56/300 =
68/140 =
56/84 =
X4.10 - bonus options.xls
Share options
68
29
97
28.00%
32.33%
18.67%
48.57%
66.67%
Total
140
160
300
Marginal probability
Marginal probability
Joint probability
Conditional probability
Conditional probability
P(A/B) =
68/140 =
48.57%
P(A) =
97/300 =
32.33%
Since P(A/B) ≠ P(A), the two events are statistically dependent.
Conclusion: The choice of bonus option and employee work function are associated.
Exercise 4.11
Age
<30
30-50
>50
Total
File:
Production
60
70
30
160
Department
Sales
25
29
8
62
Administration
18
25
35
78
103/300 =
160/300 =
29/300 =
35/78 =
(160+103-60)/300 =
34.33%
53.33%
9.67%
44.87%
67.67%
X4.11 - age profile.xls
Total
103
124
73
300
(a) (i)
(ii)
(iii)
(iv)
(v)
P(< 30 years) =
P(Production) =
P(Sales ∩ (30-50) years) =
P(>50 / Admin) =
P(Production U <30 U both) =
Marginal probability
Marginal probability
Joint probability
Conditional probability
Conditional probability
(b)
No, since outcomes of these two events can occur simultaneously.
(c)
Let A = >50 and B = Admin.
Is P(A/B) = P(A) ?
P(A/B) = 35/78 =
44.87%
73/300 =
P(A) =
24.33%
Since P(A/B) ≠ P(A), the two events are statistically dependent.
Conclusion: There is an association between the Age of an employee and the Department
in which they are employed (e.g. younger employees tend to be in Production).
(d)
See (a) above
Exercise 4.12
Usage
Professional
Personal
Total
File:
X4.12 - digital cameras.xls
Digital Camera Brand Preference
Canon
Nikon
Pentax
48
15
27
30
95
65
78
110
92
Total
90
190
280
(a)
(b)
(c)
P(Professional) =
P(Nikon User) =
P(Pentax / Personal) =
(d)
Let A = Professional usage and B = Canon preference.
P(A/B) = P(A) ? P(A/B) =
48/78 =
61.54%
P(A) =
90/280 =
32.14%
Since P(A/B) ≠ P(A), the two events are statistically dependent.
Conclusion: Type of usage and choice of brand are associated.
(e.g. Professionals may prefer Canon while Nikon is favoured by Personal users).
(e)
P(Canon ∩ Professional) =
48/280 =
(f)
P(Professional U Nikon U both) =
(90+110-15)/280 =
66.07%
Addition rule for non-mutually exclusive events
(g)
No, as outcomes of the two events can occur simultaneously as illustrated in (f) above.
90/280 =
110/280 =
65/190 =
32.14%
39.29%
34.21%
17.14%
Marginal probability
Marginal probability
Conditional probability
Joint probability
Exercise 4.13
(a) Probability Tree
Component A
Component B
Joint outcomes
(Fail)
P(F2) = 0.15
P(F1 ∩ F2) = 0.20 x 0.15 = 0.03
(Not fail)
P(S2) = 0.85
P(F1 ∩ S2) = 0.20 x 0.85 = 0.17
(Fail)
P(F2) = 0.15
P(S1 ∩ F2) = 0.80 x 0.15 = 0.12
(Not fail)
P(S2) = 0.85
P(S1 ∩ S2) = 0.80 x 0.85 = 0.68
P(F1) = 0.20
(Fail)
(Not fail)
P(S1) = 0.80
1.00
Exercise 4.13
(b)
P(A) = 0.20
P(B) = 0.15
P(A ∩ B) = P(A) x P(B) =
(c)
P(A U B)
P(Fail) =
P(A U B) =
P(A) + P(B) - P(A∩B) =
Hence P(not Fail) = 1 - P(A U B) = 1 - 0.32 =
0.2 * 0.15 =
0.03
(3% chance)
0.2+0.15-0.03
0.32
0.68
(68% chance)
Exercise 4.14
(a) Probability Tree
Workshop
P(Y) = 0.30
(Attend)
(Not attend)
P(N) = 0.70
Exam Result
Joint outcomes
(Pass)
P(S) = 0.80
P(Y ∩ S) = 0.30 x 0.8 = 0.24
(Fail)
P(F) = 0.20
P(Y ∩ F) = 0.30 x 0.20 = 0.06
(Pass)
P(S) = 0.60
P(N ∩ S) = 0.70 x 0.60 = 0.42
(Fail)
P(F) = 0.40
P(N ∩ F) = 0.70 x 0.40 = 0.28
1.00
Exercise 4.14
(b)
Refer to the end nodes of the Probability Tree to answer question 4.14 (b)
(i)
P(Pass and Attended workshop) =
(ii)
P(Pass) = P(S) =
P(S ∩ Y) + P(S ∩ N)
Reading off the Probability Tree
Thus P(S) =
P(Y ∩ S) = 0.30 x 0.8 = 0.24
0.24 + 0.42 =
P(S ∩ Y) = 0.30 x 0.8 = 0.24
P(S ∩ N) = 0.70 x 0.60 = 0.42
0.66
66%
24%
Exercise 4.15
Using Excel
(i)
(ii)
(iii)
(iv)
6! = 6.5.4.3.2.1 =
3! 5! = (3.2.1)(5.4.3.2.1) =
4! 2! 3! = (4.3.2.1)(2.1)(3.2.1) =
7!/(4!*(7-4)!) =
7C4 =
(v)
9C6
=
9!/(6!*(9-6)!) =
(vi)
8P3
=
8!/(8-3)!) =
336
=FACT(8)/FACT(5)
(vii)
5P2 =
5!/(5-2)!) =
20
=FACT(5)/FACT(3)
(viii)
7C7
7!/(7!*(7-7)!) =
(ix)
=
7P4 =
7!/(7-4)!) =
720
720
288
35
=FACT(6)
=FACT(3)*FACT(5)
=FACT(4)*FACT(2)*FACT(3)
=FACT(7)/(FACT(4)*FACT(3))
84
=FACT(9)/(FACT(6)*FACT(3))
1
840
=FACT(7)/(FACT(7)*FACT(0))
=FACT(7)/FACT(3)
Likely scenarios
(i)
(ii)
(iii)
(iv)
(vi)
Number of ways of arranging 6 cars on a showroom floor.
Number of ways of arranging the seating plan of 8 persons,
consisting of 3 males and 5 females.
Number of sequences of visiting 9 stores, consisting of 4 clothing stores,
2 home décor stores and 3 coffee shops.
Selecting all combinations of 4 holiday destinations from a possible 7 destinations.
Similar for (v) and (viii)
Selecting a committee of 3 persons from 8 candidates, where the first person
selected is the chairman, the second is the secretary and the third is a member.
Similar for (vii) and (ix)
Exercise 4.16
Assume each advertisement contains a different combination of 7 out of 12 products.
12C7
=
12!/(7!*(12-7)!) =
792 different combinations
=FACT(12)/(FACT(7)*FACT(5)) (Using Excel )
Exercise 4.17
A different permutation of 3 soup brands on 5 shelves is required.
5P3
=
5!/(5-3)! = 60 distinct ordering of 3 soup brands on 5 shelves.
=FACT(5)/FACT(2)
(Using Excel )
Exercise 4.18
(a)
9C4
=
(b)
P(3,5,7,8) =
9!/(4!*(9-4)!) =
126 separate portfolios of 4 equities.
=FACT(9)/(FACT(4)*FACT(5)) (Using Excel )
1/126 =
0,794% chance of getting this combination.
0.007937
Exercise 4.19
No. of permutations of 5 screws = 5!
5.4.3.2.1 =
120
Thus the probability of replacing them in exactly the same order =
1/120 =
0.00833 (0,833% chance)
Exercise 4.20
(a)
10C3
=
10!/(3!*(10-3)!) =
120 different selections
of 3 tourist attractions from 10 options.
=FACT(10)/(FACT(3)*FACT(7)) (Using Excel )
(b)
P(a given combination of 3 out of 10) =
1/120 =
0.0083
(0,833% chance)
Exercise 4.21
(a)
4 C2
x 7 C4 =
(4!/(2!*2!))*(7!/(4!*3!)) =
210
different committees
=FACT(4)/(FACT(2)*FACT(2))*FACT(7)/(FACT(4)*FACT(3))
(Using Excel )
(b)
(4C2 x 7C4) x 2 = (4!/(2!*2!))*(7!/(4!*3!)) x 2 =
420
different committees
=2*FACT(4)/(FACT(2)*FACT(2))*FACT(7)/(FACT(4)*FACT(3))
(Using Excel )
Exercise 4.22
Project Scoping Study
APPROACH 1
T
L
Using a PROBABILITY TREE
On time
Late
S
NS
Marginal probabilities
P(T) =
Scope change
No scope change
Conditional probabilities
0.7
On time
Late
P(L)
Bayes Theorem
0.3
Joint probabilities
P(S/T) =
0.4
P(S and T) =
0.28
P(NS/T) =
0.6
P(NS and T) =
0.42
P(S/L) =
0.8
P(S and L) =
0.24
P(NS/L) =
0.2
P(NS and L) =
0.06
1
Bayes Application
Given
Find
(i)
(ii)
Then
P(T) =
P(T/S) =
Using the Joint Probabilities from the Probability Tree
Prior Probability
Posterior Probability
0.7
P(T and S)/P(S) =
P(S) =
P(S and T) + P(S and L) =
P(T and S) =
0.28
P(T/S) =
P(T and S)/P(S) =
0.52
0.5385
There is a 53.85% chance that a 'scope-changed' project will be completed on time.
---ooOoo--APPROACH 2
T
L
Using TABLE FORMAT (Applying Marginals and Conditional Probabilities)
Additional Information
S
NS
0.28
0.42
0.24
0.06
0.52
0.48
0.7
0.3
1
---ooOoo---
P(T/S) =
P(T and S)/P(S)
=0.28/(0.28+0.24)
0.5385
Exercise 4.23
Married Couples Sporting Habits Study
APPROACH 1
HS
HN
Bayes Theorem
Using a PROBABILITY TREE
Husband plays sport
Husband does not play sport
Marginal probabilites
P(HS) =
Husband
plays sport
0.6
P(HNS)
Husband does
not play sport
0.4
WS
WNS
Wife plays sport
Wife does not play sport
Conditional probabilities
Joint probabilities
P(WS/HS) =
0.4
P(HS and WS) =
0.24
P(WNS/HS) =
0.6
P(HS and WNS) =
0.36
P(WS/HNS) =
0.3
P(HNS and WS) =
0.12
P(WNS/HNS) =
0.7
P(HNS and WNS) =
0.28
1
Bayes Application
Using the Joint Probabilities from the Probability Tree
P(HS) =
P(HS/WS) =
Given
Find
(i)
(ii)
Then
0.6
P(HS and WS)/P(WS) =
Prior Probability
Posterior Probability
P(WS) =
P(WS and HS) + P(WS and HNS) =
P(HS and WS) =
0.24
P(HS/WS) =
P(HS and WS)/P(WS) =
0.36
0.6667
There is a 66,67% chance that a husband plays sport if the wife also plays sport.
---ooOoo--APPROACH 2
HS
HNS
Using TABLE FORMAT (Applying Marginals and Conditional Probabilities)
Additional Information
WS
WNS
0.24
0.36
0.12
0.28
0.36
0.64
0.6
0.4
1
---ooOoo---
P(HS/WS) =
P(HS and WS)/P(WS)
=0.24/(0.24+0.12)
0.6667
Airline Departure Times Study
Exercise 4.24
APPROACH 1
A
B
Using a PROBABILITY TREE
Airline A
Airline B
T
L
Marginal probabilities
P(A) =
A
0.6
B
P(B) =
Bayes Theorem
0.4
Leaves on Time
Leaves Late
Conditional probabilities
Joint probabilities
P(T/A) =
0.8
P(A and T) =
0.48
P(L/A) =
0.2
P(A and L) =
0.12
P(T/B) =
0.65
P(B and T) =
0.26
P(L/B) =
0.35
P(B and L) =
0.14
1
Bayes Application
Given
Find
(i)
(ii)
Then
P(A) =
P(A/T) =
Using the Joint Probabilities from the Probability Tree
0.6
P(A and T)/P(T) =
Prior Probability
Posterior Probability
P(T) =
P(A and T) + P(B and T) =
P(A and T) =
0.48
P(A/T) =
P(A and T)/P(T) =
0.74
0.6486
There is a 64.86% chance that it is Airline A, if the aircraft that has just left, left on time.
---ooOoo--APPROACH 2
A
B
Using TABLE FORMAT (Applying Marginals and Conditional Probabilities)
Additional Information
T
L
0.12
0.48
0.26
0.14
0.26
0.74
0.6
0.4
1
---ooOoo---
P(A/T) =
P(A and T)/P(T)
=0.48/(0.48+0.26)
0.6486
Exercise 4.25
New Business Venture Study
APPROACH 1
G
NG
Using a PROBABILITY TREE
NBV started by Graduate
NBV started by non-Graduate
Marginal probabilities
P(G) =
Graduate
S
F
Successful
Failure
Conditional probabilities
0.6
Non
graduate
P(NG) =
Bayes Theorem
0.4
Joint probabilities
P(S/G) =
0.8
P(G and S) =
0.48
P(F/G) =
0.2
P(G and F) =
0.12
P(S/NG) =
0.65
P(NG and S) =
0.26
P(F/NG) =
0.35
P(NG and F) =
0.14
1
Bayes Application
Given
Find
(i)
(ii)
Then
Using the Joint Probabilities from the Probability Tree
P(G) =
P(G/F) =
0.6
P(G and F)/P(F) =
Prior Probability
Posterior Probability
P(F) =
P(G and F) =
P(G and F) + P(NG and F) =
0.12
P(G/F) =
P(G and F)/P(F) =
0.26
0.4615
There is a 46.15% chance that a NBV was started by a Graduate given that it has failed.
---ooOoo--APPROACH 2
G
NG
Using TABLE FORMAT (Applying Marginals and Conditional Probabilities)
Additional Information
S
F
0.48
0.12
0.26
0.14
0.74
0.26
0.6
0.4
1
---ooOoo---
P(G/F) = P(G and F)/P(F)
=0.12/(0.12+0.14)
0.4615
Exercise 4.26
On-line Airline Tickets Purchase Study
APPROACH 1
E
NE
Using a PROBABILITY TREE
e-Ticket purchase
non e-Ticket purchase
Marginal probabilities
P(E) =
e-Ticket
0.6
Bayes Application
Given
Find
(i)
(ii)
Then
B
NB
Business Traveller
non-Business Traveller
Conditional probabilities
0.4
none-Ticket
P(NE) =
Bayes Theorem
Joint probabilities
P(B/E) =
0.8
P(E and B) =
0.32
P(NB/E) =
0.2
P(E and NB) =
0.08
P(B/NE) =
0.45
P(NE and B) =
0.27
P(NB/NE) =
0.55
P(NB and NB) =
0.33
1
Using the Joint Probabilities from the Probability Tree
P(E) =
P(E/B) =
0.4
P(E and B)/P(B) =
Prior Probability
Posterior Probability
P(B) =
P(E and B) =
P(E and B) + P(NE and B) =
0.32
P(E/B) =
P(E and B)/P(B) =
0.59
0.5424
There is a 54.24% chance that an e-ticket was bought given that it was bought by a business traveller.
---ooOoo--APPROACH 2
E
NE
Using TABLE FORMAT (Applying Marginals and Conditional Probabilities)
Additional Information
B
NB
0.32
0.08
0.27
0.33
0.59
0.41
0.4
0.6
1
---ooOoo---
P(E/B) =
P(E and B)/P(B)
=0.32/(0.32+0.27)
0.5424
CHAPTER 5
PROBABILITY DISTRIBUTIONS
Exercise 5.1
Exercise 5.2
Binomial probability distribution
Poisson probability distribution
(a)
(b)
(c)
(d)
continuous
discrete
discrete
continuous
(e.g. 35.142 gm)
(e.g. 132 employees)
(e.g. 7046 households)
(e.g. 514.68 km)
Exercise 5.3
(a) (i)
n = 7; p = 0,2; x = 3
(a) (ii)
n = 10; p = 0,2; x = 4
P(x = 3) =
P(x = 4) =
7C3
3
(7-3)
(0.2) (0.8)
10C4
4
=
0.1147
(10-4)
(0.2) (0.8)
11.47%
=
0.0881
8.81%
(a) (iii)
n = 12; p = 0,3; x ≤ 4
P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) + P(x = 4)
72.37%
0.01384+0.07118+0.16779+0.2397+0.23114 =
0.7237
(a) (iv)
n = 10; p = 0,05; x = 2 or 3
(a) (v)
n = 8; p = 0,25; x ≥ 3
(b) (i)
(b) (ii)
(b) (iii)
(b) (iv)
(b) (v)
=BINOMDIST(3,7,0.2,0)
=BINOMDIST(4,10,0.2,0)
=BINOMDIST(4,12,0.3,1)
=BINOMDIST(2,10,0.05,0)+BINOMDIST(3,10,0.05,0)
=1-BINOMDIST(2,8,0.25,1)
P(x = 2) + P(x = 3)
0.07464+0.01048 =
0.0851
8.51%
1 - (P(x = 0) + P(x = 1) + P(x = 2))
1 - (0.10011+0.26697+0.31146) =
0.3215
32.15%
0.1147
0.0881
0.7237
0.0851
0.3215
Exercise 5.4
(a)
Binomial distribution
There are only two possible outcomes (in-stock; out-of-stock)
This outcome is observed 6 times (n = 6 stores)
The probability of observing the "out-of-stock" outcome, p = 0.20, is constant.
The stores (trials) are independent of each other
(b)
P(x = 1) = 6C1 (0.2)1 (0.8)(6-1) =
0.3932
39.32% =BINOMDIST(1,6,0.2,0)
(c)
P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2)
0.2621 + 0.3932 + 0.2458 =
0.9011
90.11% =BINOMDIST(2,6,0.2,1)
(d)
P(x = 0) = 6C0 (0.2)0 (0.8)(6-0) =
0.2621
26.21% =BINOMDIST(0,6,0.2,0)
(e)
Mean (binomial) =
6 x 0.2 =
1.2
On average, 1,2 stores out the 6 stores surveyed can be expected to be out of stock in a week.
Exercise 5.5
(a)
(b)
Binomial distribution
n = 12
p = 0.15
P(x = 0) = 12C0 (0.15)0 (0.85)(12-0) =
0.1422
=BINOMDIST(0,12,0.15,0)
n = 15
p = 0.15
P(x < 3) = P(x = 0) + P(x = 1) + P(x = 2) =
= 0.0874 + 0.2312 + 0.2857 =
0.6042
=BINOMDIST(2,15,0.15,1)
Exercise 5.6
(a)
Binomial distribution
n = 10 p = 0.30
(probability of preferring the deluxe model)
P(x = 3) = 10C3 (0.30)3 (0.70)(10-3) =
0.2668
=BINOMDIST(3,10,0.3,0)
(b)
n = 10 p = 0.70
(probability of preferring the standard model)
P(x > 2) = 1 - P(x ≤ 2) = 1 - (P(x = 0) + P(x = 1) + P(x = 2)) =
= 1 - (0.0000059 + 0.000138 + 0.00145) =
0.9984
=1-BINOMDIST(2,10,0.7,1)
Exercise 5.7
(a)
Binomial distribution
n=8
p = 0.05 (probability of a defective Tata truck)
P(x = 1) = 8C1 (0.05)1 (0.95)(8-1) =
(b)
n=8
p = 0.05
n=8
p = 0.05
0.9942
=BINOMDIST(2,8,0.05,1)
(probability of a defective Tata truck)
0
P(x = 0) = 8C0 (0.05) (0.95)(8-0) =
(d)
=BINOMDIST(1,8,0.05,0)
(probability of a defective Tata truck)
P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) =
= 0.66342 + 0.27934 + 0.05146 =
(c)
0.2793
0.6634
=BINOMDIST(0,8,0.05,0)
Mean(binomial) =
64 x 0.05 =
3.2
Based on 64 sales, the dealer can expect 3.2 trucks to be returned for assemby defective repairs
Exercise 5.8
(a)
Binomial distribution
n=6
p = 0.80 (probability of a UT out-performing the JSE Index)
P(x = 6) = 6C6 (0,80)6 (0,20)(6-6) =
(b)
n=6
p = 0.80
n=8
p = 0.20
=BINOMDIST(6,6,0.8,0)
(probability of a UT out-performing the JSE Index)
P(x = (2 U 3)) = P(x = 2) + P(x = 3) =
= 0.01536 + 0.08192 =
(c)
0.2621
0.0973
=BINOMDIST(2,6,0.8,0)+BINOMDIST(3,6,0.8,0)
(probability of a UT under-performing the JSE Index)
P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) =
= 0.26214 + 0.39322 + 0.24576 =
0.9011
=BINOMDIST(2,6,0.2,1)
Exercise 5.9
(a)
Binomial distribution
n = 12
p = 0.20 (probability of a person participating in a focus group)
P(x = 2) = 12C2 (0.20)2 (0.80)(12-2) =
=BINOMDIST(2,12,0.2,0)
(b)
n = 12
p = 0.20
(probability of a person participating in a focus group)
5
P(x = 5) = 12C5 (0.20) (0.80)(12-5) =
=BINOMDIST(5,12,0.2,0)
(c)
n = 12
p = 0.20
0.2835
0.5320
(probability of a person participating in a focus group)
P(x ≥ 6) = 1 - P(x ≤ 5) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3)+P(x=4)+P(x=5)) =
= 1 - (0.0687 + 0.2062 + 0.2835 + 0.2362 + 0.1329 + 0.0532) =
=1 - BINOMDIST(5,12,0.2,1)
0.0194
Exercise 5.10
(a) (i)
Binomial distribution
n = 10
p = 0.10 (probability of a general population person being a heavy reader )
P(x < 2) = P(x=0)+P(x=1) = 10C0(0.10)0(0.90)(10-0) + 10C1(0.10)1(0.90)(10-1)
= 0.34868 + 0.38742 =
0.7361
P(x < 2) = =BINOMDIST(1,10,0.1,1)
(a) (ii)
n = 10
p = 0.35 (probability of a pensioner person being a heavy reader )
P(x < 2) = P(x=0)+P(x=1) = 10C0(0.35)0(0.65)(10-0) + 10C1(0.35)1(0.65)(10-1)
= 0.01346 + 0.07249 =
0.0860
P(x < 2) = =BINOMDIST(1,10,0.35,1)
(b)
n = 280
p = 0.65 (probability of a pensioner person not being a heavy reader )
Expected number (mean) of "non-heavy" readers =
280 x 0.65 =
182
Exercise 5.11
(a)
Poisson distribution
e-3 35 / 5! =
P(x = 5 / a = 3) =
(b)
P(x ≥ 4 / a = 3) = 1 - P(x ≤ 3) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3))
-3 0
-3 1
-3 2
-3 3
= 1 - (e 3 / 0! +e 3 / 1! + e 3 / 2! + e 3 / 3!)
= 1 - (0.04979 + 0.14936 + 0.22404 + 0.22404)
= 1 - 0.64723 =
0.3528
35.28%
(c)
P(x = 0 / a = 3) =
e-3 30 / 0! =
0.1008
0.0498
10.08%
4.98%
=POISSON(5,3,0)
=1-POISSON(3,3,1)
=POISSON(0,3,0)
Exercise 5.12
(a)
(b)
Poisson distribution
P(x ≤ 2 / a = 4) = P(x=0)+P(x=1)+P(x=2)
-4 0
-4 1
-4 2
= e 4 / 0! +e 4 / 1! + e 4 / 2!
= 0,01832 + 0,07326 + 0,14653 =
0.2381
23.81% =POISSON(2,4,1)
P(x ≥ 4 / a = 4) = 1 - P(x ≤ 3) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3))
= 1 - (0,2381 (from (i) above) + e-3 43 / 4!)
= 1 - (0,2381 + 0,19537)
= 1 - 0,43347 =
0.5665
56.65%
=1-POISSON(3,4,1)
Exercise 5.13
(a) (i)
Poisson distribution
e-6 61 / 1! =
P(x = 1 / a = 6) =
(a) (ii)
P(x ≤ 3 / a = 6) = P(x=0)+P(x=1)+P(x=2)+P(x=3)
-6 0
-6 1
-6 2
-6 3
= e 6 / 0! +e 6 / 1! + e 6 / 2! + e 6 / 3!
= 0,00248 + 0,01487 + 0,04462 + 0,08924
(a) (iii)
0.0149
1.49%
P(x ≥ 3 / a = 6) = 1 - P(x ≤ 2) = 1 - (P(x=0)+P(x=1)+P(x=2))
= 1 - (e-6 60 / 0! +e-6 61 / 1! + e-6 62/ 2!)
= 1 - (0,00248 + 0,01487 + 0,04462)
= 1 - 0,06197 =
0.938
93.80%
=POISSON(1,6,0)
0.1512
15.12%
=POISSON(3,6,1)
=1-POISSON(2,6,1)
(b)
Note: a = 3 since the mean orders must refer to a given half-day interval (i.e. 6/2 = 3)
-3 1
e 3 / 1! =
P(x = 1 / a = 3) =
0.1494
14.94%
=POISSON(1,3,0)
(c)
Mean =
6 orders/day
Std dev =
√6 =
2.449 orders/day
Exercise 5.14
(a)
Poisson distribution
P(x ≥ 3 / a = 1.8) = 1 - P(x ≤ 2) = 1 - (P(x = 0) + P(x = 1) + P(x = 2))
-1.8
0
-1.8
1
-1.8
2
= 1 - (e 1.8 / 0! + e
1.8 / 1! + e 1.8 / 2!)
= 1 - (0.1653 + 0.29754 + 0.26778)
= 1 - 0.7306 =
0.2694
=1-POISSON(2,1.8,1)
26.94%
(b)
P(x < 4 / a = 1.8) = P(x ≤ 3) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3)
= e-1.8 1.80 / 0! + e-1.8 1.81 / 1! + e-1.8 1.82/ 2! + e-1.8 1.83/ 3!
= 0.1653 + 0.29754 + 0.26778 + 0.1607 =
0.8913
89.13%
=POISSON(3,1.8,1)
Exercise 5.15
(a)
Poisson distribution
P(x ≤ 5 / a = 7) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) + P(x = 4) + P(x = 5)
= e-7 70 / 0! + e-7 71 / 1! + e-7 72/ 2! + e-7 73/ 3! + e-7 74 / 4! + e-7 75 / 5!
= 0.00091 + 0.00638 + 0.02234 + 0.05213 + 0.09123 +0.12772 =
0.3007
=POISSON(5,7,1)
30.07%
(b)
P(x = 6 or x = 9 / a = 7) = P(x=6) + P(x=9)
25.04%
= e-7 76 / 6! + e-7 79 / 9! =
0.1490 + 0.1014 =
0.2504
=POISSON(6,7,0)+POISSON(9,7,0)
(c)
Note: a = 14 since the time interval is a given two-day period (i.e. 7 x 2 = 14)
P(x > 20 / a = 14) = 1 - P(x ≤ 20)
= 1 - (P(x = 0) + P(x = 1) + P(x = 2) + …… + P(x = 19) + P(x = 20))
Use Excel only.
=1 - POISSON(20,14,1) =
= 1 - 0.9521 =
0.0479
4.79%
Exercise 5.16
(a)
(b)
Normal distribution
Use the Standard Normal (z) table
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
P(0 < z < 1.83) =
P(z > -0.48) =
P(-2.25 < z < 0) =
P(1.22 < z ) =
P(-2.08< z < 0.63) =
P(z < -0.68) =
P(0.33 < z < 1.5) =
0.4664
0.5 + P(0 < z < 0.48) = 0.5 + 0.1844 =
0.6844
P(0 < z < 2.25) =
0.4878
0.5 - P(0 < z < 1.22) = 0.5 - 0.3888 =
0.1112
P(0 < z < 2.08)+P(0 < z < 0.63) =
0,4812 + 0,2357 =
0.5 - P(0 < z < 0.68) =
0,5 - 0,2517 =
0.2483
P(0 < z < 1.5) - P(0 < z < 0.33) =
0.4332 - 0.1293 =
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
Using Excel
=NORMSDIST(1.83)-0.5
=1-NORMSDIST(-0.48)
=0.5-NORMSDIST(-2.25)
=1-NORMSDIST(1.22)
=NORMSDIST(0.63)-NORMSDIST(-2.08)
=NORMSDIST(-0.68)
=NORMSDIST(1.5)-NORMSDIST(0.33)
0.4664
0.6844
0.4878
0.1112
0.7169
0.2483
0.3039
0.7169
0.3039
Exercise 5.17
Normal distribution
(a)
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(b)
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
Use the Standard Normal (z) table
Look up area
P(z < ?) = 0.9147
0.9147 - 0.5 = 0.4147
P(z > ?) = 0.5319
0.5319 - 0.5 = 0.0319
Since area to right of z is greater than 0.5, z must be negative.
P(0 < z < ?) = 0.4015
0.4015
P(? < z < 0) = 0.4803
0.4803
P(? < z ) = 0.0985
0.5 - 0.0985 = 0.4015
P(z < ?) = 0.2517
0.5 - 0.2517 = 0.2483
Since area to left of z is less than 0.5, z must be negative.
P(? < z ) = 0.6331
0.6331 - 0.5 = 0.1331
Since area to right of z is greater than 0.5, z must be negative.
Using Excel
=NORMSINV(0.9147)
=NORMSINV(1-0.5319)
=NORMSINV(0.5+0.4015)
=NORMSINV(0.5-0.4803)
=NORMSINV(1-0.0985)
=NORMSINV(0.2517)
=NORMSINV(1-0.6331)
z-value
1.37
-0.08
1.29
-2.06
1.29
-0.67
-0.34
1.3703
-0.0800
1.2901
-2.0600
1.2901
-0.6691
-0.3401
Exercise 5.18
(a)
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(b)
(i)
(ii)
(iii)
(iv)
(v)
(vi)
Use the Standard normal (z) table
μ = 64
σ = 2.5
P(x < 62) = P(z < (62-64)/2.5) = P(z < -0.08) =
0.5 - 0.2881 =
P(x > 67.4) = P(z > (67.4-64)/2.5) = P(z > 1.36) =
0.5 - 0.4131 =
P(59.6 < x < 62.8) = P((59.6-64)/2.5 < z < (62.8-64)/2.5) = P(-1.76 < z < -0.48) =
0.4608 - 0.1844 =
P(x > ?) = 0.1026
Look up area = 0.3974 giving z ≈ 1.267
then
x = 64+1.267*2.5 =
P(x > ?) = 0.9772
Look up area = 0.9772 - 0.5 = 0.4772 giving z = -2.00
then
x = 64-2.00*2.5 =
P(60.2 < x < ?) = 0.6652
Find P(60.2 < x < 64) = P(-1.52 < z < 0)
giving area = 0.4357.
Look up area = 0.6652 - 0.4357 = 0.2295 giving z ≈ 0.611
then
x = 64+0.611*2.5 =
Using Excel
NORMDIST and NORMINV
=NORMDIST(62,64,2.5,1)
=1-NORMDIST(67.4,64,2.5,1)
=NORMDIST(62.8,64,2.5,1)-NORMDIST(59.6,64,2.5,1)
=NORMINV(1-0.1026,64,2.5)
=NORMINV(1-0.9772,64,2.5)
=NORMDIST(60.2,64,2.5,1)+0.6652
Apply area = 0.729455 to '=NORMINV(0.729455,64,2.5)
0.2119
0.0869
0.2764
67.167
59.002
0.729455
65.528
0.2119
0.0869
0.2764
67.168 min
59 min
65.528 min
Exercise 5.19
Gym Attendance Duration
Normal distribution
x ≡ N(80, 20)
(a)
P(x > 120) = P(z > (120-80)/20) = P(z > 2) =
0.5 - 0.4772 =
0.0228
2.28%
(b)
P(x < 60) = P(z (60-80)/20) = P(z < -1) =
0.5 - 0.3413 =
0.1587
15.87%
(c)
P(x > k) = 0.60
Look up area = 0.60 - 0.50 = 0.10 giving z ≈ -0.253
Then
x = 80 - 0.253*20 ≈
Using Excel
NORMDIST or NORMINV
(a)
=1-NORMDIST(120,80,20,1)
0.0228
(b)
=NORMDIST(60,80,20,1)
0.1587
(c)
=NORMINV(1-0.6,80,20)
74.93
minutes
74.94 min
Exercise 5.20
Automatic Washing Machine Lifespan
Normal distribution
x ≡ N(3.1; 1.1)
(a)
P(x < 1) = P(z < (1-3.1)/1.1) = P(z < -1.9091) =
0.5 - 0.4719 =
0.0281
2.81%
(b) (i)
P(x > 4) = P(z > (4-3.1)/1.1) = P(z > 0.819) =
0.5 - 0.2939 =
0.2061
20.61%
0.5 - 0.4854 =
0.0146
1.46%
(b) (ii) P(x > 5.5) = P(z > (5.5-3.1)/1.1) = P(z > 2.182) =
(c)
P(x < k) = 0.05
Look up area = 0.50 - 0.05 = 0.45 giving z ≈ -1.645
Then
x = 3.1 - 1.645*1.1 ≈
1.29 years
Using Excel
NORMDIST or NORMINV
(a)
=NORMDIST(1,3.1,1.1,1)
0.0281
(b) (i)
=1-NORMDIST(4,3.1,1.1,1)
0.2061
(b) (ii) =1-NORMDIST(5.5,3.1,1.1,1)
(c)
=NORMINV(0.05,3.1,1.1)
0.0146
1.29 years
Exercise 5.21
(a)
(i)
Household Daily Water Usage
Normal distribution
x ≡ N(220; 45)
P(x > 300) = P(z > (300-220)/45) = P(z > 1.778) =
0.5 - 0.4625 =
0.0375
3.75%
(ii)
P(x < 100) = P(z < (100-220)/45) = P(z < -2.667) =
0.5 - 0.4962 =
0.0038
0.38%
(iii)
P(x < k) = 0.15
Look up area = 0.50 - 0.15 = 0.35 giving z ≈ -1.038 (approx)
Then
x = 220 - 1.038*45 ≈
173.29 litres
(iv)
P(x > k) = 0,20
Look up area = 0.50 - 0.20 = 0.30 giving z ≈ 0.842 (approx)
Then
x = 220 + 0.842*45 ≈
257.89 litres
Using Excel
NORMDIST
(b)
(a) (i)
(a) (ii)
(c)
=1-NORMDIST(300,220,45,1)
=NORMDIST(100,220,45,1)
Using Excel
0.0377
0.0038
NORMINV
(a) (iii) =NORMINV(0.15,220,45)
(a) (iv) =NORMINV(1-0.2,220,45)
173.36
257.87
litres
litres
Exercise 5.22
(a)
(i)
(ii)
(iii)
(iv)
Long-distance Truck Drivers' Reaction Times
Normal distribution x ≡ N(1.4; 0.25)
Variance = 0.0625 Std dev = 0.25
P(x > 2) = P(z > (2-1.4)/0.25) = P(z > 2.4) =
0.5 - 0.4918 =
0.0082
P(1.2 < x < 1.4) = P((1.2-1.4)/0.25 < z < (1.4-1.4)/0.25) = P(-0.80 < z < 0) =
0.2881
P(x < 0.9) = P(z < (0.9-1.4)/0.25) = P(z < -2.00) =
0.5 - 0.4772 =
0.0228
P(0.5 < x < 1.0) = P((0.5-1.4)/0.25 < z < (1.0-1.4)/0.25) = P(-3.6
0.49984
< z <--1.6)
0.4452
= =
0.0546
(b)
P(x > 1.8) = P(z > (1.8-1.4)/0.25) = P(z > 1.6) =
(c)
P(x > 1.7) = P(z > (1.7-1.4)/0.25) = P(z > 1.2) =
0.82%
28.81%
2.28%
5.46%
0.5 - 0.4452 =
0.0548
5.48%
No. of truck drivers = 120 * 0,0548 = 6,576 drivers
7 drivers (approx)
0.5 - 0.3849 =
0.1151
11.51%
No. of truck drivers = 360 * 0.1151 = 41.436 drivers
42 drivers (approx)
NORMDIST
(a)
Using Excel
(i)
(ii)
(iii)
(iv)
=1-NORMDIST(2,1.4,0.25,1)
=0.5-NORMDIST(1.2,1.4,0.25,1)
=NORMDIST(0.9,1.4,0.25,1)
=NORMDIST(1,1.4,0.25,1)-NORMDIST(0.5,1.4,0.25,1)
0.0082
0.2881
0.0228
0.0546
0.82%
28.81%
2.28%
5.46%
(b)
=1-NORMDIST(1.8,1.4,0.25,1)
truck drivers
0.0548
6.576
5.48%
7
(c)
=1-NORMDIST(1.7,1.4,0.25,1)
truck drivers
41.425
0.115
42
Exercise 5.23
Hair dye Containers Mass
Normal distribution
x ≡ N(18.2; 0.7)
(a) P(x < 18) = P(z < (18-18.2)/0.7) = P(z < -0.2857) =
0.5 - 0.1124 (approx) =
or
0.3876
38.76%
Look up area = 0.50 - 0.15 = 0.35 giving z ≈ 1.038 (approx)
Then
x = 18.2 + 1.038 * 0.7 ≈
18.927 gm
(b) P(x > k) = 0.15
Using Excel
Variance = 0.49 Std dev = 0.7
NORMDIST and NORMINV
(a) =NORMDIST(18,18.2,0.7,1)
(b) =NORMINV(0.85,18.2,0.7)
0.3875
18.926
%
gm
Exercise 5.24
Motor Vehicle Service Time
Normal distribution
x ≡ N(70; 9)
Variance = 81 Std dev = 9
(a)
P(x < 60) = P(z < (60-70)/9) = P(z < -1.11) =
0.5 - 0.3665 =
0.1335
13.35%
(b)
P(x > 90) = P(z > (90-70)/9) = P(z > 2.22) =
0.5 - 0.4868 =
0.0132
1.32%
(c)
P(50 < x < 60) = P((50-70)/9 < z < (60-70)/9) = P(-2.22 < z < -1.11) =
0.4868 - 0.3665 =
0.1203
12.03%
(d)
P(x > 80) = P(z > (80-70)/9) = P(z > 1.11) =
No. of customers =
0.1335 * 80 =
0.1335
10.68
13.35%
customers
(e)
P(z > (80-μ)/9) = 0.05
Then
Look up area = 0.50 - 0.05 = 0.45 giving z ≈ 1.645
1.645 = (80 - μ)/9
giving μ = 80 - 1.645 * 9 = 65.195 min
Using Excel
NORMDIST
0.5 - 0.3665 =
0.13326
13.33%
=1-NORMDIST(90,70,9,1)
0.013134
1.31%
(c)
=NORMDIST(60,70,9,1)-NORMDIST(50,70,9,1)
0.120126
12.01%
(d)
=1-NORMDIST(80,70,9,1)
0.13326
13.33%
(e)
This answer cannot be computed from the Excel function NORMDIST.
(a)
=NORMDIST(60,70,9,1)
(b)
Exercise 5.25
Coffee Dispensing Machine - Cup Fill
Normal distribution
(a)
(i)
(ii)
(iii)
x ≡ N(230; 10)
P(x > 235) = P(z > (235-230)/10) = P(z > 0.5) =
0.5 - 0.1915 =
0.3085
P(235 < x < 245) = P((235-230)/10 < z < (245-230)/10) = P(0.5 < z < 1.5) =
0.4332 - 0.1915 =
0.2417
P(x < 220) = P(z < (220-230)/10) = P(z < -1.00) =
0.5 - 0.3413 =
0.1587
P(x > k) = 0.15
(c)
P(z < (220-μ)/10) = 0.10 Look up area = 0.50 - 0.10 = 0.40 giving z ≈ -1.282
Then
-1.282 = (220 - μ)/10
giving μ = 220 + 1.282 * 10 =
NORMDIST and NORMINV
(a) (i) =1-NORMDIST(235,230,10,1)
(a) (ii) =NORMDIST(245,230,10,1)-NORMDIST(235,230,10,1)
(a) (iii) =NORMDIST(220,230,10,1)
0.3085
0.2417
0.1587
(b)
240.36
=NORMINV(1-.15,230,10)
24.17%
15.87%
Look up area = 0.50 - 0.15 = 0.35 giving z ≈ 1.038 (approx)
Then
x = 230 + 1.038 * 10 =
240.38 ml
(b)
Using Excel
30.85%
30.85%
24.17%
15.87%
ml
232.82 ml
Exercise 5.26
Normal distribution
Car Battery Lifespan
x ≡ N(28; 4)
(a)
P(30 < x < 34) = P((30-28)/4 < z < (34-28)/4) = P(0.50 < z < 1.50) =
0.4332 - 0.1915 =
or
(b)
P(x < 24) = P(z < (24-28)/4) = P(z < -1.00) =
(c)
P(x > k) = 0.60
(d)
P(z < (x-28)/4) = 0.05
Look up area = 0.50 - 0.05 = 0.45 giving z ≈ -1.645
Then
-1.645 = (x - 28)/4
giving x = 28 - 1.645 * 4 =
21.42
Using Excel
(a)
(b)
(c)
(d)
0.2417
24.17%
0.1587
15.87%
0.5 - 0.3413 =
or
Look up area = 0.60 - 0.50 = 0.10 giving z ≈ -0.253 (approx)
Then
x = 28 - 0.253 * 4 ≈
26.988 months
months
NORMDIST and NORMINV
=NORMDIST(34,28,4,1) - NORMDIST(30,28,4,1)
=NORMDIST(24,28,4,1)
=NORMINV(0.4,28,4)
=NORMINV(.05,28,4)
0.2417
0.1587
26.987
21.421
24.17%
15.87%
26.987 mths
21.421 mths
CHAPTER 6
SAMPLING AND SAMPLING DISTRIBUTIONS
6.1
To generalise sample findings to the target population.
6.2
Sampling methods; Concept of the sampling distribution.
6.3
subset.
6.4
representative.
6.5
Non-probability sampling: sample members are chosen using non-random criteria meaning
that some members of the target population are excluded from being included in the
sample.
Probability sampling: sample members are chosen using random selection processes so that
each target population member stands a chance of being included in the sample.
6.6
Random (probability) sampling methods. Every member of the target population has a
chance of being included in the sample. This is likely to result in a more representative
sample than if a non-probability sampling method was used.
6.7
Unrepresentativeness of the target population;
Not possible to measure sampling error.
6.8
random (chance)
6.9
equal
6.10
Simple random sampling
6.11
Systematic random sampling
6.13
Stratified random sampling
6.14
Cluster random sampling
6.15
It results in a smaller sampling error
6.16
sample statistic; population parameter
6.17
standard error
6.18
95.5%
6.19
Normal
6.20
n = 30 or larger
6.21
Central Limit Theorem.
6.22
Sampling error is the error made when estimating the true population parameter (e.g.
population mean) when using the sample statistic (e.g. sample mean).
CHAPTER 7
CONFIDENCE INTERVAL ESTIMATION
Exercise 7.1
To estimate a population parameter value by defining an Interval within which the true
population value is likely to fall at a stated level of confidence.
Exercise 7.2 Use the z-statistic since the population standard deviation, σ is known (Given σ = 8)
Standard error
95% Confidence level
Margin of error
= 8/√64 =
z(0.95) =
z x SE
Lower 95% confidence limit
Upper 95% confidence limit
1
1.96
1.96
85 - 1.96(1)
85 + 1.96(1)
Use NORMSINV(0.975)
83.04
86.96
Exercise 7.3
t statistic
Exercise 7.4
Use the t -statistic since the population standard deviation, σ is unknown (Given s = 6)
Standard error (approx)
Degrees of freedom
90% Confidence level
Margin of error
= 6/√25 =
= 25 - 1
t (0.10,24) =
t x SE
Lower 90% confidence limit
Upper 90% confidence limit
1.2
24
1.711
2.91
54 - 1.711(1.2)
54 + 1.711(1.2)
Use TINV(0.10,24)
Note: TINV requires tail probability
51.95
56.05
Exercise 7.5
(a)
Manually
x(bar) = 24.4
σ = 10.8
std error =
z crit (95%) =
95% Confidence level =
10.8/√144 =
1.96
1.96 *0.9 =
Lower 95% confidence limit
Upper 95% confidence limit
Interpretation
n = 144
0.9
1.764
24.4 - 1.764 =
24.4 + 1.764 =
22.636
26.164
There is a 95% chance that the interval (22.636 to 26.164) covers
the actual average number of employees per SME in Gauteng.
(b)
z-crit (0.95)
=NORMSINV(0.975) =
1.96
(c)
95% Confidence level
=CONFIDENCE(0.05,10.8,144)
1.764
Exercise 7.6
(a)
Manually
x(bar) = 131.6
σ = 25
std error = 2.68
25/√87 =
z crit (90%) =
1.645
90% Confidence level = 1.645*2.68
Lower 90% confidence limit
Upper 90% confidence limit
n = 87
2.68
4.409
127.191
136.009
131.6 - 4.409 =
131.6 + 4.409 =
Interpretation
There is a 90% chance that the interval (127.191 to 136.009) covers the
actual average no. of palettes per order received by a sugar mill in Durban.
(b)
z-crit (0.90)
=NORMSINV(0.95)
1.645
(c)
90% Confidence level
=CONFIDENCE(0.10,131.6,87)
4.409
(d)
90% confidence limits of total palettes shipped in a year
Lower 90% limit
Upper 90% limit
127.191*720 =
136.009*720 =
91577.52
97926.48
Interpretation
Total palettes shipped in a year, on average, is likely to be
between 91 577 and 97 926 palettes, with 90% confidence.
palettes
palettes
Exercise 7.7
(a)
Manually
x(bar) = R356
σ = R44
std error =
44/√256 =
z crit (95%) =
1.96
95% Confidence level = 1.96 * 2.75 =
Lower 95% confidence limit
Upper 95% confidence limit
n = 256
2.75
5.39
R350.61
R361.39
356 - 5.39 =
356 + 5.39 =
Interpretation
There is a 95% chance that the interval (R350.61 to R361.39) covers the
actual average monthly car insurance premium of medium-sized cars.
(b)
z crit (90%) =
1.645
90% Confidence level = 1.645*2.75 =
Lower 90% confidence limit
Upper 90% confidence limit
4.524
R351.48
R360.52
356 - 4,524 =
356 + 4,524 =
Interpretation
There is a 90% chance that the interval (R351.48 to R360.52) covers the
actual average monthly car insurance premium of medium-sized cars.
The 90% confidence interval limits are closer together than
the 95% confidence interval limits.
The lower confidence results in a more precise set of interval limits (narrower).
(c)
z-crit (0.95)
=NORMSINV(0.975)
(d)
95% Confidence level
=CONFIDENCE(0.05,44,256)
(e)
95% confidence limits of total monthly premium income
Lower 95% confidence limit
Upper 95% confidence limit
350.61*3000
361.39*3000
R1 051 830
R1 084 170
Interpretation
Total monthly premium income, on average, is likely to be between
R1 051 830 and R1 084 170 with 95% confidence.
1.96
5.3899
Exercise 7.8
(a)
Manually
x(bar) = 4.985
σ = 0.04
std error =
z crit (99%) =
99% Confidence level =
0.005657
0.04/√50 =
2.58
2.58 * 0.005657 =
0.01460
Lower 99% confidence limit
Upper 99% confidence limit
n = 50
4.985 - 0.0146 =
4.985 + 0.0146 =
4.9704
4.9996
Interpretation
There is a 99% chance that the interval (4.97 litres to 4.9996 litres)
covers the actual average volume of paint in all five-litre cans.
(b)
Yes, the store owner has statistical evidence to confirm that the average
volume of paint in 5-litre cans is most likely to be below 5 litres.
(c)
z-crit (0.99)
=NORMSINV(0.995)
2.576
(d)
99% Confidence level
=CONFIDENCE(0.01,0.04,50)
0.0146
Exercise 7.9
(a)
Manually
x(bar) = 3.8
σ = 0.6
std error = 0,122474
z crit (90%) =
90% Confidence level =
0.6/√24 =
1.645
0.20147
Lower 90% confidence limit
Upper 90% confidence limit
n = 24
0.122474
3.8 - 0.20147 =
3.8 + 0.20147 =
3.599
4.001
Interpretation
There is a 90% chance that the interval (3.599 to 4.001) covers
the actual average inventory turnover rate of all convenience stores.
(b)
z-crit (0.90)
=NORMSINV(0.95)
(c)
90% Confidence level
=CONFIDENCE(0.10,0.6,24)
1.645
0.20145
Exercise 7.10
(a)
Manually
x(bar) = 166.2
std error (estimated) =
t crit (95%,20)
95% Confidence level =
s = 22.8
n = 21
22.8/√21 =
2.086
10.3786
4.975368
Lower 95% confidence limit
Upper 95% confidence limit
Interpretation
(b)
166.2 - 10.3786
166.2 + 10.3786
There is a 95% chance that between 155.8 and 176.6 calls, on
average, will be received by the call centre daily.
Manually
x(bar) = 166.2
std error (estimated) =
t crit (99%,20)
99% Confidence level =
s = 22.8
n = 21
22.8/√21 =
2.845
14.1549
4.975368
Lower 99% confidence limit
Upper 99% confidence limit
Interpretation
155.82
176.58
166.2 - 14.1549
166.2 + 14.1550
152.05
180.35
There is a 99% chance that between 152.1 and 180.4 calls, on
average, will be received by the call centre daily.
(c)
The 95% confidence interval is more precise (narrower), but less reliable than the
99% confidence interval which is less precise (wider), but more reliable.
(d)
t-crit (0.95, 20)
=TINV(0.05,20)
(e)
Over a 30 day period
Lower 95% confidence limit
Upper 95% confidence limit
Interpretation
There is a 95% chance that between 4675 and 5298 calls, on
average, will be received over a 30-day period.
2.0860
155.82*30 =
176.58*30 =
4674.60
5297.40
Exercise 7.11
(a)
Manually
x(bar) = 12.5
s = 3.4
std error (estimated) =
t crit (90%, 27)
90% Confidence level =
3.4/√28 =
1.703
1.0943
Lower 90% confidence limit
Upper 90% confidence limit
(b)
n = 28
0.64254
12.5 - 1.0943 =
12.5 + 1.0943 =
11.406
13.594
Interpretation
There is a 90% chance that the actual mean dividend yield of
all JSE-listed companies lies between 11.41% and 13.59%.
t-crit (0.90, 27)
=TINV(0.10,27)
1.7033
Exercise 7.12
(a)
Manually
x(bar) = 0.981
s = 0.052
std error (estimated) =
0.052/√18 =
t crit (99%, 17) =
2.8982
99% Confidence level =
0.0355
Lower 99% confidence limit
Upper 99% confidence limit
Interpretation
n = 18
0.01226
0.981 - 0.0355 =
0.981 + 0.0355 =
0.9455
1.0165
There is a 99% chance that the average fill of 1-litre cartons
of milk lies between 0.9455 litres and 1.0165 litres.
Since the interval covers 1 litre, it can be concluded that
the cartons do contain one litre of milk on average.
(b)
Manually
x(bar) = 0.981
s = 0.052
std error (estimated) =
0.052/√18 =
t crit (95%, 17) =
2.1098
95% Confidence level =
0.0259
Lower 95% confidence limit
Upper 95% confidence limit
Interpretation
n = 18
0.01226
0.981 - 0.0259 =
0.981 + 0.0259 =
0.9551
1.0069
There is a 95% chance that the average fill of 1-litre cartons
of milk lies between 0.9551 litres and 1.0069 litres.
The 95% confidence interval is more precise (narrower), but less reliable than the
99% confidence interval which is less precise (wider), but more reliable.
(c)
t-crit (0.99, 17)
t-crit (0.95, 17)
=TINV(0.01,17)
=TINV(0.05,17)
2.8982
2.1098
Exercise 7.13
(a)
Manually
x(bar) = 1420
s = 160
std error (estimated) =
t crit (90%, 49) =
90% Confidence level =
160/√50 =
1.676
37.9235
Lower 90% confidence limit
Upper 90% confidence limit
Interpretation
(b)
22.62742
(use df = 50 from Table 2, Appendix)
1420 - 37.9235 =
1421 + 37.9235 =
1382.08
1457.92
There is a 90% chance that the average monthly wage of
union members lies between R1382.08 and R1457.92.
Manually
x(bar) = 1420
s = 160
std error (estimated) =
t crit (99%, 49) =
99% Confidence level =
160/√50 =
2.678
60.5962
Lower 99% confidence limit
Upper 99% confidence limit
Interpretation
n = 50
n = 50
22.62742
(use df = 50 from Table 2, Appendix)
1420 - 60.5962 =
1420 + 60.5962 =
1359.4
1480.6
There is a 99% chance that the average monthly wage of
union members lies between R1359.40 and R1480.60.
(c)
The 90% confidence interval is more precise (narrower), but less reliable than the
99% confidence interval which is less precise (wider), but more reliable.
(d)
t-crit (0.90, 49)
t-crit (0.99, 49)
=TINV(0.10,49)
=TINV(0.01,49)
1.6766
2.6800
Exercise 7.14
x = 84
p = 84/200 =
std error =
z crit (95%) =
95% Confidence level =
n = 200
0.42
√[(0.42)(0.58)/200]
1.96
0.0684
Lower 95% confidence limit
Upper 95% confidence limit
0.0349
0.42 - 0.0684 =
0.42 + 0.0684 =
0.3516
0.4884
Interpretation
There is a 95% chance that the percentage of manufacturing firms that
meet the employment equity charter lies between 35.2% and 48.8%.
Exercise 7.15
x = 68
p = 68/160 =
std error =
z crit (95%) =
95% Confidence level =
n = 160
0.425
√[(0.425)(0.575)/160] =
1.96
0.0766
Lower 95% confidence limit
Upper 95% confidence limit
0.03908
0.425 - 0.0766 =
0.425 + 0.0766 =
Interpretation
There is a 95% chance that the percentage of cash-paying customers
lies between 34.8% and 50.2%.
0.3484
0.5016
Exercise 7.16
x = (365-78) = 287
n = 365
p = 287/365 =
0.786
std error =
√[(0.786)*(0.214)/365] =
z crit (90%) =
1.645
90% Confidence level =
0.03532
Lower 90% confidence limit
Upper 90% confidence limit
0.786 - 0.0353 =
0.786 + 0.0353 =
0.021467
0.7507
0.8213
Interpretation
There is a 90% chance that the percentage of non-overdrawn cheque accounts
at the Tshwane branch lies between 75.1% and 82.1%.
Exercise 7.17
x = 120
p = 120/300 =
std error =
z crit (90%) =
90% Confidence level =
n = 300
0.4
√[(0.4)*(0.6)/300] =
1.645
0.04652
Lower 90% confidence limit
Upper 90% confidence limit
0.4 - 0.04652 =
0.4 + 0.04652 =
0.02828
0.3535
0.4465
Interpretation
There is a 90% chance that the percentage of shoppers who frequent a shopping
mall primarily because of its store mix lies between 35.4% and 44.7%.
Exercise 7.18
(a)
File:
Descriptive Statistics - Absent Days
Absent Days
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence Level (95.0%)
X7.18 - cashier absenteeism.xlsx
(Using Data Analysis in Excel )
9.379
0.625
9
7
3.364
11.315
-0.919
-0.009
12
3
15
272
29
1.280
The average number of days absent was 9.4 days.
68% of employees were absent between 6 and 12.7 days (within 1 std dev).
The data is symmetrical about the mean (skewness = -0.009).
Absenteeism per employee ranged between 3 (min) and 15 days (max) last year.
(b)
Lower 95% Confidence Limit
Upper 95% Confidence Limit
9.379 - 1.28 =
9.379 + 1.28 =
8.10
10.66
Interpretation
There is a 95% chance that the average number of days absent last year by
all cashiers in a supermarket was between 8.1 days and 10.7 days.
(c)
Since the 95% confidence interval covers 10 days (8.1 < μ < 10.66), it is
possible that the mean number of days absent per employee does exceed 10 days.
Thus the company's policy is not being strictly adhered to.
(d)
t-crit (0.95, 28)
Given standard error =
Confidence level (95%) =
=TINV(0.05,28)
0.625 * 2.0484 =
2.0484
0.6250
1.2803
Exercise 7.19
(a)
File:
Descriptive Statistics - Parcel Weights (kg)
Parcel Masses (kg)
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence Level (90.0%)
X7.19 - parcel masses.xlsx
(Using Data Analysis in Excel )
2.829
0.091
2.78
3.22
0.5998
0.360
-0.972
-0.108
2.17
1.75
3.92
121.63
43
0.154
The average parcel weight was 2.83 kg.
68% of parcels weigh between 2.23 kg and 3.43 kg (within 1 std dev).
The data is symmetrical about the mean (skewness = -0.1076).
Parcel weights ranged between 1.75 kg (min) and 3.92 kg (max).
(b)
Lower 90% Confidence Limit
Upper 90% Confidence Limit
2.829 - 0.154 =
2.829 + 0.154 =
2.675
2.983
Interpretation
There is a 90% chance that the average parcel weight is between 2.68 kg and 2.98 kg.
(c)
Since the 90% confidence interval lies below 3 kg (2.675 < μ < 2.983), it is
highly likely that parcel weights do not exceed 3 kg.
Thus Post-Net is adhering to its policy.
(d)
t-crit (0.90, 42)
Given standard error =
Confidence level (90%) =
=TINV(0.10,42)
0.091 * 1.682 =
1.682
0.0910
0.154
Exercise 7.20
(a)
File:
Descriptive Statistics - Cost-to-Income Ratio
X7.20 - cost-to-income.xlsx
(Using Data Analysis in Excel )
Cost-to-Income Ratio
Mean
71.240
Standard Error
1.997
Median
68
Mode
84
Standard Deviation
14.123
Sample Variance
199.451
Kurtosis
-1.145
Skewness
0.139
Range
53
Minimum
44
Maximum
97
Sum
3562
Count
50
Confidence Level (95.0%)
4.014
The average cost-to-income ratio per company is 71.24%.
68% of companies have a cost-to-income ratio of between 57.1% and 85.4% (within 1 std dev).
The data is reasonably symmetrical about the mean (skewness = 0.139).
Cost-to-income ratios ranged between 44% and 97% for the sample of public companies.
(b)
Lower 95% Confidence Limit
Upper 95% Confidence Limit
71.24 - 4.014 =
71.24 + 4.014 =
67.226
75.254
Interpretation
There is a 95% chance that the average cost-to-income ratio of public companies
is between 67.2% and 75.3%.
(c)
(d)
t-crit (0.95, 49)
Given standard error =
Confidence level (95%) =
=TINV(0.05,49)
1.997 * 2.0096 =
2.0096
1.9970
4.0132
Find P(x > 75) using "μ" = 71.24 and "σ"= 14.123
Standardise x = 75 to t-stat
t-stat = (75 - 71.24)/14.123 =
Use Excel to find P(z > 0.26623)
=TDIST(0.26623,49,1)
0.39559
Interpretation
39.6% of all public companies are likely to have a cost-to-income ratio
in excess of the 75% rule of thumb.
0.26623
Sample size determination
z
e
σ
1.96
10
50
n =
96
z
e
σ
2.58
0.1
1
n =
666
(b)
z
e
σ
2.58
0.15
1
n =
296
(c)
z
e
σ
2.58
0.2
1
n =
166
Exercise 23
z
p
e
1.645
0.5
0.03
n =
752
Exercise 7.21
Exercise 22
(a)
CHAPTER 8
HYPOTHESIS TESTS
SINGLE POPULATION (MEANS, PROPORTIONS AND VARIANCES)
Exercise 8.1
To test whether a claim / statement made about a population parameter
value is probably true or false, based on sample evidence.
Exercise 8.2
The “closeness” of the sample statistics to the claimed population parameter value.
Exercise 8.3
The Five Steps of Hypothesis Testing
Step 1: Define the statistical hypotheses (the null and alternative hypotheses).
Step 2: Determine the region of acceptance of the null hypothesis.
Step 3: Compute the sample test statistic.
Step 4: Compare the sample test statistic to the region of acceptance.
Step 5: Draw the statistical and management conclusions.
Exercise 8.4
Level of significance (α)
(and sample size when the population standard deviation is unknown)
Exercise 8.5
Reject H0 in favour of H1 at the 5% level of significance.
Exercise 8.6
(i)
H0: µ ≤ 560
x (bar) = 577
H1: µ > 560
σ = 86
n = 120
α = 0.05
(a) One-sided upper tailed test
Use z test statistic since σ is known.
Area of Acceptance
(b) z-stat =
z ≤ 1.645
Read off 0.45 from z-table; or
=NORMSINV(0.95) [using Excel ]
(577-560)/(86/√(120) = 2.165
Since z-stat (2.165) > z-crit (1.645), there is sufficient sample evidence at the
5% level of significance to reject H0 in favour of H1. (i.e. Reject H0)
Conclude that the population mean value is significantly larger than 560.
(ii)
(c) p -value =
P(z > 2.165) =
H0: π ≥ 0.72
x = 216
n = 330
H1: π < 0.72
α = 0.10
0.5 - 0.4848 =
From z-table; or
0.0152
=1-NORMSDIST(2.165)
Derive p = 216/330 = 0,6545
(a) One-sided lower tailed test
Use z test statistic for proportions
Area of Acceptance
(b) z-stat =
z ≥ -1.28
Read off 0.4 from z-table; or
=NORMSINV(0.9) [using Excel ]
-2.65
(0.6545 - 0.72)/√[(0.72)(1-0.72)/330] =
Since z-stat (-2.65) < z-crit (-1.28), there is sufficient sample evidence at the
10% level of significance to reject H0 in favour of H1. (i.e. Reject H0)
Conclude that the true population proportion is significantly less than 0.72.
(c) p -value =
(iii) H0: µ = 8.2
x (bar) = 9.6
P(z < -2.65) =
H1: µ ≠ 8.2
s = 2.9
n = 30
0.5 - 0.496 =
0.0040
From z-table; or
=NORMSDIST(-2.65)
α = 0.01
(a) Two-sided test
Use t test statistic since σ is unknown
(only given the sample standard deviation, s and n is small (n < 30))
Area of Acceptance
(b) t-stat =
-2.756 ≤ t ≤ +2.756
Read off T(0.005,29) from t-table;
or '=TINV(0.01,29) [using Excel ]
2.644
(9.6 - 8.2)/(2.9)/√(30) =
Since t-stat (2.644) falls within the area of acceptance, there is insufficient sample
evidence at the 1% level of significance to reject H0 in favour of H1. (i.e. Accept H0)
Conclude that the true mean value is equal to 8.2.
(c) p -value =
2 x P(t > 2.644) =
0.0131
=TDIST(2.644,29,2)
(iv) H0: µ ≥ 18
x (bar) = 14.6
H1: µ < 18
s = 3.4
n = 12
α = 0.01
(a) One-sided lower tailed test
Use t test statistic since σ is unknown
(only have the sample stardard deviation, s and n is small (n < 30))
Area of Acceptance
(b) t-stat =
t ≥ -2.718
Read off T(0.01,11) from t-table;
or '=TINV(0.02,11) [using Excel ]
-3.464
(14.6 - 18)/(3.4)/√(12) =
Since t-stat (-3.464) < t-crit (-2.718), there is sufficient sample evidence at the
1% level of significance to reject H0 in favour of H1. (Reject H0)
Conclude that the true mean value is significantly below 18.
(v)
(c) p -value =
P(t < -3.464) =
H0: π = 0.32
x = 68
n = 250
H1: π ≠ 0.32
α = 0.05
0.0026
Derive p =
'=TDIST(-(-3.464),11,1)
68/250 = 0.272
(a) Two-sided test
Use z test statistic for proportions
Area of Acceptance
(b) z-stat =
-1.96 ≤ z ≤ +1.96
Read off z(0.975) from z-table; or
=NORMSINV(0.975) [Excel ]
(0.272 - 0.32)/√[(0.32)(1-0.32)/250] =
-1.627
Since z-stat (-1.627) falls within the area of acceptance, there is insufficient sample
evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Accept H0)
Conclude that the true population proportion is equal to 0.32.
(c) p -value =
2 x P(z < -1.627) =
0.1037
'=2*NORMSDIST(-1.627)
Exercise 8.7
(a)
H0: µ = 85
(b)
Use the z test statistic since σ is known (σ = 25 min (given))
(c)
Use α = 0.05
Area of Acceptance
z-stat =
H1: µ ≠ 85
-1.96 ≤ z ≤ 1.96
(80.5 - 85)/(25/√(132) =
Two-sided test
Read off 0.475 from z-table; or
=NORMSINV(0.975) [using Excel]
-2.068
Statistical conclusion
Since z-stat (-2.068) lies below -z-crit (-1.96), there is sufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that the population mean value is significantly different from 85 minutes.
Visitors to the Knysna shopping mall do not spend 85 minutes on average in the mall.
(d)
p -value =
(e)
Since p -value (0.0386) < α (0.05), there is strong sample evidence to
conclude that visitors to the Knysna shopping mall do not spend 85 minutes,
on average, in the mall.
2 x P(z < -2.068) =
0.0386
=NORMSDIST(-2.068) x 2
By inspection of the sample mean, it appears that visitors to the Knysa
shopping mall spend significantly less than 85 minutes in the mall.
Exercise 8.8
(a)
H0: µ ≥ 30 min
(b)
Use the z test statistic since σ is known (σ = 10.5 min (given))
(c)
Use α = 0.01
Area of Acceptance
z-stat =
H1: µ < 30 min
z ≥ -2.33
(27.9 - 30)/(10.5/√(86)) =
One-sided lower tailed test
Read off 0.49 from z-table; or
=NORMSINV(0.01) [using Excel]
-1.855
Statistical conclusion
Since z-stat (-1.855) lies above z-crit (-2.33), there is insufficient sample evidence at
the 1% level of significance to reject H0 in favour of H1 (Do not reject H0).
Management conclusion
Conclude that the population mean value is at least 30 minutes.
The supermarket manager's belief is therefore valid.
(d)
p -value = P(z < -1.855) =
0.0318 =NORMSDIST(-1.855)
Since p -value (0.0318) > α (0.01) the sample evidence does not refute H0.
The sample evidence is not strong enough to refute the belief that customers
spend 30 minutes or more, on average, doing their purchases at the supermarket.
Hence conclude that customers are likely to spend at least half-an-hour,
on average, in the supermarket doing their grocery shopping.
Exercise 8.9
(a)
H0: µ ≤ 72 hours
(b)
Use the z test statistic since σ is known (σ = 18 hours (given))
Use α = 0.10
Area of Acceptance
z-stat =
H1: µ > 72 hours
z ≤ 1.28
(75.9 - 72)/(18/√(46)) =
One-sided upper tailed test
Read off 0.40 from z-table; or
=NORMSINV(0.90) [using Excel]
1.470
Statistical conclusion
Since z-stat (1.47) > z-crit (1.28), there is sufficient sample evidence at
the 10% level of significance to reject H0 in favour of H1 (Reject H0).
Management conclusion
Conclude that the local importer's claim is valid.
Consignments are taking significantly longer than 72 hours to clear customs.
(c)
p -value = P(z > 1.47) =
0.0708 =1-NORMSDIST(1.47)
Since p -value (0.0708) < α (0.10) this confirms support for H1.
There is moderate sample evidence (relative to α = 0.10) to conclude that
consignment clearance times, on average, are significantly longer than 72 hours.
Exercise 8.10
(a)
Use the z test statistic since σ is known (σ = 14.7% (given))
(b)
H0: µ ≤ 40%
(c)
Use α = 0.01
Area of Acceptance
z-stat =
H1: µ > 40%
z ≤ 2.33
(44.1 - 40)/(14.7/√(76)) =
One-sided upper tailed test
Read off 0.49 from z-table; or
=NORMSINV(0.99) [using Excel]
2.431
Statistical conclusion
Since z-stat (2.431) > z-crit (2.33), there is sufficient sample evidence at
the 1% level of significance to reject H0 in favour of H1 (Reject H0).
Management conclusion
Conclude that the Department of Health's concern is justified.
The average markup is significantly greater than 40%.
(d)
p -value = P(z > 2.431) =
0.0075
=1-NORMSDIST(2.431)
Since p -value (0.0075) << α (0.01) it confirms H1. There is overwhelming sample
evidence to conclude that the average % markup is significantly greater than 40%.
Exercise 8.11
(a)
Use the t test statistic since σ is unknown (only s = 21 gms is given)
H0: µ = 700 gms
H1: µ ≠ 700 gms
Two-sided test
Use α = 0.05 with degrees of freedom = (n -1) = 63
Area of Acceptance
-2.00 ≤ t ≤ 2.00
Read off t(0.025,63) from t-table; or
'=TINV(0.05,63) [using Excel]
t-stat =
(695 - 700)/(21/√(64)) =
-1.905
Statistical conclusion
Since t-stat (-1,905) lies within the acceptance area, there is insufficient sample
evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
Management conclusion
Conclude that Ryeband Bakery is both legally compliant and not wasting ingredients.
The average weight of all white loaves produced by Ryeband Bakery is 700 gms.
(b)
p -value = 2 x P(t < -1.905) =
0.0613
=TDIST(-(-1.905),63,2)
Since p -value (0.0613) > α (0.05) H0 cannot be rejected as the sample evidence
is weak in favour of H1.
Hence it can be concluded that, on average, the weight of white bread
loaves produced by the Ryeband Bakery is 700 gm.
(c)
H0: µ ≥ 700 gms
H1: µ < 700 gms
One-sided lower tailed test
Use α 0.05 with degrees of freedom = (n -1) = 63
Area of Acceptance
t ≥ -1.671
Read off t(0.05,63) from t-table; or
=TINV(0.10,63) [using Excel]
t-stat =
(695 - 700)/(21/√(64)) =
-1.905
Statistical conclusion
Since t-stat (-1.905) < t-crit (-1.671), there is sufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (i.e. Reject H0).
Management conclusion
Conclude that Ryeband Bakery is not legally compliant.The average weight of all
white bread loaves produced by Ryeband Bakery is significantly below 700 gms.
Exercise 8.12
(a)
Use the t test statistic since σ is unknown (only s = R788 is given)
H0: µ ≥ 5500
H1: µ < 5500
One-sided lower tailed test
Use α = 0.10 with degrees of freedom = (n -1) = 17
Area of Acceptance
t ≥ -1.33
Read off t(0.10,17) from t-table; or
=TINV(0.20,17) [using Excel]
t-stat =
(5275 - 5500)/(788/√(18)) =
-1.211
Statistical conclusion
Since t-stat (-1.211) lies within the acceptance area, there is insufficient sample
evidence at the 10% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
Management conclusion
Conclude that mean weekly sales of the new pudding flavour is not less than R5500.
The company should therefore not withdraw the product at this stage.
(b)
p -value = P(t < -1.211) =
0.1212
=TDIST(-(-1.211),17,1)
Since p -value (0.1212) > α (0.10), H0 cannot be rejected as the sample evidence
is weak in favour of H1.
Hence it can be concluded that, on average, the weekly sales of the new pudding
flavour is not less than R5500 and the product should not be withdrawn.
Exercise 8.13
(a)
Use the t test statistic since σ is unknown (only s = 3.6 is given)
H0: µ ≤ 80 kg
H1: µ > 80 kg
One-sided upper tailed test
Use α = 0.05 and degrees of freedom = (n -1) = 25
Area of Acceptance
t ≤ 1.708
Read off t(0.05,25) from t-table; or
=TINV(0.10,25) [using Excel]
t-stat =
(81.3 - 80)/(3.6/√(26)) =
1.841
Statistical conclusion
Since t-stat (1.841) > t-crit (1.708), it lies in the region of rejection.
Therefore there is sufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (Reject H0)
Management conclusion
Conclude that mean tensile strength of the consignment of wire is more than 80 kg.
Marathon Products should accept this consignment as it meets quality specs.
(b)
p -value = P(t > 1.841) =
0.0388
=TDIST(1.841,25,1)
Since p -value (0.0388) < α (0.05) there is strong sample evidence to reject H0
in favour of H1.
Hence it can be concluded that, on average, the tensile strength of
the wire in the consignment exceeds 80 kg. Hence accept the consignment.
Exercise 8.14
(a)
Use the t test statistic since σ is unknown (only s = 0.068 is given)
H0: µ ≥ 1
H1: µ < 1
One-sided lower tailed test
Use α = 0.05 and degrees of freedom = (n -1) = 19
Area of Acceptance
t ≥ -1.729
Read off t(0.05,19) from t-table; or
=TINV(0.10,19) [using Excel]
std error = 0.068/√(20) =
t-stat =
(0.982 - 1)/(0.0152) =
0.0152
-1.1842
Statistical conclusion
Since t-stat (-1.1842) > t-crit (-1.729), it lies within the region of acceptance.
There is therefore insufficient sample evidence at the
5% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
Management conclusion
Conclude that the mean fill of one-litre milk containers is not less than 1 litre.
The Consumer Council's claim that containers are being underfilled is not valid.
(b)
p -value = P(t < -1.1842) =
0.1255
=TDIST(-(-1.1842),19,1)
Since p -value (0.1255) > α (0.05), H0 cannot be rejected as there is no sample
evidence to support H1.
Hence it can be concluded that, on average, the mean fill of milk containers
is at least one 1 lite. Thus there is no statistical support for the claim.
Exercise 8.15
(a)
Use the z test statistic for proportions
H0: π ≥ 0.30
H1: π < 0.30
Use α = 0.05
Area of Acceptance
n=
p=
std error =
z-stat =
z ≥ -1.645
x=
400
106/400 =
√(0.3*0.7)/400 =
(0.265 - 0.3)/(0.0229) =
One sided lower tailed test
Read off z(0.45) from z-table; or
=NORMSINV(0.05) [using Excel ]
106
0,265
0,0229
-1.5284
Statistical conclusion
Since z-stat (-1.5284) > z-crit (-1.645) there is insufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
Management conclusion
Conclude that 30% or more of listeners tune into the news broadcast of this station.
The company should advertise in this radio station's news timeslots.
(b)
p -value =
P(z < -1.5284) =
0.0632
'=NORMSDIST(-1.5284)
Since p-value (0.0633) > α (0.05), there is weak sample evidence to
reject H0 in favour of H1. (i.e. Accept H0)
The company can accept the radio station's claim as valid.
(c)
p -value =
P(z < -1.5284) =
0.0632
=NORMDIST(0.265,0.3,0.0229,1)
Exercise 8.16
(a)
Use the z test statistic for proportions
H0: π ≤ 0.60
H1: π > 0.60
Use α = 0.05
Area of Acceptance
n=
p=
std error =
z-stat =
z ≤ 1.645
x=
150
(150-54)/150 =
√(0.6*0.4)/150 =
(0.64 - 0.6)/(0.04) =
Read off z(0.45) from z-table; or
=NORMSINV(0.95) [using Excel ]
54
0.64
0.04
1.000
Statistical conclusion
Since z-stat (1.00) < z-crit (1.645) there is insufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
Management conclusion
Conclude that not more than 60% of Cape Town motorists do not have vehicle insurance.
The motor vehicle advisor's claim is not valid.
(b)
p -value =
P(z > 1.000) =
0.1587
=1-NORMSDIST(1)
Since p -value (0.1587) > α (0.05), the sample evidence does not support H1.
Hence the insurance advisor's claim has no statistical validity.
Exercise 8.17
(a)
Use the z test statistic for proportions
H0: π ≤ 0.15
H1: π > 0.15
Use α = 0.10
Area of Acceptance
n=
p=
std error =
z-stat =
z ≤ 1.28
x=
560
96/560 =
√(0,15*0,85)/560 =
(0,1714 - 0,15)/(0,0151) =
Read off z(0.40) from z-table; or
=NORMSINV(0.90) [using Excel]
96
0.1714
0.0151
1.417
Statistical conclusion
Since z-stat (1.417) > z-crit (1.28) there is sufficient sample evidence at
the 10% level of significance to reject H0 in favour of H1 (i.e. Reject H0)
Management conclusion
Conclude that the churn rate in the telecommunications industry exceeds 15%.
(b)
p -value =
P(z > 1.417) =
0.0782
=1-NORMSDIST(1.417)
Since p -value (0.0782) > α (0.10), there is moderate sample evidence to
reject H0 in favour of H1.
The same management conclusion applies as in (a) above.
(c)
p -value =
P(z > 1.417) =
0,0782
=1-NORMDIST(0.1714,0.15,0.0151,1)
Exercise 8.18
(a)
Use the z test statistic for proportions
H0: π ≥ 0.90
H1: π < 0.90
Use α = 0.01
Area of Acceptance
n=
p=
std error =
z-stat =
z ≥ -2.33
x=
300
260/300 =
√(0.9*0.1)/300 =
(0.8667 - 0.9)/(0.0173) =
Read off z(0.49) from z-table; or
=NORMSINV(0.01) [using Excel]
260
0.8667
0.0173
-1.9249
Statistical conclusion
Since z-stat (-1.9249) > z-crit (-2.33) there is insufficient sample evidence at
the 1% level of significance to reject H0 in favour of H1 (i.e. Do not reject H0)
Management conclusion
Conclude that the germination rate of the barley seed is at least 90%.
The cooperative can justify buying the barley seed from this merchant.
The cooperative can accept the seed merchant's claim.
(b)
p -value = P(z < -1.9249) =
0.0271
=NORMSDIST(-1.9249)
Since p -value (0.0271) > α (0.01), there is weak sample evidence at
the 1% level of signficance to reject H0 in favour of H1.
Hence do not reject H0.
The same management conclusion applies as in (a) above.
(c)
p -value = P(z < -1.9249) =
0.0271
=NORMDIST(0.8667,0.9,0.0173,1)
Exercise 8.19
File: X8.19 - cost-to-income.xlsx
Use the t test statistic since σ is unknown
H0: µ ≥ 75%
H1: µ < 75%
Use α = 0.05 and degrees of freedom = (n -1) = 49
Read off t(0.05,49) from t-table; or
=TINV(0.10,49) [using Excel]
Area of Acceptance
std error =
t-stat =
t ≥ -1.676
1.9973
(71.24 - 75)/(1.9973) =
-1.8825
Cost-to-income (%)
Mean
71.24
Standard Error
1.9973
Median
68
Mode
84
Standard Deviation
14.1227
Sample Variance
199.451
Kurtosis
-1.1450
Skewness
0.1391
Range
53
Minimum
44
Maximum
97
Sum
3562
Count
50
Statistical conclusion
Since t-stat (-1.8825) < t-crit (-1.676) there is sufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (i.e. Reject H0)
Management conclusion
Conclude that the mean cost-to-income ratio of JSE companies is less than 75%.
JSE companies are therefore adhering to the rule of thumb.
Exercise 8.20
(a)
File:
X8.20 - kitchenware.xlsx
Normality assumption check
Sales values
≤ 75
76 - ≤ 100
101 - ≤ 125
126 - ≤ 150
151 - ≤ 175
176 - ≤ 200
201 - ≤ 225
> 225
Count
5
6
8
11
9
6
4
1
The distribution appears to be normal.
(b)
Use the t test statistic since σ is unknown
H0: µ ≥ R150
H1: µ < R150
Use α = 0.05 and degrees of freedom = (n -1) = 49
Read off t(0.05,49) from t-table; or
=TINV(0.10,49) [using Excel]
Area of Acceptance
std error =
t-stat =
t ≥ -1.676
6.657
(137.12 - 150)/(6.657) =
-1.9348
Sales Transaction value
Mean
137.12
Standard Error
6.657
Median
138.5
Mode
184
Standard Deviation
47.074
Sample Variance
2215.944
Kurtosis
-0.741
Skewness
0.022
Range
190
Minimum
52
Maximum
242
Sum
6856
Count
50
Statistical conclusion
Since t-stat (-1.9348) < t-crit (-1.676) there is sufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (Reject H0).
Management conclusion
Conclude that the mean transaction value at the Claremont branch is signficantly
below R150.
The management are advised to close the Claremont branch since it is unprofitable.
Exercise 8.21
(a)
X8.21 - flight delays.xlsx
File:
Normality assumption check
delays
≤5
5 - ≤ 7.5
7.6 - ≤ 10
10.1 - ≤ 12.5
12.6 - ≤ 15
15.6 - ≤ 17.5
> 17.5
Count
0
8
28
29
13
2
0
Histogram of Flight Delay Times
35
28
30
29
No. of Flights
25
20
13
15
8
10
5
0
2
0
5
7.5
10
12.5
15
17.5
0
More
Delay time intervals
The histogram distribution appears normal. The assumption is satisfied.
(b)
Use the t test statistic since σ is unknown
H0: µ ≤ 10 min
H1: µ > 10 min
Use α = 0.10 and degrees of freedom = (n -1) = 79
Read off t(0.10,79) from t-table; or
=TINV(0.20,79) [using Excel]
Area of Acceptance
std error =
t-stat =
t ≥ 1.292
0.261
(10.324 - 10)/(0.261) =
1.241
flight delays
Mean
10.324
Standard Error
0.261
Median
10.3
Mode
11
Standard Deviation
2.333
Sample Variance
5.444
Kurtosis
-0.550
Skewness
0.084
Range
10.3
Minimum
5.3
Maximum
15.6
Sum
825.9
Count
80
Statistical conclusion
Since t-stat (1.241) < t-crit (1.292) there is insufficient sample evidence at
the 10% level of significance to reject H0 in favour of H1 (i.e. Accept H0).
Management conclusion
Conclude that flight delay times, on average, do not exceed 10 minutes.
ACASA management do not need to conduct an indepth investigation on flight delays.
Exercise 8.22
(a)
X8.22 - medical claims.xls
File:
Normality assumption check
Claims intervals
≤ 100
101 - ≤ 125
126 - ≤ 150
151 - ≤ 175
176 - ≤ 200
201 - ≤ 225
226 - ≤ 250
251 - ≤ 300
Count
5
3
9
21
25
18
8
11
Histogram of Daily Claims Processed
30
25
No. of Claims
25
21
18
20
15
11
9
10
5
5
8
3
0
0
100
125
150
175
200
225
250
300
More
Claims processed intervals
The distribution appears to be reasonably normal. The assumption is satisfied
(b)
Use the t test statistic since σ is unknown
H0: µ ≤ 180 claims
H1: µ > 180 claims
Use α = 0.01 and degrees of freedom = (n -1) = 99
Read off t(0.01,99) from t-table; or
=TINV(0.02,99) [using Excel]
Area of Acceptance
std error =
t-stat =
t ≥ 2.365
4.568
(190.39 - 180)/(4.568) =
2.275
Medical Claims
Mean
190.39
Standard Error
4.568
Median
190
Mode
210
Standard Deviation
45.680
Sample Variance
2086.685
Kurtosis
-0.092
Skewness
0.096
Range
199
Minimum
92
Maximum
291
Sum
19039
Count
100
Statistical conclusion
Since t-stat (2.275) < t-crit (2.365) there is insufficient sample evidence at
the 1% level of significance to reject H0 in favour of H1 (Thus accept H0).
Management conclusion
Conclude that the average number of claims processed daily does not exceed 180.
Thus the supervisor has no statistical grounds for requesting additional staff.
Exercise 8.23
X8.23 - newspaper readership.xlsx
File:
Guardian Newspaper Claim
(a)
One-Way Pivot Table (Frequency Table) and Bar Chart
Tabloid
Sun
Data
%
Count
%
Count
%
Count
%
Count
Guardian
Mail
Voice
Total %
Total Count
Total
15.8%
19
35.0%
42
25.8%
31
23.3%
28
100%
120
Tabloid
Sun
Guardian
Mail
Voice
%
15.8
35
25.8
23.3
Bar Chart of Tabloid Readership
40
35
35
% of Readers
30
25.8
25
20
23.3
15.8
15
10
5
0
Sun
Guardian
Mail
Voice
Interpretation
Based on the sample fo 120 tabloid readers, the Guardian newspaper has the
largest share of 35%. The Sun has the lowest percentage of readers at 16%.
(b)
Use the z test statistic for proportions
H0: π ≥ 0.40
(c)
Use α = 0.05
Area of Acceptance
n=
p=
std error =
z-stat =
H1: π < 0.40 Test claim that Guardian newspaper
has at least a 40% market share.
z ≥ -1.645
x=
120
42/120 =
√(0.4*0.6)/120 =
(0.35 - 0.4)/(0.0447) =
Read off z(0.45) from z-table; or
=NORMSINV(0.05) [using Excel]
42
0.35
0.0447
-1.1186
Statistical conclusion
Since z-stat (-1.1186) > z-crit (-1.645) there is insufficient sample evidence at
the 5% level of significance to reject H0 in favour of H1 (Thus accept H0)
Management conclusion
Hence conclude that the Guardian's market share is at least 40%
The Guardian newspaper 's claim is justified statistically.
Exercise 8.24
(a)
File:
X8.24 - citrus products.xlsx
One-Way Pivot Table (Frequency Table) and Bar Chart
Awareness
High
Low
Moderate
Total Count
Total %
Data
Count
%
Count
%
Count
%
Total
34
20%
72
42.35%
64
37.65%
170
100%
Awareness
Low
Moderate
High
%
42.35
37.65
20
Bar Chart of Awareness Levels
% of respondents
50
42.35
40
37.65
30
20
20
10
0
Low
Moderate
High
Awareness Levels
Interpretation
80% of sampled consumers have a low or moderate awareness. Only 20% of
the sampled consumers indicated a high awarness of the nutritional benefits of citrus.
(b) (i)
Use the z test statistic for proportions
H0: π ≤ 0.15
(b) (ii)
H1: π > 0.15
Use α = 0.01
Area of Acceptance
n=
p=
std error =
z-stat =
z ≤ 2.326
x=
170
34/170 =
√(0.15*0.85)/170 =
(0.20 - 0.15)/(0.0274) =
Read off z(0.49) from z-table; or
=NORMSINV(0.99) [using Excel]
34
0.2
0.0274
1.825
Statistical conclusion
Since z-stat (1.825) < z-crit (2.326) there is insufficient sample evidence at
the 1% level of significance to reject H0 in favour of H1 (i.e. Accept H0)
(b) (iii)
Management conclusion
Since the level of high consumer awareness does not exceed 15%, it is recommended that
Fruitco should launch a national awareness campaign.
Exercise 8.25
(a)
File:
X8.25 - aluminium scrap.xlsx
Histogram of Daily % Scrap of Machine
Daily % Scrap
≤ 2.8
2.8 - ≤ 3.2
3.2 - ≤ 3.6
3.6 - ≤ 4.0
4.0 - ≤ 4.4
4.4 - ≤ 4.8
Total
Count
6
12
13
11
5
3
50
Histogram of Machine Daily % Scrap
16
14
12
13
11
No. of days
12
10
8
6
5
6
3
4
2
0
Scrap % intervals
Interpretation
The assumption of normality is largely satisfied.
The histogram is only moderately skewed to the right.
(b)
95% Confidence Limits for Machine
Machine - Daily % Scrap
Mean
3.483
Standard Error
0.0723
Median
3.41
Mode
2.97
Standard Deviation
0.511
Sample Variance
0.261
Kurtosis
-0.851
Skewness
0.331
Range
1.83
Minimum
2.67
Maximum
4.5
Sum
174.14
Count
50
Confidence Level(95.0%)
0.145
Lower 95% confidence limit =
Upper 95% confidence limit =
3.483 -0.145 =
3.483+0.145 =
3.34
3.63
There is a 95% chance that the average daily % scrap produced by the machine
is likely to lie between 3.34% and 3.63%.
(c)
Let μ = population mean daily % scrap produced by the machine.
Use the t test statistic since σ is unknown (only s is given)
H0:
H1:
μ ≥ 3.75%
μ < 3.75%
One sided lower tailed test
Region of Acceptance
t-crit =
Use α = 0.05 with df = (50 - 1) = 49
t(0.05)(49) =
=TINV(0.1,49)
Decision rule
Do not reject H0 in favour of H1 if -1.677 ≤ t-stat
t-stat
std error =
t-stat =
0.0723
-1.677
(see Table above)
(3.483 - 3.75)/0.0723
-3.693
Statistical conclusion
Since t-stat (-3.693) lies well within the region of rejection, there is strong sample
evidence at the 5% significance level to reject H0 in favour of H1. (Thus reject H0).
Management conclusion
Conclude that the average daily % scrap produced by the machine is less than 3.75%.
The machine is not yet due for a full maintenance service.
Hypothesis Test for a Single Population Variance, σ2
5m pipe length variability analysis
Exercise 8.26
Problem Characteristics
Variable x = length of pipe (specification = 5m)
Data type numerical, ratio scaled, continuous.
Data
Step 1
n =
α=
s=
26
0.05
3.46
pipes
σ0 =
H0 : σ2 ≤ 9
One sided Upper tailed test
Management Question in H0
H1 : σ2 > 9
Step 2
Region of Non-Rejection
Upper
Step 3
3
2
Χ -crit =
37.652
2
Sample test statistic X -stat
2
X -stat =
33.254
(Use Chi-Square)
Do not reject Ho if Χ2-stat ≤ 37.652
Formula 8.4
p -value = 0.12
Also
Use CHISQ.DIST.RT(x, df)
Steps 4, 5 Statistical Conclusion
Since X2-stat < upper X2-crit (i.e. lies in region of non-rejection of H0),
do not reject H0 at 5% level of significance.
Management Conclusion
The production manager can be 95% confident that the variation
in pipe lengths is within the limits of the product specification.
Exercise 8.27
Hypothesis Test for a Single Population Variance, σ2
Problem Characteristics
Variable x = unknown - but numeric
Data type numerical, ratio scaled, continuous
Data
Step 1
n =
α=
s2 =
20
0.1
σ20 =
49.3
H0 : σ2 ≤ 30
30
One sided Upper tailed test
2
H1 : σ > 30
Step 2
Region of Acceptance
Upper
Step 3
2
Χ -crit =
(Use Chi-Square)
27.204
2
Sample test statistic X -stat
X2-stat =
31.223
Also
2
Do not reject Ho if Χ -stat ≤ 27.204
Formula 8.4
p -value =
0.0382
Use CHISQ.DIST.RT(x, df)
Steps 4, 5 Statistical Conclusion
Since X2-stat > X2-crit (i.e. lies in region of rejection of H0),
reject H0 at 10% level of significance.
Management Conclusion
We are 90% confident that the population variance is significantly greater
than the specified value of 30.
Exercise 8.28
Hypothesis Test for a Single Population Variance, σ
Insurance claim values variability analysis
2
Problem Characteristics
Variable x = claim values (in Rand)
Data type numerical, ratio scaled - implies means and standard deviations
Data
Step 1
n =
α=
32
0.05
s=
84
claims
2
σ 0=
2
H0 : σ = 5625
H1 : σ ≠ 5625
Two-tailed test
Management Question in H0
Region of Non-Rejection
2
Χ -crit = 48.232
Upper
(Use Chi-Square)
Lower
2
Step 2
5625
2
Χ -crit =
17.539
2
Do not reject Ho if 17.539 ≤ Χ -stat ≤ 48.232
Step 3
2
Use Formula 8.4
Sample test statisticX -stat
2
X -stat = 38.886
Also p -value = 0.1561
Use CHISQ.DIST.RT(x, df)
Steps 4, 5 Statistical Conclusion
Since X2-stat < X2-crit (i.e. lies in region of non-rejection of H0),
do not reject H0 at 5% level of significance.
Management Conclusion
2
The insurance industry can be 95% confident that the variation in claim values is still σ = R5625.
---ooOoo---
Exercise 8.29
File:
X8.29 - pain relief.xlsx
Problem Characteristics
Variable x = time to pain relief (in minutes)
Data type numerical, ratio scaled, continuous
Data
Step 1
n =
α=
16
0.01
s=
0.9004
patients
σ20 =
H0 : σ2 ≥ 1.8
One sided lower tailed test
Management Question in H0
2
H1 : σ < 1.8
Step 2
Region of Non-Rejection
Lower
Step 3
1.8
2
Χ -crit =
(Use Chi-Square)
5.229
2
Sample test statistic X -stat
2
X -stat =
6.756
Also
2
Do not reject Ho if Χ -stat ≥ 5.299
Use Formula 8.4
p -value =
0.0359
Use CHISQ.DIST(x, df)
Steps 4, 5 Statistical Conclusion
Since X2-stat > lower X2-crit (i.e. lies in region of non-rejection of H0),
do not reject H0 at 1% level of significance.
Management Conclusion
The pharmaceutical company can be 99% confident that
the variation in time to pain relief from the new headache pill is not significantly less than
2
the current headache pill (of σ = 1.8 min).
(i.e. it does not significantly reduce variation in time to pain relief).
CHAPTER 9
HYPOTHESIS TESTS
COMPARISON BETWEEN TWO POPULATIONS (MEANS, VARIANCES AND PROPORTIONS)
Exercise 9.1
When the population standard deviations of the two populations are unknown.
Exercise 9.2
Use Formula 9.1
z-stat =
z-stat =
Exercise 9.3
(a)
(b)
[(72 - 66) - 0]/√(202/40+102/50)
1.7321
t-crit (0.05,38) = 2.024
t-crit (0.05,90) = 1.987
Refer to Appendix 1 Table 2
Refer to Appendix 1 Table 3
Exercise 9.4
When the two samples are not independent.
Exercise 9.5
H0: π1 ≥ π2
H1: π1 < π2
One sided lower tailed test.
Exercise 9.6
(a)
Test for equality of variances
H0 :
σ21 = σ22
H1 :
σ²1 ≠ σ²2
Two tailed test
F-cri t = F(0.05/2, 23,18) =
(from Table 4(b), Appendix 1)
2.50
Decision rule: Do not reject H0 if F-stat ≤ 2.50
F-stat =
= 4.14²/3.32² =
1.555
=F.INV.RT(0.025,23,18)
2.515
Use rule of larger variance in numerator
Since F-stat (1.555) < F-crit (2.515), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Therefore, use the pooled-variances t -test to test for differences in means.
(b)
Test for difference between two population means
Let population 1 =
Manufacturers
Let population 2 =
Retailers
Let μi = population mean earnings yield (%) per sector i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
(c)
H0 :
μ1 = μ2
H1 :
μ1 ≠ μ2
Two tailed test
Region of Acceptance
t-crit =
Use α = 0.05 with df = (19+24-2) = 41
t(0.05)(41) =
2.021
Decision rule
Do not reject H0 in favour of H1 if -2.021 ≤ t-stat ≤ 2.021
t-stat
s² (pooled variance) =
((19-1)3,32²+(24-1)4,14²)/(19+24-2)
14.454
std error =
√(14.454*(1/19+1/24))
1.16747
t-stat =
((8.45-10.22)-(0))/1.16747
-1.51609
Conclusion
Since t-stat (-1.51609) lies within the region of acceptance, there is
insufficient sample evidence at the 5% level of significance to
reject H0 in favour of H1. (i.e. Accept H0).
Conclude that there is no difference in the mean earnings yield (%) between
manufacturing companies and retail companies.
Exercise 9.7
(a)
Let population 1 =
DIY Consumers
Let population 2 =
Non-DIY Consumers
Let μi = population mean age of each consumer group i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
(b)
H0:
μ1 ≥ μ2
H1:
μ1 < μ2
One sided lower tailed test
Region of Acceptance
t-crit =
Use α = 0.10 with df = (29+34-2) = 61
t(0.10)(61) =
-1.296
Decision rule
Do not reject H0 in favour of H1 if -1.296 ≤ t-stat
t-stat
s² (pooled variance) =
((29-1)15.9²+(34-1)16.2²)/(29+34-2)
258.0197
std error =
√(258.0197*(1/29+1/34))
4.060301
t-stat =
((41.8-47.4)-(0))/4.0603
-1.379
(c)
Statistical conclusion
Since t-stat (-1.379) lies outside (below) the region of acceptance,
there is sufficient sample evidence at the 10% level of significance
to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that the mean age of DIY consumers is significantly lower than
the mean age of non-DIY consumers at the 10% level of significance.
(d)
Region of Acceptance (new)
t-crit =
Decision rule
Use α = 0.05 with df = (29+34-2) = 61
t(0.05)(61) =
-1.671
Do not reject H0 in favour of H1 if -1.671 ≤ t-stat
Statistical conclusion
t-stat (-1.379) now lies within the (new) region of acceptance.
Thus there is insufficient sample evidence at the 5% level of significance
to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no significant difference in the mean ages between
DIY consumers and non-DIY consumers.
Exercise 9.8
(a)
Let population 1 =
Bus commuters
Let population 2 =
Train commuters
Let μi = population mean commuting time for each transport mode i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
(b)
H0:
μ1 ≤ μ2
H1:
μ1 > μ2
One sided upper tailed test
Test for equality of variances
2
1
=σ
2
2
H0:
σ
H1:
σ²1 ≠ σ²2
Two tailed test
F-cri t = F(0.05/2, 21,35) =
2.10
Decision rule: Do not reject H0 if F-stat ≤ 2.10
(from Table 4(b), Appendix 1)
F-stat =
Use rule of larger variance in numerator
= 7.8²/4.6² =
2.875
=F.INV.RT(0.025,21,35)
2.105
Since F-stat (2.875) > F-crit (2.10), reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are different.
Therefore, use the unequal-variances t -test to test for differences in means.
(c)
Region of Acceptance
df =
30
(Use df formula 9.9)
Use α = 0.01 with df = 30
t-crit =
t(0.01)(30) =
df (num) = 11.14938
df (denom) =0.374049
2.75
Decision rule
Do not reject H0 in favour of H1 if t-stat ≤ 2.75
t-stat
std error =
√(7.82/22 + 4.62/36)
1.680545
t-stat =
((35.3-31.8)-(0))/1.6805
2.083
Statistical conclusion
Since t-stat (2,083) lies within the region of acceptance,
there is insufficient sample evidence at the 1% level of significance
to reject H0 in favour of H1. (i.e. Accept H0).
(d)
Management conclusion
Conclude that there is no difference in the mean commuting times between
bus and train commuters.
Recommendation
Since there is no difference in mean commuting times, either mode of transport can be
prioritised for upgrading (or both can be upgraded simultaneously).
Exercise 9.9
Let population 1 =
Mastercard users
Let population 2 =
Visa Card users
Let μi = population mean month-end credit card balance (in Rands) for each card type i .
Use the z test statistic since σi's are known
H0:
μ1 = μ2
H1:
μ1 ≠ μ2
Region of Acceptance
z-crit =
Two sided test
Use α = 0.05
z(0.05) =
1.96
Decision rule
Do not reject H0 in favour of H1 if -1.96 ≤ z-stat ≤ 1.96
z-stat
std error =
√(294²/45+336²/66)
60.26065
z-stat =
((922-828)-(0))/60.26065
1.5599
Statistical conclusion
Since z-stat (1.5599) lies within the region of acceptance, there is insufficient sample
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no difference in the mean month-end credit card balances
between Mastercard holders and Visa Card holders.
Exercise 9.10
Let population 1 =
non-attendees of job enrichment workshops
Let population 2 =
attendees of job enrichment workshops
Let μi = population mean job satisfaction rating for each employee category i .
Use the z test statistic since σi's are known
H0:
μ1 ≥ μ2
One-sided lower tailed test
H1:
μ1 < μ2
← non-attendees have lower job satisfaction than
attendees.
Region of Acceptance
z-crit =
Use α = 0.05
z(0.05) =
-1.645
Decision rule
Do not reject H0 in favour of H1 if -1.645 ≤ z-stat
z-stat
std error =
√(1.1²/22+0.8²/25)
0.283901
z-stat =
((6.9-7.5)-(0))/0.283901
-2.1134
Statistical conclusion
Since z-stat (-2.1134) lies outside (below) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that the mean job satisfaction score for non-attendees is significantly
lower than the mean job satisfaction score for job enrichment attendees.
Thus, the statistical evidence supports the view that the job enrichment
workshops significantly increased job satisfaction levels of sales consultants.
Exercise 9.11
(a)
95% Confidence Limits - Explorer Fund only
Std error =
√(2.3²/15) =
0.5939
z (0.95) =
1.96
Lower 95% confidence limit =
Upper 95% confidence limit =
11.24 days
13.56 days
12.4 - 1.96 (0.5939) =
12.4 + 1.96 (0.5939) =
There is a 95% chance that the true average time to settlement for
claims lodged against the Explorer Fund lies between 11.24 days and 13.56 days.
(b)
Let population 1 =
Green-Aid Medical Fund
Let population 2 =
Explorer Medical Fund
Let μi = population mean time to settlement of claims by each medical fund i .
Use the z test statistic since σi's are known
H0:
μ1 ≥ μ2
One-sided lower tailed test
H1:
μ1 < μ2
← Green-Aid Fund settles quicker than Explorer Fund
Region of Acceptance
z-crit =
(Use α = 0.05)
z(0.05) =
-1.645
Decision rule
Do not reject H0 in favour of H1 if -1.645 ≤ z-stat
z-stat
std error =
√(3.2²/14+2.3²/15)
1.041199
z-stat =
((10.8-12.4)-(0))/1.041199
-1.5367
Statistical conclusion
Since z-stat (-1.5367) lies within the region of acceptance, there is insufficient sample
(or p-value = 0.0622 > α = 0.05) (see (iii) below), there is insufficient sample evidence
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
There is no difference in the mean claims settlement time between the two Funds.
Thus, the statistical evidence does not support the view that the Green-Aid Medical
Fund settles claims sooner, on average, than the Explorer Medical Fund.
Exercise 9.12
Let population 1 =
Gas ovens
Let population 2 =
Electric ovens
Let μi = population mean baking time for each oven type i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 ≥ μ2
One sided lower tailed test
H1:
μ1 < μ2
← gas ovens bake faster than electric ovens
Region of Acceptance
t-crit =
Use α = 0.05 with df = (5+5-2) = 8
t(0.05)(8) =
-1.86
Decision rule
Do not reject H0 in favour of H1 if t-stat ≥ -1.86
t-stat
s² (pooled variance) =
((5-1)0.16²+(5-1)0.09²)/(5+5-2)
0.01685
std error =
√(0.01685*(1/5+1/5))
0.08210
t-stat =
((0.75-0.89)-(0))/0.0821
-1.705
Statistical conclusion
Since t-stat (-1.705) lies within the region of acceptance, there is insufficient sample
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no difference in the mean bread baking time between gas
and electric ovens, at the 5% significance level.
Gas ovens are therefore, not faster, on average, than electric ovens.
Exercise 9.13
(a)
Test for equality of variances
H0: σ21 = σ22
H1: σ²1 ≠ σ²2
Rule of thumb test: F-stat < 3?
= 152.2²/121.5² =
F-stat =
1.569
Since F-stat (1.569) < 3, do not reject H0 in favour of H1.
Conclude that the two population variances are equal.
Therefore, use the pooled-variances t -test to test for differences in means.
(b)
Let population 1 =
Cape Town branch
Let population 2 =
Durban branch
Let μi = population mean size of orders received by each branch i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 ≤ μ2
H1:
μ1 > μ2
One sided upper tailed test
← CT branch performing better than Durban branch
t-crit =
Use α = 0.10 with df = (18+15-2) = 31
t(0.10)(31) =
1.309
Region of Acceptance
Do not reject H0 in favour of H1 if t-stat ≤ 1.309
Decision rule
t-stat
s² (pooled variance) =
((18-1)121.5²+(15-1)152.2²)/(18+15-2)
18556.97
std error =
√(18556.97*(1/18+1/15))
47.6243
t-stat =
((335.2-265.6)-(0))/47.6243
1.4614
Statistical conclusion
Since t-stat (1.4614) lies outside (above) the region of acceptance, there is sufficient
sample evidence at the 10% significance level to reject H0 in favour of H1 (i.e. Reject H0).
Management conclusion
Conclude that the mean size of orders received by the Cape Town branch is
significantly larger than the mean size of orders received by the Durban branch.
Thus the Cape Town branch is performing better than the Durban branch in
terms of average order size.
(c)
Region of Acceptance (new)
t-crit =
Decision rule (new)
Use α = 0.05 with df = (18+15-2) = 31
t(0.05)(31) =
1.696
Do not reject H0 in favour of H1 if t-stat ≤ 1.696
Statistical conclusion
t-stat (= 1.4614) now lies within the (new) region of acceptance.
Thus there is insufficient sample evidence at the 5% level of significance to
reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no significant difference in the mean order sizes between the
Cape Town branch and the Durban branch.
Thus there is no evidence to believe that the Cape Town branch is performing better
than the Durban branch in terms of average order size.
(d)
Findings based on the 5% significance level are more meaningful than those based
on the 10% significance level because it requires stronger (more convincing) sample
evidence before tests conducted at 5% are prepared to reject the null hypothesis.
The operations manager can be more confident that there is no difference in mean
performance between the two branches (conclusion based on (b)).
Exercise 9.14
File:
X9.14 - package designs.xlsx
First, test for equality of variances
H0:
σ21 = σ22
H1:
σ²1 ≠ σ²2
Two tailed test
F-cri t = F(0.05/2,7,7) =
(from Table 4(b), Appendix 1)
4.99
Decision rule: Do not reject H0 if F-stat ≤ 4.99
F-stat =
= 5.706²/4.862² = 1.377
=F.INV.RT(0.025,7,7)
4.995
Use rule of larger variance in numerator
Since F-stat (1.377) < F-crit (4.99), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Therefore, use the pooled-variances t -test to test for differences in means.
Now conduct the t-test for equal means, using the pooled-variances t-test approach.
Let population 1 =
Pyramid-shaped carton
Let population 2 =
Barrel-shaped carton
Let μi = population mean sales volume of one-litre cartons for each carton shape i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 ≥ μ2
One sided lower tailed test
H1:
μ1 < μ2
'Pyramid' sales are less than 'Barrel' sales
Region of Acceptance
t-crit =
Use α = 0.05 with df = (8+8-2) = 14
t(0.05)(14) =
-1.761
Decision rule
Do not reject H0 in favour of H1 if -1.761 ≤ t-stat
t-stat
s² (pooled variance) =
((8-1)4.862²+(8-1)5.706²)/(8+8-2)
28.09874
std error =
√(28.09874*(1/8+1/8))
2.650412
t-stat =
((23.75-27.375)-(0))/2.650412
-1.3677
Statistical conclusion
Since t-stat (-1.3677) lies within the region of acceptance, there is insufficient sample
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no difference in the mean weekly sales of one-litre cartons
of apple juice between the pyramid-shaped and barrel-shaped carton designs.
Thus the marketer can choose either package design to achieve higher weekly sales.
Exercise 9.15
Let population 1 =
Fruit Puffs consumers
Let population 2 =
Fruity Wheat consumers
Let πi = population proportion of consumers who prefer fruit-flavoured wheat cereal i .
Use the z test statistic
H0:
π1 ≤ π2
One sided upper tailed test
H1:
π1 > π2
Fruit Puffs is preferred by more consumers than
Fruity Wheat.
Region of Acceptance
Decision rule
Sample data
z-stat
z-crit =
Use α = 0.05
z(0.05) =
1.645
Do not reject H0 in favour of H1 if z-stat ≤ 1.645.
n
x
pi
Fruit Puffs
175
54
0.309
Fruity Wheat
150
36
0.240
π(hat) (pooled proportion) =
(54+36)/(175+150) =
0.2769
std error =
√(0.2769*(1-0.2769)*(1/175+1/150))
0.04978949
z-stat =
(0.309-0.24)/0.049789
1.3858
Statistical conclusion
Since z-stat (= 1.3858) lies within the region of acceptance, there is insufficient sample
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no difference in the percentage of consumers who prefer
each type of fruit-flavoured wheat cereal.
The marketer's view that Fruit Puffs is more preferred than Fruity Wheat cannot be
validated based on the statistical evidence at the 5% level of significance.
The marketer can therefore choose to launch either fruit flavour of wheat cereal.
Exercise 9.16
(a)
Let population 1 =
Male respondents
Let population 2 =
Female respondents
Let πi = population proportion who prefer jazz for each gender i .
Use the z test statistic
H0:
π1 = π2
H1:
π1 ≠ π2
Use α = 0.05
z-crit =
Region of Acceptance
Decision rule
Sample data
z-stat
Two sided test (equal preference)
1.96
Do not reject H0 in favour of H1 if -1.96 ≤ z-stat ≤ 1.96.
n
x
pi
Male
140
46
0.329
π(hat) (pooled proportion) =
Female
110
21
0.191
(46+21)/(140+110) =
0.268
std error =
√(0.268*(1-0.268)*(1/140+1/110))
0.056432928
z-stat =
(0.329-0.191)/0.056433
2.445
Statistical conclusion
Since z-stat (2.445) lies outside (above) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1 (i.e. Reject H0)
Management conclusion
Conclude that there is a difference in the proportion of males compared to the
proportion of females who enjoy listening to jazz.
By inspection , proportionately more males than females enjoy listening to jazz.
(b)
p -value =
=(1-NORMSDIST(2.445))*2
0.0145
Since the p-value (=0.0145) < α = 0.05, there is strong sample evidence in
support of H1. Hence conclude that there is a difference in the proportion of males
compared to the proportion of females who enjoy listening to jazz.
Exercise 9.17
(a)
Let population 1 =
Status Cheque Account clients
Let population 2 =
Elite Cheque Account clients
Let πi = population proportion of clients for each account type i who are overdrawn.
Use the z test statistic
H0:
π1 ≥ π2
One sided lower tailed test
H1:
π1 < π2
'Status' proportion less than 'Elite' proportion
Use α = 0.05
z-crit =
Region of Acceptance
Decision rule
Sample data
z-stat
-1.645
Do not reject H0 in favour of H1 if z-stat ≥ -1.645.
n
x
pi
Status
300
48
0.16
Elite
250
55
0.22
π(hat) (pooled proportion) = (48+55)/(300+250) =
0.1873
std error =
√(0.1873*(1-0.1873)*(1/300+1/250))
0.0334
z-stat =
(0.16-0.22)/0.0334
-1.796
Statistical conclusion
Since z-stat (= -1.796) lies outside (below) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that proportionately more Elite cheque account clients are overdrawn
compared to Status cheque account clients.
(b)
p -value =
=NORMSDIST(-1.796)
0.0363
Since the p-value (=0.0363) < α = 0.05, there is strong sample evidence in
support of H1. Hence conclude that proportionately more Elite cheque account clients
are overdrawn compared to Status cheque account clients.
Exercise 9.18
(a) (i)
File:
X9.18 - aluminium scrap.xlsx
Test for equality of variances
2
1
2
2
H0:
σ
=σ
H1:
σ²1 ≠ σ²2
Two tailed test
Since F-stat (1.702) < F-crit (1.77), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Use the pooled-variances t -test to test for differences in means.
(Data Analysis - Excel )
F-Test Two-Sample for Variances
Machine 1
Machine 2
Mean
3.483
3.668
Variance
0.261
0.153
Observations
50
30
df
49
29
F-stat
1.702
P(F<=f) one-tail
0.0639
F Critical one-tail
1.777
(a) (ii)
Let population 1 =
machine 1 daily % scrap
Let population 2 =
machine 2 daily % scrap
Let μi = population average daily % scrap produced by each machine i .
(a) (iii) Use the pooled-variances t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 ≥ μ2
One sided lower tailed test
H1:
μ1 < μ2
Machine 1 produces lower % scrap than machine 2
Test performed manually using Descriptive Statistics from Data Analysis
Region of Acceptance
t-crit =
Use α = 0.05 with df = (50+30-2) = 78
t(0.05)(78) =
TINV(0.1,78)
Decision rule
Do not reject H0 in favour of H1 if -1.664 ≤ t-stat
Sample data - descriptive statistics
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Skewness
Range
Minimum
Maximum
Sum
Count
Machine 1
3.483
0.072
3.41
2.97
0.511
0.261
0.331
1.83
2.67
4.5
174.14
50
Machine 2
3.668
0.072
3.705
3.77
0.392
0.153
0.014
1.59
2.88
4.47
110.03
30
-1.664
t-stat
s² (pooled variance) =
std error =
t-stat =
(a) (iii)
((50-1)*0.511²+(30-1)*0.392²)/(50+30-2)
0.221169038
√(0.221169*(1/50+1/30))
0.108607919
((3.483-3.668)-(0))/0.10861
-1.7032
Test performed using t-Test Assuming Equal Variances in Data Analysis
Using Data Analysis (in Excel )
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
(a) (iv)
Machine 1
3.483
0.261
50
0.221
0
78
-1.703
0.046
-1.665
0.093
1.991
Machine 2
3.668
0.153
30
Statistical conclusion
Since t-stat (= -1.7032) lies outside (below) the region of acceptance, there is sufficient
or p-value = 0.0463 < α = 0.05 (see below), there is sufficient sample evidence
sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that machine 1 has a significantly lower average daily % scrap than
machine 2.
(b)
p -value =
=T.DIST(-(-1.7032),78,1)
0.0463
=T.DIST(-1.7032,78,TRUE)
Since the p-value (=0.046) < α = 0.05, there is strong sample evidence in
support of H1. Conclude that machine 1 produces scrap at a significantly lower
average daily rate than machine 2. This conclusion is valid at the 5% significance level.
Exercise 9.19
File:
X9.19 - water purification.xlsx
(a) (i) Test for equality of variances
2
1
=σ
2
2
H0:
σ
H1:
σ²1 ≠ σ²2
Two tailed test
Since F-stat (1.063) < F-crit (1.924), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Use the pooled-variances t -test to test for differences in means.
F-Test Two-Sample for Variances
Free State
Mean
27.458
Variance
1.563
Observations
24
df
23
F-stat
1.063
P(F<=f) one-tail
0.4342
F Critical one-tail
1.924
(Data Analysis - Excel )
KZN
26.448
1.470
29
28
(a) (ii) Let population 1 =
Free State plant daily impurities levels
Let population 2 =
KZN plant daily impurities levels
Let μi = population average daily impurities level per plant i .
(a) (iii) Use the pooled-variances t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 ≤ μ2
One sided upper tailed test
H1:
μ1 > μ2
FS treatment plant has higher level of impurities
than the KZN water treatment plant.
Test performed manually using Descriptive Statistics from Data Analysis
Region of Acceptance
t-crit =
Use α = 0.01 with df = (24+29-2) = 51
t(0.01)(51) =
TINV(0.02,51)
2.402
Decision rule
Do not reject H0 in favour of H1 if t-stat ≤ 2.402
Descriptive Statistics
Free State
Mean
27.458
Standard Error
0.255
Median
27.5
Mode
28
Standard Deviation
1.250
Sample Variance
1.563
Skewness
0.030
Range
5
Minimum
25
Maximum
30
Sum
659
Count
24
Confidence Level(99.0%)
0.717
t-stat
s² (pooled variance) =
std error =
t-stat =
KZN
26.448
0.225
27
27
1.213
1.470
-0.193
5
24
29
767
29
0.622
((24-1)*1.25²+(29-1)*1.213²)/(24+29-2)
1.51247
√(1.51247*(1/24+1/29))
0.33937
((27.458-26.448)-(0))/0.33937
2.9764
Test performed using t-Test Assuming Equal Variances in Data Analysis
Using Data Analysis (in Excel )
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Free State
27.458
1.563
24
1.512
0
51
2.9764
0.00223
2.402
0.004
2.676
KZN
26.448
1.470
29
(a) (iv) Statistical conclusion
Since t-stat (2.9764) lies outside (above) the region of acceptance, there is sufficient
sample evidence at the 1% significance level to reject H 0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that the KZN plant produces water of a higher quality (fewer impurities, on average)
than the Free State plant. Thus the KZN plant manager's claim can be supported
statistically at the 1% level of significance.
(b)
p -value =
=T.DIST(2.9761),51,1)
0.00223
=T.DIST.RT(2.9764,51)
Since the p-value (=0.00223) << α = 0.01, there is overwhelming sample evidence in
support of H1. Same conclusion as in (a) applies.
Exercise 9.20
(a)
File:
X9.20 - herbal tea.xlsx
Test for equality of variances
H0: σ21 = σ22
Two tailed test
H1: σ²1 ≠ σ²2
Since F-stat (1.227) < F-crit (2.098), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Use the pooled-variances t -test to test for differences in means.
(Data Analysis - Excel )
F-Test Two-Sample for Variances
Freshpak Yellow Label
Mean
7.689
6.917
Variance
2.170
1.768
Observations
19
23
df
18
22
F-stat
1.227
P(F<=f) one-tail
0.3205
F Critical one-tail
2.098
(b) (i)
Let population 1 = Freshpak brand
Let population 2 = Yellow Label brand
Let μi = population mean level of quercetin in mg/kg in each brand i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
H0:
μ1 = μ2
H1:
μ1 ≠ μ2
Two sided test
(No difference)
Test performed manually using Descriptive Statistics from Data Analysis
Region of Acceptance
Decision rule
Use α = 0.05 with df = (19+23-2) = 40
t-crit =
t(0.05)(40) =
2.021
Do not reject H0 in favour of H1 if -2.021 ≤ t-stat ≤ 2.021
Descriptive Statistics
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Skewness
Range
Minimum
Maximum
Sum
Count
Freshpak
7.689
0.3379
7.9
7.9
1.4731
2.1699
-0.1752
5.1
5
10.1
146.1
19
Yellow Label
6.917
0.2772
7.1
5.7
1.3296
1.7679
0.1030
5.3
4.5
9.8
159.1
23
t-stat
s² (pooled variance) =
((19-1)*1.473²+(23-1)*1.3296²)/(19+23-2)
1.948688
std error =
√(1.948688*(1/19+1/23))
0.43277
t-stat =
((7.689-6.917)-(0))/0.43277
1.784
Test performed using t-Test Assuming Equal Variances in Data Analysis
Using Data Analysis (in Excel )
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Freshpak
7.6895
2.1699
19
1.9488
0
40
1.7840
0.0410
1.6839
0.0820
2.0211
Yellow Label
6.9174
1.7679
23
Statistical conclusion
Since t-stat (1.784) lies within the region of acceptance, there is insufficient sample
evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that there is no difference in the mean quercetin content (in mg/kg)
between the Freshpak and Yellow Labels brands of rooibos tea.
(b) (ii)
Let population 1 =
Let population 2 =
Freshpak brand (FP)
Yellow Label brand (YL)
H0:
μ1 ≤ μ2
H1:
μ1 > μ2
One sided upper tailed test
FP contains more quercetin than YL
Using Data Analysis (in Excel )
t-Test: Two-Sample Assuming Equal Variances
Freshpak
Mean
7.6895
Variance
2.1699
Observations
19
Pooled Variance
1.9488
Hypothesized Mean Difference
0
df
40
t Stat
1.7840
P(T<=t) one-tail
0.0410
t Critical one-tail
1.6839
P(T<=t) two-tail
0.0820
t Critical two-tail
2.0211
Yellow Label
6.9174
1.7679
23
Statistical conclusion
Since t-stat (= 1.784) lies outside (above) the region of acceptance (t-stat ≤ 1.6839),
(see table above), there is sufficient sample evidence at the 5% level of
significance to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that Freshpak's claim that their brand contains more quercetin, on average,
than the Yellow Label brand, can be supported statistically at the 5% significance level.
Exercise 9.21
(a) (i)
File:
X9.21 - meat fat.xlsx
Test for equality of variances
H0: σ21 = σ22
Two tailed test
H1: σ²1 ≠ σ²2
Since F-stat (1.261) < F-crit (2.066), do not reject H0 in favour of H1.
Conclude,at α = 0.05, that the two population variances are equal.
Use the pooled-variances t -test to test for differences in means.
F-Test Two-Sample for Variances
Namibia
Mean
27.074
Variance
33.840
Observations
27
df
26
F
1.261
P(F<=f) one-tail
0.3002
F Critical one-tail
2.066
(a) (ii)
(Data Analysis - Excel )
Little Karoo
30.333
26.833
21
20
Let population 1 =
Namibian meat producer
Let population 2 =
Little Karoo meat producer
Let μi = population average fat content of meat supplied by each producer i .
Use the t test statistic since σi's are unknown (only s1 and s2 are given)
(a) (iii)
H0:
μ1 ≥ μ2
H1:
μ1 < μ2
One sided lower tailed test
Test performed manually using Descriptive Statistics from Data Analysis
Region of Acceptance
t-crit =
Decision rule
t-stat
Use α = 0.01 with df = (27+21-2) = 46
t(0.01)(46) =
=TINV(0.02,46) 2.412
Do not reject H0 in favour of H1 if t-stat ≤ 2.412
s² (pooled variance) =((27-1)*5.8173²+(21-1)*5.1801²)/(27+21-2)
30.794
std error =
√(30.794*(1/27+1/21))
1.61459
t-stat =
((27.0741-30.3333)-(0))/1.61459
-2.0186
Test performed using t-Test Assuming Equal Variances in Data Analysis
Using Data Analysis (in Excel )
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
(a) (iv)
Namibia
27.0741
33.8405
27
30.7939
0
46
-2.0186
0.0247
-2.4102
0.0494
2.6870
Little Karoo
30.3333
26.8333
21
Statistical conclusion
Since t-stat (-2.0186) lies within the region of acceptance, there is insufficient sample
evidence at the 1% significance level to reject H0 in favour of H1. (i.e. Accept H0).
Management conclusion
Conclude that the mean fat content of meat between the Namibian producer and the
Little Karoo producer is the same .
There is therefore no statistical justification, at the 1% significance level, to sign an
exclusive agreement with the Namibian producer.
(b)
p -value =
=TDIST(-(-2.0186),46,1)
0.0247
=T.DIST(-2.0186,46,TRUE)
This is the same p-value as shown in the Data Analysis output for a one-tailed test.
Since p-value = 0.0247 > α = 0.01 (see Table above), there is insufficient sample evidence
to reject H0 in favour of H1 at the 1% level of signficance.
Same management conclusion as in (a) (iii) above.
Exercise 9.22
File:
X9.22 - disinfectant sales.xlsx
(a)
Matched pairs test
The same retail outlets were surveyed both before and after the
promotional campaign. Thus the two samples are not independent.
(b)
Define
x1 =
x2 =
Sales per outlet before the promotional campaign
Sales per outlet after the promotional campaign
(x1 - x2)
Let d =
i.e. " before" - "after"
Let μd = population mean difference in sales from before to after the promotional campaign.
Use the matched pairs t test statistic
(c)
H0:
μd ≥ 0
One sided lower tailed test
H1:
μd < 0
'before' sales are lower than the 'after' sales
Region of Acceptance
t-crit =
Use α = 0.05 with df = (12-1) = 11
t(0.05)(11) =
-1.796
Decision rule
Do not reject H0 in favour of H1 if -1.796 ≤ t-stat
Sample data
t-stat
Σxd
n
x(bar)d
-8
12
-0.667
sd
1.231
(-0.667 - 0)/(1.231/√12) =
-1.877
Statistical conclusion
Since t-stat (-1.877) lies outside (below) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that there has been a significant increase in mean sales of 500ml bottles
of disinfectant liquid from before to after the promotional campaign.
The promotional campaign has therefore been a success at significantly
increasing mean sales volume of the product - at the 5% significant level.
(d)
t-Test: Paired Two Sample for Means
using Data Analysis
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Before
11.5
4.455
12
0.817
0
11
-1.876
0.0437
1.7959
0.0874
2.2010
p- value = =TDIST(-(-1.877),11,1)
0.0436
After
12.17
3.606
12
Statistical conclusion
Since p -value = 0.0437 < α = 0.05, there is moderately strong sample
evidence to reject H0 in favour of H1 and conclude that the promotional
campaign has been effective.
Exercise 9.23
File:
X9.23 - performance ratings.xlsx
(a)
The samples are dependent as the same employee is tested both before and
after the training sessions.
(b)
Matched pairs test
Performance rating before the training sessions
x1 =
Performance rating after the training sessions
x2 =
(x1 - x2)
Let d =
i.e. " before" - "after"
Let μd = population mean difference in performance rating scores from
before to after the training sessions.
Use the matched pairs t test statistic
H0:
μd ≥ 0
One sided lower tailed test
H1:
μd < 0
'before rating scores lower than 'after' rating scores.
Test performed manually
Use α = 0.05 with df = (18-1) = 17
t(0.05)(17) =
-1.74
Region of Acceptance
t-crit =
Decision rule
Sample data
t-stat
Do not reject H0 in favour of H1 if -1.74 ≤ t-stat
Σxd
n
x(bar)d
-6.4
18
-0.356
sd
0.7114
(-0.356 - 0)/(0.7114/√18) =
-2.123
Data Analysis: t-Test : Paired Two Sample for Means
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Before
11.067
6.024
18
0.95743865
0
17
-2.120
0.0245
1.740
0.04898861
2.10981556
After
11.422
5.835
18
Statistical conclusion
Since t-stat (-2.123) lies outside (below) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1 (i.e. Reject H0)
Management conclusion
Conclude that there has been a significant increase in mean performance ratings
scores of employees who attended the series of workshops ans seminars.
The performance enhancement sessions have therefore been effective at increasing
motivation and productivity - at the 5% level of significance.
(c)
p -value =
=TDIST(-(-2.123),17,1)
0.0244
Also see t-Test Table above.
Since p-value = 0.0244 < α = 0.05, there is strong sample evidence to reject H0
in favour of H1 and conclude that the workshops have significantly increased
employee motivation and productivity.
Exercise 9.24
File:
X9.24 - household debt.xlsx
(a)
The samples are dependent as the same household is tested both a year ago
and at the current time period .
(b)
Matched pairs test
Household debt level a year ago.
x1 =
Household debt level currently.
x2 =
(x1 - x2)
Let d =
i.e. " year ago" - "current"
Let μd = population mean difference in household debt levels from a year ago
to the current period.
Use the matched pairs t test statistic
H0:
μd ≤ 0
One sided upper tailed test
H1:
μd > 0
Debt higher a year ago than today (current period)
Test performed manually
Use α = 0.05 with df = (10-1) = 9
t(0.05)(9) =
1.833
Region of Acceptance
t-crit =
Decision rule
Sample data
Do not reject H0 in favour of H1 if t-stat ≤ 1.833
Σxd
n
x(bar)d
sd
t-stat
12
10
1.2
1.8135
(1.2 - 0)/(1.8135/√10) =
2.0925
Data Analysis: t-Test : Paired Two Sample for Means
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Year ago
40.5
35.611
10
0.9541
0
9
2.0925
0.0330
1.8331
0.0659
2.2622
Current
39.3
36.011
10
Statistical conclusion
Since t-stat (2.0925) lies outside (above) the region of acceptance, there is sufficient
sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0).
Management conclusion
Conclude that there has been a significant decrease in the average level of
household debt from a year ago.
The increase in prime interest rate (from 6% to 11%) has lead to a significant decline
in the average level of household debt from a year ago - at the 5% significance level.
(c)
p -value =
=TDIST(2.0925,9,1)
0.033
Also see t-Test Table above
Since the p -value (0.033) < α = 0,05, there is strong sample evidence to reject H0 in
favour of H1 at the 5% level of significance.
Same statistical and management conclusions as (b) above.
Exercise 9.25
Process output variability study
Random variable:
H0
H1:
Hourly output per process
unit of measure: units produced per hour
2
2
2
2
σ (1) = σ (2)
Management question in H0
σ (1) ≠ σ (2)
Region of Acceptance (use α = 0.05)
Note: Set up the F-test as an upper tailed test (F-stat = larger s2/smaller s2)
F-crit (upper) =
F(0.05,30,24)
1.939
Decision rule: Do not reject H0 if F-stat (the sample evidence) ≤ 1.939
Sample data
s12
=
14.6
n1 =
25
F-stat = larger s2/smaller s2
α=
s22 =
n2 =
0.05
23.2
31
F-stat =
23.2/14.6 =
p -value =
0.1240
1.589
Statistical conclusion
Since F-stat = 1.589 < F-crit = 1.939, there is insufficient sample evidence at the
5% level of significance to reject H0 in favour of H1.
Management conclusion
Therefore conclude, with 95% confidence, that the variability of hourly outputs
between the two production processes is the same.
Exercise 9.26
File: X9.26 - milk yield.xlsx
Random variable:
H0:
H1:
σ2(f) ≤ σ2(c)
σ2(f) > σ2(c)
milk yield (in litres per week) per cow
Management question in H1
Region of Acceptance:
Use α = 0.05 with df1 = 16-1 = 15 and df2 = 16-1 = 15
(See Excel output)
F-crit =
F(0.05,15,15) =
2.403
Decision rule: Do not reject H0 if F-stat ≤ 2.403
F-Test Two-Sample for Variances
Free grazing
Mean
31.306
Variance
72.138
Observations
16
df
15
F-stat
1.673
P(F<=f) one-tail
0.1647
F Critical one-tail
2.403
F-stat = 1.673 (See Excel output)
(Data Analysis in Excel )
Controlled Feed
35.375
43.110
16
15
p -value =
0.1647
Statistical conclusion
Since F-stat = 1.673 < F-crit = 2.403, there is insufficient sample evidence at the
5% level of significance to reject H0 in favour of H1.
Management conclusion
Therefore conclude, with 95% confidence, that there is no signficant difference
in the variability in milk yields of cows between the two feeding practices.
Exercise 9.27
File: X9.27 - employee wellness.xlsx
Random variable:
H0:
H1:
Hours spent exercising per week
σ2(over 40) ≤ σ2(under 40)
σ2(over 40) > σ2(under 40)
Management question in H1
Region of Acceptance:
Use α = 0.05 with df1 = 23-1 = 22 and df2 = 21-1 = 20
(See Excel output)
F-crit =
F(0.05,22,20) =
2.102
Decision rule: Do not reject H0 if F-stat ≤ 2.102
F-Test Two-Sample for Variances
Over 40
Mean
2.461
Variance
0.836
Observations
23
df
22
F-stat
2.240
P(F<=f) one-tail
0.0373
F Critical one-tail
2.102
F-stat = 2.24 (See Excel output)
(Data Analysis (Excel ))
Under 40
3.086
0.373
21
20
p -value =
0.0373
Statistical conclusion
Since F-stat = 2.24 > F-crit = 2.102, there is sufficient sample evidence at the
5% level of significance to reject H0 in favour of H1.
Management conclusion
Therefore conclude, with 95% confidence, that 'over 40' employees do indeed
exercise more 'erractically' (i.e. show significantly greater variability in exercise times)
than 'under 40' employees.
Exercise 9.28
File: X9.28 - attrition rate.xlsx
Random variable
2
2
2
2
Attrition rates (%) per call center per month
H0: σ (fin) = σ (health)
Management question in H0
H1: σ (fin) ≠ σ (health)
Region of Acceptance of H0:
Find only F-crit (lower) or F-crit (upper)
This depends on whether the smaller or the larger sample variance is in the numerator of F-stat
OR
For an F-crit (upper) only:
F-stat = Larger variance / Smaller variance
For an F-crit (lower) only:
F-stat = Smaller variance / Larger variance
F-Test Two-Sample for Variances
Health
Financial
Mean
5.46
6.13
Variance
1.1413
0.6653
Observations
17
21
df
16
20
F-stat
1.715
P(F<=f) one-tail
0.1263
F-crit (upper)
2.184
F-Test Two-Sample for Variances
Financial
Mean
6.13
Variance
0.6653
Observations
21
df
20
F-stat
0.583
P(F<=f) one-tail
0.1263
F-crit (lower)
0.458
F-crit (upper) =
F-crit (lower) =
Decision rule:
F(0.05,16,20)
=1.1413/0.6653
1/F(0.05,16,20)
=0.6653/1.1413
(or F(0.95,20,16))
For upper tailed test: Do not reject H0 if F-stat ≤ 2.184
OR
For lower tailed test: Do not reject H0 if F-stat ≥ 0.458
Health
5.46
1.1413
17
16
2.184
0.458
Now F-stat = 1.715 (for an upper tailed test); or F-stat = 0.583 (for a lower tailed test)
Conclusion (based on an upper tailed F-test)
Since F-stat = 1.715 > F-crit (upper) = 2.184 (and hence lies within the acceptance region), there
is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1.
Same Conclusion (based on a lower tailed F-test)
Since F-stat = 0.583 > F-crit (lower) = 0.458 (and hence lies within the acceptance region), there
is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1.
Management conclusion
With 95% confidence, it can be concluded that there is no significant difference in the variability
in attrition rates between the two sectors (financial and health).
CHAPTER 10
CHI-SQUARE HYPOTHESIS TESTS
Exercise 10.1
Purpose: To test whether there is any statistically significant association between
the outcomes of two categorical variables. Stated differently, are the outcomes
associated with two categorical variables independent of each other or not?
Exercise 10.2
Categorical (nominal or ordinal-scaled) data
Exercise 10.3
H0: There is no statistical association between the two categorical variables
Exercise 10.4
Expected frequencies represent the null hypothesis of no association (or statistical
independence ) between the two categorical variables.
Exercise 10.5
χ²-crit (0.05,6) = 12.592
χ²-crit (0.10,6) = 10.645
Exercise 10.6
(a)
File:
X10.6 - motivation status.xlsx
Row Percentages
Male
Female
Total
High
26.7
47.5
38.6
Motivation level
Moderate
26.7
30.0
28.6
Low
46.7
22.5
32.9
Total
100
100
100
Interpretation (by inspection)
When compared to the general population profile, males tend to have low levels of
motivation, while females tend to be more highly motivated. It appears
therefore that a statistical association exists between gender and motivation level.
(b)
H0: There is no association between Gender and Motivation level
H1: There is an association between Gender and Motivation level
Region of Rejection
(Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2)
χ²-crit =
χ²(0.10)(2) =
4.605
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 4.605
χ²-stat
Observed frequencies (fo)
High
Moderate
Male
8
8
Female
19
12
Total
27
20
Low
14
9
23
Total
30
40
70
Expected frequencies (fe)
High
Male
11.57
Female
15.43
Total
27
Moderate
8.57
11.43
20
Low
9.86
13.14
23
Total
30
40
70
Chi-Squared components
High
Male
1.102
Female
0.827
Moderate
0.038
0.029
Low
1.741
1.306
χ²-stat =
5.0428
Conclusion
Since χ²-stat = 5.0428 > χ²-crit = 4.605, there is sufficient sample evidence at the
10% significance level to reject H0 in favour of H1. Therefore conclude that there is a
statistical association between the gender of an employee and their level of motivation.
The nature of the relationship is described in (a) above.
Exercise 10.7
(a)
File:
X10.7 - internet shopping.xlsx
Row Percentages
full-time
at-home
Total
Internet shopping
Yes
No
75.7
24.3
18.5
81.5
20.8
79.2
Total
100
100
100
Interpretation (by inspection)
Since the row percentage profiles (between full-time employed and at-home customers) are
very similar to each other and to the general population profile, it can be concluded, by
observation, that the two attributes are not associated (i.e. they are statistically independent).
(b)
H0: There is no association between Employment Status and Use of Internet Shopping
H1: There is an association between Employment Status and Use of Internet Shopping
Region of Rejection
χ²-crit =
(Use α = 0.05 with degrees of freedom = (2-1)(2-1) = 1)
χ²(0.05)(1) = 3.843
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 3.843
χ²-stat
Observed frequencies (fo)
Yes
full-time
35
at-home
40
Total
75
No
109
176
285
Total
144
216
360
Expected frequencies (fe)
Yes
full-time
30
at-home
45
Total
75
No
114
171
285
Total
144
216
360
Chi-Squared components
Yes
No
full-time
0.8333
0.2193
at-home
0.5556
0.1462
χ²-stat =
1.7544
Conclusion
Since χ²-stat = 1.7544 < χ²-crit = 3.843, there is insufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is
no statistical association between the employment status of a customer and their
use of the internet for shopping purposes.
These two events are statistically independent.
Exercise 10.8
(a)
File:
X10.8 - car size.xlsx
Row Percentages
Under 30
30 - 45
Over 45
Total
Small
15.2
21.1
37.5
26.3
Car sizes bought
Medium
Large
33.3
51.5
36.8
42.1
29.2
33.3
33.0
40.7
Total
100
100
100
100
Interpretation (by inspection)
With reference to the general population profile (Total row %),
under 30's tend to prefer larger cars; 30-45 year age car buyers marginally tend
towards medium to large cars, while over 45's strongly tend to prefer smaller cars.
(b)
H0: There is no association between Age of car buyer and Car Size bought.
H1: There is an association between Age of car buyer and Car Size bought.
Region of Rejection
(Use α = 0.01 with degrees of freedom = (3-1)(3-1) = 4)
χ²-crit =
χ²(0.01)(4) = 13.277
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 13.277
χ²-stat
Observed frequencies (fo)
Small
Medium
Under 30
10
22
30 - 45
24
42
Over 45
45
35
Total
79
99
Large
34
48
40
122
Total
66
114
120
300
Expected frequencies (fe)
Small
Medium
Under 30
17.38
21.78
30 - 45
30.02
37.62
Over 45
31.6
39.6
Total
79
99
Large
26.84
46.36
48.8
122
Total
66
114
120
300
Chi-Squared components
Small
Medium
Under 30
3.134
0.002
30 - 45
1.207
0.510
Over 45
5.682
0.534
Large
1.910
0.058
1.587
χ²-stat =
14.6247
Conclusion
Since χ²-stat = 14.6247 > χ²-crit = 13.277, there is sufficient sample evidence at the
1% significance level to reject H0 in favour of H1. Therefore conclude that there is a
statistical association between the age of a car buyer and the size of car bought.
(c)
Management Conclusion
The nature of the statistical relationship found in (b) is described in (a) above.
Under 30's tend to prefer larger cars; 30-45 year age car buyers marginally tend
towards medium to large cars, while over 45's strongly tend to prefer smaller cars.
Recommendation
Target larger cars to the younger market and smaller cars to the older market.
Exercise 10.9
(a)
File:
X10.9 - sports readership.xlsx
Let πi = proportion of people who read Sports News in each of the i regions.
H0: π1 = π2 = π3
H1: At least one πi is different (i = 1,2,3)
Sample proportions
(b)
Region of Rejection
Decision rule
χ²-stat
χ²-crit =
E Cape
0.160
W Cape
0.104
KZN
0.250
Use α = 0.01 with degrees of freedom = (2-1)(3-1) = 2
χ²(0.01)(2) = 9.21
Reject H0 in favour of H1 if χ²-stat ≥ 9.21
Observed frequencies (fo)
E Cape
W Cape
84
86
No
16
10
Yes
100
96
Total
KZN
78
26
104
Total
248
52
300
Expected frequencies (fe)
E Cape
W Cape
82.67
79.36
No
17.33
16.64
Yes
100
96
Total
KZN
85.97
18.03
104
Total
248
52
300
Chi-Squared components
E Cape
W Cape
No
0.0215
0.5556
Yes
0.1026
2.6496
KZN
0.7395
3.5267
χ²-stat =
7.5954
Conclusion
Since χ²-stat = 7.5954 < χ²-crit = 9.21, there is insufficient sample evidence at the
1% significance level to reject H0 in favour of H1. Therefore conclude that the
proportion of people who read Sports News is the same in each Geographical Region.
These two events are statistically independent.
(c)
H0: There is no association between the propensity to read Sports News and Region
H1: There is an association between the propensity to read Sports News and Region
Exercise 10.10
(a)
File:
X10.10 - gym activity.xlsx
Row Percentages
Male
Female
Total
Gym activity
Spinning Swimming
42.35
22.35
52.73
29.09
46.43
25.00
Circuit
35.29
18.18
28.57
Total
100
100
100
Interpretation (by inspection)
When compared to the general population of gym goers, more females than males
tend to prefer spinning and swimming; while more males relative to females tend to
prefer mainly doing the circuit.
The evidence is however not strongly convincing.
(b)
H0: There is no association between Gender and preferred Gym Activity
H1: There is an association between Gender and preferred Gym Activity
Region of Rejection
Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2
χ²(0.10)(2) = 4.605
χ²-crit =
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 4.605
χ²-stat
Observed frequencies (fo)
Spinning Swimming
Male
36
19
Female
29
16
Total
65
35
Circuit
30
10
40
Total
85
55
140
Expected frequencies (fe)
Spinning Swimming
Male
39.46
21.25
Female
25.54
13.75
Total
65
35
Circuit
24.29
15.71
40
Total
85
55
140
Circuit
1.345
2.078
χ²-stat =
4.803
Chi-Squared components
Spinning Swimming
Male
0.304
0.238
Female
0.470
0.368
Conclusion
Since χ²-stat = 4.803 > χ²-crit = 4.605 (marginally), there is sufficient sample evidence
at the 10% significance level to reject H0 in favour of H1. Therefore conclude that
gender and preferred gym activity are associated.
The nature of the relationship is described in (a) above.
(c)
New rejection region
Decision rule
Use α = 0.05 with degrees of freedom = (2-1)(3-1) = 2
χ²-crit =
χ²(0.05)(2) = 5.991
Reject H0 in favour of H1 if χ²-stat ≥ 5.991
New decision
Do not reject H0 at the 5% significance level,
New conclusion
(d)
since χ²-stat = 4.803 < χ²-crit = 5.991.
There is no statistical association between gender and gym activity
(i.e. they are statistically independent) at the 5% level of significant.
Let πi = proportion of females who prefer each gym activity i (spinning, swimming, circuit)
H0: π1 = π2 = π3
H1: At least one πi is different (i = 1,2,3)
Sample proportions
Spin
0.446
Swim
0.457
Circuit
0.250
Statistical conclusion
The same statistical conclusion applies as in (b) (i.e. a statistical association exists)
or stated differently: at least one population proportion is different.
Management conclusion
By an inspection of the row percentage table in (a), it can be concluded that
spinning and swimming are the most preferred gym activities of females,
while doing the circuit is the least preferred gym activity of females.
Exercise 10.11
(a)
File:
X10.11 - supermarket visits.xlsx
Categorical Frequency Table - Supermarket Visits
Visits
Daily
3 / 4 times
Twice
Once only
Total
Customers
36
55
62
27
180
Percent
20.0
30.6
34.4
15.0
100
Belief %
25
35
30
10
100
Interpretation (by inspection)
Once-a-week visits are the least common shopping behaviour (only 15%).
Only one-in-five shoppers (20%) shop daily.
The most common shopping pattern is either 3/4 times a week (30.6%) or twice a week (34.4%).
These two shopping behaviours represent 65% of all the sampled shoppers.
(b)
Goodness-of-fit test for an empirical distribution
H0: The frequency of store visits per week is as per the manager's belief.
H1: The frequency of store visits per week differs significantly from the manager's belief.
Region of Rejection
(Use α = 0.05 with degrees of freedom = (4-1) = 3)
χ²-crit =
χ²(0.05)(3) =
7.815
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 7.815
χ²-stat
Visits
Daily
3 / 4 times
Twice
Once only
Total
fo
36
55
62
27
180
Customers
%fe
25
35
30
10
100
fe
45
63
54
18
180
χ²-stat
1.8
1.016
1.185
4.5
8.501
Conclusion
Since χ²-stat = 8.501 > χ²-crit = 7.815, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
the shopping frequency of customers differs significantly from the manager's belief.
The nature of the relationship is described in (a) above.
(c)
Interpretation
The manager's belief is that customers shop more frequently during a week.
frequently than the manager believes. For example, the survey found that
only 51% shop more than 3 times per week, while the manager assumed that this
percentage was 60%. Similarly, more customers prefer to shop only once or twice a
week (49%) compared to the manager's belief that this percentage was only 40%).
These differences are however not strongly significantly different.
Exercise 10.12
File:
X10.12 - equity portfolio.xlsx
Goodness-of-fit test for an empirical distribution
H0: There is no change in the equity portfolio mix between 2008 and 2012.
H1: There is a significant change in the equity portfolio mix between 2008 and 2012.
Region of Rejection
(Use α = 0.05 with degrees of freedom = (4-1) = 3)
χ²-crit =
χ²(0.05)(3) = 7.815
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 7.815
χ²-stat
Equity
Mining
Industrial
Retail
Financial
Total
fo (2012)
900
1400
400
1800
4500
Equities
Ratio fe
2
3
1
4
10
fe (2008)
900
1350
450
1800
4500
χ²-stat
0
1.852
5.556
0
7.407
Conclusion
Since χ²-stat = 7.407 < χ²-crit = 7.815, there is insufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
there is no significant change in the equity portfolio mix of the investor
between 2008 and 2012.
The equity portfolio profile is essentially the same in 2012 as it was in 2008.
Exercise 10.13
(a)
File:
X10.13 - payment method.xlsx
Goodness-of-fit test for an empirical distribution
H0: There is no change in the payment method for electronic goods.
H1: There is a significant change in the payment method for electronic goods.
Region of Rejection
Use α = 0.05 with degrees of freedom = (3-1) = 2
χ²-crit =
χ²(0.05)(2) = 5.991
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 5.991
χ²-stat
Payment
Cash
Debit Card
Credit Card
Total
Payment Methods
fo
% fe
fe
23
41
46
35
49
70
42
110
84
100
200
200
χ²-stat
0.543
6.3
8.048
14.891
Conclusion
Since χ²-stat = 14.891 >> χ²-crit = 5.991, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
there is a significant shift in payment practices from the past.
(b)
Management Interpretation
There is a significant shift in payment practices for electronic goods.
There is more emphasis on credit card payment (55%) today than in the past (42%).
Exercise 10.14
(a)
File:
XS10.14 - package sizes.xlsx
Goodness-of-fit test for an empirical distribution
H0: Limpopo sales pattern follows the national sales pattern.
H1: Limpopo sales pattern does not follow the national sales pattern.
Region of Rejection
(Use α = 0.05 with degrees of freedom = (3-1) = 2)
χ²-crit =
χ²(0.05)(2) = 5.991
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 5.991
χ²-stat
Package
Large
Midsize
Small
Total
Package Size Sales
fo
Ratio fe
fe
3
190
162
5
250
270
2
100
108
10
540
540
χ²-stat
4.840
1.481
0.593
6.914
Conclusion
Since χ²-stat = 6.914 > χ²-crit = 5.991, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
the Limpopo sales pattern of cereal package sizes differs significantly from the
national sales pattern of package sizes sold.
(b)
Management Interpretation
The Limpopo sales patterns differs significantly from the national sales pattern
of package sizes sold. By an inspection of the Limpopo sales profile relative to the
national pattern, Limpopo tends to sell more large sized packages relative
to the national pattern.
Exercise 10.15
(a)
File:
X10.15 - compensation plan.xlsx
Column Percentages
Plan
Present
New
Total
Cape
62
38
100
Regions
Gauteng Free State
75.7
67.1
24.3
32.9
100
100
KZN
72.7
27.3
100
Total
70.8
29.2
100
Interpretation (by inspection)
When compared to the national population profile of all employees,
the present compensation plan enjoys more support in Gauteng, and the
least support in the Cape Province where the new plan is favoured more.
The evidence is however not overwhelming (i.e. the profile differences are not large)
(b)
Let πi = proportion of sales staff in favour of the present payment plan in each province i .
H0: π1 = π2 = π3 = π4
H1: At least one πi is different (i = 1,2,3,4)
Sample proportions
(c)
Region of Rejection
Decision rule
χ²-stat
Cape
0.62
Gauteng
0.757
Free State
0.671
KZN
0.727
Use α = 0.10 with degrees of freedom = (2-1)(4-1) = 3
χ²-crit =
χ²(0.10)(3) = 6.251
Reject H0 in favour of H1 if χ²-stat ≥ 6.251
Observed frequencies (fo)
Plan
Cape
Gauteng
Present
62
140
New
38
45
Total
100
185
Free State
47
23
70
KZN
80
30
110
Total
329
136
465
Expected frequencies (fe)
Plan
Cape
Gauteng
Present
70.75
130.89
New
29.25
54.11
Total
100
185
Free State
49.53
20.47
70
KZN
77.83
32.17
110
Total
329
136
465
Chi-Squared components
Plan
Cape
Gauteng
Present
1.0828
0.6337
New
2.6194
1.5330
Free State
0.1289
0.3119
KZN
0.0606
0.1466
χ²-stat =
6.5169
Conclusion
Since χ²-stat = 6.5169 > χ²-crit = 6.251 (marginally), there is sufficient sample evidence
at the 10% significance level to reject H0 in favour of H1. Therefore conclude that
the support for the present compensation plan is different in at least one of the provinces.
The nature of the relationship is described in (a) above.
(d)
Formulate as a test for independence of association between payment plan and province.
H0: There is no association between payment plan preference and province.
H1:
(e)
There is an association between payment plan preference and province.
New rejection region
χ²-crit =
Decision rule
Use α = 0.05 with degrees of freedom = (2-1)(4-1) = 3
χ²(0.05)(3) = 7.815
Reject H0 in favour of H1 if χ²-stat ≥ 7.815
New decision
Do not reject H0 at the 5% significance level,
since χ²-stat = 6.5169 < χ²-crit = 7.815.
New conclusion
There is no statistical association between payment plan preference and province.
(i.e. they are statistically independent) at the 5% level of significant.
The sample evidence is not strong enough (i.e. the sample proportion differences are
not great enough) to reject H0 in favour of H1 at the 5% level of significance.
Exercise 10.16
(a)
File:
X10.16 - tyre defects.xlsx
Row Percentages
Morning
Afternoon
Night
Total
Nature of defective tyre
technical mechanical
material
22.1
61.8
16.2
30.2
46.5
23.3
42.6
36.8
20.6
31.5
48.2
20.3
Total
100
100
100
100
Interpretation (by inspection)
When compared to the total production of defective tyres (total row %),
tyre defects due to mechanical problems tend to be more prevalent during morning shifts.
Technical defects however tend to be more prevalent during the night shift.
Thus there does appear to be an association between shift and nature of tyre defects.
(b)
Formulate as a test for independence of association between nature of defect and shift
H0: There is no association between nature of tyre defect and shift.
H1:
There is an association between nature of tyre defect and shift.
Region of Rejection
χ²-crit =
Use α = 0.05 with degrees of freedom = (3-1)(3-1) = 4
χ²(0.05)(4) =
9.488
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 9.488
χ²-stat
Observed frequencies (fo)
technical
mechanical
Morning
15
42
Afternoon
26
40
Night
29
25
Total
70
107
material
11
20
14
45
Total
68
86
68
222
Expected frequencies (fe)
technical
mechanical
Morning
21.44
32.77
Afternoon
27.12
41.45
Night
21.44
32.77
Total
70
107
material
13.78
17.43
13.78
45
Total
68
86
68
222
Chi-Squared components
technical
mechanical
Morning
1.9351
2.5967
Afternoon
0.0460
0.0508
Night
2.6646
1.8443
(c)
material
0.5622
0.3782
0.0034
χ²-stat =
10.0812
Conclusion
Since χ²-stat = 10.0812 > χ²-crit = 9.488, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
the nature of tyre defects produced is related to the shift on which the defects occur.
The nature of the relationship is described in (a) above.
(d)
Let πi = proportion of defective tyres caused by mechanical factors per shift i .
H0: π1 = π2 = π3
H1: At least one πi is different (i = 1,2,3)
The hypothesis test procedure is identical to (b) above.
The sample proportions being compared are:
Morning
0.618
Afternoon
0.465
Night
0.368
Conclusion
Since H0 is rejected in favour of H1 at the 5% significance level, it can be concluded that there
is at least one shift that has a different proportion of defective tyres due to mechanical factors.
Based on the row percentages table in (a) above, it is clear that the morning shift
produces a proportionally larger percentage of defective tyres due to mechanical
factors than the afternoon or night shifts.
Exercise 10.17
Histogram
Delays(min)
<5
5 - 7.5
7.5 - 10
10-12.5
12.5-15
15-17.5
Total
Count
0
8
28
29
13
2
80
Histogram of Flight Delay Times
35
29
28
30
No. of flights
(a)
X10.17 - flight delays.xlsx
File:
25
20
13
15
8
10
5
0
2
0
<5
5 - 7.5
7.5 - 10
10-12.5
12.5-15
15-17.5
Delay Intervals (minutes)
Interpretation
(b)
Flight delay times (in minutes) appear to be normally distributed.
Descriptive Statistics
Interpretation
flight delays
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
10.324
0.261
10.3
11
2.333
5.444
-0.550
0.084
10.3
5.3
15.6
825.9
80
The low skewness coefficient (0.084) indicates approximate normality.
(c) (i) Goodness-of-fit test for Normality
H0: Flight time delays (in minutes) follows a normal distribution with μ = 10.324 min and σ = 2.333 min.
H1: Flight time delays (in minutes) do not follow a normal distribution with μ = 10.324 min and σ = 2.333 min.
Region of Rejection
Use α = 0.01 with degrees of freedom = (7-2-1) = 4
χ²-crit =
χ²(0.01)(4) =
13.277
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 13.277
Delay Intervals
-∞ < x < 5
5 < x < 7.5
7.5 < x < 10
10 < x < 12.5
12.5 < x < 15
15 < x < 17.5
17.5 < x < +∞
fo
0
8
28
29
13
2
0
80
Expected frequencies using x ≡ N(10.324; 2.333) from Z-table
Normal probability intervals (x and z)
Probability
fe
P(x < 5)
0.90
P(z < -2.282)
0.01130
P(-2.282 < z < -1.21)
0.10180
P(5 < x < 7.5)
8.14
P(-1.21 < z < -0.139)
0.33120
P(7.5 < x < 10)
26.50
P(-0.139 < z < 0.933)
0.37950
P(10 < x < 12.5)
30.36
P(0.933 < z < 2.004)
0.15400
P(12.5 < x < 15)
12.32
P(2.004 < z < 3.076)
0.02117
P(15 < x < 17.5)
1.69
P(z > 3.076)
0.00103
P(x > 17.5)
0.08
1
80
χ²-stat =
Conclusion
Since χ²-stat = 1.2282 << χ²-crit = 13.277, there is insufficient sample evidence
at the 1% significance level to reject H 0 in favour of H1. Therefore conclude that
flight delay times (in minutes) follows a Normal distribution with μ = 10.324 min and σ = 2.333 min.
χ²-stat
0.9040
0.0025
0.0854
0.0609
0.0375
0.0554
0.0824
1.2282
(c) (ii)
P(x < 5)
P(5 < x < 7,5)
P(7,5 < x < 10)
P(10 < x < 12,5)
P(12,5 < x < 15)
P(15 < x < 17,5)
P(x > 17,5)
=NORMDIST(5,10.324,2.333,1)
=NORMDIST(7.5,10.324,2.333,1) - NORMDIST(5,10.324,2.333,1)
=NORMDIST(10,10.324,2.333,1) - NORMDIST(7.5,10.324,2.333,1)
=NORMDIST(12.5,10.324,2.333,1) - NORMDIST(10,10.324,2.333,1)
=NORMDIST(15,10.324,2.333,1) - NORMDIST(12.5,10.324,2.333,1)
=NORMDIST(17.5,10.324,2.333,1) - NORMDIST(15,10.324,2.333,1)
=1-NORMDIST(17.5,10.324,2.333,1)
NORMDIST() From Z-table
0.01124
0.01130
0.10181
0.10180
0.33172
0.33120
0.37974
0.37950
0.15297
0.15400
0.02147
0.02117
0.00105
0.00103
1
1
Exercise 10.18
(a)
File:
newspaper sections.xlsx
Two-way Pivot table of Gender by Newspaper Section Most Preferred to Read.
Gender
Female
Data
Count
Row %
Count
Row %
Male
Total Count
Total Row %
Section
Sport
14
23.3%
41
34.2%
55
30.6%
Social
28
46.7%
35
29.2%
63
35.0%
Business
18
30.0%
44
36.7%
62
34.4%
Grand Total
60
100%
120
100%
180
100%
Interpretation (by inspection)
Females tend to read the Social section most, with the Sports section read the least.
Males, alternatively, are marginally more interested in the Sport and Business sections.
These observational conclusions are, however, not overwhelmingly conclusive.
(b)
Stacked Bar Chart
Newspaper Section Read by Gender
120%
100%
80%
30.0%
36.7%
46.7%
29.2%
60%
Social
40%
20%
23.3%
0%
(c)
Business
Female
Sport
34.2%
Male
Formulate as a test for independence of association between gender and section read .
H0: There is no association between gender and the newspaper section most preferred.
H1 :
There is an association between gender and the newspaper section most preferred.
Region of Rejection
Decision rule
χ²-crit =
Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2
χ²(0.10)(2) = 4.605
Reject H0 in favour of H1 if χ²-stat ≥ 4.605
χ²-stat
Observed frequencies (fo)
Sport
Social
female
14
28
male
41
35
Total
55
63
Business
18
44
62
Total
60
120
180
Expected frequencies (fe)
Sport
Social
female
18.3
21.0
male
36.7
42.0
Total
55
63
Business
20.7
41.3
62
Total
60
120
180
Chi-Squared components
Sport
Social
female
1.0242
2.3333
male
0.5121
1.1667
Business
0.3441
0.1720
χ²-stat =
5.5525
Conclusion
Since χ²-stat = 5.5525 > χ²-crit = 4.605 (marginal), there is sufficient sample evidence
at the 10% significance level to reject H0 in favour of H1. Therefore conclude that
gender and the newspaper section most preferred are statistically associated.
The nature of the relationship is described in (a) above.
(d)
Let πi = proportion of females who most prefer each newspaper section i .
H0: π1 = π2 = π3
H1: At least one πi is different (i = 1,2,3)
The hypothesis test procedure is identical to (c) above.
The sample proportions being compared are:
Sport
Social
Business
0.255
0.444
0.290
Conclusion
Since H0 is rejected in favour of H1 at the 10% significance level, it can be concluded that
there is at least one newspaper section that females prefer differently to the other sections.
Based on the row percentages table in (a) above, it is clear that
females tend to read the Social section most, with the Sports section read the least.
Exercise 10.19
(a)
File:
X10.19 - vehicle financing.xlsx
One-way Pivot table of Car Loan Sizes
Count of Loan size
Loan size
Under 100
100 - <150
150 - <200
200 - <250
Above 250
Grand Total
Count of Loan size
Loan size
Under 100
100 - <150
150 - <200
200 - <250
Above 250
Grand Total
Total
18
58
110
70
44
300
Total
6%
19%
37%
23%
15%
100%
Interpretation (by inspection)
The most popular car loan size was between R150 000 and R200 000 (37% of all
applications) followed by car loan sizes of between R200 000 and R250 000 (23%).
Only 6% of loan applications were for amounts below R100 000.
(b)
Bar Chart of Car Loan Applications
0.4
0.35
0.3
0.25
0.2
0.37
0.15
0.1
0.05
0
0.23
0.19
0.15
0.06
Under 100
100 - <150
150 - <200
200 - <250
Above 250
Size of Loan Applications (R 1000)
(c)
Test for Goodness-of-Fit for an empirical distribution.
H0: There is no change in the size of car loan applications from four years ago.
H1: There is a significant shift in the size of car loan applications from four years ago.
(d)
Region of Rejection
Decision rule
χ²-stat
χ²-crit =
Use α = 0.05 with degrees of freedom = (5-1) = 4
χ²(0.05)(4) =
9.488
Reject H0 in favour of H1 if χ²-stat ≥ 9.488
Loan Size
< R100
R100 - R150
R150 - R200
R200 - R250
> R250
Total
fo
18
58
110
70
44
300
Payment Methods
% fe
10
20
40
20
10
100
fe
30
60
120
60
30
300
χ²-stat
4.800
0.067
0.833
1.667
6.533
13.900
Conclusion
Since χ²-stat = 13.9 > χ²-crit = 9.488, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
there has been a significant shift in the size of car loan applications from 4 years ago.
The shift has been towards larger car loan applications.
Exercise 10.20
(a)
X10.20 - milk products.xlsx
File:
Two-way Pivot table of milk type purchased and health-concious status of consumers
Question 1 Data
Fat-free
Count
Percent
Low fat
Count
Percent
Full cream Count
Percent
Total Count
Total Percent
Question 2
Yes
20
43.5%
15
32.6%
11
23.9%
46
100%
No
5
16.7%
10
33.3%
15
50.0%
30
100%
Grand Total
25
32.9%
25
32.9%
26
34.2%
76
100%
Interpretation (by inspection)
The sample evidence points strongly towards health-concious consumers purchasing more
fat-free dairy products, while non-health-concious consumers tend to purchase more
full cream dairy products.
(b)
Stacked Bar Chart
Milk Type Purchased and Health-Concious Status
80%
% of Consumers
70%
60%
50%
16.7%
33.3%
50.0%
40%
Yes
30%
20%
43.5%
10%
0%
No
Fat-free
32.6%
Low fat
23.9%
Full cream
Milk Categories
(c)
Formulate as a test for independence of association between milk type purchased and
health-concious status of consumers.
H0: There is no association between milk type purchased and health-concious status.
H1:
There is an association between milk type purchased and health-concious status.
Region of Rejection
(Use α = 0.05 with degrees of freedom = (3-1)(2-1) = 2)
χ²-crit =
χ²(0.05)(2) =
5.991
Decision rule
Reject H0 in favour of H1 if χ²-stat ≥ 5.991
χ²-stat
Observed frequencies (fo)
Yes
20
fat-free
15
low fat
11
full cream
46
Total
No
5
10
15
30
Total
25
25
26
76
Expected frequencies (fe)
Yes
fat-free
15.13
low fat
15.13
full cream
15.74
46
Total
No
9.87
9.87
10.26
30
Total
25
25
26
76
Chi-Squared components
Yes
1.566
fat-free
0.001
low fat
1.426
full cream
No
2.402
0.002
2.186
χ²-stat =
7.583
Conclusion
Since χ²-stat = 7.583 > χ²-crit = 5.991, there is sufficient sample evidence
at the 5% significance level to reject H0 in favour of H1. Therefore conclude that
there is a significant statistical association between the health-concious status
of a consumer and their preference for certain type of dairy milk products.
The nature of the relationship is described in (a) above.
(d)
Let πi = proportion of health-concious consumers who prefer each milk type i .
H0: π1 = π2 = π3
H1: At least one πi is different (i = 1,2,3)
The hypothesis test procedure is identical to (c) above.
The sample proportions being compared are:
fat-free
0.800
low-fat
0.600
full cream
0.423
Conclusion
Since H0 is rejected in favour of H1 at the 5% significance level, it can be concluded that
there is at least one milk type that is purchased by a different proportion of health-concious
consumers. Based on the row percentages table in (a) above, it is clear that health-concious
consumers tend to purchase more fat-free dairy products than full cream products.
CHAPTER 11
ANALYSIS OF VARIANCE
COMPARING MEANS ACROSS MULTIPLE POPULATIONS
Exercise 11.1
The purpose of one-factor Anova is to test for equality of means across
multiple (more than two) populations.
Exercise 11.2
Example: Compare the output performance of five identical machines.
Exercise 11.3
Variation between groups measures how similar or how different
(i.e. how close or how far apart) the sample means are from each other.
It is a measure of the level of influence of the treatment factor on the
response measure. Any differences can be attributed to (or explained by)
the influence of the treatment factor on the numeric response measure.
Exercise 11.4
SST =
25.5
Numerator degrees of freedom =
SSE =
204.6 - 25.5 = 179.1
Denominator degrees of freedom =
(4 - 1) = 3
(4x10 - 4) = 36
MST =
MSE =
25.5 / 3 =
179.1 / 36 =
8.5
4.975
F-stat =
8.5 / 4.975 =
1.7085
Exercise 11.5
F-crit =
F(0.05, 3, 36) =
2.866
Exercise 11.6
H 0 : μ 1 = μ2 = μ3 = μ4
In Excel, use
H1: At least one μi differs
Decision Rule: Do not reject H0 if F-stat ≤ F-crit .
Since F-stat = 1.7085 < F-crit = 2.866, do not reject H0.
Conclusion: All population means are equal.
=FINV(0.05,3,36)
Exercise 11.7
(a)
File:
X11.7 - car fuel efficiency.xlsx
Sample Average Fuel Consumption (l/100km)
Σxi
Peugot
32.4
ni
5
x(bar)i
6.48
Grand mean =
VW
29.3
4
7.325
96.1/14 =
Ford
34.4
5
6.88
6.864
litres / 100 km
Bar Chart of Fuel Efficiency Means
10
9
8
7
6
5
4
6.88
3
2
1
0
Car Types
(b)
7.325
6.48
One-Factor Anova
Peugot
6.48
VW
7.325
Ford
6.88
Factor =
Car Type ( 1 = Peugot; 2 = VW; 3 = Ford)
Response measure = Fuel Consumption (l/100km)
H0:
μ1 = μ2 = μ3
H1:
At least one μi differs (i = 1, 2, 3)
(Use α = 0.05 with df1 = 3-1 = 2 and df2 = 14-3 = 11)
Region of Rejection
F-crit =
F(0.05)(2,11) =
3.98
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 3.98
F-stat
SSW =
SSB =
SST =
(7-6.48)²+(6.3-6.48)²+(6-6.48)²+(6.4-6.48)²+(6.7-6.48)²+
(6.8-7.325)²+(7.4-7.325)²+(7.9-7.325)²+(7.2-7.325)²+
(7.6-6.88)²+(6.8-6.88)²+(6-4.88)²+(7-6.88)²+(6.6-6.88)²
= 2.0635
(6.48-6.864)²+(7.325-6.864)²+(6.88-6.864)²
= 1.5886
2.0635 + 1.5886 =
3.6521
Anova Table
Source of
SS
Variation
Between
1.5886
Within
2.0635
Total
3.6521
df
MS
F-stat
2
11
13
0.79432
0.18759
4.234
Conclusion
Since F-stat = 4.234 > F-crit = 3.98, there is sufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is at
least one motor vehicle type with a different average fuel consumption to the rest.
By inspection, it would appear that VW has an average fuel consumption that is significantly
different (higher, and hence least fuel efficient) from Peugot and Ford.
(c)
(Use α = 0.01 with df1 = 3-1 = 2 and df2 = 14-3 = 11)
New Region of Rejection
F-crit =
F(0.01)(2,11) =
7.21
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 7.21
New decision
Do not reject H0 at the 1% significance level,
since F-stat = 4.234 < F-crit = 7.21
New conclusion
There is no statistical evidence, at the 1% signficance level, to conclude that average
fuel consumption differs across motor vehicle types.
Note: The sample evidence must be more convincing (i.e. larger differences between
sample means) before one is prepared to reject the null hypothesis in favour of the
alternative hypothesis.
A level of significance of 1% indicates that the sample evidence is not strong
enough (meaningful differences) yet to reject the null hypothesis of equal means.
Exercise 11.8
(a)
File:
X11.8 - package design.xlsx
Assumption 1
Assumption 2
Equal population variances.
A normally distribution population for the response variable.
One-Factor Anova
Factor =
Package designs
(1 = A; 2 = B; 3 = C)
Response measure =
Carton sales (units)
Let μi =
population mean sales of a breakfast cereal packaged in design shape i
H0:
μ1 = μ2 = μ3
H1:
At least one μi differs (i = 1, 2, 3)
Region of Rejection
(Use α = 0.05 with df1 = 3-1 = 2 and df2 = 21-3 = 18)
F-crit =
F(0.05)(2,18) =
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 3.55
F-stat
3.55
(Refer to formulae in Chapter 11)
Sample evidence
Sample means
Grand mean
Design A
35.75
721/21 =
Design B
32.86
34.33
Design C
34.17
SSW =
(35-35.75)²+(37-35.75)²+(39-35.75)²+(36-35.75)²+(30-35.75)²+(39-35.75)²+(36-35.75)²+(34-35.75)²+
(35-32.86)²+(34-32.86)²+(30-32.86)²+(31-32.86)²+(34-32.86)²+(32-32.86)²+(34-32.86)²+
(38-34.17)²+(34-34.17)²+(32-34.17)²+(34-34.17)²+(34-34.17)²+(33-34.17)²
SSB =
(35.75-34.33)²+(32.86-34.33)²+(34.17-34.33)²
SST =
101.1905+31.4762 =
= 101.1905
= 31.4762
132.6667
Anova Table
Source of
Variation
Between
Within
Total
SS
df
MS
F-stat
31.4762
101.1905
132.6667
2
18
20
15.7381
5.6217
2.7995
Conclusion
Since F-stat = 2.7995 < F-crit = 3.55, there is insufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is
no difference in the mean volume of sales across the 3 package designs.
(b)
Recommendation
There is no strong statistical evidence to conclude that sales volumes differ across the
three package designs. All are likely to generate the same average sales. Therefore the cereal
producer can choose any of the three package designs for their new muesli cereal.
Exercise 11.9
(a)
File:
One-Factor Anova
Let μi =
Factor = Bank
Response measure =
(1 = X; 2 = Y; 3 = Z)
Rating score (1 to 10)
population mean service rating score for bank i
H0:
μ 1 = μ2 = μ3
H1:
At least one μi differs (i = 1, 2, 3)
Region of Rejection
(Use α = 0.10 with df1 = 3-1 = 2 and df2 = 27-3 = 24)
F-crit =
F(0.10)(2,24) =
2.538
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 2.538
F-stat
Sample means
Grand mean
SSB =
SST =
=FINV(0.1,2,24)
(Refer to the formulae in Chapter 11)
Sample evidence
SSW =
X11.9 - bank service.xlsx
Bank X
6.875
169/27 =
Bank Y
5.778
6.259
Bank Z
6.2
(8-6.875)²+(6-6.875)²+...+(5-5.778)²+(6-5.778)²+…+(8-6.2)²+(7-6.2)²+…+(6-6.2)²
= 24.0306
(6.875-6.259)²+(5.778-6.259)²+(6.2-6.259)²
= 5.1546
24.0306 + 5.1546 =
29.1852
Anova Table
Source of
Variation
Between
Within
Total
SS
df
MS
F-stat
5.1546
24.0306
29.1852
2
24
26
2.5773
1.0013
2.574
Conclusion
Since F-stat = 2.574 > F-crit = 2.538, there is sufficient sample evidence at the
10% significance level to reject H0 in favour of H1. Therefore conclude that there is
at least one bank that has a different mean service rating score to the other banks.
By inspection, it would appear that Bank X has a significantly higher mean
service rating score than the other two banks.
(b)
(Use α = 0.05 with df1 = 3-1 = 2 and df2 = 27-3 = 24)
New Region of Rejection
F-crit =
F(0.05)(2,24) =
3.40
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 3.40
Do not reject H0 at the 5% significance level,
New decision
since F-stat = 2.574 < F-crit = 3.40
New conclusion
There is no statistical evidence, at the 5% signficance level, to conclude that the
mean service ratings is different across the three banks.
The three banks are perceived similarly by customers in terms of their service levels.
Note:
The reason for the change in conclusion between (a) and (b) is that the
statistical evidence is only weak (i.e. small differences in sample means) at
the 10% significance level, while it the 5% significance level, it is not seen as
strong enough (i.e. differences are not large enough) to reject the null hypothesis.
Exercise 11.10
File:
One-Factor Anova
Let μi =
X11.10 - shelf height.xlsx
Factor = Shelf Height (1 = Bottom; 2 = Waist; 3 = Shoulder; 4 = Top)
Response measure =
Sales volume (units sold)
population mean sales of a drinking chocolate product displayed at shelf height i
H0:
μ1 = μ2 = μ3 = μ4
H1:
At least one μi differs (i = 1, 2, 3, 4)
Region of Rejection
(Use α = 0.05 with df1 = 4-1 = 3 and df2 = 30-4 = 26)
F-crit =
F(0.05)(3,26) =
2.990
Decision rule
Reject H0 in favour of H1 if F-stat ≥ 2,990
F-stat
(Refer to the formulae in Chapter 11)
Sample evidence
Sample means
Grand mean
SSW =
SSB =
SST =
Bottom
76.143
2375/30 =
Waist
81.375
79.167
Shoulder
82.444
Top
74.833
(78-76.143)²+….+(78-81.375)²+….+(83-82.444)²+….+(69-74.833)²+…+(75-74.833)²
= 727.788
(76.143-79.167)²+(81.375-79.167)²+(82.444-79.167)²+(74.833-79.167)²
= 312.379
727.788 + 312.379 =
1040.167
Anova Table
Source of
Variation
Between
Within
Total
SS
312.379
727.788
1040.167
df
3
26
29
MS
104.126
27.992
F-stat
3.720
Conclusion
Since F-stat = 3.72 > F-crit = 2.99, there is sufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is
at least one shelf height that generates a different mean level of sales to the other shelves.
By inspection, it would appear that shoulder and waist high shelves generate higher average
sales of the drinking chocolate product than bottom or top shelves.
Exercise 11.11
(a)
File:
One-Factor Anova
X11.11 - machine evaluation.xlsx
Factor =
Labelling Machine (1 = A; 2 = B; 3 = C)
Response measure =
Processing time (in minutes)
population mean processing time for shaping and labelling machine i
Let μi =
H0:
μ1 = μ2 = μ3
H1:
At least one μi differs (i = 1, 2, 3)
Anova: Single Factor
SUMMARY
Groups
Machine A
Machine B
Machine C
Count
5
5
5
ANOVA
Source of Variation
Between Groups
Within Groups
SS
28.933
52.8
Total
81.733
Sum
59
56
72
Average
11.8
11.2
14.4
Variance
3.7
5.7
3.8
df
MS
14.467
4.4
F-stat
3.288
2
12
P-value
0.0727
F crit
2.8068
14
Conclusion
Since F-stat = 3.288 > F-crit = 2.8068, there is sufficient sample evidence at the
10% significance level to reject H0 in favour of H1. Therefore conclude that there is at least
machine that has a different mean processing time.
By inspection, machine C has a significantly longer mean processing time than either
machines A and B. Machine C must not be considered for purchase.
(b)
Two-Sample Test of Mean Processing Times between Machines A (1) and B (2)
H0:
μ1 = μ2
(two tailed test)
H1:
μ1 ≠ μ2
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Machine A Machine B
11.8
11.2
3.7
5.7
5
5
4.7
0
8
0.4376
0.3366
1.3968
t-crit (0.10, 8) =
0.6733
=TINV(0.1,8)
1.8595
1.8595
Conclusion
Since t-stat = 0.4376 lies with the region of non-rejection of H 0 (i.e. within ± 1.8595),
the sample evidence is not strong enough to reject H 0 in favour of H1.
The population mean processing times between the two machines A and B are
likely to be identical. Thus the company can purchase either machine A or machine B.
(c)
Recommendation
Based on the statistical evidence in (a) and (b), the company can purchase either
machine A or machine B - they are likely to be equally efficient in operation.
Exercise 11.12
(a)
File:
One-Factor Anova
Let μi =
H0:
H1:
(b)
X11.12 - earnings yield.xlsx
Factor =
Economic Sector
(1 = Financial; 2 = Retail; 3 = Industrial; 4 = Mining)
Response measure = Earnings yield (%)
population mean earnings yield per economic sector i
μ 1 = μ 2 = μ 3 = μ4
At least one μi differs (i = 1, 2, 3, 4)
Anova: Single Factor
SUMMARY
Groups
Industrial
Retail
Financial
Mining
Count
20
20
20
20
Sum
96.1
94.4
113.2
111
Average
4.81
4.72
5.66
5.55
Variance
1.091
1.492
1.588
2.067
ANOVA
Source of Variation
Between Groups
Within Groups
SS
14.3894
118.5195
df
MS
4.7965
1.5595
F-stat
3.0757
Total
132.9089
3
76
P-value
0.0326
F crit
2.7249
79
Conclusion
Since F-stat = 3.076 > F-crit = 2.725, there is sufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is
at least one sector with a different mean earnings yield relative to the other sectors.
(c)
Interpretation
By inspection, the Industrial and Retail sectors each appear to have significantly lower
mean earnings yields relative to the Financial and Mining sectors' mean earnings yields.
Exercise 11.13
(a)
File:
One-Factor Anova
Let μi =
(b)
X11.13 - advertising strategy.xlsx
Factor =
Advertising strategy
(1 = Sophisticated; 2 = Athletic; 3 = Trendy)
Response measure =
Sales (units of cans)
H0:
population mean level of sales achieved under each advertising strategy i
μ1 = μ2 = μ3
H1:
At least one μi differs (i = 1, 2, 3)
Anova: Single Factor
SUMMARY
Groups
Sophisticated
Athletic
Trendy
Count
20
20
20
Sum
8380
7241
8030
Average
419
362.05
401.5
Variance
5075.579
6201.313
3781.737
ANOVA
Source of Variation
Between Groups
Within Groups
SS
34039.03
286113.95
df
2
57
MS
17019.52
5019.54
F-stat
3.391
Total
320152.98
59
P-value
0.0406
Conclusion
Since F-stat = 3.391 > F-crit = 3.159, there is sufficient sample evidence at the 5% level of
significance to reject H0 in favour of H1. Therefore conclude that there is at least one
advertising strategy that results in a different mean level of deodorant sales relative
to the other strategies.
(c)
Interpretation and recommendation
By inspection, the Athletic advertising strategy resulted in the lowest mean sales of all
three advertising strategies. On average, the Sophisticated and Trendy strategies
appear to be equally effective (the difference in sample means does not appear significant).
Recommendation:
Either the Trendy or the Sophisticated strategy can be adopted.
F-crit
3.159
(d)
Two-Sample Test of Means between the Sophisticated (1) and Trendy (2) Strategies
H0:
μ1 = μ2
(two-tailed test)
H1:
μ1 ≠ μ2
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Sophisticated
Trendy
419
401.5
5075.579 3781.737
20
20
4428.6579
0
38
0.8316
0.2054
1.6860
0.4108
2.0244
t-crit (0.05,38) =
=TINV(0.05,38)
2.0244
Conclusion
Since t-stat = 0.8316 lies within the region of non-rejection of H0 (i.e. ± 2.0244), there is
insufficient sample evidence to reject H0 in favour of H1.
Therefore the population mean sales from each of the two strategies is likely to be identical.
The two strategies are therefore equally effective and either can be adopted by the company.
(e)
Recommendation
The Athletic strategy can be discarded. It appears the least effective.
The remaining two strategies (Sophisticated and Trendy) are equally effective and
therefore either can be adopted by the company to promote its new ladies deodorant.
Exercise 11.14
(a)
X11.14 - leverage ratio.xlsx
File:
Descriptive Statistics
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence Level(95.0%)
Technology
73.83
1.697
72.5
60
9.296
86.420
-1.309
0.010
28
60
88
2215
30
3.471
Construction
78.07
1.912
78.5
80
10.471
109.651
-0.546
0.051
40
58
98
2342
30
3.910
Banking
Manufacturing
69.73
2.527
68
68
13.841
191.582
-0.453
0.103
58
41
99
2092
30
5.168
76.37
2.347
79.5
81
12.853
165.206
0.495
-0.857
50
44
94
2291
30
4.799
Interpretation
The mean leverage ratio is lowest for the banking sector and highest for the
construction and the manufacturing sectors.
These differences do appear to be significant.
(b)
One-Factor Anova
Let μi =
Factor =
Economic Sector
(1 = Technology; 2 = Construction; 3 = Banking; 4 = Manufacturing)
Response measure =
Leverage ratio
H0:
population mean leverage ratio per economic sector i
μ1 = μ2 = μ3 = μ 4
H1:
At least one μi differs (i = 1, 2, 3, 4)
Anova: Single Factor
SUMMARY
Groups
Technology
Construction
Banking
Manufacturing
Count
30
30
30
30
ANOVA
Source of Variation
Between Groups
Within Groups
SS
1181.13
16032.87
Total
17214
Sum
2215
2342
2092
2291
df
3
116
119
Average
73.83
78.07
69.73
76.37
Variance
86.42
109.65
191.58
165.21
MS
393.711
138.214
F-stat
2.849
P-value
0.0406
F crit
2.683
Conclusion
Since F-stat = 2.849 > F-crit = 2.683 (marginally), there is sufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is
at least one sector with a different mean leverage ratio relative to the other sectors.
By inspection, the banking sector has the lowest mean leverage ratio, while construction
and manufacturing appear to have similarly high mean leverage ratios.
Recommendation
(c)
The investor is advised to consider either the banking sector (with the lowest
mean leverage ratio) or the technology sector (with a marginally higher mean
leverage ratio). (This difference may not be statistically significant).
Two-Sample Test of Means - Leverage ratios between Technology (1) and Banking (2) sectors.
H0:
μ1 = μ2
(two-tailed test)
H1:
μ1 ≠ μ2
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Technology
73.833
86.420
30
139.001
0
58
1.347
0.092
1.672
0.183
2.002
Banking
69.733
191.582
30
t-crit (0.05, 58) =
=TINV(0.05,58) 2.002
Conclusion
Since t-stat = 1.347 lies within the region of non-rejection of H 0 (i.e. within ±2.002), the sample evidence is
not strong enough to reject H 0 in favour of H1. The population mean leverage ratios between the
Technology and the the Banking sector are therefore likely to be equal.
(d)
Recommendation
Based on the statistical evidence in (c) and (d), an investor is advised to consider either
the Technology sector or the Banking sector. Their mean leverage ratios are likely to be equal.
Since both sectors offer an investor the same lower risk, either or both can be chosen for investment.
Exercise 11.15
(a)
File:
Descriptive Statistics
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence Level(95.0%)
On-the-Job
9
0.082
9
9
0.327
0.107
0.109
0.472
1.2
8.5
9.7
144
16
0.174
Lecture
8.75
0.112
8.65
8.5
0.420
0.177
0.339
0.759
1.5
8.2
9.7
122.5
14
0.243
X11.15 - training methods.xlsx
Role Play
9.2
0.094
9.2
9.6
0.325
0.105
-1.420
0.229
0.900
8.8
9.7
110.4
12
0.206
Audio-Visual
8.85
0.114
8.9
8.9
0.426
0.181
3.419
-1.522
1.7
7.7
9.4
123.9
14
0.246
Interpretation
The differences in mean performance scores appear marginal across the four
different training methods, with on-the-job training and role-play having the
highest average scores.
(b)
One-Factor Anova
Let μi =
Factor =
Training Method
(1 = On-the-Job; 2 = Lecture; 3 = Role Play; 4 = Audio-Visual)
Response measure =
Performance score (1 - 10)
H0:
population mean performance score per training method i
μ1 = μ2 = μ3 = μ4
H1:
At least one μi differs (i = 1, 2, 3, 4)
Anova: Single Factor
SUMMARY
Groups
On-the-Job
Lecture
Role Play
Audio-Visual
Count
16
14
12
14
Sum
144
122.5
110.4
123.9
ANOVA
Source of Variation
Between Groups
Within Groups
SS
1.487
7.41
df
3
52
Total
8.897
55
Average
9
8.75
9.2
8.85
MS
0.4957
0.1425
Variance
0.107
0.177
0.105
0.181
F-stat
3.479
P-value
0.0223
F-crit
2.7826
Conclusion
Since F-stat = 3.479 > F-crit = 2.7826, there is sufficient sample evidence at the
5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one
training method with a different mean performance score relative to the other training methods.
By inspection, the lecture and audio-visual are the least effective (lower mean scores),
while on-the-job and the role-play methods are more effective (with higher mean scores).
Recommendation
The training manager is advised to consider either on-the-job training or use
role play methods (The difference does not appear to be statistically significant).
(c)
Two-Sample Test of Means
Performance scores between On-the-Job (1) and Role Play (2) methods.
H0:
μ1 = μ2
(two-tailed test)
H1:
μ1 ≠ μ2
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
On-the-Job
Role Play
9
9.2
0.10666667 0.105454545
16
12
0.10615385
0
26
-1.6074361
0.06001825
1.7056179
0.1200365
t-crit (0.05,26) =
2.05552942
=TINV(0.05,26)
2.0555
Conclusion
Since t-stat = -1.6074 lies within the region of non-rejection of H 0, (i.e. within ±2.0555) the sample
evidence is not strong enough to reject H0 in favour of H1. The population mean performance scores
between the two training methods is likely to be the same.
(d)
Recommendation
Based on the statistical evidence in (b) and (c), the HR manager is advised to select either
on-the-job training or role play methods of training.
Both are likely to produce similar high mean performance scores.
Exercise 11.16
In one-factor ANOVA, only one categorical factor is used to explain
possible differences between the observed sample means.
In two-factor ANOVA, two categorical factors are used to explain
possible differences between the observed sample means.
Exercise 11.17
Response variable (y)
Factor 1 (x1)
numeric, ratio-scaled
categorical (nominal / ordinal-scaled)
Factor 2 (x2)
categorical (nominal / ordinal-scaled)
Exercise 11.18
Interaction term means that different combinations of the levels from the two factors
can have different influences (effects) on the numeric response variable.
Exercise 11.19
Interaction plot:
It visually displays the nature and strength of the interaction
effect of combinations of factor levels on the dependent response variable.
It is constructed from the sample means of the various combinations
of the different factor levels.
Exercise 11.20
(a) - (f)
Two-factor ANOVA with Interaction Table
Source of Variation
(g)
df
SS
MS
F-stat
p -value
F -crit
Factor A: PC O.S.
2
68
34
6.8
0.0025
5.08
Factor B: Laptop
3
42
14
2.8
0.0499
4.22
Interaction (AxB)
6
132
22
4.4
0.0013
3.20
Error (Residual)
Total
48
59
240
482
5
Factor A is statistically signficant (at α = 0.01) since F-stat (6.8) > F-crit (5.08).
Alternatively, its p-value (0.0025) < α (0.01).
Factor B is not statistically signficant (at α = 0.01) since F-stat (2.8) < F-crit (4.22).
Alternatively, its p-value (0.0499) > α (0.01).
The interaction effect between factor A and factor B is statistically signficant (at α = 0.01)
since F-stat (4.4) > F-crit (3.20).
Alternatively, its p-value (0.0013) < α (0.01).
XS11.21
Exercise 11.21
(a) (i)
File:
X11.21 - sales ability.xlsx
Factor: Qualifications
H0: μ(b) = μ(a) = μ(s)
Factor: Experience
H0: μ(under 3) = μ(over 3)
H1: At least one μi differs (I = b, a, s)
H1: At least one μi differs (i = under 3, over 3)
Factor: Interaction
H0: No interaction effect
H1: There is an interaction effect
Anova: Two-Factor With Replication
SUMMARY
Business
(Data Analysis - Excel )
Arts
Social Science
Total
Under 3 years
Count
Sum
Average
Variance
10
10
400.1
376.6
40.01
37.66
61.31656 23.07156
10
30
325.9
1102.6
32.59 36.7533333
14.56766667 40.628092
10
10
446.2
357.6
44.62
35.76
28.37956 92.68711
10
30
415.8
1219.6
41.58 40.6533333
42.13066667 64.626023
20
20
846.3
734.2
42.315
36.71
48.08029 55.78305
20
741.7
37.085
48.12555263
Over 3 years
Count
Sum
Average
Variance
Total
Count
Sum
Average
Variance
ANOVA
Source of Variation
Sample (Experience)
Columns (Qualifications)
Interaction (Exper x Qualif)
Within
Total
SS
228.15
392.73
300.26
2359.38
3280.52
df
1
2
2
54
59
MS
228.150
196.365
150.131
43.692
F-stat
p-value
5.222
0.0263
4.494
0.0157
3.436
0.0394
F-crit
4.020
3.168
3.168
(a) (ii)
Factor: Qualifications
Since F-stat (4.494) > F-crit (3.168), reject H0 and conclude that qualifications
is a statistically significant factor (α = 0.05)
Business graduates generate significantly higher average sales (42.32)
than graduates with Arts (36.71) or Social Science (37.09) degrees.
(a) (iii)
Factor: Experience
Since F-stat (5.22) > F-crit (4.02), reject H0 and conclude that experience
is a statistically significant factor (α = 0.05)
Graduates with over 3 years experience generate significantly higher average sales (40.65)
than graduates with less than 3 years experience (36.75).
(a) (iv)
Factor: Interaction effect
Since F-stat (3.436) > F-crit (3.168), reject H0 and conclude that there is a significant
interaction effect between experience and qualifications on graduates' sales performance (α = 0.05)
Business graduates with over 3 years experience perform the best (44.62) while social science
graduates with with less than 3 years experience perform the worst (32.59).
(b)
Summary of Sample Means Table
Under 3 years
Over 3 years
Business
40.01
44.62
Arts
37.66
35.76
Social Science
32.59
41.58
Page 19
XS11.21
Chart Title
Interaction Plot
46
44
42
40
38
36
34
32
30
Business
Arts
Under 3 years
Social Science
Over 3 years
See (a) (iv) for an interpretation of the interaction plot.
(c)
Recommendation to HR manager:
Recruit predominantly business and social science graduates with over 3 years of work experience.
If Arts graduates are employed, they must be given intensive marketing training.
Page 20
Exercise 11.22
(a) (i)
File:
X11.22 - dropped calls.xlsx
Random variable: % of daily dropped calls per network switch and transmission type
Factor 1 = Switch type (SW1, SW2, SW3, SW4)
Factor 2 = Transmission type (Voice, Data bundles)
Management Question: Is the % of daily dropped calls the same across all network switching
devices and / or transmission types
Factor: Switch type
H0: μ(1) = μ(2) = μ(3) = μ(4)
Factor: Transmission type
H0: μ(voice) = μ(data)
H1: At least one μi differs (i = 1,2,3,4)
H1: At least one μi differs (i = voice, data)
Factor: Interaction
H0: No interaction effect
H1: There is an interaction effect
Anova: Two-Factor With Replication
(Data Analysis - Excel )
SUMMARY
SW2
SW1
SW3
SW4
Total
Voice
Count
Sum
Average
Variance
8
8
8
8
32
7.76
4.35
4.69
6.39
23.19
0.97 0.54375 0.58625 0.79875 0.724688
0.098657 0.081884 0.086227 0.071155 0.106645
Data
Count
Sum
Average
Variance
8
8
8
8
32
5.83
6.82
8.16
7.1
27.91
0.72875
0.8525
1.02
0.8875 0.872188
0.047984 0.078707 0.057171 0.067279 0.067818
Total
Count
Sum
Average
Variance
16
16
16
16
13.59
11.17
12.85
13.49
0.849375 0.698125 0.803125 0.843125
0.083953 0.100363 0.11709 0.066703
ANOVA
Source of Variation
Sample (Transmission)
Columns (Switch)
Interaction (Trans x Switch)
Within
Total
SS
0.3481
0.234819
1.050075
4.12345
5.756444
df
MS
F-stat
1
0.3481 4.727498
3 0.078273 1.063014
3 0.350025 4.753641
56 0.073633
63
P-value
F-crit
0.033926 4.012973
0.372175 2.769431
0.005055 2.769431
(a) (ii)
Factor: Switching device
Since F-stat (1.063) < F-crit (2.769), do not reject H0 and conclude that switch devices are not statistically
significant (α = 0.05). Hence all switch devices are likely to have the same average dropped call rate.
(a) (iii)
Factor: Transmission
Since F-stat (4.73) > F-crit (4.01), reject H0 and conclude that transmission type
is a statistically significant factor (α = 0.05)
Data transmission lead to a higher average dropped call rate (0.872) than voice transmission (0.725).
(a) (iv)
Factor: Interaction effect
Since F-stat (4.75) > F-crit (2.769), reject H0 and conclude that there is a significant
interaction effect between switch devices and transmission type. (α = 0.05)
SW2 and SW3, transmitting voice (0.544 and 0.586), are likely to have the lowest average dropped call rate,
while SW1 transmitting voice (0.97) and SW3 transmitting data (1.02), are likely to have the
highest average dropped call rate.
(b)
Summary of Sample Means Table
SW1
0.970
0.729
Voice
Data
SW2
0.544
0.853
SW3
0.586
1.020
SW4
0.799
0.888
Chart Title
1.100
1.000
1.020
0.970
0.900
0.888
0.853
0.800
0.799
0.729
0.700
0.600
0.586
0.544
0.500
0.400
SW1
SW2
SW3
Voice
SW4
Data
See (a) (iv) for an interpretation of the interaction plot.
(c)
Recommendation to the chief engineer:
For voice transmissions, use switching devices 2 and 3;
For data transmission, use only switching device 1.
Investigate the high % dropped calls rates of switching device 3 (for data transmission)
and switching device 4 for both transmission types.
Exercise 11.23
(a)
File:
X11.23 - rubber wastage.xlsx
Random variable: % rubber wastage per week
Factor: Machine (TAM)
H0: μ(1) = μ(2) = μ(3)
Factor: Tyre
H0: μ(R) = μ(B)
H1: At least one μi differs (i = 1, 2, 3)
H1: At least one μi differs (i = Radial, Bias)
Factor: Interaction
H0: No interaction effect
H1: There is an interaction effect
Anova: Two-Factor With Replication
(Data Analysis - Excel )
SUMMARY
TAM2
TAM1
TAM3
Total
Radial
Count
Sum
Average
Variance
5
19.71
3.942
3.96097
5
33.52
6.704
6.56833
5
15
19.44
72.67
3.888 4.844667
3.44237 5.844455
5
13.47
2.694
1.35283
5
38.6
7.72
9.83965
5
15
43.77
95.84
8.754 6.389333
8.96443 13.26548
Bias
Count
Sum
Average
Variance
Total
Count
Sum
Average
Variance
10
10
10
33.18
72.12
63.21
3.318
7.212
6.321
2.794329 7.579173 12.09134
ANOVA
Source of Variation
Sample (Tyres)
Columns (TAMS)
Interaction (Tyre x TAM)
Within
Total
SS
17.89496
83.25042
47.77433
136.5143
285.434
df
1
2
2
24
29
MS
F-stat
p-value
F-crit
17.89496 3.146037 0.088802 4.259677
41.62521 7.317951 0.003301 3.402826
23.88716
4.1995
0.0273 3.402826
5.688097
Statistical and Management Conclusions
Since F-stat (3.15) < F-crit (4.26), do not reject H0 and conclude that tyre types is not
Factor: Tyres
a statistically significant factor (α = 0.05)
Regardless of TAM used, average rubber wastage is the same across both tyre types produced.
Since F-stat (7.318) > F-crit (3.403), reject H0 and conclude that TAM used
Factor: TAMs
is a statistically significant factor (α = 0.05)
Regardless of tyre type produced, average rubber wastage is likely to be lowest on TAM1 (3.318).
compared to TAM2 (7.212) and TAM3 (6.321).
Since F-stat (4.199) > F-crit (3.403), reject H0 and conclude that there is a significant
Factor: Interaction effect
interaction effect between tyre type produced and TAM used (α = 0.05)
Average rubber wastage for bias tyres is lowest on TAM1 (2.694) but highest on TAM3 (8.754).
By contrast, average rubber wastage of radial tyres is lowest on TAM3 (3.888) and highest on TAM2 (6.704).
(b)
Interaction Plot
Radial
Bias
TAM1
3.942
2.694
TAM2
6.704
7.720
TAM3
3.888
8.754
Chart Title
10.000
9.000
8.000
7.000
6.000
5.000
4.000
3.000
2.000
1.000
0.000
8.754
7.720
6.704
3.942
3.888
2.694
TAM1
TAM2
Radial
See (a) for an interpretation of the interaction plot.
TAM3
Bias
(c)
Recommendation to the production manager:
TAM1 machine is the most efficient in minimising wastage for both tyre types (purchase more of TAM1).
Allocate Radial tyres manufacturing to TAM3 (it has the minimum % wastage of all three machines on Radial tyres).
Investigate the high % wastage of TAM3 for Bias tyres and TAM2 for both tyre types.
Possibly replace TAM2 type machines with TAM1 type machines.
CHAPTER 12
SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
Exercise 12.1
Regression analysis defines the structural relationship between two numeric
variables as a mathematical equation (usually a straight line equation).
Its purpose is to use the equation for estimation / prediction purposes.
Correlation analysis measures the strength of the relationship between the
two numeric variables used in the regression equation.
Exercise 12.2
Dependent variable, y
Exercise 12.3
An independent variable, x , is used as a predictor of the dependent variable.
Exercise 12.4
Scatter plot
Exercise 12.5
Method of Least Squares
Exercise 12.6
Strong inverse linear relationship
Exercise 12.7
H0: ρ = 0
H1: ρ ≠ 0
Degrees of freedom = 18 - 2 = 16
t-crit = t(0.05, 16) =
(Use =TINV(0.05,16) in Excel )
2.12
Decision Rule: Accept H0 if -2.12 ≤ t-stat ≤ +2.12
(0.42) * √[(18 - 2)/(1 - 0.422)] =
t-stat =
1.851
Conclusion: Do not reject H0.
There is no statistically
significant relationship between x and y .
Exercise 12.8
(a)
File:
X12.8 - training effectiveness.xlsx
Scatter plot - Training hours versus Productivity
Scatter Plot - Training hours vs Productivity
80
output (units)
70
60
50
40
30
20
10
15
20
25
30
35
40
45
hours of training
Interpretation
There is a strong positive relationship between hours of training
and worker output. The more the training received, the higher the output.
(b)
n = 10
Σ
Training (x)
20
36
20
38
40
33
32
28
40
24
311
Coefficients
Output (y)
40
70
44
56
60
48
62
54
63
38
535
b1 =
b0 =
Thus
x²
400
1296
400
1444
1600
1089
1024
784
1600
576
10213
xy
800
2520
880
2128
2400
1584
1984
1512
2520
912
17240
y²
1600
4900
1936
3136
3600
2304
3844
2916
3969
1444
29649
(10*17240-(311*535))/(10*10213-(311²))
1.112
(535 - 1.112*311)/10
18.917
ŷ = 18.917 + 1.112 x
where 20 ≤ x ≤ 40
(c)
Correlation coefficient (r) =
(17240-(311*535)/10)/√((10213-(311²)/10)*(29649-(535²)/10))
= 0.8072
Coefficient of Determination (r²) =
0.8072² =
0.6516
Variation in training hours can explain 65.16% of the variability in worker output.
This is a high level of explained variation. Hence training input is very beneficial
to worker output and the training programmes should be continued.
(d)
ŷ =18.917+1.112*25
x = 25
= 46.717 units
A worker with 25 hours of training can be expected to produce 46.72 units of output, on average.
Exercise 12.9
(a)
File:
X12.9 - capital utilisation.xlsx
Scatter plot - Earnings Yield versus Inventory Turnover
Scatter Plot
Earnings Yield vs Inventory Turnover
18
earnings yield
16
14
12
10
8
6
4
2
0
0
1
2
3
4
5
6
7
8
9
inventory turnover
Interpretation
There is a strong positive relationship between inventory turnover and earnings yield.
As inventory turnover increases, earnings yields also increases.
(b)
n =9
Σ
Inv t/o (x)
3
5
4
7
6
4
8
6
5
48
Coefficients
Thus
(c)
Correlation coefficient (r) =
E.Y. (y)
10
12
8
13
15
10
16
13
10
107
x²
9
25
16
49
36
16
64
36
25
276
xy
30
60
32
91
90
40
128
78
50
599
y²
100
144
64
169
225
100
256
169
100
1327
b1 =
(9*599-(48*107))/(9*276-(48²))
1.4167
b0 =
(107-1.4167*48)/9
4.333
ŷ = 4.333 + 1.4167 x
where 3 ≤ x ≤ 8
(599-(48*107)/9)/√((276-(48²)/9)*(1327-(107²)/9))
= 0.8552
There is a strong positive linear association between inventory turnover and earnings yield.
Thus the business analyst's view is supported by the strong sample evidence.
(d)
Coefficient of Determination (r²) =
0.8552² =
0.7313
i.e. 73.13%
Inventory turnover (capital utilisation) can explain 73.12% of the variability in a company's
earnings yield. This is a high level of explained variation. Hence inventory turnover has been
shown to have a significant direct effect on a company's earnings yield.
Yes, the regression equation can by used with confidence to estimate earnings yield
based on a company's level of inventory turnover.
(e)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t-crit =
t(α=0.05,df = 7) =
± 2.364
-2.364 ≤ t-stat ≤ +2.364
t-stat =
(0.8552)*√((9-2)/(1-(0.8552²))) =
2.364624
4.3655
Conclusion
Since t-stat (4.3655) lies outside the area of acceptance for H 0,there is
sufficient sample evidence at the 5% significance level to reject H0 in favour of H1.
Conclude that there is a strong positive association between inventory turnover
and earnings yield.
(f)
Expected earnings yield for x = 6
ŷ = 4.333 + 1.4167(6) =
12.833%
A company with an expected inventory turnover of 6 next year can expect to achieve
an earnings yield of 12.833%.
Exercise 12.10
File:
X12.10 - loan applications.xlsx
(a)
Dependent variable (y) =
No. of Loan applications received
Independent variable (x) =
interest rate (%)
Interest rate (%) is assumed to influence the number of loan applications received.
(b)
Scatter plot
no. loan applications received
Scatter Plot
Loan applications Received vs Interest Rate (%)
35
30
25
20
15
10
4
5
6
7
8
9
10
interest rate (%)
Interpretation
A moderate to strong negative linear relationship between interest rate and
number of loan applications received is observed from the scatter plot.
(c)
n = 11
Σ
Correlation coefficient
Int rate % (x)
7
6.5
5.5
6
8
8.5
6
6.5
7.5
8
6
75.5
r=
Applications (y)
18
22
30
24
16
18
28
27
20
17
21
241
x²
49
42.25
30.25
36
64
72.25
36
42.25
56.25
64
36
528.25
y²
324
484
900
576
256
324
784
729
400
289
441
5507
(11*1614.5 - 75.5*241)/√((11*528.25-75.5²)*(11*5507-241²))
= -0.8302
Interpretation
This correlation shows a strong negative (inverse) assocation between
interest rates (%) and the number of loan applications received.
xy
126
143
165
144
128
153
168
175.5
150
136
126
1614.5
(d)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t-crit =
t(α=0.05,df = 9) =
-2.262 ≤ t-stat ≤ +2.262
± 2.262
t-stat =
(-0.8302)*√((11-2)/(1-(-0.8302²))) =
'= -4.4677
Conclusion
Since t-stat (-4,4677) lies outside the area of acceptance for H0,there is
sufficient sample evidence at the 5% significance level to reject H0 in favour of H1.
Conclude that there is a strong negative relationship between interest rates (%)
and the number of loan applications received monthly.
(e)
Regression coefficients
Thus
b1 =
(11*1614.5 - 75.5*241)/(11*528.25-75.5²)
-3.9457
b0 =
(241 - (-3.9457)*(75.5))/11
48.991
ŷ = 48.991 - 3.9457 x
where 5.5 ≤ x ≤ 8.5
(f)
Interpretation of b1 coefficient
For a 1% increase in interest rate, 3.95 fewer loan applications will be received.
(g)
x=6
ŷ=
48.991 - 3.9457*(6) =
25.32 applications
If the rate of interest is 6%, the bank can expect to receive 25.32 (say, 25) applications.
Exercise 12.11
File:
X12.11 - maintenance costs.xlsx
(a)
Dependent variable (y) =
Annual maintenance cost (Rand)
Independent variable (x) = Machine age (years)
Machine age is assumed to influence annual maintenance costs.
(b)
Scatter plot
annual maintenance cost (R)
Scatter Plot
Annual Maintenance Cost (R) vs Age of Machines
70
60
50
40
30
20
10
0
0
2
4
6
8
10
machine age (years)
Interpretation
A strong direct (positive) linear relationship between the ages of machines and
their annual maintenance costs (in Rands) is observed in the scatter plot.
(c)
Correlation coefficient
n = 12
Σ
Age (yrs)
4
3
3
8
6
7
1
1
5
2
4
6
50
r=
Annual Cost (R)
45
20
38
65
58
50
16
22
38
26
30
35
443
x²
16
9
9
64
36
49
1
1
25
4
16
36
266
y²
2025
400
1444
4225
3364
2500
256
484
1444
676
900
1225
18943
(12*2182 - 50*443)/√((12*266-50²)*(12*18943-443²))
= 0.870028
Interpretation
There is a very strong positive association between the ages of
of machines and their annual maintenance costs (in Rands).
xy
180
60
114
520
348
350
16
22
190
52
120
210
2182
(d)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t crit =
t(α=0.05,df = 10) =
-2.228 ≤ t-stat ≤ +2.228
± 2.228
t-stat =
(0.870028)*√((12-2)/(1-0.870028²))
5.5806
Conclusion
Since t-stat (5.5806) lies outside the area of acceptance for H 0,there is
sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1.
Conclude that there is a strong direct association between age of machines
and the level of annual maintance costs (in Rands).
(e)
Regression coefficients
b1 =
b0 =
Thus
(12*2182 - 50*443)/(12*266 - 50²)
5.8295
(443 - 5.8295*50)/12
12.627
ŷ = 12.627 + 5.8295 x
where 1 ≤ x ≤ 8
(f)
Interpretation of b1 coefficient
If the age of a machine increased by 1 year, annual maintenance costs will rise by R5.8295.
(Alternatively, for every year older, the annual maintenance costs increase by R5.8295)
(g)
ŷ=
12.627 + 5.8295*(5) =
x=5
R 41.77
For a 5 year old machine, annual maintenance costs are expected to be R41.77.
Exercise 12.12
(a)
File:
X12.12 - employee performance.xlsx
Scatter Plot
performance rating (0 - 100)
Scatter Plot
Performance Rating vs Aptitude Score
95
90
85
80
75
70
65
60
55
50
2
3
4
5
6
7
8
9
10
aptitude score (1 - 10)
Interpretation
There appears to be a weak to moderate positive association between an employee's
aptitude score and their performance rating.
(b)
Correlation coefficient (r)
n = 12
Σ
Aptitude (x)
7
6
5
4
5
8
7
8
9
6
4
6
75
r=
Perf rating (y)
82
74
82
68
75
92
86
69
85
76
72
64
925
x²
49
36
25
16
25
64
49
64
81
36
16
36
497
y²
6724
5476
6724
4624
5625
8464
7396
4761
7225
5776
5184
4096
72075
xy
574
444
410
272
375
736
602
552
765
456
288
384
5858
(12*5858 - 75*925)/√((12*497-75²)*(12*72075-925²))
0.5194
Interpretation
There is a moderate positive association between the aptitude scores of
employees and their performance scores after one year.
(c)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t-crit =
t(α=0.05,df = 10) =
-2.228 ≤ t-stat ≤ +2.228
t-stat =
(0.5194)*√((12-2)/(1-0.5194²))
± 2.228
1.922
Conclusion
Since t-stat (1.922) lies inside the area of acceptance for H0,there is
not sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1.
Conclude, at a 5% significance level, that there is no statistically significant association
between an employee's aptitude score and their job performance rating score one year later.
(d)
Regression coefficients
Thus
b1 =
(12*5858-75*925)/(12*497-75²)
b0 =
(925 - 2.7168*75)/12
ŷ = 60.1032 + 2.7168 x
2.7168
60.1032
where 4 ≤ x ≤ 9
Interpretation of b1 coefficient
If an employee's aptitude score increases by one unit, their performance rating
score will increase by 2,7168 points.
(e)
Estimation
ŷ=
x =8
60.1032 + 2.7168*(8) =
81.84
For an employee with an aptitude score of 8, they could expect a job performance rating
score of 81.84.
The association between aptitude score and performance rating is not statistically significant.
Therefore, the call centre manager should have low confidence in this
estimated performance rating score.
Exercise 12.13
(a)
File:
X12.13 - opinion polls.xlsx
Correlation coefficient (r)
n = 11
Σ
Poll (%) (x)
42
34
59
41
53
40
65
48
59
38
62
541
r=
Election (%) (y)
51
31
56
49
68
35
54
52
54
43
60
553
x²
1764
1156
3481
1681
2809
1600
4225
2304
3481
1444
3844
27789
y²
2601
961
3136
2401
4624
1225
2916
2704
2916
1849
3600
28933
xy
2142
1054
3304
2009
3604
1400
3510
2496
3186
1634
3720
28059
(11*28059 - 541*553)/√((11*27789-541²)*(11*2893-553²))
0.7448
Interpretation
There is a moderate to strong positive association between opinion poll predictions
and the actual election results.
(b)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t-crit =
t(α=0.05,df = 9) =
-2.262 ≤ t-stat ≤ +2.262
± 2.262
t-stat =
(0.7448)*√((11-2)/(1-0.7448²))
3.3484
Conclusion
Since t-stat (3.3484) lies outside the area of acceptance for H0,there is
sufficient sample evidence at the 5% significance level to reject H0 in favour of H1.
Conclude, at a 5% significance level, that there is a statistically significant association
between opinion poll predictions and the actual election results.
(c)
Regression coefficients
b1 =
(11*28059 - 541*553)/(11*27789-541²) =
b0 =
(553 - 0.729*541)/11 =
Thus
0.729
14.4175
where 34 ≤ x ≤ 65
ŷ = 14.4175 + 0.729 x
Interpretation of b1 coefficient
For a one percentage point increase in an opinion poll prediction, the actual
election percentage is likely to increase by 0.729 percentage points.
(d)
Coefficient of Determination r²
r² =
0.7448² =
0.5547
Interpretation
Opinion poll predictions can explain 55.47% of variation in actual election results percentages
(e)
Prediction
x = 58
ŷ=
14.4175 + 0.729*(58) =
56.70%
If an opinion poll predicts support at 58%, the actual election result is likely to be 56.7%.
(f)
x = 82
ŷ=
14.4175 + 0.729*(82) =
74.20%
If an opinion poll predicts support at 82%, the actual election result is likely to be 74.2%.
Because x = 82 is beyond the domain of x (34 ≤ x ≤ 65), this expected actual election result
is unreliable and possibly invalid due to extrapolation being used.
Exercise 12.14
File:
(a)
Dependent variable (y) =
Independent variable (x) =
(b)
Scatter Plot
X12.14 - capital investment.xlsx
return on investment (%)
capital investment
Scatter Plot
Return on Investment (%) vs Capital Investment (%)
return on investment (%)
10
8
6
4
2
0
10
20
30
40
50
60
70
80
90
levels of capital investm ent (%)
Interpretation
There appears to be a weak to moderate association between a company's level of capital
investment and its return on investment.
(c)
Excel's Data Analysis (Regression)
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.41447
0.17179
0.15253
2.12495
45
ANOVA
Regression
Residual
Total
Intercept
Capital
df
SS
40.27335
194.16309
234.43644
MS
40.27335
4.51542
F-stat
8.91907
Coefficients Standard Error
1.27412
1.19293
0.06783
0.02271
t Stat
1.06806
2.98648
P-value
0.29145
0.00465
1
43
44
Correlation coefficient (r) =
Regression line (equation)
p-value
0.00465
Lower 95%
Upper 95%
-1.13166
3.67990
0.02203
0.11363
0,414473
ŷ = 1.2741 + 0.06783 x
for 21.1 ≤ x ≤ 79.5
(d)
Coefficient of Determination r²
r² =
0.41447² =
0.171788
Interpretation
The variation in the level of capital investment explains only 17.18% of the variation
in return on investment .
(e)
H0 : ρ = 0
Area of Acceptance for H0
H1 : ρ ≠ 0
t-crit =
t-stat =
t(α=0.05,df = 43) =
-2.016 ≤ t-stat ≤ +2.016
(0.41447)*√((45-2)/(1-0.41447²)) =
± 2.016
2.9865
Conclusion
Since t-stat (2.9865) lies outside the area of acceptance for H0,there is
sufficient sample evidence at the 5% significance level to reject H0 in favour of H1.
Conclude, at a 5% significance level, that there is a statistically significant association
between a company's level of capital investment and its return on investment.
(f)
b1 =
Interpretation of b1 coefficient
0.06783
For a one percentage point change in capital investment , company return on investment
can be expected to change by 0.06783 percentage points.
(h)
Estimation
x = 55
ŷ = 1.27412 + 0.06783 (55) =
5.005%
The expected return on investment for a company with a 55% level of capital
investment is 5.005%.
Exercise 12.15
(a)
File:
Dependent variable (y ) =
Independent variable (x ) =
X12.15 - property valuations.xlsx
Market Values
Council Valuations
Council valuations are assumed to have an influence on property market values.
(b)
Scatter Plot
Market Values vs Council Valuations
300
market values (R)
250
200
150
100
50
Market values
0
0
50
100
150
200
council valuations (R)
Interpretation
There appears to be a strong positive correlation between the council's valuation of
a residential property in Bloemfontein and its market value.
(c)
Excel's Data Analysis (Regression)
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.78104
R Square
0.61002
Adjusted R Square
0.59975
Standard Error
26.40553
Observations
40
ANOVA
df
Regression
Residual
Total
Intercept
Council valuations
SS
MS
41444.82 41444.82
26495.58 697.2521
67940.4
Coefficients Standard Error
14.2370
71.3619
1.0151
0.1317
Correlation coefficient (r) =
Regression line (equation)
(d)
1
38
39
t Stat
5.0124
7.7097
0.78104
ŷ = 71.3619 + 1.0151 x
F-stat
59.4402
p-value
2.7522E-09
P-value
Lower 95% Upper 95%
1.28E-05
42.541
100.183
2.75E-09
0.749
1.282
for 48 ≤ x ≤ 154
Coefficient of Determination r²
r² =
0.78104² =
0.61002
Interpretation
Variation in council valuations explain 61.002% of the variation in property market values .
(e)
H0: ρ = 0
Area of Acceptance for H0
H1: ρ ≠ 0
t-crit =
t-stat =
t(α=0.05,df = 38) =
-2.024 ≤ t-stat ≤ +2.024
(0.78104)*√((40-2)/(1-0.78104²)) =
± 2.024
7.70975
Conclusion
Since t-stat (7.70975) lies well outside the area of acceptance for H0,there is
sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1.
Conclude, at a 5% significance level, that there is a statistically significant association
between the Council's valuation and the resultant property market valuation.
(f)
b1 =
Interpretation of b1 coefficient
1.0151
For a R1 (in R1000) change (up / down) in council valuation , market valuation of a
property can be expected to change (up / down) by R1.0151 (in R1000) .
(g)
Estimation
x = 100
ŷ = 71.3619 + 1.0151 (100) =
R172.874 (in R1000s)
The expected market value of a property which the council values at R100 (in R1000s)
is likely to be R172.874 (in R1000s).
CHAPTER 13
MULTIPLE LINEAR REGRESSION
Exercise 13.1
Simple linear regression has only one independent variable (x1) whereas multiple linear
regression has two or more independent variables (x1, x2, x3, … , xk) that are assumed
to influence the outcome of the dependent variable, y .
Exercise 13.2
ANOVA
df
Regression
Residual
Total
5
40
45
SS
84
148
232
MS
16.8
3.7
F-stat
4.541
p-value
0.002261
(a)
R2 = 84/232 = 0.3621
36.2% of total variation in y can be explained by the 5 independent variables.
(b)
H0: β1 = β2 = β3 = β4 = β5 = 0
or H0: ρ = 0
vs
vs
F-crit
2.45
H1: At least one βi ≠ 0 (i = 1,2,3,4,5)
H1: ρ ≠ 0
(c)
F-stat = 4.5412
and F-crit = F(0.05,5,40) = 2.45 (See Anova Table)
(d)
Reject H0. Conclude that the overall model is statistically significant.
(i.e. at least one x i is statistically significant in estimating y)
Exercise 13.3
Intercept
A
B
C
D
t-crit =
(a)
Coefficients
1.82
0.68
-2.35
0.017
1.96
Std Error
1.12
0.28
0.984
0.012
1.16
t-stat
1.63
2.44
-2.39
1.42
1.69
p-value
0.1215
0.0253
0.0140
0.1737
0.1083
Lower 95%
-0.53
0.09
-4.42
-0.01
-0.48
Upper 95%
3.92
2.78
-0.25
2.12
4.06
t(0.05,24-4-1) = t(0.05,19) = 2.093
For each x i variable (A, B, C and D), test: H0: βi = 0 against H1: βi ≠ 0 for i = A, B, C and D.
(b), (c), (d) t-crit = t(0.05,19) = ±2.093
For A : Since t-stat (2.44) > t-crit (+2.093); or p -value (0.0253) < α (0.05), or
{0.09 ≤ βA ≤ 2.78} does not cover zero, conclude variable A is statistically significant.
For B : Since t-stat (-2.39) < -t-crit (-2.093); or p -value (0.014) < α (0.05), or
{-4.42 ≤ βB ≤ -0.25} does not cover zero, conclude variable B is statistically significant.
For C : Since t-stat (1.42) < -t-crit (+2.093); or p -value (0.1737) > α (0.05), or
{-0.01 ≤ βC ≤ 2.12} covers zero, conclude variable C is not statistically significant.
For D : Since t-stat (1.69) < -t-crit (+2.093); or p -value (0.1083) > α (0.05), or
{-0.48 ≤ βD ≤ 4.06} covers zero, conclude variable D is not statistically significant.
Exercise 13.4
(a)
For x 3 variable, test: H0: β3 = 0 against H1: β3 ≠ 0
(b)
t-crit = t(0.05,30) = ±2.042. Hence, do not reject H0 if -2.042 ≤ t-stat ≤ +2.042.
(c )
Since t-stat (2.44) > t-crit (+2.042), hence reject H0 in favour of H1 at α = 0.05.
(d)
Conclude that the x 3 variable is statistically significant in estimating y .
Exercise 13.5
(a)
Holding all other variables constant, a unit increase in x 2 will result in a 1.6 reduction in y^ .
(b)
For x 2 variable, test: H0: β2 = 0 against H1: β2 ≠ 0
Yes, since the 95% confidence interval for β2 does not cover zero.
Exercise 13.6
Binary coding scheme (Choose 'Lean' as the base category)'
F1 and F2 are the dummy variable names chosen.
Fuel type
Leaded
Unleaded
Lean
Exercise 13.7
F1
1
0
0
F2
0
1
0
Binary coding scheme (Choose 'spring' as the base category)'
S1, S2 and S3 are the dummy variable names chosen.
Season
summer
autumn
winter
spring
S1
1
0
0
0
S2
0
1
0
0
S3
0
0
1
0
Exercise 13.8
File:
X13.8 - employee absenteeism.xlsx
Excel's Data Analysis - Regression
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.7384
R Square
0.5453
Adjusted R Square
0.5012
Standard Error
3.7204
Observations
35
ANOVA
Regression
Residual
Total
Intercept
Tenure
Satisfaction
Commitment
df
SS
514.467
429.076
943.543
MS
171.489
13.841
Coefficients Standard Error
36.411
5.929
0.220
0.065
-0.184
0.109
-0.332
0.080
t-stat
6.141
3.376
-1.692
-4.130
3
31
34
(a)
R2 = 514.467/943.543 = 54.53%
(b)
H0: βT = βS = βC = 0
H1: At least one βi ≠ 0
versus
and
F-crit = F(0.05,3,31) = 2.92
F-stat = 12.39
Reject H0. Conclude the overall model is statistically significant.
F-stat
p-value
12.390 0.00001707
p-value
Lower 95% Upper 95%
8.22345E-07
24.318
48.504
0.001997
0.087
0.352
0.100765
-0.406
0.038
0.000254
-0.497
-0.168
p -value = 0.00001707
(i.e. at least one x i is statistically significant in estimating y)
(c), (d), (e)
For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0
with t-crit = t(0.05,31) = ±2.04
Tenure : Since t-stat (3.376) > t-crit (+2.04); or p -value (0.001997) < α (0.05), or
{0.087 ≤ βT ≤ 0.352} does not cover zero, conclude Tenure is statistically significant.
Satisfaction : Since t-stat (-1.692) lies within t-crit (±2.04); or p -value (0.100765) > α (0.05), or
{-0.406 ≤ βS ≤ 0.038} covers zero, conclude Satisfaction is not statistically significant.
Commitment : Since t-stat (-4.13) < t-crit (-2.04); or p -value (0.000254) < α (0.05), or
{-0.497 ≤ βC ≤ -0.168} does not cover zero, conclude Commitment is statistically significant.
(f) (i)
No, organisational commitment is the most important explanatory factor because it has a
larger t-stat value (-4.13) and a smaller p- value (0.000254) than job tenure .
(f) (ii)
No, job satisfaction is not a statistically signficant explanatory factor of employee
absenteeism (see (c), (d) and (e) above).
Yes, organisational commitment does play a statistically signficant role in explaining
employee absenteeism (see (c), (d) and (e) above).
(g)
y(hat) = 36.411 + 0.22 (48) - 0.184 (50) - 0.332 (60) = 17.8008
Using t-crit = t(0.025,31) = ±2.04; standard error = 3.7204; n = 35
giving margin of error = 2.04 (3.7204)/√35 = 1.2826
16.5182
Lower 95% confidence limit =
17.8008 - 1.2826
19.0835
Upper 95% confidence limit =
17.8008 + 1.2827
{16.52 ≤ y(hat)(estimated) ≤ 19.08}
Management interpretation: We can be 95% confident that the true average number
of days absent per employee per annum is likely to lie between 16.5 days and 19.1 days.
Exercise 13.9
File:
X13.9 - plastics wastage.xlsx
Excel's Data Analysis - Regression
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.8061
R Square
0.6498
Adjusted R Square
0.6109
Standard Error
0.5160
Observations
31
ANOVA
Regression
Residual
Total
Intercept
Dexterity
Speed
Viscosity
df
3
27
30
SS
13.3384
7.1887
20.5271
MS
4.4461
0.2662
Coefficients
1.8179
-0.1112
0.0173
1.9189
Standard Error
1.2189
0.0286
0.0047
1.2581
t-stat
1.4914
-3.8816
3.6770
1.5252
(a)
R2 = 13.3384/20.5271 = 64.98%
(b)
H0: βD = βS = βV = 0 versus
H1: At least one βi ≠ 0
F-crit = F(0.05,3,27) = 2.99
F-stat = 16.6992 and p-value = 0.000002466
Reject H0. Conclude the overall model is statistically significant.
F-stat
16.6992
p-value
2.466E-06
p-value
Lower 95% Upper 95%
0.1474
-0.6830
4.3188
0.0006
-0.1700
-0.0524
0.0010
0.0077
0.0270
0.1388
-0.6625
4.5004
(i.e. at least one x i is statistically significant in estimating y)
(c), (d), (e)
For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0
(f)
The most important factor is operator dexterity (p -value (0.0006)), then machine speed with
p -value = 0.001. Plastic viscosity is not a significant influencing factor (p -value = 0.1388).
(g)
y(hat) = 1.8179 - 0.1112 (25) + 0.0173 (200) + 1.9189 (0.25) = 2.9826
Using t-crit = t(0.025,27) = ±2.052; standard error = 0.516; n = 31
giving margin of error = 2.052 (0.516)/√31 = 0.1902
Lower 95% confidence limit =
2.9826 - 0.1902 =
2.7924
Upper 95% confidence limit =
2.9826 + 0.1902 =
3.1728
{2.79% ≤ y(hat)(estimated) ≤ 3.17%}
Management interpretation: We can be 95% confident that the true average % of plastic
wastage per shift is likely to lie between 2.79% and 3.17%.
with t-crit = t(0.05,27) = ±2.052
Dexterity : Since t-stat (-3.886) < t-crit (-2.052); or p -value (0.0006) < α (0.05), or
{-0.17 ≤ βD ≤ -0.0524} does not cover zero, conclude Dexterity is statistically
significant.
Speed : Since t-stat (3.677) > t-crit (+2.052); or p -value (0.001) < α (0.05), or
{0.0077 ≤ βS ≤ 0.027} does not cover zero, conclude Speed is statistically significant.
Viscosity : Since t-stat (1.525) < t-crit (+2.052); or p -value (0.1388) > α (0.05), or
{-0.6625 ≤ βV ≤ 4.5} covers zero, conclude Viscosity is not statistically significant.
Exercise 13.10
(a)
File:
Binary coding scheme (Choose 'method C' as the base category)
MA and MB are the dummy variable names chosen.
Method Code
A
B
C
(b)
X13.10 - employee performance.xlsx
MA
1
0
0
MB
0
1
0
Sample input data for first 6 consultants (showing the binary coded data)
Consultant
1
2
3
4
5
6
Productivity
24
30
26
37
29
28
Experience
9
4
10
12
10
6
MA
1
0
1
0
0
0
MB
0
1
0
1
0
0
Excel's Data Analysis - Regression (using the binary coded data as in (a))
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.7894
0.6231
0.5524
2.3425
20
ANOVA
Regression
Residual
Total
df
3
16
19
SS
145.151
87.799
232.95
Intercept
Experience
MA
MB
Coefficients
26.387
0.389
-3.659
1.472
Std error
1.940
0.167
1.376
1.336
MS
48.384
5.487
F-stat
8.817
t-stat
p-value
13.601 3.29E-10
2.335
0.0329
-2.659
0.0172
1.102
0.2867
p-value
0.001108
Lower 95%
Upper 95%
22.274
30.499
0.036
0.742
-6.576
-0.742
-1.360
4.304
Estimated multiple regression equation:
y^ = 26.387 + 0.389 Experience - 3.659 MA + 1.472 MB (based on data recoded as in (a))
(c)
R2 = 145.151/232.95 = 62.31%
(d)
H0: βE = βMA = βMB = 0
H1: At least one βi ≠ 0
versus
F-crit = F(0.05,3,16) = 3.24
and
F-stat = 8.817
p -value = 0.001108
Reject H0. Conclude the overall model is statistically significant.
(i.e. at least one x i is statistically significant in estimating y)
(e)
For Experience , test: H0: βE = 0 against H1: βE ≠ 0
(f)
For each of MA and MB, test: H0: βi = 0 vs H1: βi ≠ 0 (i = MA, MB), wiith t-crit = t(0.05,16) = ±2.12
with t-crit = t(0.05,16) = ±2.12
Since t-stat (2.335) > t-crit (2.12), conclude work experience is statistically significant at α = 0.05.
MA : Since t-stat (-2.659) < lower t-crit (-2.12), conclude MA is statistically significant (i.e. adopting
marketing method A results in significantly lower consultant productivity levels, on average,
compared to using marketing method C (i.e. the base category).
MB : Since –t-crit (-2.12) < t-stat (1.102) < +t-crit (-2.12), conclude MB is not statistically significant
(i.e. consultant productivity levels, on average, are the same for marketing method B and
and marketing method C (i.e. no difference to the base category average productivity level).
Overall conclusion: the independent variable ‘marketing method’ is statistically significant, but
only for marketing method A (when compared to method C (i.e. the base category).
Marketing methods B and C can be combined as there is no statistically significant difference
between them with regards to their average productivity levels across consultant.
(g), (h)
For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0
Experience : Since its p -value (0.0329) < α = 0.05, or {0.036 ≤ βE ≤ 0.742} does not cover zero,
conclude that work experience is statistically significant.
MA : Since its p -value (0.0172) < α = 0.05, or {-6.576 ≤ βMA ≤ -0.742} does not cover zero,
conclude that marketing method A is statistically significantly different from marketing method C
(i.e. the base category) in terms of consultants' average productivity levels.
MB : Since its p -value (0.2867) > α = 0.05, or {-1.36 ≤ βMA ≤ 4.304} covers zero,
conclude marketing method B is not statistically significantly different from marketing method C
(i.e. the base category) in terms of consultants' average productivity levels.
(i)
Employ consultants with longer work experience and avoid using marketing method A as it
produces lower productivity levels than either marketing methods B or C.
(j)
y^ = 26.387 + 0.389 (8) - 3.659 (0) + 1.472 (1) = 30.9717
Using t-crit = t(0.025,16) = ±2.1199; standard error = 2.3425; n = 20
giving margin of error = 2.1199 (2.3425)/√20 = 1.1104
Lower 95% confidence limit =
30.9717 - 1.1104 =
29.86
Upper 95% confidence limit =
30.9717 + 1.1104 =
32.08
{29.86 deals ≤ y(hat)(estimated) ≤ 32.08 deals}
Management interpretation: The bank management can be 95% confident that the actual
average number of deals closed per month per consultant is likely to lie between 30 and 32 (rounded).
Exercise 13.11
(a)
File:
X13.11 - corporate performance.xlsx
Binary coding scheme (Choose region 'KZN' as the base category)
R1 and R2 are the dummy variable names chosen.
Region
Gauteng
Cape
KZN
Code
1
2
3
R1
1
0
0
R2
0
1
0
Binary coding scheme (Choose sector 'Construction' as the base category)
S is the dummy variable name chosen.
Sector
Agriculture
Construction
(b)
Code
1
2
S
1
0
Sample input data for first 6 companies (showing the binary coded data)
ROC(%)
19.7
17.2
17.1
16.6
16.6
16.5
Sales
7178
1437
3948
1672
2317
4123
Margin%
18.7
18.5
16.5
16.2
16.0
15.6
Debt ratio(%)
28.5
24.3
65.6
26.4
20.1
46.4
R1
1
1
1
1
1
0
R2
0
0
0
0
0
1
Excel's Data Analysis - Regression (using the binary coded data as in (a))
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.9125
R Square
0.8327
Adjusted R Square
0.7769
Standard Error
0.9524
Observations
25
Y = f(Sales; Margin %; Debt ratio(%); Region; Sector)
ANOVA
df
Regression
Residual
Total
6
18
24
SS
81.2595
16.3261
97.5856
MS
13.5433
0.9070
F-stat
14.9318
p-value
0.000004076
S
0
1
1
1
1
0
REGRESSION OUTPUT
Intercept
Sales
Margin%
Debt ratio(%)
R1
R2
S
Coefficients
11.0146
0.0002
0.1791
0.0091
3.1453
0.9213
-0.9230
Standard Error
0.8746
0.0001
0.0672
0.0154
0.8389
0.5862
0.4205
t Stat
12.5936
2.1668
2.6656
0.5930
3.7494
1.5716
-2.1950
P-value
0.0000
0.0439
0.0158
0.5606
0.0015
0.1335
0.0415
Lower 95%
9.1771
0.00001
0.03794
-0.02321
1.38288
-0.31031
-1.80651
Upper 95%
12.8520
0.00033
0.32030
0.04146
4.90770
2.15295
-0.03956
y^ = 11.0146 + 0.0002 Sales + 0.1791 Margin % + 0.0091 Debt ratio % + 3.1453 R1 + 0.9213 R2 - 0.923 S
(based on data recoded as in (a))
2
(c )
R = 81.2595/97.5856 = 83.27%
(d)
H0: βS = βM% = βDR = βR1 = βR2 = βS = 0
versus
H1: At least one βi ≠ 0
F-crit = F(0.05,6,18) = 2.66
and
F-stat = 14.932
p -value = 0.000004076
Reject H0. Conclude that the overall model is statistically significant.
(i.e. at least one x i is statistically significant in estimating y)
(e), (h), (i)
For all variables, test: H0: βi = 0 vs H1: βi ≠ 0 with t-crit = t(0.05,18) = ±2.101
Sales : Since t-stat (2.1668) > t-crit (2.101), or p -value (0.0439) < α = 0.05 or
{0.00001 ≤ βS ≤ 0.00033} does not cover zero, conclude sales is statistically significant.
Margin% : Since t-stat (2.6656) > t-crit (2.101), or p -value (0.0158) < α = 0.05 or
{0.0379 ≤ βM% ≤ 0.3203} does not cover zero, conclude margin% is statistically significant.
Debt ratio% : Since –t-crit (-2.101) < t-stat (0.593) < t-crit (2.101), or p -value (0.5606) > α = 0.05 or
{-0.0232 ≤ βDR% ≤ 0.0415} covers zero, conclude debt ratio% is not statistically significant.
(f)
Region: For each dummy variable R1 and R2 , test: H0: βi = 0 against H1: βi ≠ 0 with t-crit = t(0.05,18) = ±2.101
R1 (Gauteng) : Since t-stat (3.7494) > upper t-crit (2.101); or its p -value (0.0015) < α = 0.05, or
{1.38288 ≤ βR1 ≤ 4.9077} does not cover zero, conclude that companies that operate in the Gauteng region
have a statistically significantly higher return on capital (%), on average, than companies that operate
in the KZN region (i.e. the base region).
R2 (Cape) : Since t-stat (1.5716) < upper t-crit (2.101); or its p -value (0.1335) > α = 0.05, or
{-0.31031 ≤ βR2 ≤ 2.15295} covers zero, conclude that companies that operate in the Cape region
do not have a statistically significant difference in their average return on capital (%)than companies
that operate in the KZN region (i.e. the base region).
Overall conclusion: the independent variable ‘region’ is statistically significant, but only with
respect to the Gauteng region (when compared to th KZN region (i.e. the base region).
The Cape and KZN regions can be merged into a single region as there is no statistically significant
difference in the average return on capital (%) of companies operating within these two regions.
(g)
Sector: For the dummy variable, S, test: H0: βS = 0 against H1: βS ≠ 0 with t-crit = t(0.05,18) = ±2.101
S (Agriculture) : Since t-stat (-2.195) < lower t-crit (-2.101); or its p -value (0.0415) < α = 0.05, or
{-1.80651 ≤ βS ≤ -0.03956} does not cover zero, conclude that companies that operate in the
agricultural sector have a statistically significantly lower return on capital (%), on average, than
companies that operate in the construction sector (i.e. the base sector).
Overall conclusion: the independent variable ‘sector’ is statistically significant.
(j)
Significant performance measures of ROC% are: Sales, Margin%, but not Debt ratio%.
For region, Gauteng has a significantly positive impact on average ROC% compared to Cape and KZN.
The agricultural sector has a significantly negative impact on average ROC% compared to the construction sector.
(j)
y^ = 11.0146 + 0.0002 (8862) + 0.1791 (10) + 0.0091 (22) + 3.1453 (0) + 0.9213 (1) - 0.923 (0) = 15.6995
Using t-crit = t(0.025,18) = ±2.101; standard error = 0.9524; n = 25
giving margin of error = 2.101 (0.9524)/√25 = 0.400198
Lower 95% confidence limit =
15.6995 - 0.400198 =
15.299
Upper 95% confidence limit =
15.6995 + 0.400198 =
16.100
{15.299% ≤ y(hat)(estimated) ≤ 16.10%}
Management interpretation: The investment analyst can be 95% confident that the actual
average return on capital (%) of companies with the given profile lies between 15.3% and 16.1%.
CHAPTER 14
INDEX NUMBERS
MEASURING BUSINESS ACTIVITY
Exercise 14.1
An index number is a single summary value that measures the overall
change in the level of activity of a single item or a basket of related
items from one time period to the another.
Example: Consumer Price Index (CPI) - the inflation indicator.
Example 14.2
A price index measures changes in price levels over time, holding
quantities constant.
A quantity index measures changes in consumption levels over time, holding
prices constant.
Example 14.3
Items are 'weighted' in a basket to reflect the importance (or value) of each
item in the basket relative to the other items in the basket.
Example 14.4
Laspeyres weighting method and Paasche weighting method
Example 14.5
(i)
(ii)
(iii)
(iv)
(v)
Example 14.6
A link relative is a period-on-period (consecutive) change in level of activity.
A price relative is a change in the level of activity of an item in a given period
relative to a base period.
Example 14.7
Real (constant) values are found by dividing monetary values by an 'inflation' index.
This removes the influence of price increases.
Real (constant) values refer to the actual purchasing power of money / or the
real (actual) change in the level of activity.
The purpose (or scope) of the index
The selection of the basket of items (i.e. the mix of items)
The choice of item weights
The choice of a suitable base year
The formulation of a substitution rule
Exercise 14.8
File:
2009
Data
(a)
X14.8 - motorcycle sales.xlsx
2010
Motorcycle
model
Unit price
(R1000)
Quantity
(units sold)
Unit price
(R1000)
Quantity
(units sold)
A
B
C
25
15
12
10
55
32
30
19
14
7
58
40
Motorcycle
p1/p0*100
model
A
30/25*100 =
B
19/15*100 =
C
14/12*100 =
Price
Relative
120.0
126.7
116.7
Interpretation
Model A has risen in price by 20% from 2009 to 2010; model B by 26.7%
and model C by 16.7%.
(b) (i)
Laspeyres Weighted Aggregates Price Index
Motorcycle
model
A
B
C
Totals
Base Value (p0*q0)
25*10 =
15*55 =
12*32 =
Composite Price Index
(b) (ii)
250
825
384
1459
30*10 =
19*55 =
14*32 =
300
1045
448
1793
122.9
1793/1459*100 =
Laspeyres Weighted Average of Price Relatives Index
Motorcycle
model
A
B
C
Totals
Base Value (p0*q0)
25*10 =
15*55 =
12*32 =
Composite Price Index
(c)
Current Value (p1*q0)
250
825
384
1459
179300/1459 =
Price Relative (p1/p0)
30/25*100 =
19/15*100 =
14/12*100 =
120.0
126.7
116.7
Weighted
Price
Relatives
120*250 +
126.7*825 +
116.7*384 =
179300
122.9
Interpretation
Motorcycle models A, B and C have risen in price by 22.9% on average
from 2009 to 2010.
Exercise 14.9
Data
(a)
File:
Motorcycle
model
A
B
C
Motorcycle
model
A
B
C
2009
Unit price
Quantity
(R1000)
(units sold)
25
10
15
55
12
32
q1/q0*100
7/10*100 =
58/55*100 =
40/32*100 =
X14.8 - motorcycle sales.xlsx
2010
Unit price
Quantity
(R1000)
(units sold)
30
7
19
58
14
40
Quantity
Relative
70.0
105.5
125.0
Interpretation
Model A unit sales dropped 30% from 2009 to 2010; model B unit sales rose
by 5.55%; while model C unit sales rose the most by 25% over the year.
(b) (i)
Laspeyres Weighted Aggregates Quantity Index
Motorcycle
model
A
B
C
Totals
Base Value (p0*q0)
25*10 =
15*55 =
12*32 =
Composite Quantity Index
(b) (ii)
25*7 =
15*58 =
12*40 =
250
825
384
1459
1525/1459*100 =
175
870
480
1525
104.5
Laspeyres Weighted Average of Quantity Relatives Index
Motorcycle
model
A
B
C
Totals
Base Value
25*10 =
15*55 =
12*32 =
Composite Quantity Index
(c)
Current Value (p0*q1)
(p0*q0)
250
825
384
1459
Quantity Relative (q1/q0)
7/10*100 =
58/55*100 =
40/32*100 =
152537.5/1459 =
70.0
105.5
125.0
104.5
Interpretation
Unit sales of motorcycle models A, B and C have risen by 4.5% on average
from 2009 to 2010.
Weighted
Quantity
Relatives
70*250 +
105.5*825 +
125*384 =
152537.5
Exercise 14.10
Data
(a) (i)
File:
Motorcycle
model
A
B
C
2009
Unit price
Quantity
(R1000)
(units sold)
25
10
15
55
12
32
Base Value (p1*q0)
30*10 =
19*55 =
14*32 =
Composite Quantity Index
Current Value (p1*q1)
30* 7 =
19*58 =
14*40 =
300
1045
448
1793
1872/1793*100 =
210
1102
560
1872
104.41
Paasche Weighted Average of Quantity Relatives Index
Motorcycle
model
A
B
C
Totals
Base Value
30*10 =
19*55 =
14*32 =
Composite Quantity Index
(b)
2010
Unit price
Quantity
(R1000)
(units sold)
30
7
19
58
14
40
Paasche Weighted Aggregates Quantity Index
Motorcycle
model
A
B
C
Totals
(a) (ii)
X14.8 - motorcycle sales.xlsx
(p1*q0)
300
1045
448
1793
187200/1793 =
Quantity Relative (q1/q0)
7/10*100 =
58/55*100 =
40/32*100 =
70.0
105.5
125.0
Weighted
Quantity
Relatives
70*300 +
105.5*1045 +
125*448 =
187200
104.41
Interpretation
Unit sales of motorcycle models A, B and C have risen by 4.41% on average
from 2009 to 2010.
Exercise14.11
File:
2009
Data
Telkom Services
TalkPlus
SmartAccess
ISDN
(a)
2010
Unit price
Quantity
(cents/call) (100's calls)
65
35
50
X14.11 - Telkom services.xlsx
Unit price
(cents/call)
14
27
16
Price relatives (2010)
Telkom Services
70/65*100 =
TalkPlus
107.7
40/35*100 =
SmartAccess
114.3
45/50*100 =
ISDN
90.0
2011
Quantity
Unit price
Quantity
(100's calls) (cents/call) (100's calls)
70
40
45
18
29
22
55
45
40
Price relatives (2011)
55/65*100 =
84.6
45/35*100 =
128.6
40/50*100 =
80.0
Interpretation
TalkPlus services increased by 7.7% in price from 2009 to 2010, but then dropped by
15.4% in price from 2009 to 2011.
SmartAccess on the other hand showed an increase in price from 2009 to 2010 of 14.3%
and by 28.6% from 2009 to 2011.
ISDN showed a decrease in price by 10% in the first year (from 2009 to 2010) and by 20%
over the 2 year period from 2009 to 2011.
(b)
Laspeyres Weighted Aggregates Price Index
Telkom Services
TalkPlus
SmartAccess
ISDN
Base Value 2010 Value
(p0*q0)
(p1*q0)
910
945
800
2655
Composite Price Indexes
980
1080
720
2780
104.71
2011 Value
(p1*q0)
770
1215
640
2625
98.87
Interpretation
(based on the Laspeyres approach)
The cost of Telkom services increased, on average, by 4.71% from 2009 to 2010,
while there was a net reduction in costs, on average of 1.13% from 2009 to 2011.
(c)
Paasche Weighted Aggregates Quantity Index
2010 Prices
Telkom Services
TalkPlus
SmartAccess
ISDN
Base Value 2010 Value
(p1*q0)
(p1*q1)
980
1080
720
2780
Composite Quantity Indexes
(d)
Interpretation
1260
1160
990
3410
2011 Prices
Base Value
(p1*q0)
770
1215
640
2625
122.7
(based on the Paasche approach)
2011 Value
(p1*q1)
935
1080
1280
3295
125.5
17
24
32
The printing company's usage of Telkom services has risen by 22.7% over one
year (from 2009 to 2010) and by 25.5%, on average over two years (from 2009 to 2011).
Exercise 14.12
Data
File:
Job categories
Systems analyst
Programmer
Network manager
(a)
Annual salary (in R10 000)
2008
2011
42
50
29
36
24
28
Systems analyst
Programmer
Network manager
Base Value
2008 (p0*q0)
3528
2784
1392
7704
Laspeyres Composite Salary Index
(c)
No. of employees
2008
2011
84
107
96
82
58
64
Laspeyres Weighted Aggregate Salary Index
IT Job Category
(b)
X14.12 - computer personnel.xlsx
Interpretation
IT Job Category
Systems analyst
Programmer
Network manager
Interpretation
Current Value
2011 (p1*q0)
4200
3456
1624
9280
120.46
On average, the overall remuneration has increased by
20.46% over the 3 years period from 2008 to 2011.
Price relatives
(p1/p0)
119.05
124.14
116.67
Programmers have enjoyed the largest percentage increase
in remuneration of 24.14% from 2008 to 2011.
Exercise 14.13
Data
Job categories
Systems analyst
Programmer
Network manager
(a)
IT Job Category
Systems analyst
Programmer
Network manager
File:
X14.12 - computer personnel.xlsx
Annual salary (in R10 000)
2008
2011
42
50
29
36
24
28
No. of employees
2008
2011
84
107
96
82
58
64
Staff Relatives
(q1/q0)
127.4
85.4
110.3
Interpretation
The staff complement of Systems Analysts grew by 27.4% and that of Network
managers grew by 10.3% over the period from 2008 to 2011. However, the
number of Programmers, on the other hand, reduced by 14.6% over this same period.
(b) (i)
Laspeyres Quantity index (Weighted Aggregates approach)
IT Job Category
Systems analyst
Programmer
Network manager
Base Value
2008 (p0*q0)
3528
2784
1392
7704
Current Value
2011 (p0*q1)
4494
2378
1536
8408
Composite Staff Complement (Quantity) Index
109.14
(b) (ii) Laspeyres Quantity index (Weighted Average of Relatives approach)
IT Job Category
Systems analyst
Programmer
Network manager
Base Value 2008 Staff Relatives
(p0*q0)
(q1/q0)
3528
2784
1392
7704
127.38
85.42
110.34
Composite Staff Complement (Quantity) Index
(c)
Weighted
Ave of
Quantity
Relatives
449400
237800
153600
840800
109.14
Interpretation
The overall IT staff complement across all job categories has increased by an
average of 9.14% from 2008 to 2011.
Exercise 14.14
File:
X14.14 - printer cartridges.xlsx
Data
Printer
cartridges
HQ21
HQ25
HQ26
HQ32
(a)
2008
Quantity
Unit price
used
145
24
172
37
236
12
314
10
Print
Cartridges
HQ26
HQ32
Unit price
(2008)
236
314
2009
2010
Unit price
Quantity used
Unit price
155
165
255
306
28
39
12
8
149
160
262
299
Unit price
(2010)
262
299
Quantity
used
36
44
14
11
Price relative 2010
262/236% =
299/314% =
111.02
95.22
Interpretation
The price of the HQ26 printer cartridge has increased by 11.02% from 2008 to 2010.
The price of the HQ32 printer cartridge has decreased by 4.78% from 2008 to 2010.
(b) (i)
Paasche Weighted Aggregates Price Index
Printer
cartridges
HQ21
HQ25
HQ26
HQ32
Totals
2009 Prices
Base Value Current Value
(2008)
(2009)
4060
4340
6708
6435
2832
3060
2512
2448
16112
16283
Composite Price Indexes
(b) (ii)
2010 Prices
Base Value
Current Value
(2008)
(2010)
5220
5364
7568
7040
3304
3668
3454
3289
19546
19361
101.06
99.05
Paasche Weighted Average of Relatives Price Index - for 2009
Printer
cartridges
Base Value
(2008)
HQ21
HQ25
HQ26
HQ32
Totals
4060
6708
2832
2512
16112
Price relative
(2009)
106.9
95.9
108.1
97.5
Paasche Composite Price Index =
Weighted Ave
(2009)
434000
643500
306000
244800
1628300
101.06
Paasche Weighted Average of Relatives Price Index - for 2010
Printer
cartridges
Base Value
(2008)
HQ21
HQ25
HQ26
HQ32
Totals
5220
7568
3304
3454
19546
Price relative
(2010)
102.8
93.0
111.0
95.2
Paasche Composite Price Index
Weighted Ave
(2010)
536400
704000
366800
328900
1936100
99.05
(c)
Interpretation
The average price of print cartridges increased marginally by 1.06% from 2008 to 2009.
However, the average price of print cartridges decreased by 0.95% from 2008 to 2010.
(d)
Composite Link Relatives
Composite price indexes
Composite link relatives
2008
100
2009
101.06
101.06
2010
99.05
98.01
Exercise 14.15
Data
Composite Price Index
(Electrical goods)
(a)
File:
2004
2005
2006
2007
2008
2009
2010
88
96
100
109
114
112
115
Reset base year to 2009
Composite Price Index
(Electrical goods)
X14.15 - electrical goods.xlsx
100k/112
2004
2005
2006
2007
2008
2009
2010
78.6
85.7
89.3
97.3
101.8
100
102.7
Interpretation
In 2004, the average price of electrical goods was 21.4% below the 2009 (base) price levels,
while in 2006, it was only 10.7% below the 2009 (base) price levels.
In 2010, prices were 2.7% higher on average than in the base period of 2009.
(b)
Link relatives
Composite Price Index
(Electrical goods)
2004
2005
2006
2007
2008
2009
2010
100
109.1
104.2
109
104.6
98.2
102.7
Interpretation
The annual average price changes in electrical goods starting in 2004 was 9.1% (for 2005);
4.2% (for 2006); 9% (for 2007); 4.6% (for 2008); -1.8% (decrease in 2009); and 2.7% (for 2010).
Exercise 14.16
Data
File:
X14.16 - insurance claims.xlsx
2006
2007
2008
2009
2010
2011
Federal Insurance
(base = 2008)
92.3
95.4
100
102.6
109.4
111.2
Baltic Insurance
(base = 2009)
93.7
101.1
98.2
100
104.5
107.6
2006
2007
2008
2009
2010
2011
Federal Insurance
(base = 2010)
84.4
87.2
91.4
93.8
100
101.6
Baltic Insurance
(base = 2010)
89.7
96.7
94
95.7
100
103
(a)
(b)
Ferderal Insurance showed a 8.6/91.4% (9.41%) increase from 2008 to 2010.
Baltic Insurance showed a 6.0/94% (6.4%) increase from 2008 to 2010.
Hence Federal Insurance showed the bigger claims increase from 2008 to 2010.
(c)
Link Relatives
Federal Insurance
Baltic Insurance
2006
100
100
2007
103.4
107.9
2008
104.8
97.1
2009
102.6
101.8
2010
106.6
104.5
2011
101.6
103
(d)
Baltic showed a year on year increase of 3%, while
Federal showed a year on year increase of only 1.6%. Hence Baltic Insurance.
(e)
Geometric mean
(f)
Federal Insurance's claims processed increased by an average of 3.785% annually
between 2006 and 2011.
Federal
Baltic
5
√(1.034*1.048*1.026*1.066*1.016) -1 =
√(1.079*0.971*1.018*1.045*1.03) - 1 =
5
3.785%
2.799%
Exercise 14.17
File:
Data
Food items in micro
market basket
Milk (litres)
Bread (loaves)
Sugar (kg)
Maize meal (kg)
(a)
Food items
Milk (litres)
Bread (loaves)
Sugar (kg)
Maize meal (kg)
Unit prices (in Rand)
2010
2011
7.29
7.89
4.25
4.45
2.19
2.45
5.25
5.59
2010
7.29
4.25
2.19
5.25
2011
7.89
4.45
2.45
5.59
X14.17 - micro-market basket.xlsx
Consumption
2010
2011
117
98
56
64
28
20
58
64
Price Relatives
108.23
104.71
111.87
106.48
Interpretation
The price of milk rose by 8.23%; bread by 4.71%; sugar by 11.87% and
maize meal by 6.48% per unit of measure from 2010 to 2011.
The largerst price change (increase) was sugar with a 11.87% increase.
(b)
Paasche Weighted Average of Price Relatives Index
Food items
Milk (litres)
Bread (loaves)
Sugar (kg)
Maize meal (kg)
Total
Base Value
(p0*q1)
714.42
272
43.8
336
1366.22
Paasche Composite Price Index
Price Relatives
108.23
104.71
111.87
106.48
146478/1366.22 =
Weighted
Average
77322
28480
4900
35776
146478
107.21
On average, the price of the micro-basket of items increased by 7.21% from 2010 to 2011.
(c)
Food items
2010
2011
Milk (litres)
Bread (loaves)
Sugar (kg)
Maize meal (kg)
117
56
28
58
98
64
20
64
Quantity
Relatives
83.8
114.3
71.4
110.3
Interpretation
The consumption of milk decreased by 16.2%; bread consumption rose by 14.3%;
sugar consumption decreased significantly by 28.6% and maize meal consumption
rose by 10.3% from 2010 to 2011 respectively.
The largest consumption change (decrease) was sugar with a 28.6% decrease.
It is interesting to note that sugar showed the largest unit price increase while
simultaneously recorded the largest decrease in consumption from 2010 to 2011.
(d)
Paasche Weighted Average of Quantity (Consumption) Relatives Index
Food items
Milk (litres)
Bread (loaves)
Sugar (kg)
Maize meal (kg)
Total
Base Value
(p1*q0)
923.13
249.20
68.60
324.22
1565.15
Paasche Composite Quantity Index
Quantity Relatives
83.8
114.3
71.4
110.3
146478/1565.15 =
Weighted
Average
77322
28480
4900
35776
146478
93.6
On average, consumption of the micro-basket items dropped by 6.4% from 2010 to 2011.
Exercise 14.18
Data
Utilities
Electricity
Sewage
Water
Telephone
(a)
Utilities
Electricity
Sewage
Water
Telephone
File:
Prices (in Rand / unit)
2008
2009
2010
2.05
2.09
1.97
0.68
0.72
0.62
0.31
0.35
0.29
1.18
1.06
1.24
Unit Price
2008
Unit Price
2010
1.97
0.62
0.29
1.24
2.09
0.72
0.35
1.06
X14.18 - utilities usage.xlsx
Consumption (No. of units)
2008
2009
2010
745
812
977
68
56
64
296
318
378
1028
1226
1284
Price
relative
2010
106.1
116.1
120.7
85.5
Interpretation
From 2008 to 2010, electricity increased by 6.1%; sewage costs by 16.1%
water by 20.7%; while telephone costs decreased by 14.5% over this period.
Electricity showed the smallest change (increase) of 6.1% from 2008 to 2010.
(b) (i) Laspeyres Weighted Aggregates Price Index
Base Value
(2008)
Utilities
Electricity
Sewage
Water
Telephone
Totals
1467.7
42.2
85.8
1274.7
2870.4
Laspeyres Composite Price Indexes
Current
Value
(2009)
1527.3
46.2
91.8
1213.0
2878.3
Current
Value
(2010)
1557.1
49.0
103.6
1089.7
2799.3
100.3
97.5
(b) (ii) Laspeyres Weighted Average of Price Relatives Index
Utilities
Electricity
Sewage
Water
Telephone
Totals
Base Value
(2008)
1467.7
42.2
85.8
1274.7
2870.4
Price
relative
(2009)
104.1
109.7
106.9
95.2
Laspeyres Composite Price Indexes
Price
relative
(2010)
106.1
116.1
120.7
85.5
Weighted
Ave (2009)
Weighted
Ave (2010)
152782.4
4625.0
9176.3
121353.3
287937.0
155717.7
4894.8
10360.9
108988.6
279961.9
100.3
97.5
(c)
Interpretation
There was virtually no change in the average cost of household utilities between
2008 and 2009. A slight increase of only 0.3% was recorded.
From 2008 to 2010, however, the average cost of household utilities actually
decreased marginally by 2.5%.
Exercise 14.19
Data
File:
Utilities
Electricity
Sewage
Water
Telephone
(a)
Utilities
Electricity
Sewage
Water
Telephone
X14.18 - utilities usage.xlsx
Prices (in Rand / unit)
2008
2009
2010
2.05
2.09
1.97
0.68
0.72
0.62
0.31
0.35
0.29
1.18
1.06
1.24
Usage
2008
Usage
2010
745
68
296
1028
977
64
378
1284
Consumption (No. of units)
2008
2009
2010
745
812
977
68
56
64
296
318
378
1028
1226
1284
Quantity
relative
2010
131.1
94.1
127.7
124.9
Interpretation
Consumption of electricity, water and telephone increased by 31.1%; 27.7% and
24.9% respectively from 2008 to 2010. Only sewage showed a decline in usage
by 5.9% from 2008 to 2010.
(b) (i)
Laspeyres Weighted Aggregates Quantity (Consumption) Index
Base Value
(2008)
Utilities
Electricity
Sewage
Water
Telephone
Totals
1467.7
42.2
85.8
1274.7
2870.4
Laspeyres Composite Quantity Indexes
(b) (ii)
Current
Value
(2010)
1924.7
39.7
109.6
1592.2
3666.2
113.1
127.7
Laspeyres Weighted Average of Quantity Relatives Index
Utilities
Electricity
Sewage
Water
Telephone
Totals
Base Value
(2008)
1467.7
42.2
85.8
1274.7
2870.4
Quantity
relative
(2009)
109.0
82.4
107.4
119.3
Laspeyres Composite Quantity Indexes
(c)
Current
Value
(2009)
1599.6
34.7
92.2
1520.2
3246.8
Quantity
Weighted
relative
Ave (2009)
(2010)
131.1
159959
94.1
3474
127.7
9219
124.9
152074
324726
113.1
Weighted
Ave (2010)
192468
3968
10962
159213
366610
127.7
Interpretation
On average, household consumption of utilities increased by 13.1% in 2009 from 2008
and also showed an overall consumption increase of 27.7% from 2008 to 2010.
Exercise 14.20
File:
Data
2005
2006
2007
2008
2009
2010
2011
97
92
100
102
107
116
112
100/107 =
0.934579
Composite
cost index
(a)
X14.20 - leather goods.xlsx
Re-base to 2009
Adjustment factor =
2005
2006
2007
2008
2009
2010
2011
90.7
86
93.5
95.3
100
108.4
104.7
Composite
cost index
(b)
Composite cost index
120
Cost Index
110
100
90
80
70
60
Composite cost index
2005
2006
2007
2008
2009
2010
2011
Interpretation
Prior to 2009, the average annual cost of leather goods inputs were below the 2009 level,
but showed a steady increase towards 2009 prices over this period.
Relative to 2009 unit costs, the cost of leather
goods inputs was higher by 8.4% and 4.7% respectively for 2010 and 2011.
(c)
The average increase in unit costs of leather goods inputs was only 4.7% between
2009 and 2011.
(d)
Link Relatives
Composite
cost index
2005
2006
2007
2008
2009
2010
2011
100
94.8
108.7
102.0
104.9
108.4
96.6
The largest year-on-year change in overall unit costs of leather goods inputs was
between 2006 and 2007 with an increase of 8.7%.
(e)
Geometric mean
6
√(0.948*1.087*1.02*1.049*1.0841*0.966)
1.024244
(i.e. 2.424% increase on average per year)
=GEOMEAN(0.9485,1.087,1.02,1.049,1.0841,0.9655)
= 1.024244
Exercise 14.21
(a)
(b)
(c)
Year
2005
2006
2007
2008
2009
2010
2011
File:
Salary
387
406
422
448
466
496
510
CPI
95
100
104
111
121
126
133
X14.21 - accountants' salaries.xlsx
Real Salary
407.4
406.0
405.8
403.6
385.1
393.7
383.5
Interpretation
The base real salary is R406 (in R1000s). Since 2006, real salaries have
declined relative to 2006 (base) and are continuing to fall further behind inflation (CPI).
Year
2005
2006
2007
2008
2009
2010
2011
Salary
387
406
422
448
466
496
510
CPI
95
100
104
111
121
126
133
Salary
104.9
103.9
106.2
104.0
106.4
102.8
Link Relatives
CPI
Diff (Sal - CPI)
105.3
104.0
106.7
109.0
104.1
105.6
-0.4
-0.1
-0.6
-5.0
2.3
-2.7
Interpretation
On a year-on-year basis, salary increases have lagged behind the inflation rate (CPI)
in all years except 2010 when salary adjustments exceeded CPI by 2.3%.
Exercise 14.22
Data
(a)
File:
2003
2004
2005
2006
2007
2008
2009
2010
94.8
97.6
100
105.2
108.5
113.9
116.7
121.1
Composite
cost index
Re-base to 2008
Composite
cost index
X14.22 - school equipment .xlsx
0.878
Adjustment factor = 100/113.9 =
2003
2004
2005
2006
2007
2008
2009
2010
83.2
85.7
87.8
92.4
95.3
100
102.5
106.3
Plot and Interpretation of Re-based Cost Index Series
(b) and (e)
Composite Cost Index
110
105
100
95
90
85
Composite cost index
80
75
2003
2004
2005
2006
2007
2008
2009
2010
Interpretation
The average cost of school equipment has shown a steady increase of 3% to 5%
annually over the period 2003 to 2010.
(c)
Link Relatives
Composite
price index
2003
2004
2005
2006
2007
2008
2009
2010
100
102.95
102.46
105.20
103.14
104.98
102.46
103.77
Interpretation
The link relatives confirm the year-on-year increases in the average cost of school
equipment of between 3% to 5% p.a.
(d)
The 2003 budget must be adjusted annually by the link relative (year-on-year) indexes.
Therefore multiply each previous year's budget by the next year's link relative index.
Budgets
2003
5000000
2004
5147500
2005
5274129
2006
2007
5548384 5722603
2008
6007589
2009
2010
6155376 6387434
Exercise 14.23
Data
Coffee types
Java
Colombia
Sumatra
Mocha
(a)
Coffee types
Java
Colombia
Sumatra
Mocha
File:
Unit price
(2007)
85
64
115
38
Quantity
(2007)
52
75
18
144
Unit price
(2007)
85
64
115
38
Unit price
(2010)
98
74
133
42
X14.23 - coffee imports.xlsx
Unit price
(2010)
98
74
133
42
Quantity
(2010)
46
90
20
168
Price Relatives
115.3
115.6
115.7
110.5
Interpretation
The coffee types of Java, Colombia and Sumatra increased by over 15% from
2007 to 2010, while Mocha increased by only 10.5% over this same period.
(b)
Laspeyres Composite Price Index (Weighted Aggregates approach)
Coffee type
Java
Colombia
Sumatra
Mocha
Total
Base Value Current Value
(2007)
(2010)
4420
4800
2070
5472
16762
5096
5550
2394
6048
19088
Laspeyres Composite Price Index =
19088/16762 =
113.9
Interpretation
The cost of coffee imports has increased by 13.9% on average from 2007 to 2010.
(c)
Cheaper.
The cost of coffee imports has increased by only 13.9% on average, since 2007.
(d)
Laspeyres Composite Quantity Index (Weighted Aggregates approach)
Coffee type
Java
Colombia
Sumatra
Mocha
Total
Base Value Current Value
(2007)
(2010)
4420
4800
2070
5472
16762
3910
5760
2300
6384
18354
Laspeyres Composite Quantity Index
18354/16762 =
109.5
Interpretation
The quantity of coffee imported has increased by 9.5% on average from 2007 to 2010.
Exercise 14.24
Data
Claim Type
GPs
Specialists
Dentists
Medicines
(a)
File:
Ave Value (R)
(2008)
220
720
580
400
Claims
(2008)
20
30
10
50
X14.24 - medical claims.xlsx
Ave Value (R)
(2010)
255
822
615
438
Claims
(2010)
30
25
15
70
Laspeyres Composite Quantity Index (Weighted Aggregates approach)
(Use the Laspeyres Approach as the default method if none is specified)
Claim type
GP's
Specialists
Dentists
Medicines
Total
Base Value
(2008)
4400
21600
5800
20000
51800
Current Value
(2010)
6600
18000
8700
28000
61300
Laspeyres Composite Quantity Index =
61300/51800 =
118.34
Interpretation
The number of claims received has increased by 18.34% from 2008 to 2010.
(b)
Claim type
Claims (2008)
Claims (2010)
GP's
Specialists
Dentists
Medicines
20
30
10
50
30
25
15
70
Quantity
relatives
150.0
83.33
150.0
140.0
Interpretation
GP's and Dentists showed the biggest increase in number of claims (both 50% increase).
(c)
Laspeyres Composite Price (Claims Value) Index (Weighted Aggregates approach)
Claim type
GP's
Specialists
Dentists
Medicines
Total
Base Value
(2008)
4400
21600
5800
20000
51800
Current Value
(2010)
5100
24660
6150
21900
57810
Laspeyres Composite Price Index =
(d)
57810/51800 =
111.6
Interpretation
The value of claims received between 2008 and 2010 has risen by 11.6%.
Exercise 14.25
Data
Shoe Model
Trainer
Balance
Dura
(a)
File:
Unit price
(2009)
320
445
562
Pairs Sold
(2009)
96
135
54
Unit price
(2010)
342
415
595
Pairs sold
(2010)
110
162
48
Quantity
relatives
114.6
120.0
88.9
X14.25 - tennis shoes.xlsx
Pairs Sold
(2010)
110
162
48
Quantity Relatives
Shoe model
Trainer
Balance
Dura
Pairs sold
(2009)
96
135
54
Interpretation
Dura's sales volume is down by 11.1% from 2009.
(b)
Laspeyres Weighted Aggregates Quantity (Volume) Index
Shoe model
Trainer
Balance
Dura
Total
Base Value
(2009)
30720
60075
30348
121143
Current Value
(2010)
35200
72090
26976
134266
Composite Quantity Index = 134266/121143*100 =
(c)
110.83
Interpretation
The average increase in shoes sold from 2009 to 2010 was only 10.83%.
Overall, sales volumes do not meet the required growth of at least 12% p.a.
Exercise 14.26
Data
Fuel
cost
index
File:
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
100.0
116.2
122.4
132.1
135.7
140.3
142.8
146.9
153.4
160.5
(a) Re-base to 2006
Fuel
cost
index
X14.26 - energy fund.xlsx
Adjustment factor =
0.7128
100/140.3 =
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
71.28
82.83
87.25
94.16
96.73
100.00
101.79
104.71
109.34
114.40
2002
2003
2004
2005
2006
2007
2008
2009
2010
105.34
107.92
102.73
103.39
101.78
102.87
104.42
104.63
(b) Link Relatives
Fuel
cost
index
2001
100.00 116.20
Fuel Cost Index
120.00
115.00
110.00
105.00
100.00
95.00
90.00
Fuel cost index
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
(c) Geometric Mean
9√(1.162*1.0534*1.0792*1.0273*1.0339*1.0178*1.0287*1.0442*1.0463)
1.05397
Annual Average Increase =
5.3974% p.a. on average.
Using Excel
=GEOMEAN(all link relatives)
105.3976
(d) Interpretation
Annual fuel cost increases and decreases were high in the period 2001 to 2004.
Thereafter, year-on-year increases were moderate and stable at between 3% to 5% p.a.
Exercise 14.27
Data
(a)
File:
X14.27 - motorcycle distributor.xlsx
Average Selling Price (R)
Model
2007
18050
Blitz
25650
Cruiser
39575
Classic
2008
19235
26200
42580
2009
21050
27350
43575
2010
21950
28645
43950
2011
22400
31280
46750
Units Sold
Model
Blitz
Cruiser
Classic
2008
185
386
70
2009
168
402
111
2010
215
519
146
2011
225
538
132
2009
106.6
2010
111.7
2011
121.9
2007
205
462
88
Price Relative Index series - Cruiser model
Model
Cruiser
2007
100.0
2008
102.1
The price of the Cruiser model rose marginally (between 2% and 7%) for 2008
and 2009 relative to 2007; but rose strongly by 11.7% in 2010 and 21.9% in 2011.
(b)
Laspeyres Composite Price Index (Using Weighted Aggregates method)
Base
Value
Value
Value 2008
2007
2009
2010
3700250
3943175 4315250 4499750
11850300 12104400 12635700 13233990
3482600
3747040 3834600 3867600
19033150 19794615 20785550 21601340
Model
Blitz
Cruiser
Classic
Total
Price Index
2007
100
2008
104.0
2009
109.2
Value
2011
4592000
14451360
4114000
23157360
2010
113.5
2011
121.7
Laspeyres Composite Price Index - Motorcycles
130
120
110
100
90
80
70
60
Price Index
2007
2008
2009
2010
2011
(c)
Interpretation
There has been a reasonably constant increase in the average price of motorcycles
over the past 5 years. In 2011, motorcycles cost 21.7% more on average than
they did in 2007.
(d)
Link relative of Selling Prices - Classic Model
2007
100
Classic
2008
107.6
2009
102.3
2010
100.9
2011
106.4
Interpretation
The Classic model has shown a very modest year-on-year price increase of
between 2.3% and 7.6% between 2007 and 2011. In 2010, the annual
(year-on-year) increase was only 0.9% (less than 1%).
(e)
Geometric Mean
Using Excel
4
= √ (1.076*1.023*1.009*1.064) - 1=
= GEOMEAN(link relatives) - 100 =
4.263
4.253
Motorcycle prices have risen by an average of 4.263% per annum.
(f)
Laspeyres Composite Quantity Index (Using Weighted Aggregates method)
Model
Blitz
Cruiser
Classic
Total
Quantity Index
Base
Value
Value
Value 2008
2007
2009
2010
3700250
3339250 3032400 3880750
11850300
9900900 10311300 13312350
3482600
2770250 4392825 5777950
19033150 16010400 17736525 22971050
2007
100
2008
84.1
2009
93.2
Value
2011
4061250
13799700
5223900
23084850
2010
120.7
2011
121.3
Laspeyres Composite Quantity Index - Motorcycles
130
120
110
100
90
80
Quantity Index
70
60
2007
2008
2009
2010
2011
(g)
Interpretation
Sales of motorcycles dropped for two years after 2007 (i.e. by 15.9% in 2008 and
by 6.8% in 2009). Thereafter unit sales, on average per model, increased by 20.7%
in 2010 relative to 2007; but performed at the same level in 2011 compared to 2010.
Exercise 14.28
File:
Data
X14.28 - tyre production.xlsx
Hillstone Tyre Production Costs and Volumes - Uitenhage Plant
Cost/Tyre
Passenger
Light truck
Giant truck
2010-Jan Feb
March April
May
June
July
Aug
210.69 212.47 210.73 218.14 219.22 216.19 225.92 234
376.45
361.7 361.76 363.94 363.62 364.06 376.9 375.4
1171.1 1109.6 1101.8 1119.7 1127.2 1120.32 1162.8 1181
Output (1000's)
Passenger
Light truck
Giant truck
2010-Jan
78
11
10
(a)
Cost relative
index series
Passenger tyre
Feb
102
14
14
March
93
13
13
April
81
12
11
May
105
16
15
June
100
16
15
July
117
19
16
Aug
105
17
16
Sept
Oct
Nov
Dec
229.89 222.76 223.96 200.3
375.55 375.04 375.59 376.3
1157.7 1166.75 1157.7 1148
Sept
98
14
16
Oct
110
15
15
Nov
97
13
14
Dec
43
6
5
Cost Relative Index - Passenger Tyres
Jan
Feb
March
April
May
June
July
Aug
Sept
Oct
Nov
Dec
100.0
100.8
100.0
103.5
104.0
102.6
107.2
111.1
109.1
105.7
106.3
95.1
Interpretation
Passenger tyre costs increased steadily throughout the year reaching a peak of 11.1% above January
2010 levels in August. Thereafter costs declined steadily and ended the year 4.9% below the
starting level in January 2010.
(b)
Passenger
Light truck
PRODUCTION COST ANALYSIS
Jan
16433.82
Feb
March
16573 16437
April
17015
Laspeyres Composite Cost Index (Weighted Aggregates)
May
June
17099 16862.8
July
Aug
17622 18250
4140.95 3978.7 3979.4 4003.3 3999.8 4004.66 4145.9
Sept
Oct
17931 17375.3
Nov
Dec
17469 15623
4129 4131.1 4125.44 4131.5
4139
Giant truck
11711
11096
11018
11197
11272 11203.2
11628 11812
11577 11667.5
11577 11479
Total Cost
32285.77
31647
31435
32216
32371 32070.7
33396 34192
33639 33168.2
33177 31241
Jan
Feb
March
April
May
June
July
Aug
Sept
Oct
Nov
Dec
100.0
98.0
97.4
99.8
100.3
99.3
103.4
105.9
104.2
102.7
102.8
96.8
Composite Cost
Index
Composite Cost Index
108.0
106.0
104.0
102.0
100.0
98.0
96.0
94.0
Composite Cost Index
92.0
Jan
Feb
March
April
May
June
July
Aug
Sept
Oct
Nov
Dec
Interpretation
Composite tyre manufacturing costs declined or remained constant for the first 6 months
until June 2010. Thereafter, costs, on average increased by 6% over July and August
before being brought under control. By December, composite costs were 3.2% below
the beginning of the year levels.
(c)
Light truck
Link Relatives
Link Relatives - Costs - Light Truck Tyres
Jan
376.45
100.0
Feb
361.7
96.1
March April
May
361.76 363.94 363.62
100.0 100.6
99.9
June
364.06
100.1
July
376.9
103.5
Aug
375.4
99.6
Sept
375.55
100.0
Oct
375.04
99.9
Nov
Dec
375.59 376.3
100.1 100.2
Interpretation
Light truck radial tyres costs have remained almost constant and unchanged throughout
the year with most adjustments not exceeding 0.5%. Only in Feb (decrease of 3.9%) and
July (increase of 3.5%) did costs fluctuate to any degree.
(d)
Passenger
Light truck
Giant truck
Total Value
Volume Index
PRODUCTION VOLUME ANALYSIS
Laspeyres Quantity Index (Weighted Aggregates)
Jan
Feb
March April
May
June
July
Aug
Sept
Oct
Nov
Dec
16433.82 21490 19594 17066 22122
21069 24651 22122 20648 23175.9 20437 9060
4140.95 5270.3 4893.9 4517.4 6023.2 6023.2 7152.6 6400 5270.3 5646.75 4893.9 2259
11711 16395 15224 12882 17567 17566.5 18738 18738 18738 17566.5 16395 5856
32285.77 43156 39712 34465 45712 44658.7 50541 47260 44656 46389.2 41726 17174
Jan
100.0
Feb
133.7
March
123.0
April
106.8
May
141.6
June
138.3
July
156.5
Aug
146.4
Sept
138.3
Oct
143.7
Nov
129.2
Laspeyres Composite Production Output (Volume) Index
180.0
160.0
140.0
120.0
100.0
80.0
60.0
40.0
20.0
Volume Index
0.0
Jan
(e)
Feb
March
April
May
June
July
Aug
Sept
Oct
Nov
Dec
Interpretation
Production volumes of all three makes of tyres showed a steady increase above the January
level by up to nearly 60% over the first half of the year. However output showed a steady
decline in the second half of the year ending at only 53.2% of the beginning of the year levels.
Dec
53.2
CHAPTER 15
TIME SERIES ANALYSIS
A FORECASTING TOOL
Exercise 15.1
Cross-sectional data is gathered at one point in time;
Time series data is recorded at fixed intervals over time.
Exercise 15.2
Monthly national new car sales;
Daily maximum temperature for Cape Town.
Exercise 15.3
A line graph
Exercise 15.4
Trend; Cycles; Seasonality; Irregular (random)
Seasonality tends to show the most regularity.
Exercise 15.5
The Moving Average method smooths out short-term
fluctuations in a time series to allow the longer-term
underlying trend and cyclical patterns to be revealed.
Exercise 15.6
Yes, averaging occurs over a longer time period (i.e. five periods)
producing a smoother curve.
Exercise 15.7
A seasonal index of 108 means that seasonal influences stimulate
the time series values by 8% above the trend / cyclical level.
Exercise 15.8
A seasonal index of 88 means that seasonal influences depress
the time series values by 12% below the trend / cyclical level.
Exercise 15.9
File:
(a) and (b)
Year
Coal Tonnage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
118
124
108
120
132
115
122
148
160
188
201
174
191
178
146
161
(c) and (d)
X15.9 - coal tonnage.xlsx
Uncentred 4
year moving
total
Centred 4year Moving
Average
470
484
475
489
517
545
618
697
723
754
744
689
676
119.25
119.875
120.5
125.75
132.75
145.375
164.375
177.5
184.625
187.25
179.125
170.625
Centred 5-year
Moving
Average
120.4
119.8
119.4
127.4
135.4
146.6
163.8
174.2
182.8
186.4
178
170
Line Graphs of Coal mined, 4 and 5 year Moving Averages
coal mined (100 000 tonnes)
250
200
150
Coal Tonnage
100
Centred 4- year Moving
Average
50
0
Centred 5-year Moving
Average
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
years
(e)
Interpretation
The annual tonnage of coal mined in the Limpopo province was constant for the first
7 years after which there was an expansion phase for the next 4 years. Thereafter coal
production has been declining steadily. This could be evidence of a cyclical effect
caused by economic cycles in the demand for coal worldwide.
Overall, a moderate upward cyclical trend.
Exercise 15.10
(a)
File:
X15.10 - franchise dealers.xlsx
Line Graph - New Franchise Dealers
Line Graph: Trend Line of New Franchise Dealers
60
no. of new dealers
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
time periods
(b)
n = 10
Σ
Period (x)
1
2
3
4
5
6
7
8
9
10
55
x²
1
4
9
16
25
36
49
64
81
100
385
xy
28
64
129
124
190
282
280
360
495
420
2372
b1 =
((10*2372)-(55*401))/(10*385-55²) =
b0 =
(401-2.0182*55)/10 =
ŷ=
(c)
Dealers (y)
28
32
43
31
38
47
40
45
55
42
401
29 + 2.0182 x
29
x = 1, 2, 3, …, 10
Trend Estimates
x
11
12
13
substitute into ŷ
29 + 2.0182 (11)
29 + 2.0182 (12)
29 + 2.0182 (13)
2.0182
ŷ
51.20
53.22
55.24
The number of new franchise dealers are likely to be 51, 53 and 55 in
periods 11, 12 and 13 respectively, based on trend estimates.
Exercise 15.11
(a)
File:
X15.11 - policy claims.xlsx
Line Graph - Household Policy Claims
no. of claims
Line Graph - Household Policy Claims
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
quarters
(b)
n = 16
Σ
Period (x) Claims (y)
1
84
2
53
3
60
4
75
5
81
6
57
7
51
8
73
9
69
10
37
11
40
12
77
13
73
14
46
15
39
16
63
136
978
x²
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
1496
xy
84
106
180
300
405
342
357
584
621
370
440
924
949
644
585
1008
7899
b1 =
=[(16)(7899) - (136)(978)]/[(16*1496)-(136)2]
b0 =
=(978-(-1.21765)*136)/16
ŷ=
71.475 - 1.21765 x
x=
1 in Q1 2008
2 in Q2 2008
3 in Q3 2008
-1.21765
71.475
Interpretation
There is a downward trend in household policy claims over the past 4 years.
16
(c)
Time
Periods
Claims
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
84
53
60
75
81
57
51
73
69
37
40
77
73
46
39
63
Uncentred
4 period
Moving
Total
2x4
period
Moving
Total
272
269
273
264
262
250
230
219
223
227
236
235
221
541
542
537
526
512
480
449
442
450
463
471
456
Seasonal Indexes
Q1
Q2
Q3
Q4
Centred 4
period
Moving
Average
67.625
67.75
67.125
65.75
64
60
56.125
55.25
56.25
57.875
58.875
57
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
88.725
110.701
120.670
86.692
79.688
121.667
122.940
66.968
71.111
133.045
123.992
80.702
79.688
121.667
122.940
80.702
78.705
120.166
121.423
79.706
Totals
404.996
400.000
121.4
79.7
78.7
120.2
Interpretation
Household policy claims tend to increase significantly in Quarters 1 and 4 of each year,
by about 20% on average, while there is a significant decline in claims during
Quarters 2 and 3 by about 20% on average.
(d)
Seasonally-adjusted trend estimate of houshold policy claims
x = 17
in Quarter 1 2012
ŷ=
Trend estimate
71.475 - 1.21765 (17) = 50.77495
x = 18
in Quarter 1 2012
ŷ=
Trend estimate
71.475 - 1.21765 (18) = 49.5573
Seasonally-adjusted trend estimate
Q1 2012
Q2 2012
ŷ (adj) =
ŷ (adj) =
50.77495*1.214 =
49.55742*0.797 =
61.64
39.5
Interpretation
The insurance company can expect to receive 62 and 40 (rounded) household policy
claims in the first and second quarters of 2012 respectively.
Exercise 15.12
(a)
File:
X15.12 - hotel occupancy.xlsx
Line Graph - Monthly Hotel Occupancy Rate (%)
Line Graph - Monthly Hotel Occupancy Rate (%)
100
occupancy rate (%)
90
80
70
60
50
40
1
2
3
4
5
6
7
8
9
10
Months (Sept - June )
(b)
n = 10
Σ
Month (x)
1
2
3
4
5
6
7
8
9
10
55
Rate (y)
74
82
70
90
88
74
64
69
58
65
734
x²
1
4
9
16
25
36
49
64
81
100
385
xy
74
164
210
360
440
444
448
552
522
650
3864
b1 =
(10*3864 - 55*734)/(10*385 - 55²) =
b0 =
(734 -(-2.09697)*55)/10 =
ŷ=
84.933 - 2.09697 x
-2.09697
84.933
x=
1 in Sept
2 in Oct
3 in Nov
Interpretation
There is a downward trend in hotel occupancy over the past 10 months since September.
(c)
Trend estimate
Period
July
Aug
x
11
12
ŷ
61.87%
59.77%
Interpretation
The continued downward trend is reflected in the next 2 month's projections.
Exercise 15.13
(a), (e), (f)
File:
X15.13 - electricity demand.xlsx
Line Graph - Electricity Demand for a City (Cape Town)
Line Graph - Cape Town Electricity Demand
y = 4.95x + 20.8
demand (1000 megawatts)
180
160
140
120
100
80
60
40
20
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
quarters (Q1 2008 - Q4 2011)
Interpretation
Electricity demand in Cape Town shows a clear seasonal pattern, peaking in quarter 3
and bottoming out in quarter 4 of each year.
(b)
n = 16
Σ
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
136
Demand (y)
21
42
60
12
35
54
91
14
39
82
136
28
78
114
160
40
1006
x²
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
1496
xy
21
84
180
48
175
324
637
112
351
820
1496
336
1014
1596
2400
640
10234
b1 =
= (16*10234 - 136*1006)/(16*1496 - 136²) =
4.95
b0 =
(1006 -(4.95)*136)/16 =
20.8
ŷ=
20.8 + 4.95 x
x=
1 in Q1 2008
2 in Q2 2008
3 in Q3 2008
(c)
Time
Periods
Demand
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
21
42
60
12
35
54
91
14
39
82
136
28
78
114
160
40
Uncentred
4 period
Moving
Total
2x4 period
Moving
Total
135
149
161
192
194
198
226
271
285
324
356
380
392
284
310
353
386
392
424
497
556
609
680
736
772
Centred 4
period
Moving
Average
35.5
38.75
44.125
48.25
49
53
62.125
69.5
76.125
85
92
96.5
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
Totals
Seasonal Indexes
Q1
Q2
Q3
Q4
178.65
30.97
79.32
117.99
175.61
30.44
77.97
115.98
406.927
400.000
169.01
30.97
79.32
111.92
185.71
26.42
62.78
117.99
178.65
32.94
84.78
118.13
77.97
115.98
175.61
30.44
Interpretation
Electricity demand peaks in Q3 by 75% over the trend / cyclical level; and drops
to 70% below the trend /cyclical level during Q4.
(d)
Seasonally-adjusted trend estimate of Cape Town's electricity demand
Period
x
Q3 2012
Q4 2012
19
20
Trend ŷ
114.85
119.8
Seasonal
Index
175.61
30.44
Seasonally
adjusted
Trend
201.69
36.47
Interpretation
Electricity demand in Cape Town is likely to peak at 201.69 megawatts in
Q3 of 2012 and bottom out at 36.47 megawatts in Q4 of 2012.
Exercise 15.14
File:
(a)
Seaonal
Index
136
112
62
90
136
112
62
90
136
112
62
90
Actual
568
495
252
315
604
544
270
510
662
605
310
535
(b) and (e)
X15.14 - hotel turnover.xlsx
Deseasonalised
417.6
442.0
406.5
350.0
444.1
485.7
435.5
566.7
486.8
540.2
500.0
594.4
Hotel Industry Quarterly Turnover
turnover (R millions)
700
600
500
400
300
200
Actual
100
0
1
2
3
4
5
6
7
De-seasonalised
8
9
10
11
quarters (Summer 2008 - Spring 2010)
(c)
Trend line estimate ŷ
n = 12
Σ
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
78
T/over (y)
568
495
252
315
604
544
270
510
662
605
310
535
5670
x²
1
4
9
16
25
36
49
64
81
100
121
144
650
xy
568
990
756
1260
3020
3264
1890
4080
5958
6050
3410
6420
37666
12
(f)
(12*37666 - 78*5670)/(12*650 - 78²) =
5.6713
b0 =
(5670 - (5.6713)*78)/12 =
435.64
ŷ=
435.64 + 5.6713 x
1 in Summer 2008
2 in Autumn 2008
3 in Winter 2008
x=
Seasonally-adjusted trend estimate of Hotel Industry Quarterly Turnover
Period
x
Summer 2011
Autumn 2011
13
14
Seasonal
Index
Trend ŷ
509.37
515.04
136
112
Seasonally
adjusted
Trend
692.74
576.84
Trend line estimation using Excel's Chart Wizard
Hotel Industry Quarterly Turnover
y = 5.6713x + 435.64
700
turnover (R millions)
(d)
b1 =
600
500
400
300
200
100
Actual
1
2
3
4
5
6
7
quarters
8
Linear (Actual)
9
10
11
12
Exercise 15.15
File:
X15.15 - farming equipment.xlsx
Plot of Farming Equipment Sales (2008 - 2011)
Quarterly Farming Implements Sales
y = 0.8544x + 52.05
75
70
no. units sold
65
60
55
50
45
Sales
40
Linear (Sales)
35
30
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16
quarters (2008 - 2011)
(a)
Seasonal Indexes for Farming Equipment Sales
Periods
Sales
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
57
51
50
56
60
56
53
61
65
60
58
68
64
62
58
70
Uncentred
4 period
Moving
Total
214
217
222
225
230
235
239
244
251
250
252
252
254
2x4 period
Moving
Total
431
439
447
455
465
474
483
495
501
502
504
506
Seasonal Indexes
(b)
Centred 4
period
Moving
Average
53.88
54.88
55.88
56.88
58.13
59.25
60.38
61.88
62.63
62.75
63.00
63.25
Summer
Autumn
Winter
Spring
Seasonal
Ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
92.81
102.05
107.38
98.46
91.18
102.95
107.66
96.97
92.61
108.37
101.59
98.02
92.61
102.95
107.38
96.97
92.63
102.97
107.40
97.00
Totals
399.92
400
107.40
97.00
92.63
102.97
Seasonal Influences
The influence of seasonal forces on farming equipment sales is modest.
There is a small stimulatory effect during Spring and Summer (no more than 7%) and
a small depressing effect (also no more than 7%) during Autumn and Winter.
(c)
n = 16
Σ
(d)
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
136
Sales (y)
57
51
50
56
60
56
53
61
65
60
58
68
64
62
58
70
949
x²
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
1496
xy
57
102
150
224
300
336
371
488
585
600
638
816
832
868
870
1120
8357
b1 =
(16*8357-136*949)/(16*1496-136²) =
0.8544
b0 =
(949 -(0.8544)*136)/16 =
52.05
ŷ=
52.05 + 0.8544 x
x=
1 in Summer 2008
2 in Autumn 2008
3 in Winter 2008
Seasonally-adjusted trend estimate of Farming Equipment Sales for 2012
Period
x
Summer
Autumn
Winter
Spring
17
18
19
20
Trend ŷ
66.57
67.43
68.28
69.14
Seasonal
Index
107.40
97.00
92.63
102.97
Seasonally
adjusted
Trend
71.50
65.41
63.25
71.19
Interpretation
The company can expect to sell between 63 and 72 farming implements
each quarter during 2012 with the higher sales expected in Summer and Spring.
Exercise 15.16
File:
(a), (e) and (f)
X15.16 - energy costs.xlsx
Line Graph of Office Complex Energy Costs (in R100 000)
Office Complex Energy Costs
y = 0.0601x + 3.1091
energy cost (R100 000)
5
4.5
4
3.5
3
2.5
2
1.5
1
1
2
3
4
5
6
7
8
9
10
11
12
quarters (2009 - 2011)
(b)
Time
Periods
Costs
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
2.4
3.8
4
3.1
2.6
4.1
4.1
3.2
2.6
4.5
4.3
3.3
Uncentred 4
period
Moving
Total
13.3
13.5
13.8
13.9
14.0
14.0
14.4
14.6
14.7
2x4
period
Moving
Total
Centred 4
period
Moving
Average
26.80
27.30
27.70
27.90
28.00
28.40
29.00
29.30
3.35
3.4125
3.4625
3.4875
3.5
3.55
3.625
3.6625
Seasonal Indexes
Summer
Autumn
Winter
Spring
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
119.40
90.84
75.09
117.56
117.14
90.14
71.72
122.87
118.27
90.49
73.41
120.21
117.6
90.0
73.0
119.5
Totals
402.39
400
73.0
119.5
117.6
90.0
Interpretation
Energy costs rise by nearly 20% (19,5% and 17,6%) over the colder months of
Autumn and Winter, and decline by between 10% (in Spring) and almost 30%
(27%) in Summer.
(c)
n = 12
Σ
(d)
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
78
Cost (y)
2.4
3.8
4
3.1
2.6
4.1
4.1
3.2
2.6
4.5
4.3
3.3
42
x²
1
4
9
16
25
36
49
64
81
100
121
144
650
xy
2.4
7.6
12
12.4
13
24.6
28.7
25.6
23.4
45
47.3
39.6
281.6
2
b1 =
[(12*281.6)-(78*42)]/[12*650-78 ] =
0.0601
b0 =
(42 - 0.0602*78)/12 =
3.1091
ŷ=
3.1091 + 0.0601 x
x=
1 in Summer 2009
2 in Autumn 2009
3 in Winter 2009
Seasonally-adjusted trend estimate of Office Complex Energy Costs for 2012
Period
x
Trend ŷ
Seasonal
Index
Summer
Autumn
Winter
Spring
13
14
15
16
3.89
3.95
4.01
4.07
72.97
119.50
117.57
89.95
Seasonally
adjusted
Trend
2.84
4.72
4.71
3.66
Interpretation
The office complex manager must budget between R284 000 and R472 000 each
quarter during 2012 with higher costs expected in the Autumn and Winter periods.
Exercise 15.17
(a)
File:
X15.17 - business registrations.xlsx
Line Graph of New Business Registrations (2007 - 2011)
Line Graph - New Business Registrations
no. of new registrations
2500
2000
1500
1000
500
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
quarters (2007 - 2011)
(b) and (c)
4-Period Moving Average and Quarterly Seasonal Indexes
Uncentred 4
Time
New
period Moving
Periods Registrations
Total
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
1005
1222
1298
1199
1173
1371
1456
1376
1314
1531
1605
1530
1459
1671
1762
1677
1604
1837
1916
1819
4724
4892
5041
5199
5376
5517
5677
5826
5980
6125
6265
6422
6569
6714
6880
7034
7176
2x4
period
Moving
Total
(b)
Centred 4
period
Moving
Average
9616
9933
10240
10575
10893
11194
11503
11806
12105
12390
12687
12991
13283
13594
13914
14210
1202.00
1241.63
1280.00
1321.88
1361.63
1399.25
1437.88
1475.75
1513.13
1548.75
1585.88
1623.88
1660.38
1699.25
1739.25
1776.25
Seasonal Indexes
Q1
Q2
Q3
Q4
(c)
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
107.99
96.57
91.64
103.72
106.93
98.34
91.38
103.74
106.07
98.79
92.00
102.90
106.12
98.69
92.22
103.42
106.526
98.514
91.820
103.568
106.41
98.41
91.72
103.46
Totals
400.429
400
91.72
103.46
106.41
98.41
(d)
Interpretation of Seasonal Influences on new business registrations
New business registrations show a modest seasonal pattern, ranging between
8,3% below the annual average in quarter 1 to 6.4% above the annual average
in quarter 3.
(b)
4-Period Moving Average Line Graph and Original Data Line Graph
Line Graph - New Business Registrations
no. of new businesses
2500
2000
1500
1000
New Registrations
500
Centred 4 period Moving
Average
0
quarters (2007 - 2011)
(e)
Least Squares trend line (using Add Trendline in Excel )
ŷ=
1078.2 + 39.338 x
x=
1 in Q1 2007
2 in Q2 2007
3 in Q3 2007
Periods
y
Adjusted
Seasonal
Indexes
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
1005
1222
1298
1199
1173
1371
1456
1376
1314
1531
1605
1530
1459
1671
1762
1677
1604
1837
1916
1819
91.72
103.46
106.41
98.41
91.72
103.46
106.41
98.41
91.72
103.46
106.41
98.41
91.72
103.46
106.41
98.41
91.72
103.46
106.41
98.41
Trend
Estimate ŷ
ŷ (adj)
1117.5
1156.9
1196.2
1235.6
1274.9
1314.2
1353.6
1392.9
1432.2
1471.6
1510.9
1550.3
1589.6
1628.9
1668.3
1707.6
1746.9
1786.3
1825.6
1865.0
1025.0
1196.9
1272.9
1215.9
1169.3
1359.7
1440.3
1370.8
1313.7
1522.5
1607.8
1525.6
1458.0
1685.3
1775.2
1680.5
1602.3
1848.1
1942.6
1835.3
Plot of Actual vs Seasonally adjusted Trend estimates (New Business Registrations)
New Business Registrations (Actual vs Trend Adjusted)
2000
1800
1600
1400
1200
quarters (2007 - 2011)
Comment
The seasonally adjusted trend estimates track the actual number of new business
registrations very closely. It is a good fitting graph.
2011 Q4
2011 Q3
2011 Q2
2011 Q1
ŷ (adj)
2010 Q4
2010 Q2
2010 Q1
2009 Q4
2009 Q3
2009 Q2
2009 Q1
2008 Q4
2008 Q3
2008 Q2
2008 Q1
2007 Q4
2007 Q3
2007 Q2
800
2010 Q3
y
1000
2007 Q1
(g)
Seasonally-adjusted Trend estimates
no. of new registrations
(f)
(h)
Seasonally-adjusted trend estimate of New Business Registrations (2012)
Period
x
Trend ŷ
Q1 2012
Q2 2012
Q3 2012
Q4 2012
21
22
23
24
1904.3
1943.6
1983.0
2022.3
Seasonal
Index
91.72
103.46
106.41
98.41
Seasonally
adjusted
Trend
1746.6
2010.9
2110.1
1990.2
Exercise 15.18
File:
Total Sales (Estimate for next year)
Quarter
1
2
3
4
Seasonal
Index
95
115
110
80
Seasonally
Trend ŷ Adjusted Trend
Estimate ŷ(adj)
12
11.4
12
13.8
12
13.2
12
9.6
48
48
X15.18 - engineering sales.xlsx
Exercise 15.19
(a) and (b)
File:
Period
Wi 2009
Sp 2009
Su 2009
Au 2009
Wi 2010
Sp 2010
Su 2010
Au 2010
Visitors
18.1
26.4
41.2
31.6
22.4
33.2
44.8
32.5
S Index
62
89
162
87
62
89
162
87
X15.19 - Table Mountain.xlsx
De-Seasonlised
29.19
29.66
25.43
36.32
36.13
37.30
27.65
37.36
no. of visitors (1000's)
Table Mountain Visitors
(Actual vs De-Seasonalised)
50
45
40
35
30
25
20
15
10
5
0
Visitors
De-Seasonlised
Wi 2009
Sp 2009
Su 2009
Au 2009
Wi 2010
Sp 2010
Su 2010
Au 2010
Quarters (2009 - 2010)
Comment
The trend (after de-seasonalising the quarterly data) is only marginally upwards.
(c)
Table Mountain Visitors (per quarter for 2011)
Quarter
Seasonal Index
Winter
Spring
Summer
Autumn
62
89
162
87
Seasonally
Trend ŷ Adjusted Trend
Estimate ŷ(adj)
37.5
37.5
37.5
37.5
150
23.25
33.375
60.75
32.625
150
Exercise 15.20
(a)
File:
Year
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
X15.20 - gross domestic product.xlsx
GDP (Millions)
45
47
61
64
72
74
84
81
93
90
98
African Country
Domestic Product
Gross
y = 5.2636x + 41.964
GDP (in R100 millions)
120
100
80
60
40
GDP (Millions)
20
0
Linear (GDP (Millions))
1
2
3
4
5
6
7
8
9
10
11
years (2001 - 2011)
ŷ = 41.964 + 5.2636 x
(b)
Trend line (see graph in (i))
(c)
Expected GDP for 2012 and 2013 (in R100 million)
Year
2012
2013
Year (x )
12
13
Trend ŷ
41.964 + 5.2636 (12)
41.964 + 5.2636 (13)
x =
Estimate (ŷ)
105.1
110.4
1 in 2001
2 in 2002
3 in 2003
Exercise 15.21
File:
(a) and (c)
(a)
Year
Months
2007
Jan - Apr
May - Aug
Sept - Dec
Jan - Apr
May - Aug
Sept - Dec
Jan - Apr
May - Aug
Sept - Dec
Jan - Apr
May - Aug
Sept - Dec
2008
2009
2010
(c)
X15.21 - pelagic fish.xlsx
Pelagic fish
3 Period
Moving Ave
44
36
34
45
42
34
38
32
27
40
31
28
Seasonal
ratio
38.00
38.33
40.33
40.33
38.00
34.67
32.33
33.00
32.67
33.00
(c)
Unadj
Seasonal
Index
Seasonal
Index
96.85
88.70
111.57
97.79
89.56
112.65
297.12
300
94.74
88.70
111.57
104.13
89.47
109.62
98.97
81.82
122.45
93.94
Totals
Seasonal
Indexes
112.65
97.79
89.56
Periods
Jan - Apr
May - Aug
Sept - Dec
Pelagic fish catches are:
- significantly higher (12.65%) than the trend /cyclical volumes in the Jan - April period.
- only marginally lower (2.21%) than the trend /cyclical volumes in the May - Aug period.
- more than 10% lower than the trend /cyclical volumes in the Sept - Dec period.
(b)
no. of tonnes
Pelagic Fish Catches
(Actual vs 3-Period Moving Average)
50
45
40
35
30
25
20
15
10
5
0
Pelagic fish
3 Period Moving Ave
1
2
3
4
5
6
7
8
9
10
11
2003 - 2006 (4-monthly)
Comment
The trend in pelagic fish catches over the past 4 years is decidedly downwards.
12
(d)
n = 12
Σ
(e)
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
78
Catch (y)
44
36
34
45
42
34
38
32
27
40
31
28
431
x²
1
4
9
16
25
36
49
64
81
100
121
144
650
xy
44
72
102
180
210
204
266
256
243
400
341
336
2654
b1 =
(12*2654 - 78*431)/(12*650 - 78²) =
-1.0315
b0 =
(431 - (-1.031469)*78)/12 =
42.621
ŷ=
42.621 - 1.0315 x
x=
1 in Jan - Apr 2007
2 in May - Aug 2007
3 in Sept - Dec 2007
Seasonally-adjusted trend estimates of Pelagic Fish Catches for 2011
Period
2011
x
Trend ŷ
Seasonal
Index
Jan - Apr
May - Aug
Sept - Dec
13
14
15
29.21
28.18
27.15
112.65
97.79
89.56
Seasonally
adjusted
Trend
32.91
27.56
24.32
Exercise 15.22
File:
Month
January
February
March
April
May
June
July
Aug
Sept
Oct
Nov
Share price
90
82
78
80
74
76
70
Period
1
2
3
4
5
6
7
8
9
10
11
X15.22 - share price.xlsx
Trend Price calc
= 89.428 - 2.7143(1)
= 89.428 - 2.7143(2)
= 89.428 - 2.7143(3)
= 89.428 - 2.7143(4)
= 89.428 - 2.7143(5)
= 89.428 - 2.7143(6)
= 89.428 - 2.7143(7)
= 89.428 - 2.7143(8)
= 89.428 - 2.7143(9)
= 89.428 - 2.7143(10)
= 89.428 - 2.7143(11)
Trend Price
86.7
84.0
81.3
78.6
75.9
73.1
70.4
67.7
65.0
62.3
59.6
Trend equation (derived from Excel 's Add Trendline)
ŷ = 89.429 - 2.7143 x
1 in January
2 in February
3 in March
x =
60c
90 - 1/3 (90) =
Threshhold selling price
Share Price Trend
y = 89.429 - 2.7143 x
100
price (cents)
90
80
70
60
Share price
50
40
Trend Price
1
2
3
4
5
6
7
8
9
10
11
months (Jan - Nov)
Conclusion
The trend estimate of the share price is likely to fall below 60c (the selling level)
by November of the same year if the downward trend continues uninterrupted.
Exercise 15.23
(a)
File:
Season
Actual
Su 2008
Au 2008
Wi 2008
Sp 2008
Su 2009
Au 2009
Wi 2009
Sp 2009
Su 2010
Au 2010
Wi 2010
Sp 2010
196
147
124
177
199
152
132
190
214
163
145
198
Seaonal
Index
X15.23 - Addo park.xlsx
Deseasonalised
112
94
88
106
112
94
88
106
112
94
88
106
175.0
156.4
140.9
167.0
177.7
161.7
150.0
179.2
191.1
173.4
164.8
186.8
(b) and (d) Line Plots of Actual and De-Seasonalised Visitor Numbers (by Season)
Addo National Park Visitors
(Actual vs De-Seasonalised)
no. of visitors (in 1000s)
250
200
150
100
Actual
50
0
De-seasonalised
Su
2008
Au
2008
Wi
2008
Sp
2008
Su
2009
Au
2009
Wi
2009
Sp
2009
Su
2010
Au
2010
Wi
2010
Sp
2010
quarters (2004 - 2006)
(c)
Conclusion
There is a very slight upward trend in visitors to the Addo National Park. There is
almost no growth in visitors over the past 3 years.
Exercise 15.24
File:
X15.24 - healthcare claims.xlsx
Value of Healthcare Claims (in R millions)
(a)
Seasonal Indexes for Healthcare Claims
Periods
Claims
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
11.8
13.2
19.1
16.4
10.9
12.4
22.4
17.8
12.2
16.2
24.1
14.6
12.8
14.5
20.8
16.1
Uncentred
Centred 4
2x4 period
4 period
period
Moving
Moving
Moving
Total
Total
Average
60.5
59.6
58.8
62.1
63.5
64.8
68.6
70.3
67.1
67.7
66
62.7
64.2
120.1
118.4
120.9
125.6
128.3
133.4
138.9
137.4
134.8
133.7
128.7
126.9
Seasonal Indexes
(b)
n = 16
Σ
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
136
Claims (y)
11.8
13.2
19.1
16.4
10.9
12.4
22.4
17.8
12.2
16.2
24.1
14.6
12.8
14.5
20.8
16.1
255.3
15.0125
14.8
15.1125
15.7
16.0375
16.675
17.3625
17.175
16.85
16.7125
16.0875
15.8625
Q1
Q2
Q3
Q4
x²
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
1496
Seasonal
Ratios
127.23
110.81
72.13
78.98
139.67
106.75
70.27
94.32
143.03
87.36
79.56
91.41
70.37
89.19
136.28
104.15
xy
11.8
26.4
57.3
65.6
54.5
74.4
156.8
142.4
109.8
162
265.1
175.2
166.4
203
312
257.6
2240.3
Unadjusted
Seasonal
Indexes
Adjusted
Seasonal
Indexes
139.67
106.75
72.13
91.41
136.28
104.15
70.37
89.19
409.96
400
(c)
b1 =
(16*2240.3 - 136*255.3)/(16*1496 - 136²) =
b0 =
(255.3 -(0.206618)*136)/16 =
ŷ=
14.2 + 0.2066 x
0.2066
14.2
x=
1 in Q1 2007
2 in Q2 2007
3 in Q3 2007
Seasonally-adjusted trend estimate of Healthcare Claims for 2011
Period
x
Q1 2011
Q2 2011
Q3 2011
Q4 2011
17
18
19
20
Trend ŷ
Seasonal
Index
17.71
17.92
18.13
18.33
70.37
89.19
136.28
104.15
Seasonally
adjusted
Trend
12.46
15.98
24.71
19.09
Interpretation
Healthcare claims are expected to rise during 2011 from a low of
R12,46 (mill) in Q1 to R24,71 (mill) in Q3.
(d)
Line Plots of Healthcare Claims (Actual vs Seasonally-adjusted trend estimates)
Periods
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
Claims
11.8
13.2
19.1
16.4
10.9
12.4
22.4
17.8
12.2
16.2
24.1
14.6
12.8
14.5
20.8
16.1
S Index
70.37
89.19
136.28
104.15
70.37
89.19
136.28
104.15
70.37
89.19
136.28
104.15
70.37
89.19
136.28
104.15
70.37
89.19
136.28
104.15
Time
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Trend (ŷ)
14.41
14.61
14.82
15.03
15.23
15.44
15.65
15.85
16.06
16.27
16.47
16.68
16.89
17.09
17.30
17.51
17.71
17.92
18.13
18.33
Trend (ŷ-adj)
10.14
13.03
20.20
15.65
10.72
13.77
21.32
16.51
11.30
14.51
22.45
17.37
11.88
15.24
23.58
18.23
12.46
15.98
24.70
19.09
Healthcare Claims
(Actual vs Seasonally-adjusted Trend Estimates)
Claims (in R millions)
30
25
20
15
10
5
Claims
0
quarters (2007 - 2010)
Trend (ŷ-adj)
Exercise 15.25
(a)
File:
Year
2005
2006
2007
2008
2009
2010
2011
Sum
n =7
Time (x)
1
2
3
4
5
6
7
28
Exp (y)
9.6
11.8
12
13.6
14.1
15
17.8
93.9
X15.25 - financial advertising.xlsx
x²
1
4
9
16
25
36
49
140
xy
9.6
23.6
36
54.4
70.5
90
124.6
408.7
b1 =
(7*408.7-28*93.9)/(7*140-282) =
1.1821
b0 =
(93.9-1.182143*28)/7
8.6857
ŷ = 8.6857 + 1.1821 x
(b)
ŷ (2012) = 8.6857 + 1.1821(8) =
(c)
1 in 2005
2 in 2006
3 in 2007
x=
18.143
Financial Services Sector
Advertising Expenditure
y = 1.1821x + 8.6857
Adspend (in R10 millions)
20
18
16
14
12
10
8
1
2
3
4
5
years (2005 - 2011)
(d)
Trend line equation (using Excel 's Add Trendline function)
ŷ = 8.6857 + 1.1821 x
x=
1 in 2005
2 in 2006
3 in 2007
6
7
Year
2005
2006
2007
2008
2009
2010
2011
2012
Time (x)
1
2
3
4
5
6
7
8
Actual (y) Trend (ŷ)
9.6
9.87
11.8
11.05
12
12.23
13.6
13.41
14.1
14.60
15
15.78
17.8
16.96
18.14
Financial Services Adspend
20
adspend (in R10 million)
(e)
18
16
14
12
Actual (y)
10
8
Trend (ŷ)
1
2
3
4
5
years (2005 - 2011)
6
7
8
Exercise 15.26
(a)
File:
X15.26 - policy surrenders.xlsx
Line Graph of Surrendered Endowment Policies (2008 - 2010)
Surrendered Endowment Policies
240
no. of policies
220
200
180
160
140
Policies
120
100
2008
Q1
2008
Q2
2008
Q3
2008
Q4
2009
Q1
2009
Q2
2009
Q3
2009
Q4
2010
Q1
2010
Q2
2010
Q3
2010
Q4
quarters (2008 - 2010)
Interpretation
There is a distinct moderate downward trend in the number of surrendered policies
over the period 2008 - 2010. This reduction could be due to the improved client
communication strategy adopted by the company in recent years.
(b)
Time
Periods
Policies
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
212
186
192
205
186
165
169
182
169
158
162
178
Uncentred
Centred 4
2x4 period
4 period
period
Moving
Moving
Moving
Total
Total
Average
795
769
748
725
702
685
678
671
667
1564
1517
1473
1427
1387
1363
1349
1338
195.5
189.625
184.125
178.375
173.375
170.375
168.625
167.25
Seasonal Indexes
Seasonal
ratios
98.21
108.11
101.02
92.50
97.48
106.82
100.22
94.47
Unadjusted
Seasonal
Indexes
Adjusted
Seasonal
Indexes
97.84
107.47
100.62
93.49
97.99
107.62
100.77
93.62
Totals
399.41
400
Q1
Q2
Q3
Q4
100.77
93.62
97.99
107.62
Interpretation
Endowment policy surrenders are highest in Q4 by 7,62% over the quarterly average;
and lowest in Q2 by 6,38% below the quarterly average. There is very little - to none seasonal impact on policy surrenders in Q1 and only 2,01% below the quarterly
average in Q3.
(c)
n = 12
Σ
(d)
Period (x)
1
2
3
4
5
6
7
8
9
10
11
12
78
Cost (y)
212
186
192
205
186
165
169
182
169
158
162
178
2164
x²
1
4
9
16
25
36
49
64
81
100
121
144
650
b1 =
(12*13558 - 78*2164)/(12*650-78²) =
b0 =
(2164 -(-3.552448)*78)/12 =
ŷ=
203.42 - 3.5524 x
xy
212
372
576
820
930
990
1183
1456
1521
1580
1782
2136
13558
-3.55245
203.42
x=
1 in Q1 2008
2 in Q2 2008
3 in Q3 2008
Seasonally-adjusted trend estimate of no. of Surrendered Endowment Policies in2011
Period
x
Trend ŷ
Seasonal
Index
Q1 2011
Q2 2011
Q3 2011
Q4 2011
13
14
15
16
157.24
153.69
150.13
146.58
100.77
93.62
97.99
107.62
Seasonally
adjusted
Trend
158.45
143.88
147.12
157.75
(Rounded)
158
144
147
158
Exercise 15.27
(a)
File:
Period
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Liquidations
246
243
269
357
163
154
109
162
222
273
284
305
293
348
423
291
320
253
234
162
240
298
264
253
293
302
188
X15.27 - company liquidations.xlsx
3-per M A
5-per M A
252.7
289.7
263.0
224.7
142.0
141.7
164.3
219.0
259.7
287.3
294.0
315.3
354.7
354.0
344.7
288.0
269.0
216.3
212.0
233.3
267.3
271.7
270.0
282.7
261.0
255.6
237.2
210.4
189.0
162.0
184.0
210.0
249.2
275.4
300.6
330.6
332.0
335.0
327.0
304.2
252.0
241.8
237.4
239.6
243.4
269.6
282.0
260.0
(b)
Company Liquidations
(Actual vs 3 and 5 period Moving Averages)
450
no. of liquidations
400
350
300
250
200
150
100
50
0
Liquidations
1
3
5
7
9
11
13
3-per M A
15
17
19
5-per M A
21
23
25
periods
(c)
Interpretation
The level of business liquidations shows no actual upward / downward trend over the
past 27 periods. There is a distinct cyclical pattern wih the longest lasting from
period 7 to period 20.
27
Exercise 15.28
File:
X15.28 - passenger tyres.xlsx
Passenger Tyre Sales (Y)
(a)
Time
Periods
2005 Q1
2005 Q2
2005 Q3
2005 Q4
2006 Q1
2006 Q2
2006 Q3
2006 Q4
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
Quarterly Seasonal Ratios and Trend Line Equation
Tyre Sales
64876
58987
54621
62345
68746
66573
60927
71234
78788
71237
68098
74444
77659
76452
73456
78908
84563
81243
74878
86756
91556
85058
77035
80145
102923
96456
(a)
Trend line
Uncentred 4
period
Moving Total
2x4 period
Moving Total
Centred 4
period
Moving
Average
Seasonal
ratios
485528
496984
510876
526071
545002
559708
571543
581924
584005
588091
598664
608486
619854
631549
637762
647032
661873
672681
678653
674199
678955
701720
60691.0
62123.0
63859.5
65758.9
68125.3
69963.5
71442.9
72740.5
73000.6
73511.4
74833.0
76060.8
77481.8
78943.6
79720.3
80879.0
82734.1
84085.1
84831.6
84274.9
84869.4
87715.0
90.00
100.36
107.65
101.24
89.43
101.82
110.28
97.93
93.28
101.27
103.78
100.51
94.80
99.95
106.07
100.45
90.50
103.18
107.93
100.93
90.77
91.37
Q1
Q2
Q3
Q4
107.76
100.61
90.72
100.91
240829
244699
252285
258591
267480
277522
282186
289357
292567
291438
296653
302011
306475
313379
318170
319592
327440
334433
338248
340405
333794
345161
356559
Seasonal Indexes
Use Excel 's Add Trendline function
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
90.64
100.81
107.65
100.51
90.72
100.91
107.76
100.61
399.62
400.00
Passenger Car Tyre Sales
y = 1302x + 58114
units sold (in 1000's)
120000
100000
80000
60000
40000
Tyre Sales
20000
Linear (Tyre Sales)
quarters (2000 - 2006)
(b)
(b)
(c)
ŷ = 58114 + 1302 x
1 in Q1 2005
2 in Q2 2005
3 in Q3 2005
x =
Seasonally-adjusted Trend estimates of passenger car tyre sales (in units) for 2012 / 2013.
Time
Periods
Time
Trend (ŷ)
2012 Q1
2012 Q2
2012 Q3
2012 Q4
2013 Q1
2013 Q2
2013 Q3
2013 Q4
29
30
31
32
33
34
35
36
95872
97174
98476
99778
101080
102382
103684
104986
Seasonal
Indices
107.76
100.61
90.72
100.91
107.76
100.61
90.72
100.91
Seasonally
Adjusted
Trend
Estimate
103312
97767
89337
100686
108924
103007
94062
105941
Interpretation
The pattern of passenger car tyre sales is very stable.
The trend is linear and upward, and seasonal variations are consistent over time.
Hence Hillstone management could have high confidence in the estimates.
Exercise 15.29
File:
X15.29 - outpatient attendances.xlsx
Outpatients Attendances - Butterworth Clinic
(a)
Quarterly Seasonal Ratios and Trend Line Equation
Time
Periods
Visits
2006 Q1
2006 Q2
2006 Q3
2006 Q4
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
12767
16389
19105
15780
14198
19868
20899
18304
14641
20204
21078
16077
14075
21259
20967
16183
12412
21824
22150
17979
17417
20568
24310
23118
Uncentred
4 period
Moving
Total
64041
65472
68951
70745
73269
73712
74048
74227
72000
71434
72489
72378
72484
70821
71386
72569
74365
79370
78114
80274
85413
67996
47428
2x4 period
Moving
Total
Centred 4
period
Moving
Average
129513
134423
139696
144014
146981
147760
148275
146227
143434
143923
144867
144862
143305
142207
143955
146934
153735
157484
158388
165687
153409
115424
16189.1
16802.9
17462.0
18001.8
18372.6
18470.0
18534.4
18278.4
17929.3
17990.4
18108.4
18107.8
17913.1
17775.9
17994.4
18366.8
19216.9
19685.5
19798.5
20710.9
19176.1
14428.0
118.01
93.91
81.31
110.37
113.75
99.10
78.99
110.53
117.56
89.36
77.73
117.40
117.05
91.04
68.98
118.82
115.26
91.33
87.97
99.31
126.77
160.23
Q1
Q2
Q3
Q4
79.10
110.69
117.46
92.75
Seasonal Indexes
Trend line Use Excel 's Add Trendline function
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
117.31
92.62
78.99
110.53
117.46
92.75
79.10
110.69
399.46
400.00
Outpatient Visits
y = 224.66x + 15591
30000
no. of patients
25000
20000
15000
10000
Visits
5000
0
Linear (Visits)
1
3
5
7
9
11
13
15
17
19
21
23
quarters (2006 - 2011)
ŷ = 15591 + 224.66 x
(b)
(c)
x =
1 in Q1 2006
2 in Q2 2006
3 in Q3 2006
Seasonally-adjusted Trend estimates of Outpatient Visits (Butterworth)
for 2012 / 2013 (first half).
Time
Periods
Time
2012 Q1
2012 Q2
2012 Q3
2012 Q4
2013 Q1
2013 Q2
25
26
27
28
29
30
Trend (ŷ)
21207.5
21432.2
21656.8
21881.5
22106.1
22330.8
Seasonal
Indices
79.10
110.69
117.46
92.75
79.10
110.69
Seasonally
Adjusted
Trend
Estimate
16775
23723
25438
20295
17486
24718
Interpretation
The pattern of outpatient attendances at the Butterworth Clinic is very stable.
It shows a steady upward trend with highly consistent seasonal variations.
Demand increases in the winter months ((Q3) and is lowest in the summer months (Q1).
Exercise 15.30
File:
X15.30 - construction absenteeism.xlsx
Construction Absenteeism
(a)
Quarterly Seasonal Ratios and Trend Line Equation
Time
Days_lost
Periods
933
865
922
864
967
936
931
902
892
845
907
801
815
715
779
711
822
856
762
722
785
715
740
704
2x4
period
Moving
Total
Centred 4
period
Moving
Average
3584
3618
3689
3698
3736
3661
3570
3546
3445
3368
3238
3110
3020
3027
3168
3151
3162
3125
2984
2962
2944
2159
1444
7202
7307
7387
7434
7397
7231
7116
6991
6813
6606
6348
6130
6047
6195
6319
6313
6287
6109
5946
5906
5103
3603
900.3
913.4
923.4
929.3
924.6
903.9
889.5
873.9
851.6
825.8
793.5
766.3
755.9
774.4
789.9
789.1
785.9
763.6
743.3
738.3
637.9
450.4
102.42
94.59
104.72
100.73
100.69
99.79
100.28
96.70
106.50
97.00
102.71
93.31
103.06
91.82
104.07
108.47
96.96
94.55
105.62
96.85
116.01
156.31
Q1
Q2
Q3
Q4
104.21
96.98
102.88
95.93
Seasonal Indexes
Seasonal
ratios
Unadjusted Adjusted
Seasonal Seasonal
Indexes
Indexes
102.74
95.80
104.07
96.85
102.88
95.93
104.21
96.98
399.45
400.00
Trend line Use Excel 's Add Trendline function
Construction Industry Days Lost due to Absenteeism
y = -9.917x + 952.75
1000
900
no. of days lsot
2006 Q1
2006 Q2
2006 Q3
2006 Q4
2007 Q1
2007 Q2
2007 Q3
2007 Q4
2008 Q1
2008 Q2
2008 Q3
2008 Q4
2009 Q1
2009 Q2
2009 Q3
2009 Q4
2010 Q1
2010 Q2
2010 Q3
2010 Q4
2011 Q1
2011 Q2
2011 Q3
2011 Q4
Uncentred 4
period
Moving
Total
800
700
600
Days_lost
500
400
Linear (Days_lost)
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
quarters (2006 - 2011)
ŷ = 952.75 - 9.917 x
(b)
(c)
x =
1 in Q1 2006
2 in Q2 2006
3 in Q3 2006
Seasonally-adjusted Trend estimates of Days Lost in Construction Industry for 2012.
Time
Periods
Time
2012 Q1
2012 Q2
2012 Q3
2012 Q4
25
26
27
28
Seasonal
Trend (ŷ)
Indices
704.8
694.9
685.0
675.1
104.21
96.98
102.88
95.93
Seasonally
Adjusted
Trend
Estimate
734
674
705
648
Interpretation
The pattern of days lost due to absenteeism in the Construction industry shows
a distinct downward trend but with inconsistent seasonal variations.
CHAPTER 16
FINANCIAL CALCULATIONS
INTEREST, ANNUITIES and NPV
Exercise 16.1
Simple interest
Interest is computed on the original lump sum for each period.
Compound interest
For each period, interest is computed on the original lump sum
plus all accummulated interest of the preceeding periods.
Exercise 16.2
No - a compounded amount will earn more interest than a
simple interest investment.
Exercise 16.3
Yes - compounding quarterly will result in interest being
capitalised sooner - and therefore earning more interest than
an annual compounded investment.
Exercise 16.4
Nominal interest rate is the quoted per annum interest rate
Effective interest rate is the actual interest rate achieved
when interest is compounded more than once per year.
Exercise 16.5
Annuity - an annuity is when a constant sum of money is paid
(or received) at regular intervals over a period of time.
Exercise 16.6
Ordinary annuity - regular payments begin the first period of the annuity term
Deferred annuity - regular payments begin only at some future period into
the term of the annuity.
Exercise 16.7
Ordinary annuity certain - the series of regular payments take place at the
end of each payment period.
Ordinary annuity due - the series of regular payments take place at the
beginning of each payment period.
Exercise 16.8
NPV is the term used to convert all cash inflows (and outflows) over time
into present value terms by dividing by the annual rate of interest
It represents future cash flows in current terms.
Exercise 16.9
(a)
Fv = 15000*(1+0.08*5) =
R21 000.00
(b)
Fv = 15000*(1+0.08)5 =
R22 039.92
(c)
Fv = 15000*(1+0.04)10 =
R22 203.66
Exercise 16.10
(a)
Pv = 150000/(1+0.12)² =
R119 579.10
(b)
Pv = 150000/(1+0.06)4 =
R118 814
(c)
Pv = 150000/(1+0.01)24 =
R118 134.90
Exercise 16.11
(a)
n = (3/1-1)/0.16 =
12.5 years
(b)
n = log(3/1)/log(1+0.16) =
7.402 years
(c)
n = log(3/1)/log(1+0.04) =
28.011 quarters
(or 7.003 years)
Exercise 16.12
(a)
Pv = 10525/(1+0.14/12*30) =
R 7,796.30
(b)
Pv = 10525/(1+0.14)2.5
R 7,585.08
Exercise 16.13
(a)
Effective Rate =(1+0.15/12)12-1 =
16.0755%
(b)
=EFFECT(0.15,12)
16.0755%
Exercise 16.14
(a)
Effective Rate =(1+0.09/4)4-1 =
9.3083%
(b)
=EFFECT(0.09,4)
9.3083%
Exercise 16.15
n = log(58890/25000)/log(1+0.11/2) =
n = 8.0013 years
16.00267 half years
Exercise 16.16
Part 1 - First 3 months
Fv = 2000*(1+0.10/2)0.5 =
R 2,049.39
Part 2 - Remaining 21 months
Fv = 2049.39*(1+0.12/12)21 =
R 2,525.65
The value of the investment after 2 years is
R 2,525.65
Exercise 16.17
Quarterly rate (i ) =
(10200/7500)1/(3*4) - 1 =
0.02595% per quarter
Annual rate =
(0.02595*4) =
10.3819% p.a.
Exercise 16.18
Monthly rate (i ) =
(8000/5000)1/(4*12) - 1 =
0.00984% per month
Annual rate =
(0.00984*12) =
11.8078% p.a.
Exercise 16.19
Pv =
25000/(1+0.09/4)11 =
R 19,572.37
The investor must deposit R19 572.37 today.
Exercise 16.20
n =
log(30000/21353.4)/log(1+0.12) =
3 years
Exercise 16.21
Let Pv = R1 and Fv = R2
Quarterly rate (i ) =
(2/1)1/(7*4) - 1 =
Annual rate (%) =
0.025064*4 =
0.025064
10.0257% p.a.
Exercise 16.22
Ordinary Annuity Certain
(a)
Fv =
(b)
1600*((1+0.12/12)15 - 1)/(0.12/12) =
R 25,755.03
=FV(0.12/12,15,1600,,0)
R 25,755.03
Ordinary Annuity Due
(c)
(d)
Fv =
1600*((1+0.12/12)15-1)*(1+0.12/12)/(0.12/12) =
R 26,012.58
=FV(0.12/12,15,1600,,1)
R 26,012.58
Exercise 16.23
Compound Interest
Fv1 =
Pv*(1+0.07)9
Simple Interest
Fv2 =
Pv*(1+0.07*9)
Difference
Fv1 - Fv2 = 334.16
Pv*(1+0.07)9 - Pv*(1+0.07*9) = 334.16
334.16/((1+0.07)9 - (1+0.07*9)) = 1602.999
Pv =
Captial Sum (Pv) =
R 1,603.00
Exercise 16.24
(a)
Car price in 3 years
Fv (Compound Interest) =
Invest at end of month
R=
(b)
80000*(1+0.04)3 =
89989.12
89989.12/(((1+0.09/12)(3*12) - 1)/(0.09/12))
R 2,186.71
Invest at beginning of month
89989.12/(((1+0.09/12)(3*12) - 1)*(1+0.09/12)/(0.09/12))
R=
R 2,170.43
Exercise 16.25
(a)
R=
(b)
Total paid =
8500/((1-(1+0.18/12)(-3*12))/(0.18/12)) =
11062.8
Interest amt =
11062.8 - 8500 =
R 2,562.80
% of debt
2562.8/8500% =
30.15%
R 307.30
Exercise 16.26
Present value of an Ordinary Annuity Certain.
Pv =
8750*((1-(1+0.1/12)^(-(5*12)))/(0.1/12)) =
The employee will receive a gratuity of R411 821.98.
R 411,821.98
Exercise 16.27
Ordinary Annuity Due
750*((1+0.145/4)(4*15)-1)*(1+0.145/4)/(0.145/4) =
(a)
Fv =
(b)
=FV(0.145/4,60,750,,1)
(using Excel function)
R 160,149.71
R 160,149.71
Exercise 16.28
Ordinary annuity certain (OAC) for 2 years with R = 540.
(2*12)
-1)/(0.12/12) =
Fv1 (OAC) = 540*((1+0.12/12)
R14 565.67
Then compute Fv on the capital sum after 2 years until maturity (for 7 years).
Fv1 (CI) =
14565.67*(1+0.12/12)^(7*12) =
R33 598.96
Ordinary annuity certain (OAC) for 7 years with R = 750.
(7*12)
Fv2 (OAC) = 750*((1+0.12/12)
Total Funds Available
-1)/(0.12/12) =
Fv1 (CI) + Fv2 (OAC) =
R98 004.21
R131 603.17
Exercise 16.29
(a)
(b)
(c)
Ordinary Annuity Certain
Fv (monthly) =
1000*((1+0.085/12)^(1*12)-1)/(0.085/12) =
R12 478.72
Fv (quarterly) =
3000*((1+0.10/4)^(1*4)-1)/(0.10/4) =
R12 457.55
Conclusion
It is better to invest monthly .
Ordinary Annuity Certain - Using Excel 's function, FV.
Fv (monthly) =
=FV(0.085/12,12,1000,,0)
R12 478.72
Fv (quarterly) =
=FV(0.10/4,4,3000,,0)
R12 457.55
Ordinary Annuity Due - Using Excel 's function, FV.
Fv (monthly) =
=FV(0.085/12,12,1000,,1)
R12 567.11
Fv (quarterly) =
=FV(0.10/4,4,3000,,1)
R12 768.99
Conclusion
It is now better to invest quarterly .
Exercise 16.30
(a)
PV of an Ordinary Annuity Certain
Pv =
2200*(1-(1+0.09/12)^(-(4*12)))/(0.09/12) =
Deposit =
20000
Total Purchase Price of Motor Vehicle =
(b)
R88 406.52
R20 000,00
R108 406.52
Pv of an Ordinary Annuity Certain - Using Excel 's function, PV
Purchase Price = PV(0.09/12,48,2200,,0) + deposit =
R88 406.52 + R20 000
= R108 406.52
Exercise 16.31
(a)
Using Ordinary Annuity Certain for 2 years
Fv1 (2 years) =
1000*((1+0.08/12)(2*12)-1)/(0.08/12) =
Using Compound Interest on Capital Sum for 1 year.
=25933.19*(1+0.1/12)(1*12) =
FV1 (1 year)
(b)
R25 933.19
R28 648.73
Using Ordinary Annuity Certain for 1 year
(1*12)
1000*((1+0.10/12)
-1)/(0.10/12) =
Fv2 (1 year) =
12565.56809
R12 565.57
Maturity value after 3 years
R41 214.30
R28 648.73 + R12 565.57 =
Using Excel 's function FV with a compound interest calculation
=FV(0.08/12,24,1000,,0)*(1+0.1/12)12 + FV(0.1/12,12,1000,,0)
R41 214.30
Exercise 16.32
(a)
Months 1 - 5
FV1 =
Withdrawal
Balance
Compound Interest
Months 6 - 10
FV2 =
Withdrawal
Balance
Compound Interest
Months 11, 12
FV3 =
(b)
200*((1+0.12/12)5-1)/(0.12/12) =
R1020.20 - R300.00 =
7
720.2*(1+0.12/12) =
5
200*((1+0.12/12) -1)/(0.12/12) =
R1020.20 - R300.00 =
720.2*(1+0.12/12)2 =
2
R1020.20
R 300.00
R 720.20
R 772.15
R1020.20
R 300.00
R 720.20
R 734.68
200*((1+0.12/12) -1)/(0.12/12) =
R 402.00
Total Amount Available after 12 months =
R772.15 + R734.68 + R402.00 =
R1908.83
Using Excel 's Function, FV
Months 1 - 5 (Ordinary Annuity Certain - R300 + Compound Interest for 7 months)
7
=(-(FV(0.12/12,5,200,,0))-300)*(1+0.12/12)
R 772.15
Months 6 - 10 (Ordinary Annuity Certain - R300 + Compound Interest for 2 months)
2
=(-(FV(0.12/12,5,200,,0))-300)*(1+0.12/12)
R 734.68
Months 11, 12 (Ordinary Annuity Certain)
=(-FV(0.01,2,200,,0))
R 402.00
Total Amount Available after 12 months =
R772.15 + R734.68 + R402.00 =
R1908.83
Exercise 16.33
(a)
Option 1
R=
Repayment at the end of every month
26000/((1-(1+0.14/4)(-3*4))/(0.14/4)) =
R2 690.58
(b)
Option 2
R=
Repayment at the beginning of every month
26000/(((1-(1+0.14/4)(-3*4))*(1+0.14/4))/(0.14/4)) =
R2 599.60
(c)
The student should select to repay at the beginning of every month.
Total repayment will be less than repaying at the end of every month.
Exercise 16.34
Present Value (Pv) of a Deferred Annuity
Factor 1
=(1-(1+0.16/12)(-(24+36)))/(0.16/12)
Factor 2
=(1-(1+0.16/12)
(-(36))
)/(0.16/12)
Difference 41.12171 - 28.44381 =
R=
30000/12.6779 =
The repayment per month is R2 366.32.
41.12171
28.44381
12.6779
R2 366.32
Exercise 16.35
Rate (i ) (half yearly) =
(40697.7/18000)(1/10)-1 =
Rate (i ) (nominal % p.a.) =
0.085*2 =
0.085
17% p.a.
Exercise 16.36
Re-arrange the Fv formula for an Ordinary Annuity Certain
Factor 1
(103757.98/2000)*(0.12/12)+1 =
n =
LOG(1.51879)/LOG(1+0.12/12) =
1.51879
42 months
The house owner will take 3 years and 6 months to save R103 757.98.
Exercise 16.37
Initial investment (R)
Annual Cash Flow (R)
year 1
year 2
year 3
year 4
year 5
12% p.a. cost of capital
1
2
3
4
5
NPV
INVESTMENT OPTIONS
Trucking
Laundry
60000
60000
32000
38500
26000
13000
9500
0
7500
45000
37500
55500
28571
30692
18506
8262
5391
91422
0
5979
32030
23832
31492
93333
R31 422
R33 333
Recommend the purchase of the Laundry business.
Download