Applied Business Statistics METHODS AND EXCEL-BASED APPLICATIONS 4 t h E d it io n Applied Business Statistics METHODS AND EXCEL-BASED APPLICATIONS TREVOR WEGNER www.jutaacademic.co.za 4th Edition SOLUTIONS MANUAL TREVOR WEGNER Applied Business Statistics: Methods and Excel-based Applications: Solutions Manual Print edition first published in 1993 Reprinted 2000 and 2003 Second Edition 2008 Third Edition 2012 Fourth edition 2015 (Web PDF) Juta and Company (Pty) Ltd First Floor Sunclare Building 21 Dreyer Street Claremont 7708 PO Box 14373, Lansdowne 7779, Cape Town, South Africa © 2015 Juta & Company (Pty) Ltd ISBN 978 1 48511 788 9 (Web PDF) All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publisher. Subject to any applicable licensing terms and conditions in the case of electronically supplied publications, a person may engage in fair dealing with a copy of this publication for his or her personal or private use, or his or her research or private study. See section 12(1)(a) of the Copyright Act 98 of 1978. The author and the publisher believe on the strength of due diligence exercised that this work does not contain any material that is the subject of copyright held by another person. In the alternative, they believe that any protected pre-existing material that may be comprised in it has been used with appropriate authority or has been used in circumstances that make such use permissible under the law. CHAPTER 1 STATISTICS IN MANAGEMENT 1.1 It is a decision support tool. It generates evidence based information through analysis of data to inform management decision making. 1.2 Descriptive statistics summarises (profiles) sample data; inferential statistics generalises sample findings to a broader population (to estimate values or confirm relationships). 1.3 Statistical modelling is explores and quantifies relationships between variables for estimation or prediction purposes. 1.4 Data quality is influenced by: (i) Data source; (ii) Data collection method; and (iii) Data type 1.5 Different statistical methods are valid for different data types. 1.6 In data preparation, consider (i) Data relevancy; (ii) Data cleaning; and (iii) Data enrichment. 1.7 (a) Random variable: Performance appraisal system used (b) Population: All JSE companies (c) Sample: The 68 HR managers surveyed (d) Sampling unit: a JSE-listed company (e) 46% is a sample statistic (f) Random sampling is necessary to allow valid inferences to be drawn based on the sample evidence. 1.8 (a) Random variable: Female magazine readership (b) Population: All female magazine readers (c) Sample: The 2000 randomly selected female magazine readers (d) Sampling unit: a female reader of a female magazine (e) 35% (700/2000) is a sample statistic (f) Inferential statistics – as its purpose is to test the belief that market share = 38% 1.9 (a) Three (3) random variables. They are: (i) weekly sales volume; (ii) number of ads placed per week; (iii) advertising media used. (b) Dependent variable = weekly sales volume (c) Independent variables = number of ads placed per week; advertising media used. (d) Statistical model building (predict sales volume from ads placed and media used) 1.10 Scenario 1 Scenario 2 Scenario 3 Scenario 4 Scenario 5 Scenario 6 Scenario 7 1.11 (a) (b) (c) (d) (e) (f) (g) (h) (i) 1.12 Inferential statistics Descriptive statistics Descriptive statistics Inferential statistics Inferential statistics Inferential statistics Inferential statistics (j) (k) (l) (m) (n) (o) (p) (q) (r) (s) (t) numeric, ratio-scaled, continuous {21,4 years; 34,6 years} numeric, ratio-scaled, continuous {416,2m²; 3406,8m²} categorical, ordinal-scaled, discrete {matric; diploma} categorical, nominal-scaled, discrete {married; single} categorical, nominal-scaled, discrete {Boeing; Airbus} categorical, nominal-scaled, discrete {verbal; emotional} numeric, ratio-scaled, discrete {41 ; 62} categorical, ordinal-scaled, discrete {salary only; commission only} (i) categorical, ordinal-scaled, discrete {1 = apple; 2 = orange} (ii) categorical, nominal-scaled, discrete {yes; no} (iii) categorical, nominal-scaled, discrete {train; bus} (iv) numeric, interval-scaled, discrete {2; 5} numeric, ratio-scaled, continuous {12,4kg; 7,234kg} categorical, nominal-scaled, discrete {Nescafe; Jacobs} numeric, ratio-scaled, continuous {26,4 min; 38,66 min} categorical, ordinal-scaled, discrete {Super; Standard} numeric, ratio-scaled, continuous {R85,47; R2315,22} numeric, ratio-scaled, discrete {75; 23} numeric, ratio-scaled, discrete {5; 38} numeric, ratio-scaled, continuous {9,54 hours; 10,12 hours} numeric, interval-scaled, discrete {2; 6} numeric, ratio-scaled, discrete {75; 238} categorical, nominal-scaled, discrete {Growth funds; Industrial funds} (a) 11 random variables (b) Economic sector Head office region Company size Turnover Share price Earnings per share Dividends per share Number of shares ROI (%) Inflation index (%) Year established categorical, nominal-scaled, discrete categorical, nominal-scaled, discrete numeric, ratio-scaled, discrete numeric, ratio-scaled, continuous numeric, ratio-scaled, continuous numeric, ratio-scaled, continuous numeric, ratio-scaled, continuous numeric, ratio-scaled, discrete numeric, ratio-scaled, continuous numeric, ratio-scaled, continuous numeric, ratio-scaled, discrete (c) Illustration value {retail} {Gauteng} {242} {R3 432 562} {R18.48} {R2.16 / share} {R0.86 / share} {12 045 622} {8.64%} {6.75%} {1988} 1.13 1.14 (a) 13 random variables (b) Gender Home language Position Join date Status Claimed Problems Yes problem Services - airlines Services – car rentals Services - hotels Services – financial Services – telecomms categorical, nominal-scaled, discrete categorical, nominal-scaled, discrete categorical, ordinal-scaled, discrete numeric, ratio-scaled, discrete categorical, ordinal-scaled, discrete categorical, nominal-scaled, discrete categorical, nominal-scaled, discrete categorical, nominal-scaled, discrete numeric, interval-scaled, discrete numeric, interval-scaled, discrete numeric, interval-scaled, discrete numeric, interval-scaled, discrete numeric, interval-scaled, discrete (c) Illustration value {female} {Xhosa} {middle manager} {1998} {gold status} {yes} {yes} {online access difficult} {2} {5} {4} {2} {2} Financial Analysis data: mainly numeric (quantitative), ratio-scaled. Voyager Services Quality data: mainly categorical (qualitative); but when rating scales are used, such as in Question 8, the data is numeric, but interval-scaled and discrete. ---ooOoo--- CHAPTER 2 EXPLORATORY DATA ANALYSIS SUMMARISING DATA SUMMARY TABLES AND GRAPHS A picture is worth a thousand words. Exercise 2.1 Exercise 2.2 (a) (b) (c) (d) bar (or pie) chart multiple (or stacked) bar chart histogram scatter plot Exercise 2.3 Cross-tabulation table (or joint frequency table; or two-way pivot table). Exercise 2.4 Bar chart (i) (ii) (iii) displays data on a categorical variable categories can be displayed in any order width of bars is arbitrary (but all of equal widths) Histogram (i) (ii) (iii) displays numerical data only intervals must be continuous (and constant width) and sequential width of bars is determined by interval width Exercise 2.5 Line graph Exercise 2.6 (a) File: X2.6 - magazines.xlsx Magazine preferences by female teenagers Magazine True Love Seventeen Heat Drum You Total Count 95 146 118 55 86 500 % 19% 29% 24% 11% 17% 100% Percent of Female Teenagers 17% 19% Seventeen 11% Heat 29% 24% (b) True Love Drum You Interpretation Seventeen is the most popular teenager magazine (29% of female teenager prefer it). Almost a quarter of the female readers surveyed prefer reading Heat (24%), while the least preferred magazine is Drum with only 11% of female magazine readers preferring it. Exercise 2.7 File: (a) X2.6 - magazines.xlsx Magazine preferences by female teenagers Magazine True Love Seventeen Heat Drum You Total Count 95 146 118 55 86 500 % 19% 29% 24% 11% 17% 100% % of Female Teenagers per Magazine 35% 29% 30% 25% 20% 15% 24% 19% 17% 11% 10% 5% 0% (b) Heat is preferred by 24% of all female teenager readers. Exercise 2.8 File: (a) and (b) Categorical Frequency Table - Job Grades Job grade A B C D Total Count Total % Data Count % Count % Count % Count % Total 14 35% 11 27.5% 6 15% 9 22.5% 40 100% (c) 22.5% of employees are in job grade D (d) Bar Chart and Pie Chart - Job Grades 35 % 35 27.5 15 22.5 100 35 27.5 30 22.5 25 20 D 23% 15 15 C 15% 10 5 0 Job Grade A B C D Total % of Employees per Job Grade % Employees per Job Grade 40 X2.8 - job grades.xlsx A B C D B 27% A 35% A B C D Exercise 2.9 (a) and (b) File: Numerical Frequency Distribution and Cumulative Frequency Distribution Rentals ≤ 200 201 - ≤250 251 - ≤300 301 - ≤350 351 - ≤400 More Total (c) X2.9 - office rentals.xlsx (i) (ii) (iii) (iv) Count 4 8 9 6 3 0 30 % Count 13.3 26.7 30 20 10 0 100 Cum % 13.3% 40.0% 70.0% 90.0% 100.0% 100.0% 13.3% of all office space costs less than or equal to R200 / m2 70% of all office space costs at most R300 / m² 10% of all office space costs more than R350 / m² 9 office buildings have rentals between R300 and R400 / m² Exercise 2.10 File: X2.10 - storage dams.xlsx Cape Town water storage dams capacities (a) Storage Dam Wemmershoek Steenbras Voelvlei Theewaterskloof Total capacity Capacity (Ml) 158644 95284 244122 440255 938305 % 16.9 10.2 26 46.9 100 Capacity of Cape Town Storage Dams (in Million litre) 17% 47% 10% 26% (b) (i) (ii) Wemmershoek Steenbras Voelvlei Theewaterskloof Voelvlei dam supplies 26% of Cape Town's water. Wemmershoek and Steenbras dams together provide 27.1% of Cape Town's water. Exercise 2.11 (a) X2.11 - taste test.xlsx File: Taste test preferences for fruit juices Blind Label A B C D E Brand Number Liqui Fruit Fruiti Drink Yum Yum Fruit Quencher Go Fruit Total % 18.0 10.4 25.6 15.2 30.8 100.0 45 26 64 38 77 250 Bar Chart - Fruit Juice Preferences Consumer Preferences for Fruit Juice Brands 40.0 % of consumers 35.0 30.8 30.0 25.6 25.0 20.0 18.0 15.2 15.0 10.4 10.0 5.0 0.0 Liqui Fruit Fruiti Drink Yum Yum Fruit Quencher Go Fruit Pie Chart - Fruit Juice Preferences Consumer Preferences - Fruit Juices 45, 18% 77, 31% Liqui Fruit Fruiti Drink 26, 10% Yum Yum Fruit Quencher 38, 15% 64, 26% (b) 18% of the sampled consumers prefer Liqui Fruit. (c) 56.4% of the sampled consumers prefer either Yum Yum or Go Fruit. Go Fruit Exercise 2.12 Manufacturer Toyota Nissan Volkswagen Delta Ford MBSA BMW MMI Total Sales (a) X2.12 - annual car sales.xlsx File: Annual Sales 96959 63172 88028 62796 74155 37268 51724 25354 499456 % Sales 19.4 12.6 17.6 12.6 14.8 7.5 10.4 5.1 100.0 Bar Chart - Annual Car Sales by Manufacturer Annual Sales of Passenger Cars by Manufacturer 120000 100000 80000 96959 88028 74155 63172 62796 60000 51724 37268 40000 25354 20000 0 (b) Percentage Pie Chart - Annual Car Sales by Manufacturer % Annual Sales of Passenger Cars by Manufacturer MMI, 5.1 MBSA, 7.5 BMW, 10.4 Toyota Toyota, 19.4 Volkswagen Ford, 14.8 Nissan, 12.6 Volkswagen, 17.6 Delta, 12.6 (c) Nissan Delta Ford MBSA BMW MMI Total % held by top three manufacturers - Toyota (19.4%), Volkswagen (17.6%) and Ford (14.8%) - represents 51.8% of the total passenger car market. Exercise 2.13 File: Manufacturer Toyota Nissan VW Delta Ford MBSA BMW MMI (a) First half 42661 35376 45774 26751 32628 19975 24206 14307 X2.13 - half-yearly car sales.xlsx Second half 54298 27796 42254 36045 41527 17293 27518 11047 % Change 27.3 -21.4 -7.7 34.7 27.3 -13.4 13.7 -22.8 Multiple bar chart - Car Sales by Half-Year and Manufacturer Half-yearly Car Sales by Manufacturer 60000 50000 40000 First half 30000 Second half 20000 10000 0 First half Toyota 42661 Nissan 35376 VW 45774 Delta 26751 Ford 32628 MBSA 19975 BMW 24206 MMI 14307 Second half 54298 27796 42254 36045 41527 17293 27518 11047 (b) First half-year best performers: Nissan; Volkswagen; MBSA and MMI (c) Delta showed the largest % increase from the first half to the second half of 34.7%. Refer to the above Table for the % Change between First and Second Half-Year Sales. Exercise 2.14 (a) File: Categorical Frequency Table - Television Brands Owned Count of Brands Brands Daewoo LG Philips Sansui Sony Grand Total (b) X2.14 - television brands.xlsx Total 16% 30.4% 10.4% 24% 19.2% 100% Percentage Bar Chart - Television Brands Owned % of TV Brands Owned 35% 30% 25% 20% 15% 10% 5% 0% 30.4% Total 16% 30.4% 10.4% 24% 10.4% 19.2% 100% Brands Daewood LG 16% Philips Sansui Sony Total Daewood LG Philips 24% 19.2% Sansui Sony (c) Philips is the least preferred brand (preferred by only 10.4% of households surveyed). (d) The most popular brand is LG that is owned by 30.4% of the households surveyed. Exercise 2.15 (a) (b) File: Frequency Count Table X2.15 - estate agents.xlsx Count of House sales House sales 3 4 5 6 7 8 Grand Total Total 12 15 6 7 5 3 48 Histogram - Residential Properties Sold per Estate Agent Histogram of Residential Properties Sold per Agent 16 14 12 10 8 6 12 15 4 6 2 0 Total 7 5 3 3 4 5 6 7 8 12 15 6 7 5 3 (c) The most frequently sold number of properties per estate agent was 4. 4 properties each were sold by (15/48) 31.3% of all estate agents (d) The same frequency count table (a) and histogram (b) is produced. Exercise 2.16 File: Fast Food Outlet KFC St Elmo's Steers Nandos Ocean B Butler's Total (a) Count 56 58 45 64 24 78 325 X2.16 - fast foods.xlsx Fast Food KFC St Elmo's Steers Nandos Ocean B Butler's % 17.2 17.8 13.8 19.7 7.4 24.0 100 % 17.2 17.8 13.8 19.7 7.4 24 Percentage Bar Chart - Consumer Preferences of Fast Food Outlet Percentage of Consumers 30.0 24 25.0 20.0 17.2 19.7 17.8 13.8 15.0 % 7.4 10.0 5.0 0.0 KFC (b) St Elmo's Steers Nandos Ocean B Butler's Percentage Pie Chart - Consumer Preferences of Food Type Firstly produce the categorical frequency table of Food Type Preferences. Sum the frequency counts of the different food types (e.g. Chicken = 56 + 64 = 120) and express the total count as a % of the total number of customers (e.g. 120/325 = 36.9%). Categorical Frequency Table - Consumer Preferences of Food Type Food type Chicken Pizza Burger Fish % 36.9 41.8 13.8 7.4 Consumer preference (%) by Food Type Fish, 7.4 Burger, 13.8 Chicken Chicken, 36.9 Pizza, 41.8 Pizza Burger Fish (c) Brief Report Pizza (42%) and Chicken (37%) dominate almost 80% of the fast food market with Pizzas being slightly more favoured by fast food consumers. Exercise 2.17 (a) File: Two-way Pivot Table - Counts, Row % (by Airline), Column % (by Passenger) Airline Comair Data Count of Passenger Row % Column % Total % Kulula Count of Passenger Row % Column % Total % SAA Count of Passenger Row % Column % Total % Total Count of Passenger Total Row % Total Column % Total Total % (b) Passenger Business 12 60.0% 33.3% 17.1% 4 20.0% 11.1% 5.7% 20 66.7% 55.6% 28.6% 36 51.4% 100% 51.4% Tourist 8 40.0% 23.5% 11.4% 16 80.0% 47.1% 22.9% 10 33.3% 29.4% 14.3% 34 48.6% 100% 48.6% Grand Total 20 100% 28.6% 28.6% 20 100.0% 28.6% 28.6% 30 100.0% 42.9% 42.9% 70 100% 100% 100% Two-way Pivot table - Row Percentage by Airline Count of Passenger Airline Comair Kulula SAA Grand Total (c) X2.17 - airlines.xlsx Passenger Business 60.0% 20.0% 66.7% 51.4% Tourist 40.0% 80.0% 33.3% 48.6% Grand Total 100.0% 100.0% 100.0% 100.0% Multiple Bar Chart - Passenger Type by Airline Multiple Bar Chart - Airline by Passenger Type % of passenger per airline 90% 70% 60% 50% 40% 0.67 0.6 0.4 30% 0.33 10% Comair Kulula SAA Business 0.6 0.2 0.666666667 Tourist 0.4 0.8 0.333333333 42.9% of passengers prefer to fly with SAA. Business Tourist 0.2 20% 0% (d) 0.8 80% (e) Kulula is most preferred by tourists (47.1% of tourists prefer Kulula). (f) Not true. Most business travellers prefer SAA (55.6%). More than half (55.6%) of all business travellers prefer SAA. Exercise 2.18 (a) (b) (i) File: Random Variable - Number of occupants per car Data Type - Numerical, discrete, ratio-scaled Numeric % Frequency Distribution (and Cumulative Frequencies) Occupants 1 2 3 4 5 Total (b) (ii) X2.18 - car occupants.xlsx Count 23 15 10 5 7 60 % 38.3 25.0 16.7 8.3 11.7 100.0 Cum Count 23 38 48 53 60 Cum % 38.3 63.3 80.0 88.3 100.0 Histogram - Occupants per Car Histogram of Occupants per Car 30 No. of Cars 25 23 20 15 15 10 10 0 7 5 5 1 2 3 4 5 Occupants "Less-Than" Ogive (see (a) above) and Cumulative Frequency Polygon Cumulative Frequency Polygon - Car Occupants 70 60 No. of Cars (b) (iii) 60 50 53 48 40 38 30 23 20 10 0 1 2 3 No. of occupants 4 5 (c) (i) (ii) (iii) 38.3% of motorists travel alone. 36.7% of vehicles have 3 or more occupants 63.3% of vehicles have no more than 2 occupants. Exercise 2.19 (a) (b) (i) (ii) File: Random variable - distance travelled (in kms) per courier trip Data type - numerical, continuous, ratio-scaled Numeric % Frequency Distribution (and Cumulative Frequencies) Distance ≤10 11 - ≤15 16 - ≤20 21 - ≤25 26 - ≤30 31 - ≤35 Total % 8 14 30 24 18 6 100 Count 4 7 15 12 9 3 50 Cum Count 4 11 26 38 47 50 Cum % 8 22 52 76 94 100 Histogram - Courier Travelling Distances per Trip Histogram of Courier Travelling Distances 18 15 16 No. of trips 14 12 12 9 10 7 8 6 4 4 3 2 0 ≤10 11 - ≤15 16 - ≤20 21 - ≤25 26 - ≤30 31 - ≤35 Distance (in km) Cumulative % Frequency Polygon Distances Travelled per Trip Distance 10 15 20 25 30 35 100 Percent of trips (b) (iii) X2.19 - courier trips.xlsx 80 60 40 20 0 22 Cum % 8 22 52 76 52 94 100 94 100 76 8 10 15 20 25 Distance (in km) 30 35 (c) (i) (ii) (iii) 18% of deliveries (9 trips) were between 25 km and 30 km. 76% of deliveries (38 trips) were within a 25 km radius. 48% of deliveries (24 trips) were beyond a 20 km radius. (iv) Reading off the Cumulative % Frequency Table (or Polygon) above 52% of trips were no more than 20 km from the depot. The longest 24% of trips were above 25 km. (v) (d) The percentage of trips above 30 km is only 6%. Hence there is adherance to the company policy. Exercise 2.20 (a) File: X2.20 - fuel bills.xlsx Random variable - monthly fuel expenditure (in Rands) Data type - numerical, continuous, ratio-scaled (b) (i) (ii) Numeric % Frequency Distribution (and (d) the Ogive) Fuel bill ≤300 301 - 400 401 - 500 501 - 600 601 - 700 701 - 800 800+ Total (b) (iii) Count 7 15 13 7 5 2 1 50 % Count 14 30 26 14 10 4 2 100 Cum Count 7 22 35 42 47 49 50 Cum % 14 44 70 84 94 98 100 Histogram of Fuel costs / motorist (Rand) Histogram of Motorists' Monthly Fuel Costs 15 16 13 no. of motorists 14 12 10 8 7 7 5 6 4 2 2 0 ≤300 301 - 400 401 - 500 501 - 600 601 - 700 701 - 800 Fuel bill (in Rands) (c) 14% (7 motorists) spend between R500 and R600 per month on fuel. (d) Cumulative % Frequency Polygon - Motorist fuel bills per month. (See the Ogive in (b) for Cumulative Frequencies) 1 800+ Cumulative % Frequency Polygon - Fuel Bills per Month 110 100 90 80 70 60 50 40 30 20 10 0 94 98 100 84 70 44 14 ≤300 301 - 400 401 - 500 501 - 600 601 - 700 701 - 800 800+ (e) From (d), approx. 77% of motorists spend less than R550 per month on fuel. (f) From (a) (and (d)), 30% (15 motorists) spend more than R500 per month on fuel. Exercise 2.21 File: Data (a) Quarters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 X2.21 - car sales.xlsx Corsa Sales 37 25 41 29 31 28 30 36 38 62 53 63 43 39 52 61 58 65 73 52 61 46 49 54 Time Line graph - Quarterly Vehicle Sales Line Graph of Opel Corsa Quarterly Sales 80 70 units sold 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Quarters (b) Yes, Corsa sales are showing a general upward trend. Exercise 2.22 File: Data (a) Year 1 2 3 4 5 6 7 8 9 10 VW Toyota 13.4 11.6 9.8 14.4 17.4 18.8 21.3 19.4 19.6 19.2 9.9 9.6 11.2 12.0 11.6 13.1 11.7 14.2 16.0 16.9 X2.22 - market shares.xlsx Trend Line graphs of Market Shares (%) per car type (VW, Toyota) Market Share (%) Line Graphs % market share 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 year Legend: Top graph - VW; Bottom graph - Toyota (b) VW shows a higher sales level but at a declining growth rate. Toyota shows a lower sales level, but at a rising growth rate. (c) Choice of franchise is not clearcut, but suggest choosing Toyota because of its more consistent (steady) growth rate. 10 Exercise 2.23 (a) File: X2.23 - defects.xlsx Scatter graph - Inspection time (x) vs Defects found (y) Scatter Graph of Defects against Inspection time no. of defects found per batch 20 18 16 14 12 10 8 6 4 2 0 20 30 40 50 60 inspection time (minutes) (b) Yes, there appears to be a moderate to strong positive linear relationship between the inspection time of a batch and the no. of defects found per batch. 70 Data Consignment AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD Time 48 50 43 36 45 49 55 63 55 36 40 46 32 50 42 36 48 38 45 30 34 43 53 48 56 40 33 35 50 48 Defects 17 9 12 7 8 10 14 18 19 6 8 14 10 15 14 8 12 8 10 6 9 11 16 16 15 12 7 10 16 18 Exercise 2.24 (a) File: X2.24 - leverage.xlsx Scatter Graph - Profit Growth (y) vs Leverage Ratio (x) Scatter Graph Profit Growth (y) and Leverage Ratio (x) 160 profit growth 140 120 100 80 60 40 20 0 30 32 34 36 38 40 42 44 leverage ratio (b) Yes, there is a clear moderate to strong positive linear relationship between the leverage ratio of a company and its profit growth. Data Leverage 40.8 42.3 43.2 37.9 36.2 35.6 36.4 39.5 42.6 42.1 37.8 34.4 36.5 38.3 39.3 36.4 33.5 32.4 35.4 35.4 35.7 35.2 35.3 44.9 35.9 38.0 36.7 39.2 41.1 38.7 Profit Growth 111 116 132 105 69 40 58 118 104 125 97 76 98 100 75 88 20 25 78 65 84 88 86 115 50 92 110 72 128 86 46 Exercise 2.25 File: (a) Sector Mining Services Grand Total (b) Service companies have a higher average ROI% (11.33%) than mining companies (9.87%). The volatility of ROI% amongst service companies (2.99%) is far lower than amongst mining companies (4.58%) By inspection, there is a high overlap of ROI% between the two sectors (based on a two-standard deviation interval around each sample mean). Thus it is likely that there is no statistically significant difference in mean ROI% between the two sectors. Average 9.87 11.33 10.70 Std dev 4.58 2.99 3.78 Exercise 2.26 File: (a) Aisle Front Middle Back (b) X2.25 - roi%.xlsx Middle Average Std dev Average Std dev Average Std dev Average Std dev X2.26 - product location.xlsx Shelf position Top Total 6.08 0.890 4.24 1.387 4.66 1.193 4.99 1.359 5.08 0.622 2.74 0.297 4.1 0.648 3.97 1.114 5.58 0.895 3.49 1.232 4.38 0.952 4.48 1.327 Based on shelf position alone, middle shelf positions generate higher average sales (R4.99) than top shelf positions (R3.97). Based on aisle location alone, a front-of-store aisle location generates the highest average sales (R5.58), followed by a back-of-store aisle location (R4.38). The lowest average sales occur when the product is displayed in a middle-of-store aisle location (R3.49). In combination, a front-of-store aisle location on a middle shelf position generates the highest average sales (R6.08), while a top shelf position in a middle-of-store aisle is the least desirable product location with an average sales volume of only R2.74. Sales variability is relatively consistent across aisle locations (0.895 to 1.232) as well as between shelf positions (1.114 to 1.359). In combination however, sales volumes show highest variability when the product is positioned in a middle shelf position in a middle-of-store location (1.387) while the lowest variability in sales volumes occur when positioned in a top shelf position in a middle-of-store aisle location (0.297). The large differences in average sales volumes (ranging from R6.08 to R2.74) is evidence of a likley statistically significant difference in sales volumes due to choice of aisle location and shelf positioning. (c) Recommendation: A middle shelf position in a front-of-store ailse location is the most preferred product display location. File: Exercise 2.27 (a) X2.27 - property portfolio.xlsx Numeric Frequency Distribution and Cumulative % of NP% Intervals -5 -2.5 0 2.5 5 7.5 10 12.5 15 17.5 20 More Count Cumulative % 0.0% 1.2% 3.4% 4.6% 11.4% 38.3% 71.9% 86.1% 95.7% 99.4% 100% 100% 100% 0 4 7 4 22 87 109 46 31 12 1 1 324 Histogram - Net Profit % 120 100 Frequency 120.0% 109 100.0% 87 80 80.0% 60 60.0% 46 40 40.0% 31 22 20 0 4 7 4 -5 -2.5 0 2.5 12 20.0% 1 1 0 0.0% 5 7.5 10 12.5 15 17.5 20 More Bin NP% (b) and (c) Region A B (d) Average Std dev Minimum Maximum Count Average Std dev Minimum Maximum Count Average Std dev Minimum Maximum Count Type of business usage Commercial Industrial Retail 7.5 4.9 10.2 2.5 3.1 3.6 -2.4 -4.2 0.8 14.6 8.1 18.4 104 40 70 12.3 6.8 8.5 3.1 4.2 1.8 2.7 -3.4 4.4 20.3 10.2 13.2 46 16 48 9.0 5.4 9.5 3.5 3.5 3.1 -2.4 -4.2 0.8 20.3 10.2 18.4 150 56 118 Total 7.9 3.5 -4.2 18.4 214 9.8 3.5 -3.4 20.3 110 8.6 3.6 -4.2 20.3 324 Profile of property portfolio: The company has almost twice as many properties in region A (214 or 66%) compared to region B (110 or 34%). Almost half of their properties are commercial (46%) followed by retail (37%) and then industrial (17%). Of all the prpoperties in the portfolio, the majority are commercial properties in region A (104 or 32%). followed by retail properties in region A (70 or 22%). The smallest component of their property portfolio consists of industrial properties in region B (only 16 or 5%). (e) Portfolio performance : Net profit % across the entire portfolio is normally distributed (histogram) with an average return of 8.6% and a standard deviation of 3/6%. NP% ranged from the lowest of -4.2% to the highest of 20.3%. From the cumulative % distribution, 75% of all properties (cumulative 86.1% - cumulative 11.4%) earned a NP% of between 5% and 12.5% p.a. Region B (9.8%) has outperformed region A (7.9%) by almost 2% on average, while commercial (9.0%) and retail (9.5%) have significantly outperformed industrial properties (5.4%). The worst performing segment is industrial properties in region A (4.9%) while the best performing properties are commercial properties in region B (12.3%). There are 15 (4.6%) properties that are under-performing (with less than a 5% net profit % p.a.).(see histogram). Volatility of NP% is fairly consistent across the segments (approx. 3.5%), except for higher variability noted in the industrial properties of region B (4.2%). Growth potential (high NP% p.a. segments) is mainly in commercial properties in region B which represents only 14% of the current portfolio retail properties in region A (currently only constitute 22% of the current portfolio). (f) Recommendations: Dispose of the worst performing industrial properties in both regions A and B and purchase more commercial properties in region B followed by retail properties in region A. CHAPTER 3 EXPLORATORY DATA ANALYSIS - DESCRIBING DATA NUMERIC DESCRIPTIVE STATISTICS Exercise 3.1 (a) median (b) mode (c) mean Exercise 3.2 Upper quartile Exercise 3.3 Statements (c) and (f). The mode would be more appropriate (both are categorical) Exercise 3.4 (a) False (b) False (c) True (d) False (e) False The new median mass will depend only on the masses of parcels in the 3rd and 4th ordered positions out of the 6 positions (after adding the extra parcel). Also, the rank order position of this extra parcel's mass is unknown. It could be the 4th, 5th or 6th ranked mass, but this depends on the masses of parcels that are heavier than the current median mass. If the 4th ranked mass is also equal to 6.5 kg, then the new median will not increase. If, on the other hand, the 4th ranked mass is greater than 6.5 kg, then the median will increase. Therefore the only statement that can be made with complete certainty is (c), (i.e. that it is impossible for the new median mass to be less than it was.) Exercise 3.5 Correct method is (b). Use the formula for the arithmetic mean (Formula 3.1) By definition, Mean = Σx / n Given Mean = 4.1 and n = 9245, it is possible to compute Σx (total number of persons) = Mean x n . i.e. total no. persons in Mossel Bay = 4.1 x 9245 = 37905 (rounded) Exercise 3.6 File: X3.6 - equity returns.xls General Equity Unit Trust % Returns Sum n % Returns 9.2 8.4 10.2 9.6 8.9 10.5 8.3 65.1 7 Deviation -0.1 -0.9 0.9 0.3 -0.4 1.2 -1 Deviation2 0.01 0.81 0.81 0.09 0.16 1.44 1 4.32 Using Excel Mean 65.1/7 = Std Dev √[4.32/(7-1)] = 9.3 0.8485 '=AVERAGE(9.2,...,8.3) '=STDEV(9.2,...,8.3) Exercise 3.7 File: Sum n Mass (kg) 11 12 8 10 13 11 9 74 7 Deviation 0.43 1.43 -2.57 -0.57 2.43 0.43 -1.57 X3.7 - luggage weights.xls Deviation2 0.1849 2.0449 6.6049 0.3249 5.9049 0.1849 2.4649 17.7143 Using Excel (a) (b) Mean 74/7 = 10.57 '=AVERAGE(11,...,9) Std Dev √[17.7143/(7-1)] = 1.7183 '=STDEV(11,...,9) On average, each passenger's hand luggage weighs 10.57 kg. 68.3% of all hand luggage is likely to weigh between 8.85 kg and 12.29 kg. (This corresponds to one standard deviation limits about the mean). (c) Coefficient of Variation 1.7183/10.57% = 16.26% (d) The variation in hand luggage weights is moderate (close together). File: Exercise 3.8 Bicycles sold Deviation Deviation2 Sum n (a) 25 18 30 36 18 20 16 24 30 19 236 10 1.4 -5.6 6.4 12.4 -5.6 -3.6 -7.6 0.4 6.4 -4.6 X3.8 - bicycle sales.xls Sorted data 16 18 18 19 20 24 25 30 30 36 1.96 31.36 40.96 153.76 31.36 12.96 57.76 0.16 40.96 21.16 392.4 Mean = 236/10 = 23.6 On average, 23.6 bicycles are sold each month. (Median sales lies in the 5.5th position) Median = (20 + 24)/2 = 22 For half of the months (i.e. 5 months), bicycles sales were less than 22 bicycles per month. (b) Range = 36 - 16 = 20 The range of sales between the worst and best months was 20 bicycles. (i.e. In the worst sales month, 16 were sold; in the best month, 36 were sold. Variance = 392.4/(10-1) = 43.6 Standard deviation = √43.6 = 6.603 68.3% of all monthly bicycle sales are likely to lie between 17 and 30.2. (c) th 18 (2.5 position) 25% of monthly bicycle sales were less than or equal to 18. Or: No more than 18 bicycles per month were sold in 25% of the months. Note: Using Excel : QUARTILE(data values,1) = 18.25 Lower Quartile (Q1) = Upper Quartile (Q3) = 27.5 (7.5th position) 25% of monthly bicycle sales were above 27.5. Or: More than 28 (27.5) bicycles per month were sold in 25% of the months. Note: Using Excel : QUARTILE(data values,3) = 28.75 (d) Approximate Skewness = 3x(23.6 - 22)/6.603 = 0.7269 There is moderate positive skewness in monthly bicycle sales. (i.e. There are one / two months with relative high bicycle sales) (e) Box plot of monthly bicycle sales Interpretation Monthly bicycle sales range between 16 and 36. The median monthly sales is 22. The positive skewness shows a wider spread of monthly sales toward the months of high sales. (f) Opening monthly stock level = 23.6 + 6.603 = 30.2 bicycles in stock If orders = 30, then the dealer will have sufficient bicycle stock to meet demand. Exercise 3.9 (a) File: Setting time Deviation 3 -6 -3 -2 -4 4 7 1 0 Sum n 27 18 21 22 20 28 31 25 24 216 9 Mean = 216/9 = 24 minutes Std dev = √140/(9-1) = 4.183 minutes 4.183/24 = X3.9 - setting times.xls Deviation2 9 36 9 4 16 16 49 1 0 140 (b) Coefficient of Variation = 17.43 % (c) No, since the consistency index is greater than 10%, this consignment will not be approved for dispatch to the clients. Exercise 3.10 File: % Increases 5.6 7.3 4.8 6.3 8.4 3.4 7.2 5.8 8.8 6.2 7.2 5.8 7.6 7.4 5.3 5.8 Sum n Deviation -0.83 0.87 -1.63 -0.13 1.97 -3.03 0.77 -0.63 2.37 -0.23 0.77 -0.63 1.17 0.97 -1.13 -0.63 102.9 16 X3.10 - wage increases.xls Deviation2 0.6910 0.7547 2.6610 0.0172 3.8760 9.1885 0.5910 0.3985 5.6110 0.0535 0.5910 0.3985 1.3660 0.9385 1.2797 0.3985 28.8144 Sorted % 3.4 4.8 5.3 5.6 5.8 5.8 5.8 6.2 6.3 7.2 7.2 7.3 7.4 7.6 8.4 8.8 Using manual computations (a) Mean = 102.9/16 6.43 % 6.25 % (lies in the 8.5th position) Median = (6.2+6.3)/2 = (b) Variance = Std dev = [28.8144/(16-1)] = √28.8144/(16-1) = (c) Lower limit = 6.43-2*(1.386) 3.66 Upper limit = 6.43+2*(1.386) 9.20 95.5% of all agreed wage increases lie between 3.66% and 9.2%. (d) CV = (1.386/6.43) % = 21.56 % Agreed wage increases are only moderately consistent. 1.921 1.386 % Using Excel Excel 's Data Analysis option (a) (b) Wage increases Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count (c) and (d) must be computed manually. Excel 's Function Keys 6.43 0.346 6.25 5.8 1.386 1.921 0.196 -0.286 5.4 3.4 8.8 102.9 16 '=AVERAGE(data values) '=MEDIAN(data values) '=STDEV(data values) '=VAR(data values) Exercise 3.11 Bank Trainee Exam Performance Mean Variance Sample size Std deviation Coefficient of Variation = Group 1 76 110 34 Group 2 64 88 26 √110 = 10.488 √88 = 9.381 Group 1 (10.488/76)% 13.8 Group 2 (9.381/64)% 14.66 Interpretation Both groups performed consistently well. The difference in CV% measures is marginal. However, group 1 trainees performed marginally more consistently than group 2 trainees. Exercise 3.12 (a) File: Random variable - value of a restaurant meal (in Rand) Data type - numerical, continuous, ratio-scaled X3.12 - meal values.xls (Using Data Analysis in Excel ) Meal values (b) Mean 1119/20 = = R55.95 3902.95 20 ∑(deviation)² = n= Variance = 3902.95/(20-1) = Std deviation = √205.42 = (c) 205.42 14.33 Median th th Average the Rand values in 10 and 11 positions. = (51+55)/2 = R53 Half off the meals were valued at R53 or less. Ranked Position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Meal Value 35 36 44 44 44 47 48 48 50 51 55 56 58 62 65 65 69 72 80 90 R44 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 55.95 3.20 53 44 14.33 205.42 0.26 0.74 55 35 90 1119 20 Median is midpoint between these two middle values (R51 and R55) (d) Mode occurs 3 times (see grouped ranked values in (c) above). (e) There is moderate positive skewness caused by two high meal values (i.e. R80 and R90). Hence recommend the median as the most representative central location meal value. Exercise 3.13 (a) File: 10.3 days absent (Using Data Analysis in Excel ) Mean = 237/23 Median Median is found in the (23+1)/2 th position i.e. 12th position Mode Median = 9 days absent Position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Days absent 2 4 4 5 5 5 6 6 6 8 9 9 10 10 10 12 15 15 15 16 17 18 30 X3.13 - days absent.xls Q1 position and value days absent Mean Standard Error Median Mode Standard Deviation Sample Variance Skewness Range Minimum Maximum Sum Count 10.30 1.33 9 5 6.36 40.49 1.38 28 2 30 237 23 Q1 Q3 Median position and value Q3 position and value There are 4 possible modal values (5; 6; 10 and 15 days). All occur with a frequency of 4. Note: Only the first modal value is reported in Excel . The mode is an unreliable measure of central location in this study. Interpretation On average, an employee is absent for 10.3 days over this 9-month period. Half the employees were absent for up to 9 days. The most common number of days absent was 5 (or 6, or 10 or 15) days (b) Lower Quartile (approximated manually) Q1 position = (23/4) = 5.75th position Q1 value in this position is = 5+(0.75*(5 - 5) = 5 days absent Using Excel 5.5 days absent =QUARTILE(data range,1) 25% of employees were absent for no more than 5 (or 5.5) days altogether. 5.5 15 Upper Quartile (approximated manually) Q3 position = (23*3/4) = 17.25th position Q3 value in this position is = 15+(0.25*(15 - 15) = 15 days absent Using Excel 15 days absent =QUARTILE(data range,3) 25% of employees were absent for more than 15 days over this 9-month period. (c) Average per 9-month = 10.3 days Average per 1-month = 1.14 days absent per month = (10.3 / 9) = ave per month Since the monthly average is above 1 day (actually 1.14 days), the company is not succesfully managing its absenteeism levels. Exercise 3.14 (a) Average File: (Using Data Analysis in Excel ) 4.665% = 79.3/17 54.85882 17 ∑(deviation)² = n= Variance = 54.85882/(17-1) = Std deviation = √(3.43) = (b) Median Rank Position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 3.43 1.85 th Median is found in the (17+1)/2 position th i.e. 9 position Median = 5.40% Ordered bad debts % 1.8 2.2 2.2 2.4 2.6 3.4 4.4 4.7 5.4 5.7 5.7 5.8 6.1 6.3 6.6 6.8 7.2 X3.14 - bad debts.xls bad debts % Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Q1 Q3 Lower Quartile position and value Median position and value Upper Quartile position and value (c) Average: On average, each furniture retailer has a bad debt % of 4.665%. Median: 50% of furniture retailers have a bad debt % of 5.4% or less. Since the mean < median, there is evidence of negative skewness. Hence propose the use of the median as the representative central value. (d) There are two modal values (2.2% and 5.7%) both occurring with frequency of 2. This makes the mode an unreliable measure of central location. (e) Skewness coefficient (Formula 3.14) Then Skp = Values required for Formula 3.14 -31.2227 ∑(x - x(bar))3 = n= 17 s (std dev) = 1.85 3 (17*(-31.2227))/((17-1)*(17-2)*1.85 ) = -0.35 Since the skewness coefficient is close to zero, there is only There is evidence of moderate negative skewness in the data on bad debts % (i.e. only a few fairly low bad debt % values - the majority are higher). 4.665 0.45 5.4 2.2 1.85 3.43 -1.49 -0.35 5.4 1.8 7.2 79.3 17 2.6 6.1 (f) Lower Quartile (approximated manually) Q1 position = (17/4) = 4.25th position Q1 value in this position is = (2.4*+(0.25*(2.6 - 2.4)) = 2.45% Using Excel = 2.6% =QUARTILE(data range,1) 25% of furniture retailers have a bad debts % of no more than 2.45% (or 2.6%). Upper Quartile (approximated manually) Q3 position = (17*3/4) = 12.75th position Q3 value in this position is = (5.8*+(0.75*(6.1 - 5.8)) = 6.025% Using Excel = 6.1% =QUARTILE(data range,3) 25% of furniture retailers have a bad debts % of more than 6.025% (or 6.1%). (g) The average % bad debts is 4.665% while the median % bad debts is 5.4%. Since there is moderate negative skewness (Skp = -0.35), the median should be used as the representative central value. Thus, since the median % of bad debts, is above 5% (median = 5.4%), an advisory note should be sent out. Exercise 3.15 (a) Average (or Mean) (Formula 3.5) File: T/O bins 500 - 750 750 - 1000 1000 - 1250 1250 - 1500 1500 - 1750 1750 - 2000 Average midpoint (x ) 625 875 1125 1375 1625 1875 Total freq (f ) 15 23 55 92 65 50 300 X3.15 - fish shop.xls xf 9375 20125 61875 126500 105625 93750 417250 R 1,390.80 = 417250/300 Interpretation The average daily turnover for the fish shop is R1390.80 per day. (b) Median T/O bins 500 - 750 750 - 1000 1000 - 1250 1250 - 1500 1500 - 1750 1750 - 2000 Total freq (f ) 15 23 55 92 65 50 300 ∑f 15 38 93 185 250 300 Q1 interval Median Interval Q3 interval 150th position out of 300 positions Median position = (300/2) = Median value therefore lies in the 4th interval [i.e. Between R1250 and R1500] Median = 1250+(250*(150-93)/(185-93)) (based on Formula 3.2) (c) Mode Modal value lies in 4th interval [1250 - 1500] as it has the highest frequency count of 92 days Mode = 1250+(250*(92-55)/(2*92-55-65)) = (based on Formula 3.3) (d) R 1,404.90 R 1,394.50 Lower Quartile (using Formula 3.7) Q1 position = (300/4) = 75th position which lies in the interval [1000 - 1250] Q1 value in this position is = (1000+250*(75-38)/(93-38)) = The maximum turnover for the slowest 25% of trading days was R1168.18. (e) R 1,168.18 Upper Quartile (using Formula 3.8) Q3 position = (300*3/4) = 225th position which lies in the interval [1500 - 1750] Q3 value in this position is = (1500+250*(225-185)/(250-185)) = R 1,653.85 A minimum turnover of R1653.85 was generated on the 25% of busiest trading days. Exercise 3.16 (a) (b) (i) Average (Mean) File: % Spend 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 midpoint (x ) 15 25 35 45 55 Total freq (f ) 6 14 16 10 4 50 X3.16 - grocery spend.xls xf 90 350 560 450 220 1670 (Formula 3.5) Average Interpretation On average, a family spends 33.4% of their income on groceries. Median % Spend 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 Total 33.4% = 1670/50 freq (f ) 6 14 16 10 4 50 ∑f 6 20 36 46 50 Q1 interval Median Interval Q3 interval 25th Median position = (50/2) = Median value therefore lies in the 3rd interval [30 - 40] (Formula 3.2) Interpretation Median = 30+(10*(25-20)/(36-20)) 33.125% 50% of families spend no more than 33.125% of their income on groceries. (b) (ii) Lower Quartile (Q1) (using Formula 3.7) Q1 position = (50/4) = 12.5th position which lies in the interval [20 - 30] Q1 value in this position is = 20+(10*(12.5-6)/(20-6)) = Interpretation (c) 24.64% 25% of families spend no more than 24.64% of their income on groceries. Upper Quartile (Q3) (using Formula 3.8) Q3 position = (50*3/4) = 37.5th position which lies in the interval [40 - 50] Q3 value in this position is = 40+(10*(37.5-36)/(46-36)) = Interpretation 41.5% 25% of families spend more than 41.5% of their income on groceries. Exercise 3.17 File: Shares 40 10 5 50 105 Price 15 20 40 10 Total Value 600 200 200 500 1500 = 1500/105 (This is a weighted average measure) (Using Formula 3.5) Average value / equity X3.17 - equity portfolio.xls R 14.29 Exercise 3.18 File: Cars sold 5 12 3 20 Price 25000 34000 55000 Total Value 125000 408000 165000 698000 = 698000/20 (This is a weighted average measure) (Using Formula 3.5) Average Price / Car X3.18 - car sales.xls R34 900 Exercise 3.19 (Use Formula 3.4) Geometric mean = File: 4 √(1.16*1.14*1.1*1.08) - 1 = X3.19 - rental increases.xls 0.11955 Interpretation The average annual % escalation rate in office rentals is 11.955% Using Excel =GEOMEAN(1.16,1,14,1,1,1,08) - 1 = 0.11955 Exercise 3.20 (a) (Using Formula 3.4) Geometric mean = File: X3.20 - sugar increases.xls 6 √(1,05*1,12*1,06*1,04*1,09*1,03) - 1 = 0.06456 Interpretation The average annual % increase in the sugar price (per kg) has been 6.456% Using Excel (b) =GEOMEAN(1.05,1.12,1.06,1.04,1.09,1.03) - 1 = The geometric mean is more appropriate since the base value of each percentage change is different . Each year's percentage change is based on the previous year's sugar price. 0.06456 Exercise 3.21 File: X3.21 - water usage.xls Using Excel 's Descriptive Statistics option in Data Analysis (a) (b) (c) water usage Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Using Excel 's Function Key - QUARTILE Lower Quartile - Q1 Upper Quartile - Q3 (d) 21.2 1.751 19.5 25 9.590 91.959 1.744 1.165 42 8 50 636 30 15 25.75 '=QUARTILE(data range,1) '=QUARTILE(data range,3) Interpretation - central location measures The average monthly water consumption per household is 21.2 kl. 50% of households consume no more than 19.5 kl per month. The most frequently occuring monthly consumption is 25 kl (Note that this modal (value is a misleading and misrepresentative measure at it only occurs 3 times). Interpretation - dispersion measures Based on the mean and standard deviation values, approximately 95.5% of households consume between 2.02 kl and 40.38 kl per month. Note, that the data is heavily skewed to the right (see skewness = 1.165), implying that there are a few households that consume high volumes of water per month. This skewness makes the interpretation given above unreliable as both the mean and the standard deviation are likely to be inflated (or over-estimated) due to the presence of a few high-valued outliers). Interpretation - quartiles 25% of households consume at most 15 kl of water per month. The 25% of heavy water consumers use at least 25.75 kl per month. (e) Households in Paarl suburb - expected total usage per month - expected total usage per year 750 21.2 x 750 = 15900 x 12 = 15 900 kl 190 800 kl Exercise 3.22 File: X3.22 - veal dishes.xls (a) Random variable - cost of a veal cordon bleu meal at a Durban restaurant Data type - numerical, continuous, ratio-scaled. (b) Using Excel 's Descriptive Statistics option in Data Analysis cordon bleu meal price Mean 61.25 Standard Error 1.99 Median 59 Mode 48 Standard Deviation 10.54 Sample Variance 111.08 Kurtosis 0.62 Skewness 0.78 Range 45 Minimum 45 Maximum 90 Sum 1715 Count 28 Interpretation On average, a patron can expect to pay R61.25 for a veal cordon bleu meal at a Durban restaurant. 50% of Durban restaurants charge no more than R59 for a veal cordon bleu meal. (c) There are two modal values (R48 and R55). Both occur with a frequency of 3. It is a misleading value because of its low frequency of occurrence. Note: Excel only shows the first occurrence of a modal value (i.e. R48) in its output. (d) Standard deviation = R 10.54 This means that 68.3% of Durban restaurants are likely to charge between R50.71 (61.25 - 10.54) and R71.79 (61.25 + 10.54) for a veal cordon bleu meal. (e) Skewness Coefficient = Formula 3.14 0.78 Skewness coefficient (approx) = Formula 3.15 0.64 The relative high positive skewness is caused by two Durban restaurants charging high prices (R80 and R90) for a veal cordon bleu meal. (f) Both measures indicate that there is moderate-to-high positive skewness, hence the median cost of R59 would be a more representative measure of central location. (g) R68 Upper Quartile (Q3) =QUARTILE(data range,3) 25% of Durban restaurants charge at least R68 for a veal cordon bleu meal. (h) R 54.75 Lower Quartile (Q1) =QUARTILE(data range,1) 25% of Durban restaurants charge no more than R54.75 for a veal cordon bleu meal. (i) 90th percentile value R 72.90 =PERCENTILE(data range,0.9) File: Exercise 3.23 (a) Using Excel 's Descriptive Statistics option in Data Analysis (i) Mean (ii) Median (iv) Standard deviation (iii) Variance fuel bill Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count X3.23 - fuel bills.xls 418 13.128 398 350 113.693 12926.054 -0.503 0.601 420 256 676 31350 75 (v) Skewness (b) Interpretation - central location The average monthly fuel bill per motorist is R418; while half of the motorists spend no more than R398 on their monthly fuel. Interpretation - standard deviation 68.3% of monthly fuel bills range between R304.31 and R531.69 (one std dev from mean) 95,5% of monthly fuel bills range between R190.61 and R645.39 (within 2 std devs of mean) (c) Interpretation - skewness There is moderate positive skewness (Skp = 0.601) caused by 8 motorists who spend more than R600 per month - well above the majority of motorists' fuel spend. (d) Coefficient of Variation CV% 113,693/418 = 27.2% There is only moderate consistency across monthly fuel bills of motorists This greater relative spread could be caused by different size vehicles and varying distances travelled. (e) Lower Quartile - Q1 R332.5 =QUARTILE(data range,1) Upper Quartile - Q3 =QUARTILE(data range,3) R502.5 Inter Quartile Range = (Q3 - Q1) R170 The middle 50% of monthly fuel bills span R170 from a low of R332.50 to a high of R502.50. (f) Five-Number Summary table Using the Excel Function Keys Minimum Lower Quartile Q1 Median Upper Quartile Q3 Maximum =MIN(data range) or =QUARTILE(data range,0) =QUARTILE(data range,1) =MEDIAN(data range) or =QUARTILE(data range,2) =QUARTILE(data range,3) =MAX(data range) or =QUARTILE(data range,4) R256 R332.5 R398 R502.5 R676 (g) Box plot (h) Interpretation of Box Plot There is clear evidence of moderate positive skewness (skewed-to-the-right) in fuel bills. There are a few motorists who spend a large amount on fuel a month. (i) Total amount of fuel consumed monthly by Paarl motorists for commuting to work: Expected total monthly fuel bill = (average bill / motorist x no.motorists) R418 * 25000 Expected Total fuel consumed (in litres) = (Expected total monthly cost / cost per litre) R10450000/10 = R 10,450,000 1 045 000 litres Exercise 3.24 (a) (b) File: X3.24 - service periods.xls Using Excel 's Descriptive Statistics option in Data Analysis (i) Mean (ii) Median (iii) Std Dev (iv) Skewness service periods Mean 7 Standard Error 0.384 Median 6 Mode 6 Standard Deviation 3.838 Sample Variance 14.727 Kurtosis -0.437 Skewness 0.553 Range 15 Minimum 1 Maximum 16 Sum 700 Count 100 Interpretation - central location On average, each engineer has 7 years of service with a company Half of all engineers spend no more than 6 years with a company Interpretation - standard deviation 68.3% of engineers spend between 3.16 and 10.84 years with a company. Similar intervals can be computed for 2 and 3 std devs from the mean. Note: when the lower limit is computed to be negative, in practice it is zero. Interpretation - skewness There is a moderate positive skewness (Skp = 0.553) in length of service periods, meaning that a few engineers have long service periods with their company. (c) Using Excel 's Histogram option in Data Analysis Frequency Distribution - Years of Service Years of Service 0-3 3-6 6-9 9 - 12 12 - 15 15 - 18 Count 21 30 24 15 8 2 (d) Lower limit = 3,16 years 7 - 3.838 = Upper limit = 10,84 years 7 + 3.838 = 68.3% of engineers spend between 3.16 years and 10.84 years with a company. (e) Percent of members with less than 3 years of service Cumulative % up to 3 years = 21/100 = 21% Percent of members with more than 12 years of service Cumulative % above 12 years = 10/100 = 10% They meet the guidelines for "new blood" (21,2%) ,but have far fewer "experienced" members (only 10%) than their guidelines require. Exercise 3.25 File: X3.25 - dividend yields.xls (a) Random variable Data type (b) Using Excel 's Descriptive Statistics option in Data Analysis dividend yield (%) of a company numeric, ratio-scaled Dividend yields (i) Mean (ii) (iii) (iv) Median Mode Std Dev Mean 4.273 Standard Error 0.218 Median Mode Standard Deviation 4.1 2.8 1.444 Sample Variance Kurtosis (v) Skewness Skewness Range Minimum Maximum Sum Count (c) 2.084 -0.359 0.259 6.1 1.5 7.6 188 44 Mean The average dividend yield per company is 4.273% Median Half of the dividend yields are at or below 4.1% Mode A misleading measure because there are 4 dividend yield values (viz. 2,8, 3,6; 4,1; 5,1) of equal modal frequency (of 3) Std dev 68.3% of all dividend yields lie between 2.83% and 5.72% Similarly for 2 and 3 std devs from the mean Skewness There is very slight positive (right) skewness (Skp = 0.259). The histogram can be assumed to be normally distributed. (d) (e) the Mean (Average) as there is minimal skewness present in the data. Bin yields Frequency Below 2% 2 - 3.5% 3.5 - 5% 5 - 6.5% 6.5 - 8% 3 11 16 11 3 (f) Five-Number Summary Table Minimum Lower Quartile Q1 Median Upper Quartile Q3 Maximum (g) Box plot 1.5 3.175 4.1 5.15 7.6 =QUARTILE(data range,0) =QUARTILE(data range,1) =QUARTILE(data range,2) =QUARTILE(data range,3) =QUARTILE(data range,4) Dividend Yield % The very slight positive skewness can be seen from the longer tail on the right of the box plot. It implies a few companies have achieved significantly higher dividend yields than the majority of the sample. (h) Minimum dividend yield achieved by the top performing 10% of companies 6.17% =PERCENTILE(data range,0.90) The top 10% of companies achieved at least a dividend yield of 6.17% (i) % of companies who did not declare more than a 3.5% dividend yield Cumulative frequency up to upper limit of 3.5 = (3 + 11) = 14. % Cumulative = 14 / 44 % = 31.8% Almost 1/3 (31.8%) of companies did not declare more than a 3.5% dividend yield. Exercise 3.26 File: (a) Random variable - the unit price of a rosebud (in cents) Data type - numerical, continuous, ratio-scaled (b) Using Excel 's Descriptive Statistics option in Data Analysis (i) (iii) (ii) (iv) selling price Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count rosebuds.xls 60.312 0.319 59.95 60.6 3.188 10.160 1.770 0.932 18.3 55.2 73.5 6031.2 100 Interpretation Mean: The average selling price of rosebuds is 60.312 cents. Median: 50% of rosebuds sold for no more than 59.95c. Std dev: 68.3% of rosebud unit prices are likely to lie between 57.12c and 63.5c. Skewness: There is excessive positive skewness (one very high unit price) (c) Coefficient of Variation CV% 3.188/60.312% = 5.3% There is very low variability amongst unit selling prices of rose buds (d) Lower Quartile Q1 Upper Quartile Q3 57.7 cents 62.45 cents =QUARTILE(data range,1) =QUARTILE(data range,3) (e) The highest unit selling price of the cheapest 25% of sales is 57.7 cents (i.e. Q1) (f) The minimum unit selling price of the highest-priced 25% of sales is 62.45 cents (i.e. Q3) Overall interpretation of (a), (d), (e) and (f) Unit selling prices ranged from 55.2c to 73.5c where 50% of the selling prices were above 59.95c. 25% of sales were below 57.7c while the most expensive 25% of unit selling prices were above 62.45c. (g) 90th percentile 64.6c =PERCENTILE(data range,0.9) (h) 10th percentile 56.6c =PERCENTILE(data range,0.1) (i) Five-Number Summary Table and Boxplot Minimum Lower Quartile Q1 Median Upper Quartile Q3 Maximum 55.2 57.7 59.95 62.45 73.5 =QUARTILE(data range,0) =QUARTILE(data range,1) =QUARTILE(data range,2) =QUARTILE(data range,3) =QUARTILE(data range,4) Boxplot Interpretation The distribution of unit selling prices of rosebuds is skewed to the right. There is also one extremely high unit selling price of 73.5c which can be considered an outlier. Overall, unit selling prices of rosebuds ranged between 55.2c and 73.5c. 25% of unit selling prices lay below 57.7 c (i.e. lower quartile); while the middle 50% of unit selling prices ranged from 57.7c to 62.45c. The "best" 25% of unit selling prices achieved were above 62.45c (Q3). Finally, the median (middle) unit selling price achieved was 59.95c. File: Mini Case Study 3.27 Frequency Distribution Savings Intervals 0 - ≤ 200 201 - ≤ 400 401 - ≤ 600 601 - ≤ 800 801 - ≤1000 More Histogram of Savings Balances (R10's) Frequency 21 71 56 19 4 4 80 71 70 Number of Clients 1 (a) X3.27 - savings balances.xlsx 56 60 50 40 30 21 19 20 10 4 4 0 0 - ≤ 200 201 - ≤ 400 401 - ≤ 600 601 - ≤ 800 801 - ≤1000 Savings Intervals 1 (b) Descriptive Statistics Savings Balance (R10's) Mean 421.86 Standard Error 16.247 Median 385 Mode 326 Standard Deviation 214.93 Sample Variance 46195.1765 Kurtosis 4.727 Skewness 1.634 Range 1383 Minimum 85 Maximum 1468 Sum 73826 Count 175 1 (c ) 1 (d) Lower Quartile Upper Quartile Female Male Grand Total Married 23 83 106 Single 41 28 69 Grand Total 64 111 175 Female Male Grand Total Married 512.6 368.0 399.4 Single 563.7 299.3 456.4 Grand Total 545.3 350.7 421.9 2 (a) Gender Marital status Month-end balances Categorical Categorical Numeric 2 (b) Refer to 1 (a) (Histogram) and 1 (b) (Descriptive Statistics) 266.5 520.5 Nominal-scaled Nominal-scaled Ratio-scaled The average savings balance of bank clients is R421.86. 50% of these clients have month-end balances of less than R385 (median). From the histogram, 21 clients have month-end balances of less than R200 and 8 are above R800. The 8 banking clients with relatively high month-end savings balances (skewness = 1.643 > 1.0) skew the average towards these higher balances and distorts the overall picture. Therefore the median month-end balance of R385 is a more representative indicator of savings balances. 25% of their clients have month-end balances of less than R266.6, while the top 25% of savers have month-end balances in excess R520.5 with 4 clients above R1000 (maximum = R1468). Refer to 1 (c) (Count of Gender and Marital Status) Amongst the female clients, the majority (64%) are single (41/64), while amongst the male clients, the majority (75%) are married. Thus they are attracting mainly single females; and married males. Refer to 1 (d) (Breakdown Table of Average Savings by Gender and Marital Status) Single females are saving the most (average of R563,7) while single males are the worst savers with an average savings balance of only R299.3 (compared to the overall average of R421.9). Also females save more, on average (R545) than males (R350). Single bank clients save more, on average (R456.4) than married bank clients (R399.4). More 2 (c) Plan of Action by Bank Bank should target females in general (as they comprise only 37%) of the current client base but have the largest average balances of all clients. Single females (23%) in particular should be targetted to attract more high savers to the bank. They should encourage their main client base - married males - to save more. Mini Case Study 3.28 X3.28 - medical claims.xlsx Frequency Distribution Claims Ratio Below 0.1 0.1 - 0.5 0.5 - 0.9 0.9 - 1.3 1.3 - 1.7 1.7 - 2.1 2.1 - 2.5 Above 2.5 Count 6 35 26 36 30 14 2 1 150 % 4.0 23.3 17.3 24.0 20.0 9.3 1.3 0.7 100 Histogram of Claims Ratios No. of Members 1 (a) File: 40 35 30 25 20 15 10 5 0 36 35 30 26 14 6 2 Claims Ratio Intervals 1 (b) Descriptive Statistics Claims Ratio Mean Median Mode Standard Deviation Sample Variance Skewness Range Minimum Maximum Sum Count Lower Quartile Upper Quartile 0.974 0.996 #N/A 0.594 0.353 0.242 2.672 0.013 2.685 146.1 150 0.460 1.421 1(c) and (d) Cross-tabulation Table / Breakdown table Age Marital Married 26 - 35 Row totals 1.326 0.915 1.123 1.112 std dev 0.552 0.589 0.629 0.610 minimum 0.147 0.051 0.013 0.013 maximum 2.685 1.989 2.408 2.685 23 27 35 85 average 0.919 0.474 0.947 0.793 std dev 0.543 0.356 0.500 0.524 minimum 0.014 0.024 0.218 0.014 maximum 1.905 1.262 1.814 1.905 36 19 10 65 1.078 0.733 1.084 0.974 count average Column totals 46 - 55 average count Single 36 - 45 std dev 0.577 0.547 0.602 0.594 minimum 0.014 0.024 0.013 0.013 maximum 2.685 1.989 2.408 2.685 59 46 45 150 count 2 (a) Marital Status Age Group (bands) Claims Ratio Qualitative / Categoric Qualitative / Categoric Quantitative / Numeric Nominal Ordinal Ratio, Continuous 2 (b) Claims Ratio Pattern of Members The average claims ratio is 0.974. This means that, on average, members are claiming as much as they contribute. The median claims ratio is a similar value (0.996). This means that at least half the members are claiming more than they contribute. The distribution (see histogram) appears to be bi-modal showing a low claims ratio (i.e. less than half their contributions) by at least a quarter of members (Q1 = 0.46) and a high claims ratio by at least half of the members (50% greater than median claims ratio of 0.996). At least 25% of members are claiming significantly more than they contribute (Q3 = 1.42). Only 4% of members claim less than 10% of their contributions (first interval of histogram). Also at least 11% (top three intervals of histogram) are claiming at least twice as much as they contribute. Refer to 1(c) and (d) It would appear that married members claim significantly more (1.112) than single members (0.793). Married members also make up the majority of members (85/150 = 57%). 1 Also, the younger members (26 - 35) and older members (46 - 55) claim significantly more (1.078 and 1.084 respectively) than the middle-aged members (36 - 45) (0.733), on average. These two 'high-claiming' age segments of members represent 69% (104/150) of all members. The highest claiming segments are the married younger (26-35) (1.326) and married older members (46 - 55) (1.123). Members from these two segments also have the maximum claims ratio of all members (2.685 and 2.408 respectively). Collectively these members represent 39% (58/150) of all members of the medical scheme. The lowest claiming segment is the middle-aged single members (0.474) but they only comprise 13% (19/150) of all scheme members. The range of claims ratios is highest amongst married (25-35 years) (0.147 - 2.685) and married (46 - 55 years) (0.013 - 2.408). It is lowest amongst single (36 - 45 year) members (0.024 - 1.262). In conclusion: Younger and Older married members claim more on average - and above their contribution levels in most cases). They also make up the bulk of membership (57%). The single members tend to claim less than the married members, but only make up 43% of the membership base. 2 (c) There is cause for concern about the financial viability of the Scheme because: The claims ratio of married members exceed 1 on average Within the married group, the age groups 26 -35 and 46 - 55 are claiming more than they contribute The married group represent the majority of members (57%) The two age groups within the married group represent 69% of all married (and are claiming more than they contribute). Overall the scheme is operating close to its breakeven of financial non-viability (overall mean claims ratio = 0.974) There is significant cross-subsidization of the married younger and older members by mostly the middle-aged single members. The mangement of this medical scheme needs to review member contributions from the married younger and older members. CHAPTER 4 BASIC PROBABILITY CONCEPTS Exercise 4.1 P(A) = 0.2 means that an event has a 20% chance of occurring. Exercise 4.2 Mutually exclusive events. Exercise 4.3 The outcome of one event does not influence / nor is influenced by / the outcome of the other event. Exercise 4.4 P(A or B) = P(A) + P(B) - P(A ∩ B) = 0.26 + 0.35 - 0.14 = 0.47 Exercise 4.5 P(X / Y) = P(X ∩ Y) / P(Y) = 0.27 / 0.36 = 0.75 P(Y / X) = P(Y ∩ X) / P(X) = 0.27 / 0.54 = 0.50 Thus P(X / Y) ≠ P(Y / X) Exercise 4.6 (a) File: Sector Mining Financial IT Production Total Count 45 72 32 101 250 X4.6 - economic sectors.xls % count 18.0 28.8 12.8 40.4 100 (b) P(Financial) = 28.8% (c) P(Not Production) = 100 - 40.4 = 59.6% (d) P(Mining or IT) = 18 + 12.8 = 30.8% (e) In (b), the marginal probability was computed. In (c), used the Complementary probability rule In (d), used the Addition rule for mutually exclusive events. Exercise 4.7 File: X4.7 - apple grades.xls (a) Apple Grades A B C D Total Quantity 795 410 106 189 1500 (b) P(Grade A) = 53% (c) P(Grade B or D) = 27,3 + 12,6 = 39.9% (d) P(not (Grade C or D)) = 100 - (7,1 + 12,6) = 80.3% (e) % 53.0 27.3 7.1 12.6 100 In (b), the marginal probability was computed. In (c), used the Addition rule for mutually exclusive events. In (d), used the Complementary probability rule. Exercise 4.8 File: Count 6678 1492 653 2865 914 12602 X4.8 - employment sectors.xls (a) Sector Formal Business Commercial Agriculture Subsistence Agriculture Informal Business Domestic Service Total % count 53.0 11.8 5.2 22.7 7.3 100 (b) P(Domestic Service) = (c) P(Commercial or Subsistance Agric) = 11,8 + 5,2 = (d) P(Informal Business / Business) = (e) In (b), the marginal probability was computed. In (c), the Addition rule for mutually exclusive events. In (d), the Conditional Probability rule. 7.3% 2865/(2865+6678) = 17% 30.02% Exercise 4.9 (a) Random variable 1 Random variable 2 (b) (c) (d) (e) File: Qualification Matric Diploma Degree Total X4.9 - qualification levels.xls Managerial level (categorical, ordinal-scaled and discrete) Qualification level (categorical, ordinal-scaled and discrete) Section Head 28 20 5 53 Managerial Level Dept Head 14 24 10 48 Division Head 8 6 14 28 50/129 = 5/129 = 24/50 = 28/129 = (53+28)/129 = (50+50+29)/129 = 10/48 = (28+50-6)/129 = 38.76% 3.88% 48% 21.71% 62.79% 100% 20.83% 55.81% Total 50 50 29 129 (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) P(Matric) = P(Section head ∩ Degree) = P(Dept head / Diploma) = P(Division head) = P(Division head U Section head) = P(Matric U Diploma U Degree) = P(Degree / Dept head) = P(Division head U Diploma U both) = (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) Probability Types and Rules Marginal probability Joint probability and Multiplication rule Conditional probability Marginal probability Addition rule for mutually exclusive events Collectively exhaustive set of events and Addition rule for mutually exclusive events Conditional probability Addition rule for non-mutually exclusive events Yes, since these outcomes cannot occur simultaneously. Exercise 4.10 Admin Production Total File: Cash bonus 28 56 84 (a) (b) (c) (d) (e) P(Cash bonus) = P(Share option) = P(Production ∩ Cash bonus) = P(Share option / Admin) = P(Production / Cash bonus) = (f) P(A/B) = P(A) ? (g) See (a) to (e) above Profit-sharing 44 75 119 84/300 = 97/300 = 56/300 = 68/140 = 56/84 = X4.10 - bonus options.xls Share options 68 29 97 28.00% 32.33% 18.67% 48.57% 66.67% Total 140 160 300 Marginal probability Marginal probability Joint probability Conditional probability Conditional probability P(A/B) = 68/140 = 48.57% P(A) = 97/300 = 32.33% Since P(A/B) ≠ P(A), the two events are statistically dependent. Conclusion: The choice of bonus option and employee work function are associated. Exercise 4.11 Age <30 30-50 >50 Total File: Production 60 70 30 160 Department Sales 25 29 8 62 Administration 18 25 35 78 103/300 = 160/300 = 29/300 = 35/78 = (160+103-60)/300 = 34.33% 53.33% 9.67% 44.87% 67.67% X4.11 - age profile.xls Total 103 124 73 300 (a) (i) (ii) (iii) (iv) (v) P(< 30 years) = P(Production) = P(Sales ∩ (30-50) years) = P(>50 / Admin) = P(Production U <30 U both) = Marginal probability Marginal probability Joint probability Conditional probability Conditional probability (b) No, since outcomes of these two events can occur simultaneously. (c) Let A = >50 and B = Admin. Is P(A/B) = P(A) ? P(A/B) = 35/78 = 44.87% 73/300 = P(A) = 24.33% Since P(A/B) ≠ P(A), the two events are statistically dependent. Conclusion: There is an association between the Age of an employee and the Department in which they are employed (e.g. younger employees tend to be in Production). (d) See (a) above Exercise 4.12 Usage Professional Personal Total File: X4.12 - digital cameras.xls Digital Camera Brand Preference Canon Nikon Pentax 48 15 27 30 95 65 78 110 92 Total 90 190 280 (a) (b) (c) P(Professional) = P(Nikon User) = P(Pentax / Personal) = (d) Let A = Professional usage and B = Canon preference. P(A/B) = P(A) ? P(A/B) = 48/78 = 61.54% P(A) = 90/280 = 32.14% Since P(A/B) ≠ P(A), the two events are statistically dependent. Conclusion: Type of usage and choice of brand are associated. (e.g. Professionals may prefer Canon while Nikon is favoured by Personal users). (e) P(Canon ∩ Professional) = 48/280 = (f) P(Professional U Nikon U both) = (90+110-15)/280 = 66.07% Addition rule for non-mutually exclusive events (g) No, as outcomes of the two events can occur simultaneously as illustrated in (f) above. 90/280 = 110/280 = 65/190 = 32.14% 39.29% 34.21% 17.14% Marginal probability Marginal probability Conditional probability Joint probability Exercise 4.13 (a) Probability Tree Component A Component B Joint outcomes (Fail) P(F2) = 0.15 P(F1 ∩ F2) = 0.20 x 0.15 = 0.03 (Not fail) P(S2) = 0.85 P(F1 ∩ S2) = 0.20 x 0.85 = 0.17 (Fail) P(F2) = 0.15 P(S1 ∩ F2) = 0.80 x 0.15 = 0.12 (Not fail) P(S2) = 0.85 P(S1 ∩ S2) = 0.80 x 0.85 = 0.68 P(F1) = 0.20 (Fail) (Not fail) P(S1) = 0.80 1.00 Exercise 4.13 (b) P(A) = 0.20 P(B) = 0.15 P(A ∩ B) = P(A) x P(B) = (c) P(A U B) P(Fail) = P(A U B) = P(A) + P(B) - P(A∩B) = Hence P(not Fail) = 1 - P(A U B) = 1 - 0.32 = 0.2 * 0.15 = 0.03 (3% chance) 0.2+0.15-0.03 0.32 0.68 (68% chance) Exercise 4.14 (a) Probability Tree Workshop P(Y) = 0.30 (Attend) (Not attend) P(N) = 0.70 Exam Result Joint outcomes (Pass) P(S) = 0.80 P(Y ∩ S) = 0.30 x 0.8 = 0.24 (Fail) P(F) = 0.20 P(Y ∩ F) = 0.30 x 0.20 = 0.06 (Pass) P(S) = 0.60 P(N ∩ S) = 0.70 x 0.60 = 0.42 (Fail) P(F) = 0.40 P(N ∩ F) = 0.70 x 0.40 = 0.28 1.00 Exercise 4.14 (b) Refer to the end nodes of the Probability Tree to answer question 4.14 (b) (i) P(Pass and Attended workshop) = (ii) P(Pass) = P(S) = P(S ∩ Y) + P(S ∩ N) Reading off the Probability Tree Thus P(S) = P(Y ∩ S) = 0.30 x 0.8 = 0.24 0.24 + 0.42 = P(S ∩ Y) = 0.30 x 0.8 = 0.24 P(S ∩ N) = 0.70 x 0.60 = 0.42 0.66 66% 24% Exercise 4.15 Using Excel (i) (ii) (iii) (iv) 6! = 6.5.4.3.2.1 = 3! 5! = (3.2.1)(5.4.3.2.1) = 4! 2! 3! = (4.3.2.1)(2.1)(3.2.1) = 7!/(4!*(7-4)!) = 7C4 = (v) 9C6 = 9!/(6!*(9-6)!) = (vi) 8P3 = 8!/(8-3)!) = 336 =FACT(8)/FACT(5) (vii) 5P2 = 5!/(5-2)!) = 20 =FACT(5)/FACT(3) (viii) 7C7 7!/(7!*(7-7)!) = (ix) = 7P4 = 7!/(7-4)!) = 720 720 288 35 =FACT(6) =FACT(3)*FACT(5) =FACT(4)*FACT(2)*FACT(3) =FACT(7)/(FACT(4)*FACT(3)) 84 =FACT(9)/(FACT(6)*FACT(3)) 1 840 =FACT(7)/(FACT(7)*FACT(0)) =FACT(7)/FACT(3) Likely scenarios (i) (ii) (iii) (iv) (vi) Number of ways of arranging 6 cars on a showroom floor. Number of ways of arranging the seating plan of 8 persons, consisting of 3 males and 5 females. Number of sequences of visiting 9 stores, consisting of 4 clothing stores, 2 home décor stores and 3 coffee shops. Selecting all combinations of 4 holiday destinations from a possible 7 destinations. Similar for (v) and (viii) Selecting a committee of 3 persons from 8 candidates, where the first person selected is the chairman, the second is the secretary and the third is a member. Similar for (vii) and (ix) Exercise 4.16 Assume each advertisement contains a different combination of 7 out of 12 products. 12C7 = 12!/(7!*(12-7)!) = 792 different combinations =FACT(12)/(FACT(7)*FACT(5)) (Using Excel ) Exercise 4.17 A different permutation of 3 soup brands on 5 shelves is required. 5P3 = 5!/(5-3)! = 60 distinct ordering of 3 soup brands on 5 shelves. =FACT(5)/FACT(2) (Using Excel ) Exercise 4.18 (a) 9C4 = (b) P(3,5,7,8) = 9!/(4!*(9-4)!) = 126 separate portfolios of 4 equities. =FACT(9)/(FACT(4)*FACT(5)) (Using Excel ) 1/126 = 0,794% chance of getting this combination. 0.007937 Exercise 4.19 No. of permutations of 5 screws = 5! 5.4.3.2.1 = 120 Thus the probability of replacing them in exactly the same order = 1/120 = 0.00833 (0,833% chance) Exercise 4.20 (a) 10C3 = 10!/(3!*(10-3)!) = 120 different selections of 3 tourist attractions from 10 options. =FACT(10)/(FACT(3)*FACT(7)) (Using Excel ) (b) P(a given combination of 3 out of 10) = 1/120 = 0.0083 (0,833% chance) Exercise 4.21 (a) 4 C2 x 7 C4 = (4!/(2!*2!))*(7!/(4!*3!)) = 210 different committees =FACT(4)/(FACT(2)*FACT(2))*FACT(7)/(FACT(4)*FACT(3)) (Using Excel ) (b) (4C2 x 7C4) x 2 = (4!/(2!*2!))*(7!/(4!*3!)) x 2 = 420 different committees =2*FACT(4)/(FACT(2)*FACT(2))*FACT(7)/(FACT(4)*FACT(3)) (Using Excel ) Exercise 4.22 Project Scoping Study APPROACH 1 T L Using a PROBABILITY TREE On time Late S NS Marginal probabilities P(T) = Scope change No scope change Conditional probabilities 0.7 On time Late P(L) Bayes Theorem 0.3 Joint probabilities P(S/T) = 0.4 P(S and T) = 0.28 P(NS/T) = 0.6 P(NS and T) = 0.42 P(S/L) = 0.8 P(S and L) = 0.24 P(NS/L) = 0.2 P(NS and L) = 0.06 1 Bayes Application Given Find (i) (ii) Then P(T) = P(T/S) = Using the Joint Probabilities from the Probability Tree Prior Probability Posterior Probability 0.7 P(T and S)/P(S) = P(S) = P(S and T) + P(S and L) = P(T and S) = 0.28 P(T/S) = P(T and S)/P(S) = 0.52 0.5385 There is a 53.85% chance that a 'scope-changed' project will be completed on time. ---ooOoo--APPROACH 2 T L Using TABLE FORMAT (Applying Marginals and Conditional Probabilities) Additional Information S NS 0.28 0.42 0.24 0.06 0.52 0.48 0.7 0.3 1 ---ooOoo--- P(T/S) = P(T and S)/P(S) =0.28/(0.28+0.24) 0.5385 Exercise 4.23 Married Couples Sporting Habits Study APPROACH 1 HS HN Bayes Theorem Using a PROBABILITY TREE Husband plays sport Husband does not play sport Marginal probabilites P(HS) = Husband plays sport 0.6 P(HNS) Husband does not play sport 0.4 WS WNS Wife plays sport Wife does not play sport Conditional probabilities Joint probabilities P(WS/HS) = 0.4 P(HS and WS) = 0.24 P(WNS/HS) = 0.6 P(HS and WNS) = 0.36 P(WS/HNS) = 0.3 P(HNS and WS) = 0.12 P(WNS/HNS) = 0.7 P(HNS and WNS) = 0.28 1 Bayes Application Using the Joint Probabilities from the Probability Tree P(HS) = P(HS/WS) = Given Find (i) (ii) Then 0.6 P(HS and WS)/P(WS) = Prior Probability Posterior Probability P(WS) = P(WS and HS) + P(WS and HNS) = P(HS and WS) = 0.24 P(HS/WS) = P(HS and WS)/P(WS) = 0.36 0.6667 There is a 66,67% chance that a husband plays sport if the wife also plays sport. ---ooOoo--APPROACH 2 HS HNS Using TABLE FORMAT (Applying Marginals and Conditional Probabilities) Additional Information WS WNS 0.24 0.36 0.12 0.28 0.36 0.64 0.6 0.4 1 ---ooOoo--- P(HS/WS) = P(HS and WS)/P(WS) =0.24/(0.24+0.12) 0.6667 Airline Departure Times Study Exercise 4.24 APPROACH 1 A B Using a PROBABILITY TREE Airline A Airline B T L Marginal probabilities P(A) = A 0.6 B P(B) = Bayes Theorem 0.4 Leaves on Time Leaves Late Conditional probabilities Joint probabilities P(T/A) = 0.8 P(A and T) = 0.48 P(L/A) = 0.2 P(A and L) = 0.12 P(T/B) = 0.65 P(B and T) = 0.26 P(L/B) = 0.35 P(B and L) = 0.14 1 Bayes Application Given Find (i) (ii) Then P(A) = P(A/T) = Using the Joint Probabilities from the Probability Tree 0.6 P(A and T)/P(T) = Prior Probability Posterior Probability P(T) = P(A and T) + P(B and T) = P(A and T) = 0.48 P(A/T) = P(A and T)/P(T) = 0.74 0.6486 There is a 64.86% chance that it is Airline A, if the aircraft that has just left, left on time. ---ooOoo--APPROACH 2 A B Using TABLE FORMAT (Applying Marginals and Conditional Probabilities) Additional Information T L 0.12 0.48 0.26 0.14 0.26 0.74 0.6 0.4 1 ---ooOoo--- P(A/T) = P(A and T)/P(T) =0.48/(0.48+0.26) 0.6486 Exercise 4.25 New Business Venture Study APPROACH 1 G NG Using a PROBABILITY TREE NBV started by Graduate NBV started by non-Graduate Marginal probabilities P(G) = Graduate S F Successful Failure Conditional probabilities 0.6 Non graduate P(NG) = Bayes Theorem 0.4 Joint probabilities P(S/G) = 0.8 P(G and S) = 0.48 P(F/G) = 0.2 P(G and F) = 0.12 P(S/NG) = 0.65 P(NG and S) = 0.26 P(F/NG) = 0.35 P(NG and F) = 0.14 1 Bayes Application Given Find (i) (ii) Then Using the Joint Probabilities from the Probability Tree P(G) = P(G/F) = 0.6 P(G and F)/P(F) = Prior Probability Posterior Probability P(F) = P(G and F) = P(G and F) + P(NG and F) = 0.12 P(G/F) = P(G and F)/P(F) = 0.26 0.4615 There is a 46.15% chance that a NBV was started by a Graduate given that it has failed. ---ooOoo--APPROACH 2 G NG Using TABLE FORMAT (Applying Marginals and Conditional Probabilities) Additional Information S F 0.48 0.12 0.26 0.14 0.74 0.26 0.6 0.4 1 ---ooOoo--- P(G/F) = P(G and F)/P(F) =0.12/(0.12+0.14) 0.4615 Exercise 4.26 On-line Airline Tickets Purchase Study APPROACH 1 E NE Using a PROBABILITY TREE e-Ticket purchase non e-Ticket purchase Marginal probabilities P(E) = e-Ticket 0.6 Bayes Application Given Find (i) (ii) Then B NB Business Traveller non-Business Traveller Conditional probabilities 0.4 none-Ticket P(NE) = Bayes Theorem Joint probabilities P(B/E) = 0.8 P(E and B) = 0.32 P(NB/E) = 0.2 P(E and NB) = 0.08 P(B/NE) = 0.45 P(NE and B) = 0.27 P(NB/NE) = 0.55 P(NB and NB) = 0.33 1 Using the Joint Probabilities from the Probability Tree P(E) = P(E/B) = 0.4 P(E and B)/P(B) = Prior Probability Posterior Probability P(B) = P(E and B) = P(E and B) + P(NE and B) = 0.32 P(E/B) = P(E and B)/P(B) = 0.59 0.5424 There is a 54.24% chance that an e-ticket was bought given that it was bought by a business traveller. ---ooOoo--APPROACH 2 E NE Using TABLE FORMAT (Applying Marginals and Conditional Probabilities) Additional Information B NB 0.32 0.08 0.27 0.33 0.59 0.41 0.4 0.6 1 ---ooOoo--- P(E/B) = P(E and B)/P(B) =0.32/(0.32+0.27) 0.5424 CHAPTER 5 PROBABILITY DISTRIBUTIONS Exercise 5.1 Exercise 5.2 Binomial probability distribution Poisson probability distribution (a) (b) (c) (d) continuous discrete discrete continuous (e.g. 35.142 gm) (e.g. 132 employees) (e.g. 7046 households) (e.g. 514.68 km) Exercise 5.3 (a) (i) n = 7; p = 0,2; x = 3 (a) (ii) n = 10; p = 0,2; x = 4 P(x = 3) = P(x = 4) = 7C3 3 (7-3) (0.2) (0.8) 10C4 4 = 0.1147 (10-4) (0.2) (0.8) 11.47% = 0.0881 8.81% (a) (iii) n = 12; p = 0,3; x ≤ 4 P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) + P(x = 4) 72.37% 0.01384+0.07118+0.16779+0.2397+0.23114 = 0.7237 (a) (iv) n = 10; p = 0,05; x = 2 or 3 (a) (v) n = 8; p = 0,25; x ≥ 3 (b) (i) (b) (ii) (b) (iii) (b) (iv) (b) (v) =BINOMDIST(3,7,0.2,0) =BINOMDIST(4,10,0.2,0) =BINOMDIST(4,12,0.3,1) =BINOMDIST(2,10,0.05,0)+BINOMDIST(3,10,0.05,0) =1-BINOMDIST(2,8,0.25,1) P(x = 2) + P(x = 3) 0.07464+0.01048 = 0.0851 8.51% 1 - (P(x = 0) + P(x = 1) + P(x = 2)) 1 - (0.10011+0.26697+0.31146) = 0.3215 32.15% 0.1147 0.0881 0.7237 0.0851 0.3215 Exercise 5.4 (a) Binomial distribution There are only two possible outcomes (in-stock; out-of-stock) This outcome is observed 6 times (n = 6 stores) The probability of observing the "out-of-stock" outcome, p = 0.20, is constant. The stores (trials) are independent of each other (b) P(x = 1) = 6C1 (0.2)1 (0.8)(6-1) = 0.3932 39.32% =BINOMDIST(1,6,0.2,0) (c) P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) 0.2621 + 0.3932 + 0.2458 = 0.9011 90.11% =BINOMDIST(2,6,0.2,1) (d) P(x = 0) = 6C0 (0.2)0 (0.8)(6-0) = 0.2621 26.21% =BINOMDIST(0,6,0.2,0) (e) Mean (binomial) = 6 x 0.2 = 1.2 On average, 1,2 stores out the 6 stores surveyed can be expected to be out of stock in a week. Exercise 5.5 (a) (b) Binomial distribution n = 12 p = 0.15 P(x = 0) = 12C0 (0.15)0 (0.85)(12-0) = 0.1422 =BINOMDIST(0,12,0.15,0) n = 15 p = 0.15 P(x < 3) = P(x = 0) + P(x = 1) + P(x = 2) = = 0.0874 + 0.2312 + 0.2857 = 0.6042 =BINOMDIST(2,15,0.15,1) Exercise 5.6 (a) Binomial distribution n = 10 p = 0.30 (probability of preferring the deluxe model) P(x = 3) = 10C3 (0.30)3 (0.70)(10-3) = 0.2668 =BINOMDIST(3,10,0.3,0) (b) n = 10 p = 0.70 (probability of preferring the standard model) P(x > 2) = 1 - P(x ≤ 2) = 1 - (P(x = 0) + P(x = 1) + P(x = 2)) = = 1 - (0.0000059 + 0.000138 + 0.00145) = 0.9984 =1-BINOMDIST(2,10,0.7,1) Exercise 5.7 (a) Binomial distribution n=8 p = 0.05 (probability of a defective Tata truck) P(x = 1) = 8C1 (0.05)1 (0.95)(8-1) = (b) n=8 p = 0.05 n=8 p = 0.05 0.9942 =BINOMDIST(2,8,0.05,1) (probability of a defective Tata truck) 0 P(x = 0) = 8C0 (0.05) (0.95)(8-0) = (d) =BINOMDIST(1,8,0.05,0) (probability of a defective Tata truck) P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) = = 0.66342 + 0.27934 + 0.05146 = (c) 0.2793 0.6634 =BINOMDIST(0,8,0.05,0) Mean(binomial) = 64 x 0.05 = 3.2 Based on 64 sales, the dealer can expect 3.2 trucks to be returned for assemby defective repairs Exercise 5.8 (a) Binomial distribution n=6 p = 0.80 (probability of a UT out-performing the JSE Index) P(x = 6) = 6C6 (0,80)6 (0,20)(6-6) = (b) n=6 p = 0.80 n=8 p = 0.20 =BINOMDIST(6,6,0.8,0) (probability of a UT out-performing the JSE Index) P(x = (2 U 3)) = P(x = 2) + P(x = 3) = = 0.01536 + 0.08192 = (c) 0.2621 0.0973 =BINOMDIST(2,6,0.8,0)+BINOMDIST(3,6,0.8,0) (probability of a UT under-performing the JSE Index) P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) = = 0.26214 + 0.39322 + 0.24576 = 0.9011 =BINOMDIST(2,6,0.2,1) Exercise 5.9 (a) Binomial distribution n = 12 p = 0.20 (probability of a person participating in a focus group) P(x = 2) = 12C2 (0.20)2 (0.80)(12-2) = =BINOMDIST(2,12,0.2,0) (b) n = 12 p = 0.20 (probability of a person participating in a focus group) 5 P(x = 5) = 12C5 (0.20) (0.80)(12-5) = =BINOMDIST(5,12,0.2,0) (c) n = 12 p = 0.20 0.2835 0.5320 (probability of a person participating in a focus group) P(x ≥ 6) = 1 - P(x ≤ 5) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3)+P(x=4)+P(x=5)) = = 1 - (0.0687 + 0.2062 + 0.2835 + 0.2362 + 0.1329 + 0.0532) = =1 - BINOMDIST(5,12,0.2,1) 0.0194 Exercise 5.10 (a) (i) Binomial distribution n = 10 p = 0.10 (probability of a general population person being a heavy reader ) P(x < 2) = P(x=0)+P(x=1) = 10C0(0.10)0(0.90)(10-0) + 10C1(0.10)1(0.90)(10-1) = 0.34868 + 0.38742 = 0.7361 P(x < 2) = =BINOMDIST(1,10,0.1,1) (a) (ii) n = 10 p = 0.35 (probability of a pensioner person being a heavy reader ) P(x < 2) = P(x=0)+P(x=1) = 10C0(0.35)0(0.65)(10-0) + 10C1(0.35)1(0.65)(10-1) = 0.01346 + 0.07249 = 0.0860 P(x < 2) = =BINOMDIST(1,10,0.35,1) (b) n = 280 p = 0.65 (probability of a pensioner person not being a heavy reader ) Expected number (mean) of "non-heavy" readers = 280 x 0.65 = 182 Exercise 5.11 (a) Poisson distribution e-3 35 / 5! = P(x = 5 / a = 3) = (b) P(x ≥ 4 / a = 3) = 1 - P(x ≤ 3) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3)) -3 0 -3 1 -3 2 -3 3 = 1 - (e 3 / 0! +e 3 / 1! + e 3 / 2! + e 3 / 3!) = 1 - (0.04979 + 0.14936 + 0.22404 + 0.22404) = 1 - 0.64723 = 0.3528 35.28% (c) P(x = 0 / a = 3) = e-3 30 / 0! = 0.1008 0.0498 10.08% 4.98% =POISSON(5,3,0) =1-POISSON(3,3,1) =POISSON(0,3,0) Exercise 5.12 (a) (b) Poisson distribution P(x ≤ 2 / a = 4) = P(x=0)+P(x=1)+P(x=2) -4 0 -4 1 -4 2 = e 4 / 0! +e 4 / 1! + e 4 / 2! = 0,01832 + 0,07326 + 0,14653 = 0.2381 23.81% =POISSON(2,4,1) P(x ≥ 4 / a = 4) = 1 - P(x ≤ 3) = 1 - (P(x=0)+P(x=1)+P(x=2)+P(x=3)) = 1 - (0,2381 (from (i) above) + e-3 43 / 4!) = 1 - (0,2381 + 0,19537) = 1 - 0,43347 = 0.5665 56.65% =1-POISSON(3,4,1) Exercise 5.13 (a) (i) Poisson distribution e-6 61 / 1! = P(x = 1 / a = 6) = (a) (ii) P(x ≤ 3 / a = 6) = P(x=0)+P(x=1)+P(x=2)+P(x=3) -6 0 -6 1 -6 2 -6 3 = e 6 / 0! +e 6 / 1! + e 6 / 2! + e 6 / 3! = 0,00248 + 0,01487 + 0,04462 + 0,08924 (a) (iii) 0.0149 1.49% P(x ≥ 3 / a = 6) = 1 - P(x ≤ 2) = 1 - (P(x=0)+P(x=1)+P(x=2)) = 1 - (e-6 60 / 0! +e-6 61 / 1! + e-6 62/ 2!) = 1 - (0,00248 + 0,01487 + 0,04462) = 1 - 0,06197 = 0.938 93.80% =POISSON(1,6,0) 0.1512 15.12% =POISSON(3,6,1) =1-POISSON(2,6,1) (b) Note: a = 3 since the mean orders must refer to a given half-day interval (i.e. 6/2 = 3) -3 1 e 3 / 1! = P(x = 1 / a = 3) = 0.1494 14.94% =POISSON(1,3,0) (c) Mean = 6 orders/day Std dev = √6 = 2.449 orders/day Exercise 5.14 (a) Poisson distribution P(x ≥ 3 / a = 1.8) = 1 - P(x ≤ 2) = 1 - (P(x = 0) + P(x = 1) + P(x = 2)) -1.8 0 -1.8 1 -1.8 2 = 1 - (e 1.8 / 0! + e 1.8 / 1! + e 1.8 / 2!) = 1 - (0.1653 + 0.29754 + 0.26778) = 1 - 0.7306 = 0.2694 =1-POISSON(2,1.8,1) 26.94% (b) P(x < 4 / a = 1.8) = P(x ≤ 3) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) = e-1.8 1.80 / 0! + e-1.8 1.81 / 1! + e-1.8 1.82/ 2! + e-1.8 1.83/ 3! = 0.1653 + 0.29754 + 0.26778 + 0.1607 = 0.8913 89.13% =POISSON(3,1.8,1) Exercise 5.15 (a) Poisson distribution P(x ≤ 5 / a = 7) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) + P(x = 4) + P(x = 5) = e-7 70 / 0! + e-7 71 / 1! + e-7 72/ 2! + e-7 73/ 3! + e-7 74 / 4! + e-7 75 / 5! = 0.00091 + 0.00638 + 0.02234 + 0.05213 + 0.09123 +0.12772 = 0.3007 =POISSON(5,7,1) 30.07% (b) P(x = 6 or x = 9 / a = 7) = P(x=6) + P(x=9) 25.04% = e-7 76 / 6! + e-7 79 / 9! = 0.1490 + 0.1014 = 0.2504 =POISSON(6,7,0)+POISSON(9,7,0) (c) Note: a = 14 since the time interval is a given two-day period (i.e. 7 x 2 = 14) P(x > 20 / a = 14) = 1 - P(x ≤ 20) = 1 - (P(x = 0) + P(x = 1) + P(x = 2) + …… + P(x = 19) + P(x = 20)) Use Excel only. =1 - POISSON(20,14,1) = = 1 - 0.9521 = 0.0479 4.79% Exercise 5.16 (a) (b) Normal distribution Use the Standard Normal (z) table (i) (ii) (iii) (iv) (v) (vi) (vii) P(0 < z < 1.83) = P(z > -0.48) = P(-2.25 < z < 0) = P(1.22 < z ) = P(-2.08< z < 0.63) = P(z < -0.68) = P(0.33 < z < 1.5) = 0.4664 0.5 + P(0 < z < 0.48) = 0.5 + 0.1844 = 0.6844 P(0 < z < 2.25) = 0.4878 0.5 - P(0 < z < 1.22) = 0.5 - 0.3888 = 0.1112 P(0 < z < 2.08)+P(0 < z < 0.63) = 0,4812 + 0,2357 = 0.5 - P(0 < z < 0.68) = 0,5 - 0,2517 = 0.2483 P(0 < z < 1.5) - P(0 < z < 0.33) = 0.4332 - 0.1293 = (i) (ii) (iii) (iv) (v) (vi) (vii) Using Excel =NORMSDIST(1.83)-0.5 =1-NORMSDIST(-0.48) =0.5-NORMSDIST(-2.25) =1-NORMSDIST(1.22) =NORMSDIST(0.63)-NORMSDIST(-2.08) =NORMSDIST(-0.68) =NORMSDIST(1.5)-NORMSDIST(0.33) 0.4664 0.6844 0.4878 0.1112 0.7169 0.2483 0.3039 0.7169 0.3039 Exercise 5.17 Normal distribution (a) (i) (ii) (iii) (iv) (v) (vi) (vii) (b) (i) (ii) (iii) (iv) (v) (vi) (vii) Use the Standard Normal (z) table Look up area P(z < ?) = 0.9147 0.9147 - 0.5 = 0.4147 P(z > ?) = 0.5319 0.5319 - 0.5 = 0.0319 Since area to right of z is greater than 0.5, z must be negative. P(0 < z < ?) = 0.4015 0.4015 P(? < z < 0) = 0.4803 0.4803 P(? < z ) = 0.0985 0.5 - 0.0985 = 0.4015 P(z < ?) = 0.2517 0.5 - 0.2517 = 0.2483 Since area to left of z is less than 0.5, z must be negative. P(? < z ) = 0.6331 0.6331 - 0.5 = 0.1331 Since area to right of z is greater than 0.5, z must be negative. Using Excel =NORMSINV(0.9147) =NORMSINV(1-0.5319) =NORMSINV(0.5+0.4015) =NORMSINV(0.5-0.4803) =NORMSINV(1-0.0985) =NORMSINV(0.2517) =NORMSINV(1-0.6331) z-value 1.37 -0.08 1.29 -2.06 1.29 -0.67 -0.34 1.3703 -0.0800 1.2901 -2.0600 1.2901 -0.6691 -0.3401 Exercise 5.18 (a) (i) (ii) (iii) (iv) (v) (vi) (b) (i) (ii) (iii) (iv) (v) (vi) Use the Standard normal (z) table μ = 64 σ = 2.5 P(x < 62) = P(z < (62-64)/2.5) = P(z < -0.08) = 0.5 - 0.2881 = P(x > 67.4) = P(z > (67.4-64)/2.5) = P(z > 1.36) = 0.5 - 0.4131 = P(59.6 < x < 62.8) = P((59.6-64)/2.5 < z < (62.8-64)/2.5) = P(-1.76 < z < -0.48) = 0.4608 - 0.1844 = P(x > ?) = 0.1026 Look up area = 0.3974 giving z ≈ 1.267 then x = 64+1.267*2.5 = P(x > ?) = 0.9772 Look up area = 0.9772 - 0.5 = 0.4772 giving z = -2.00 then x = 64-2.00*2.5 = P(60.2 < x < ?) = 0.6652 Find P(60.2 < x < 64) = P(-1.52 < z < 0) giving area = 0.4357. Look up area = 0.6652 - 0.4357 = 0.2295 giving z ≈ 0.611 then x = 64+0.611*2.5 = Using Excel NORMDIST and NORMINV =NORMDIST(62,64,2.5,1) =1-NORMDIST(67.4,64,2.5,1) =NORMDIST(62.8,64,2.5,1)-NORMDIST(59.6,64,2.5,1) =NORMINV(1-0.1026,64,2.5) =NORMINV(1-0.9772,64,2.5) =NORMDIST(60.2,64,2.5,1)+0.6652 Apply area = 0.729455 to '=NORMINV(0.729455,64,2.5) 0.2119 0.0869 0.2764 67.167 59.002 0.729455 65.528 0.2119 0.0869 0.2764 67.168 min 59 min 65.528 min Exercise 5.19 Gym Attendance Duration Normal distribution x ≡ N(80, 20) (a) P(x > 120) = P(z > (120-80)/20) = P(z > 2) = 0.5 - 0.4772 = 0.0228 2.28% (b) P(x < 60) = P(z (60-80)/20) = P(z < -1) = 0.5 - 0.3413 = 0.1587 15.87% (c) P(x > k) = 0.60 Look up area = 0.60 - 0.50 = 0.10 giving z ≈ -0.253 Then x = 80 - 0.253*20 ≈ Using Excel NORMDIST or NORMINV (a) =1-NORMDIST(120,80,20,1) 0.0228 (b) =NORMDIST(60,80,20,1) 0.1587 (c) =NORMINV(1-0.6,80,20) 74.93 minutes 74.94 min Exercise 5.20 Automatic Washing Machine Lifespan Normal distribution x ≡ N(3.1; 1.1) (a) P(x < 1) = P(z < (1-3.1)/1.1) = P(z < -1.9091) = 0.5 - 0.4719 = 0.0281 2.81% (b) (i) P(x > 4) = P(z > (4-3.1)/1.1) = P(z > 0.819) = 0.5 - 0.2939 = 0.2061 20.61% 0.5 - 0.4854 = 0.0146 1.46% (b) (ii) P(x > 5.5) = P(z > (5.5-3.1)/1.1) = P(z > 2.182) = (c) P(x < k) = 0.05 Look up area = 0.50 - 0.05 = 0.45 giving z ≈ -1.645 Then x = 3.1 - 1.645*1.1 ≈ 1.29 years Using Excel NORMDIST or NORMINV (a) =NORMDIST(1,3.1,1.1,1) 0.0281 (b) (i) =1-NORMDIST(4,3.1,1.1,1) 0.2061 (b) (ii) =1-NORMDIST(5.5,3.1,1.1,1) (c) =NORMINV(0.05,3.1,1.1) 0.0146 1.29 years Exercise 5.21 (a) (i) Household Daily Water Usage Normal distribution x ≡ N(220; 45) P(x > 300) = P(z > (300-220)/45) = P(z > 1.778) = 0.5 - 0.4625 = 0.0375 3.75% (ii) P(x < 100) = P(z < (100-220)/45) = P(z < -2.667) = 0.5 - 0.4962 = 0.0038 0.38% (iii) P(x < k) = 0.15 Look up area = 0.50 - 0.15 = 0.35 giving z ≈ -1.038 (approx) Then x = 220 - 1.038*45 ≈ 173.29 litres (iv) P(x > k) = 0,20 Look up area = 0.50 - 0.20 = 0.30 giving z ≈ 0.842 (approx) Then x = 220 + 0.842*45 ≈ 257.89 litres Using Excel NORMDIST (b) (a) (i) (a) (ii) (c) =1-NORMDIST(300,220,45,1) =NORMDIST(100,220,45,1) Using Excel 0.0377 0.0038 NORMINV (a) (iii) =NORMINV(0.15,220,45) (a) (iv) =NORMINV(1-0.2,220,45) 173.36 257.87 litres litres Exercise 5.22 (a) (i) (ii) (iii) (iv) Long-distance Truck Drivers' Reaction Times Normal distribution x ≡ N(1.4; 0.25) Variance = 0.0625 Std dev = 0.25 P(x > 2) = P(z > (2-1.4)/0.25) = P(z > 2.4) = 0.5 - 0.4918 = 0.0082 P(1.2 < x < 1.4) = P((1.2-1.4)/0.25 < z < (1.4-1.4)/0.25) = P(-0.80 < z < 0) = 0.2881 P(x < 0.9) = P(z < (0.9-1.4)/0.25) = P(z < -2.00) = 0.5 - 0.4772 = 0.0228 P(0.5 < x < 1.0) = P((0.5-1.4)/0.25 < z < (1.0-1.4)/0.25) = P(-3.6 0.49984 < z <--1.6) 0.4452 = = 0.0546 (b) P(x > 1.8) = P(z > (1.8-1.4)/0.25) = P(z > 1.6) = (c) P(x > 1.7) = P(z > (1.7-1.4)/0.25) = P(z > 1.2) = 0.82% 28.81% 2.28% 5.46% 0.5 - 0.4452 = 0.0548 5.48% No. of truck drivers = 120 * 0,0548 = 6,576 drivers 7 drivers (approx) 0.5 - 0.3849 = 0.1151 11.51% No. of truck drivers = 360 * 0.1151 = 41.436 drivers 42 drivers (approx) NORMDIST (a) Using Excel (i) (ii) (iii) (iv) =1-NORMDIST(2,1.4,0.25,1) =0.5-NORMDIST(1.2,1.4,0.25,1) =NORMDIST(0.9,1.4,0.25,1) =NORMDIST(1,1.4,0.25,1)-NORMDIST(0.5,1.4,0.25,1) 0.0082 0.2881 0.0228 0.0546 0.82% 28.81% 2.28% 5.46% (b) =1-NORMDIST(1.8,1.4,0.25,1) truck drivers 0.0548 6.576 5.48% 7 (c) =1-NORMDIST(1.7,1.4,0.25,1) truck drivers 41.425 0.115 42 Exercise 5.23 Hair dye Containers Mass Normal distribution x ≡ N(18.2; 0.7) (a) P(x < 18) = P(z < (18-18.2)/0.7) = P(z < -0.2857) = 0.5 - 0.1124 (approx) = or 0.3876 38.76% Look up area = 0.50 - 0.15 = 0.35 giving z ≈ 1.038 (approx) Then x = 18.2 + 1.038 * 0.7 ≈ 18.927 gm (b) P(x > k) = 0.15 Using Excel Variance = 0.49 Std dev = 0.7 NORMDIST and NORMINV (a) =NORMDIST(18,18.2,0.7,1) (b) =NORMINV(0.85,18.2,0.7) 0.3875 18.926 % gm Exercise 5.24 Motor Vehicle Service Time Normal distribution x ≡ N(70; 9) Variance = 81 Std dev = 9 (a) P(x < 60) = P(z < (60-70)/9) = P(z < -1.11) = 0.5 - 0.3665 = 0.1335 13.35% (b) P(x > 90) = P(z > (90-70)/9) = P(z > 2.22) = 0.5 - 0.4868 = 0.0132 1.32% (c) P(50 < x < 60) = P((50-70)/9 < z < (60-70)/9) = P(-2.22 < z < -1.11) = 0.4868 - 0.3665 = 0.1203 12.03% (d) P(x > 80) = P(z > (80-70)/9) = P(z > 1.11) = No. of customers = 0.1335 * 80 = 0.1335 10.68 13.35% customers (e) P(z > (80-μ)/9) = 0.05 Then Look up area = 0.50 - 0.05 = 0.45 giving z ≈ 1.645 1.645 = (80 - μ)/9 giving μ = 80 - 1.645 * 9 = 65.195 min Using Excel NORMDIST 0.5 - 0.3665 = 0.13326 13.33% =1-NORMDIST(90,70,9,1) 0.013134 1.31% (c) =NORMDIST(60,70,9,1)-NORMDIST(50,70,9,1) 0.120126 12.01% (d) =1-NORMDIST(80,70,9,1) 0.13326 13.33% (e) This answer cannot be computed from the Excel function NORMDIST. (a) =NORMDIST(60,70,9,1) (b) Exercise 5.25 Coffee Dispensing Machine - Cup Fill Normal distribution (a) (i) (ii) (iii) x ≡ N(230; 10) P(x > 235) = P(z > (235-230)/10) = P(z > 0.5) = 0.5 - 0.1915 = 0.3085 P(235 < x < 245) = P((235-230)/10 < z < (245-230)/10) = P(0.5 < z < 1.5) = 0.4332 - 0.1915 = 0.2417 P(x < 220) = P(z < (220-230)/10) = P(z < -1.00) = 0.5 - 0.3413 = 0.1587 P(x > k) = 0.15 (c) P(z < (220-μ)/10) = 0.10 Look up area = 0.50 - 0.10 = 0.40 giving z ≈ -1.282 Then -1.282 = (220 - μ)/10 giving μ = 220 + 1.282 * 10 = NORMDIST and NORMINV (a) (i) =1-NORMDIST(235,230,10,1) (a) (ii) =NORMDIST(245,230,10,1)-NORMDIST(235,230,10,1) (a) (iii) =NORMDIST(220,230,10,1) 0.3085 0.2417 0.1587 (b) 240.36 =NORMINV(1-.15,230,10) 24.17% 15.87% Look up area = 0.50 - 0.15 = 0.35 giving z ≈ 1.038 (approx) Then x = 230 + 1.038 * 10 = 240.38 ml (b) Using Excel 30.85% 30.85% 24.17% 15.87% ml 232.82 ml Exercise 5.26 Normal distribution Car Battery Lifespan x ≡ N(28; 4) (a) P(30 < x < 34) = P((30-28)/4 < z < (34-28)/4) = P(0.50 < z < 1.50) = 0.4332 - 0.1915 = or (b) P(x < 24) = P(z < (24-28)/4) = P(z < -1.00) = (c) P(x > k) = 0.60 (d) P(z < (x-28)/4) = 0.05 Look up area = 0.50 - 0.05 = 0.45 giving z ≈ -1.645 Then -1.645 = (x - 28)/4 giving x = 28 - 1.645 * 4 = 21.42 Using Excel (a) (b) (c) (d) 0.2417 24.17% 0.1587 15.87% 0.5 - 0.3413 = or Look up area = 0.60 - 0.50 = 0.10 giving z ≈ -0.253 (approx) Then x = 28 - 0.253 * 4 ≈ 26.988 months months NORMDIST and NORMINV =NORMDIST(34,28,4,1) - NORMDIST(30,28,4,1) =NORMDIST(24,28,4,1) =NORMINV(0.4,28,4) =NORMINV(.05,28,4) 0.2417 0.1587 26.987 21.421 24.17% 15.87% 26.987 mths 21.421 mths CHAPTER 6 SAMPLING AND SAMPLING DISTRIBUTIONS 6.1 To generalise sample findings to the target population. 6.2 Sampling methods; Concept of the sampling distribution. 6.3 subset. 6.4 representative. 6.5 Non-probability sampling: sample members are chosen using non-random criteria meaning that some members of the target population are excluded from being included in the sample. Probability sampling: sample members are chosen using random selection processes so that each target population member stands a chance of being included in the sample. 6.6 Random (probability) sampling methods. Every member of the target population has a chance of being included in the sample. This is likely to result in a more representative sample than if a non-probability sampling method was used. 6.7 Unrepresentativeness of the target population; Not possible to measure sampling error. 6.8 random (chance) 6.9 equal 6.10 Simple random sampling 6.11 Systematic random sampling 6.13 Stratified random sampling 6.14 Cluster random sampling 6.15 It results in a smaller sampling error 6.16 sample statistic; population parameter 6.17 standard error 6.18 95.5% 6.19 Normal 6.20 n = 30 or larger 6.21 Central Limit Theorem. 6.22 Sampling error is the error made when estimating the true population parameter (e.g. population mean) when using the sample statistic (e.g. sample mean). CHAPTER 7 CONFIDENCE INTERVAL ESTIMATION Exercise 7.1 To estimate a population parameter value by defining an Interval within which the true population value is likely to fall at a stated level of confidence. Exercise 7.2 Use the z-statistic since the population standard deviation, σ is known (Given σ = 8) Standard error 95% Confidence level Margin of error = 8/√64 = z(0.95) = z x SE Lower 95% confidence limit Upper 95% confidence limit 1 1.96 1.96 85 - 1.96(1) 85 + 1.96(1) Use NORMSINV(0.975) 83.04 86.96 Exercise 7.3 t statistic Exercise 7.4 Use the t -statistic since the population standard deviation, σ is unknown (Given s = 6) Standard error (approx) Degrees of freedom 90% Confidence level Margin of error = 6/√25 = = 25 - 1 t (0.10,24) = t x SE Lower 90% confidence limit Upper 90% confidence limit 1.2 24 1.711 2.91 54 - 1.711(1.2) 54 + 1.711(1.2) Use TINV(0.10,24) Note: TINV requires tail probability 51.95 56.05 Exercise 7.5 (a) Manually x(bar) = 24.4 σ = 10.8 std error = z crit (95%) = 95% Confidence level = 10.8/√144 = 1.96 1.96 *0.9 = Lower 95% confidence limit Upper 95% confidence limit Interpretation n = 144 0.9 1.764 24.4 - 1.764 = 24.4 + 1.764 = 22.636 26.164 There is a 95% chance that the interval (22.636 to 26.164) covers the actual average number of employees per SME in Gauteng. (b) z-crit (0.95) =NORMSINV(0.975) = 1.96 (c) 95% Confidence level =CONFIDENCE(0.05,10.8,144) 1.764 Exercise 7.6 (a) Manually x(bar) = 131.6 σ = 25 std error = 2.68 25/√87 = z crit (90%) = 1.645 90% Confidence level = 1.645*2.68 Lower 90% confidence limit Upper 90% confidence limit n = 87 2.68 4.409 127.191 136.009 131.6 - 4.409 = 131.6 + 4.409 = Interpretation There is a 90% chance that the interval (127.191 to 136.009) covers the actual average no. of palettes per order received by a sugar mill in Durban. (b) z-crit (0.90) =NORMSINV(0.95) 1.645 (c) 90% Confidence level =CONFIDENCE(0.10,131.6,87) 4.409 (d) 90% confidence limits of total palettes shipped in a year Lower 90% limit Upper 90% limit 127.191*720 = 136.009*720 = 91577.52 97926.48 Interpretation Total palettes shipped in a year, on average, is likely to be between 91 577 and 97 926 palettes, with 90% confidence. palettes palettes Exercise 7.7 (a) Manually x(bar) = R356 σ = R44 std error = 44/√256 = z crit (95%) = 1.96 95% Confidence level = 1.96 * 2.75 = Lower 95% confidence limit Upper 95% confidence limit n = 256 2.75 5.39 R350.61 R361.39 356 - 5.39 = 356 + 5.39 = Interpretation There is a 95% chance that the interval (R350.61 to R361.39) covers the actual average monthly car insurance premium of medium-sized cars. (b) z crit (90%) = 1.645 90% Confidence level = 1.645*2.75 = Lower 90% confidence limit Upper 90% confidence limit 4.524 R351.48 R360.52 356 - 4,524 = 356 + 4,524 = Interpretation There is a 90% chance that the interval (R351.48 to R360.52) covers the actual average monthly car insurance premium of medium-sized cars. The 90% confidence interval limits are closer together than the 95% confidence interval limits. The lower confidence results in a more precise set of interval limits (narrower). (c) z-crit (0.95) =NORMSINV(0.975) (d) 95% Confidence level =CONFIDENCE(0.05,44,256) (e) 95% confidence limits of total monthly premium income Lower 95% confidence limit Upper 95% confidence limit 350.61*3000 361.39*3000 R1 051 830 R1 084 170 Interpretation Total monthly premium income, on average, is likely to be between R1 051 830 and R1 084 170 with 95% confidence. 1.96 5.3899 Exercise 7.8 (a) Manually x(bar) = 4.985 σ = 0.04 std error = z crit (99%) = 99% Confidence level = 0.005657 0.04/√50 = 2.58 2.58 * 0.005657 = 0.01460 Lower 99% confidence limit Upper 99% confidence limit n = 50 4.985 - 0.0146 = 4.985 + 0.0146 = 4.9704 4.9996 Interpretation There is a 99% chance that the interval (4.97 litres to 4.9996 litres) covers the actual average volume of paint in all five-litre cans. (b) Yes, the store owner has statistical evidence to confirm that the average volume of paint in 5-litre cans is most likely to be below 5 litres. (c) z-crit (0.99) =NORMSINV(0.995) 2.576 (d) 99% Confidence level =CONFIDENCE(0.01,0.04,50) 0.0146 Exercise 7.9 (a) Manually x(bar) = 3.8 σ = 0.6 std error = 0,122474 z crit (90%) = 90% Confidence level = 0.6/√24 = 1.645 0.20147 Lower 90% confidence limit Upper 90% confidence limit n = 24 0.122474 3.8 - 0.20147 = 3.8 + 0.20147 = 3.599 4.001 Interpretation There is a 90% chance that the interval (3.599 to 4.001) covers the actual average inventory turnover rate of all convenience stores. (b) z-crit (0.90) =NORMSINV(0.95) (c) 90% Confidence level =CONFIDENCE(0.10,0.6,24) 1.645 0.20145 Exercise 7.10 (a) Manually x(bar) = 166.2 std error (estimated) = t crit (95%,20) 95% Confidence level = s = 22.8 n = 21 22.8/√21 = 2.086 10.3786 4.975368 Lower 95% confidence limit Upper 95% confidence limit Interpretation (b) 166.2 - 10.3786 166.2 + 10.3786 There is a 95% chance that between 155.8 and 176.6 calls, on average, will be received by the call centre daily. Manually x(bar) = 166.2 std error (estimated) = t crit (99%,20) 99% Confidence level = s = 22.8 n = 21 22.8/√21 = 2.845 14.1549 4.975368 Lower 99% confidence limit Upper 99% confidence limit Interpretation 155.82 176.58 166.2 - 14.1549 166.2 + 14.1550 152.05 180.35 There is a 99% chance that between 152.1 and 180.4 calls, on average, will be received by the call centre daily. (c) The 95% confidence interval is more precise (narrower), but less reliable than the 99% confidence interval which is less precise (wider), but more reliable. (d) t-crit (0.95, 20) =TINV(0.05,20) (e) Over a 30 day period Lower 95% confidence limit Upper 95% confidence limit Interpretation There is a 95% chance that between 4675 and 5298 calls, on average, will be received over a 30-day period. 2.0860 155.82*30 = 176.58*30 = 4674.60 5297.40 Exercise 7.11 (a) Manually x(bar) = 12.5 s = 3.4 std error (estimated) = t crit (90%, 27) 90% Confidence level = 3.4/√28 = 1.703 1.0943 Lower 90% confidence limit Upper 90% confidence limit (b) n = 28 0.64254 12.5 - 1.0943 = 12.5 + 1.0943 = 11.406 13.594 Interpretation There is a 90% chance that the actual mean dividend yield of all JSE-listed companies lies between 11.41% and 13.59%. t-crit (0.90, 27) =TINV(0.10,27) 1.7033 Exercise 7.12 (a) Manually x(bar) = 0.981 s = 0.052 std error (estimated) = 0.052/√18 = t crit (99%, 17) = 2.8982 99% Confidence level = 0.0355 Lower 99% confidence limit Upper 99% confidence limit Interpretation n = 18 0.01226 0.981 - 0.0355 = 0.981 + 0.0355 = 0.9455 1.0165 There is a 99% chance that the average fill of 1-litre cartons of milk lies between 0.9455 litres and 1.0165 litres. Since the interval covers 1 litre, it can be concluded that the cartons do contain one litre of milk on average. (b) Manually x(bar) = 0.981 s = 0.052 std error (estimated) = 0.052/√18 = t crit (95%, 17) = 2.1098 95% Confidence level = 0.0259 Lower 95% confidence limit Upper 95% confidence limit Interpretation n = 18 0.01226 0.981 - 0.0259 = 0.981 + 0.0259 = 0.9551 1.0069 There is a 95% chance that the average fill of 1-litre cartons of milk lies between 0.9551 litres and 1.0069 litres. The 95% confidence interval is more precise (narrower), but less reliable than the 99% confidence interval which is less precise (wider), but more reliable. (c) t-crit (0.99, 17) t-crit (0.95, 17) =TINV(0.01,17) =TINV(0.05,17) 2.8982 2.1098 Exercise 7.13 (a) Manually x(bar) = 1420 s = 160 std error (estimated) = t crit (90%, 49) = 90% Confidence level = 160/√50 = 1.676 37.9235 Lower 90% confidence limit Upper 90% confidence limit Interpretation (b) 22.62742 (use df = 50 from Table 2, Appendix) 1420 - 37.9235 = 1421 + 37.9235 = 1382.08 1457.92 There is a 90% chance that the average monthly wage of union members lies between R1382.08 and R1457.92. Manually x(bar) = 1420 s = 160 std error (estimated) = t crit (99%, 49) = 99% Confidence level = 160/√50 = 2.678 60.5962 Lower 99% confidence limit Upper 99% confidence limit Interpretation n = 50 n = 50 22.62742 (use df = 50 from Table 2, Appendix) 1420 - 60.5962 = 1420 + 60.5962 = 1359.4 1480.6 There is a 99% chance that the average monthly wage of union members lies between R1359.40 and R1480.60. (c) The 90% confidence interval is more precise (narrower), but less reliable than the 99% confidence interval which is less precise (wider), but more reliable. (d) t-crit (0.90, 49) t-crit (0.99, 49) =TINV(0.10,49) =TINV(0.01,49) 1.6766 2.6800 Exercise 7.14 x = 84 p = 84/200 = std error = z crit (95%) = 95% Confidence level = n = 200 0.42 √[(0.42)(0.58)/200] 1.96 0.0684 Lower 95% confidence limit Upper 95% confidence limit 0.0349 0.42 - 0.0684 = 0.42 + 0.0684 = 0.3516 0.4884 Interpretation There is a 95% chance that the percentage of manufacturing firms that meet the employment equity charter lies between 35.2% and 48.8%. Exercise 7.15 x = 68 p = 68/160 = std error = z crit (95%) = 95% Confidence level = n = 160 0.425 √[(0.425)(0.575)/160] = 1.96 0.0766 Lower 95% confidence limit Upper 95% confidence limit 0.03908 0.425 - 0.0766 = 0.425 + 0.0766 = Interpretation There is a 95% chance that the percentage of cash-paying customers lies between 34.8% and 50.2%. 0.3484 0.5016 Exercise 7.16 x = (365-78) = 287 n = 365 p = 287/365 = 0.786 std error = √[(0.786)*(0.214)/365] = z crit (90%) = 1.645 90% Confidence level = 0.03532 Lower 90% confidence limit Upper 90% confidence limit 0.786 - 0.0353 = 0.786 + 0.0353 = 0.021467 0.7507 0.8213 Interpretation There is a 90% chance that the percentage of non-overdrawn cheque accounts at the Tshwane branch lies between 75.1% and 82.1%. Exercise 7.17 x = 120 p = 120/300 = std error = z crit (90%) = 90% Confidence level = n = 300 0.4 √[(0.4)*(0.6)/300] = 1.645 0.04652 Lower 90% confidence limit Upper 90% confidence limit 0.4 - 0.04652 = 0.4 + 0.04652 = 0.02828 0.3535 0.4465 Interpretation There is a 90% chance that the percentage of shoppers who frequent a shopping mall primarily because of its store mix lies between 35.4% and 44.7%. Exercise 7.18 (a) File: Descriptive Statistics - Absent Days Absent Days Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Confidence Level (95.0%) X7.18 - cashier absenteeism.xlsx (Using Data Analysis in Excel ) 9.379 0.625 9 7 3.364 11.315 -0.919 -0.009 12 3 15 272 29 1.280 The average number of days absent was 9.4 days. 68% of employees were absent between 6 and 12.7 days (within 1 std dev). The data is symmetrical about the mean (skewness = -0.009). Absenteeism per employee ranged between 3 (min) and 15 days (max) last year. (b) Lower 95% Confidence Limit Upper 95% Confidence Limit 9.379 - 1.28 = 9.379 + 1.28 = 8.10 10.66 Interpretation There is a 95% chance that the average number of days absent last year by all cashiers in a supermarket was between 8.1 days and 10.7 days. (c) Since the 95% confidence interval covers 10 days (8.1 < μ < 10.66), it is possible that the mean number of days absent per employee does exceed 10 days. Thus the company's policy is not being strictly adhered to. (d) t-crit (0.95, 28) Given standard error = Confidence level (95%) = =TINV(0.05,28) 0.625 * 2.0484 = 2.0484 0.6250 1.2803 Exercise 7.19 (a) File: Descriptive Statistics - Parcel Weights (kg) Parcel Masses (kg) Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Confidence Level (90.0%) X7.19 - parcel masses.xlsx (Using Data Analysis in Excel ) 2.829 0.091 2.78 3.22 0.5998 0.360 -0.972 -0.108 2.17 1.75 3.92 121.63 43 0.154 The average parcel weight was 2.83 kg. 68% of parcels weigh between 2.23 kg and 3.43 kg (within 1 std dev). The data is symmetrical about the mean (skewness = -0.1076). Parcel weights ranged between 1.75 kg (min) and 3.92 kg (max). (b) Lower 90% Confidence Limit Upper 90% Confidence Limit 2.829 - 0.154 = 2.829 + 0.154 = 2.675 2.983 Interpretation There is a 90% chance that the average parcel weight is between 2.68 kg and 2.98 kg. (c) Since the 90% confidence interval lies below 3 kg (2.675 < μ < 2.983), it is highly likely that parcel weights do not exceed 3 kg. Thus Post-Net is adhering to its policy. (d) t-crit (0.90, 42) Given standard error = Confidence level (90%) = =TINV(0.10,42) 0.091 * 1.682 = 1.682 0.0910 0.154 Exercise 7.20 (a) File: Descriptive Statistics - Cost-to-Income Ratio X7.20 - cost-to-income.xlsx (Using Data Analysis in Excel ) Cost-to-Income Ratio Mean 71.240 Standard Error 1.997 Median 68 Mode 84 Standard Deviation 14.123 Sample Variance 199.451 Kurtosis -1.145 Skewness 0.139 Range 53 Minimum 44 Maximum 97 Sum 3562 Count 50 Confidence Level (95.0%) 4.014 The average cost-to-income ratio per company is 71.24%. 68% of companies have a cost-to-income ratio of between 57.1% and 85.4% (within 1 std dev). The data is reasonably symmetrical about the mean (skewness = 0.139). Cost-to-income ratios ranged between 44% and 97% for the sample of public companies. (b) Lower 95% Confidence Limit Upper 95% Confidence Limit 71.24 - 4.014 = 71.24 + 4.014 = 67.226 75.254 Interpretation There is a 95% chance that the average cost-to-income ratio of public companies is between 67.2% and 75.3%. (c) (d) t-crit (0.95, 49) Given standard error = Confidence level (95%) = =TINV(0.05,49) 1.997 * 2.0096 = 2.0096 1.9970 4.0132 Find P(x > 75) using "μ" = 71.24 and "σ"= 14.123 Standardise x = 75 to t-stat t-stat = (75 - 71.24)/14.123 = Use Excel to find P(z > 0.26623) =TDIST(0.26623,49,1) 0.39559 Interpretation 39.6% of all public companies are likely to have a cost-to-income ratio in excess of the 75% rule of thumb. 0.26623 Sample size determination z e σ 1.96 10 50 n = 96 z e σ 2.58 0.1 1 n = 666 (b) z e σ 2.58 0.15 1 n = 296 (c) z e σ 2.58 0.2 1 n = 166 Exercise 23 z p e 1.645 0.5 0.03 n = 752 Exercise 7.21 Exercise 22 (a) CHAPTER 8 HYPOTHESIS TESTS SINGLE POPULATION (MEANS, PROPORTIONS AND VARIANCES) Exercise 8.1 To test whether a claim / statement made about a population parameter value is probably true or false, based on sample evidence. Exercise 8.2 The “closeness” of the sample statistics to the claimed population parameter value. Exercise 8.3 The Five Steps of Hypothesis Testing Step 1: Define the statistical hypotheses (the null and alternative hypotheses). Step 2: Determine the region of acceptance of the null hypothesis. Step 3: Compute the sample test statistic. Step 4: Compare the sample test statistic to the region of acceptance. Step 5: Draw the statistical and management conclusions. Exercise 8.4 Level of significance (α) (and sample size when the population standard deviation is unknown) Exercise 8.5 Reject H0 in favour of H1 at the 5% level of significance. Exercise 8.6 (i) H0: µ ≤ 560 x (bar) = 577 H1: µ > 560 σ = 86 n = 120 α = 0.05 (a) One-sided upper tailed test Use z test statistic since σ is known. Area of Acceptance (b) z-stat = z ≤ 1.645 Read off 0.45 from z-table; or =NORMSINV(0.95) [using Excel ] (577-560)/(86/√(120) = 2.165 Since z-stat (2.165) > z-crit (1.645), there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Reject H0) Conclude that the population mean value is significantly larger than 560. (ii) (c) p -value = P(z > 2.165) = H0: π ≥ 0.72 x = 216 n = 330 H1: π < 0.72 α = 0.10 0.5 - 0.4848 = From z-table; or 0.0152 =1-NORMSDIST(2.165) Derive p = 216/330 = 0,6545 (a) One-sided lower tailed test Use z test statistic for proportions Area of Acceptance (b) z-stat = z ≥ -1.28 Read off 0.4 from z-table; or =NORMSINV(0.9) [using Excel ] -2.65 (0.6545 - 0.72)/√[(0.72)(1-0.72)/330] = Since z-stat (-2.65) < z-crit (-1.28), there is sufficient sample evidence at the 10% level of significance to reject H0 in favour of H1. (i.e. Reject H0) Conclude that the true population proportion is significantly less than 0.72. (c) p -value = (iii) H0: µ = 8.2 x (bar) = 9.6 P(z < -2.65) = H1: µ ≠ 8.2 s = 2.9 n = 30 0.5 - 0.496 = 0.0040 From z-table; or =NORMSDIST(-2.65) α = 0.01 (a) Two-sided test Use t test statistic since σ is unknown (only given the sample standard deviation, s and n is small (n < 30)) Area of Acceptance (b) t-stat = -2.756 ≤ t ≤ +2.756 Read off T(0.005,29) from t-table; or '=TINV(0.01,29) [using Excel ] 2.644 (9.6 - 8.2)/(2.9)/√(30) = Since t-stat (2.644) falls within the area of acceptance, there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1. (i.e. Accept H0) Conclude that the true mean value is equal to 8.2. (c) p -value = 2 x P(t > 2.644) = 0.0131 =TDIST(2.644,29,2) (iv) H0: µ ≥ 18 x (bar) = 14.6 H1: µ < 18 s = 3.4 n = 12 α = 0.01 (a) One-sided lower tailed test Use t test statistic since σ is unknown (only have the sample stardard deviation, s and n is small (n < 30)) Area of Acceptance (b) t-stat = t ≥ -2.718 Read off T(0.01,11) from t-table; or '=TINV(0.02,11) [using Excel ] -3.464 (14.6 - 18)/(3.4)/√(12) = Since t-stat (-3.464) < t-crit (-2.718), there is sufficient sample evidence at the 1% level of significance to reject H0 in favour of H1. (Reject H0) Conclude that the true mean value is significantly below 18. (v) (c) p -value = P(t < -3.464) = H0: π = 0.32 x = 68 n = 250 H1: π ≠ 0.32 α = 0.05 0.0026 Derive p = '=TDIST(-(-3.464),11,1) 68/250 = 0.272 (a) Two-sided test Use z test statistic for proportions Area of Acceptance (b) z-stat = -1.96 ≤ z ≤ +1.96 Read off z(0.975) from z-table; or =NORMSINV(0.975) [Excel ] (0.272 - 0.32)/√[(0.32)(1-0.32)/250] = -1.627 Since z-stat (-1.627) falls within the area of acceptance, there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Accept H0) Conclude that the true population proportion is equal to 0.32. (c) p -value = 2 x P(z < -1.627) = 0.1037 '=2*NORMSDIST(-1.627) Exercise 8.7 (a) H0: µ = 85 (b) Use the z test statistic since σ is known (σ = 25 min (given)) (c) Use α = 0.05 Area of Acceptance z-stat = H1: µ ≠ 85 -1.96 ≤ z ≤ 1.96 (80.5 - 85)/(25/√(132) = Two-sided test Read off 0.475 from z-table; or =NORMSINV(0.975) [using Excel] -2.068 Statistical conclusion Since z-stat (-2.068) lies below -z-crit (-1.96), there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that the population mean value is significantly different from 85 minutes. Visitors to the Knysna shopping mall do not spend 85 minutes on average in the mall. (d) p -value = (e) Since p -value (0.0386) < α (0.05), there is strong sample evidence to conclude that visitors to the Knysna shopping mall do not spend 85 minutes, on average, in the mall. 2 x P(z < -2.068) = 0.0386 =NORMSDIST(-2.068) x 2 By inspection of the sample mean, it appears that visitors to the Knysa shopping mall spend significantly less than 85 minutes in the mall. Exercise 8.8 (a) H0: µ ≥ 30 min (b) Use the z test statistic since σ is known (σ = 10.5 min (given)) (c) Use α = 0.01 Area of Acceptance z-stat = H1: µ < 30 min z ≥ -2.33 (27.9 - 30)/(10.5/√(86)) = One-sided lower tailed test Read off 0.49 from z-table; or =NORMSINV(0.01) [using Excel] -1.855 Statistical conclusion Since z-stat (-1.855) lies above z-crit (-2.33), there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1 (Do not reject H0). Management conclusion Conclude that the population mean value is at least 30 minutes. The supermarket manager's belief is therefore valid. (d) p -value = P(z < -1.855) = 0.0318 =NORMSDIST(-1.855) Since p -value (0.0318) > α (0.01) the sample evidence does not refute H0. The sample evidence is not strong enough to refute the belief that customers spend 30 minutes or more, on average, doing their purchases at the supermarket. Hence conclude that customers are likely to spend at least half-an-hour, on average, in the supermarket doing their grocery shopping. Exercise 8.9 (a) H0: µ ≤ 72 hours (b) Use the z test statistic since σ is known (σ = 18 hours (given)) Use α = 0.10 Area of Acceptance z-stat = H1: µ > 72 hours z ≤ 1.28 (75.9 - 72)/(18/√(46)) = One-sided upper tailed test Read off 0.40 from z-table; or =NORMSINV(0.90) [using Excel] 1.470 Statistical conclusion Since z-stat (1.47) > z-crit (1.28), there is sufficient sample evidence at the 10% level of significance to reject H0 in favour of H1 (Reject H0). Management conclusion Conclude that the local importer's claim is valid. Consignments are taking significantly longer than 72 hours to clear customs. (c) p -value = P(z > 1.47) = 0.0708 =1-NORMSDIST(1.47) Since p -value (0.0708) < α (0.10) this confirms support for H1. There is moderate sample evidence (relative to α = 0.10) to conclude that consignment clearance times, on average, are significantly longer than 72 hours. Exercise 8.10 (a) Use the z test statistic since σ is known (σ = 14.7% (given)) (b) H0: µ ≤ 40% (c) Use α = 0.01 Area of Acceptance z-stat = H1: µ > 40% z ≤ 2.33 (44.1 - 40)/(14.7/√(76)) = One-sided upper tailed test Read off 0.49 from z-table; or =NORMSINV(0.99) [using Excel] 2.431 Statistical conclusion Since z-stat (2.431) > z-crit (2.33), there is sufficient sample evidence at the 1% level of significance to reject H0 in favour of H1 (Reject H0). Management conclusion Conclude that the Department of Health's concern is justified. The average markup is significantly greater than 40%. (d) p -value = P(z > 2.431) = 0.0075 =1-NORMSDIST(2.431) Since p -value (0.0075) << α (0.01) it confirms H1. There is overwhelming sample evidence to conclude that the average % markup is significantly greater than 40%. Exercise 8.11 (a) Use the t test statistic since σ is unknown (only s = 21 gms is given) H0: µ = 700 gms H1: µ ≠ 700 gms Two-sided test Use α = 0.05 with degrees of freedom = (n -1) = 63 Area of Acceptance -2.00 ≤ t ≤ 2.00 Read off t(0.025,63) from t-table; or '=TINV(0.05,63) [using Excel] t-stat = (695 - 700)/(21/√(64)) = -1.905 Statistical conclusion Since t-stat (-1,905) lies within the acceptance area, there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0) Management conclusion Conclude that Ryeband Bakery is both legally compliant and not wasting ingredients. The average weight of all white loaves produced by Ryeband Bakery is 700 gms. (b) p -value = 2 x P(t < -1.905) = 0.0613 =TDIST(-(-1.905),63,2) Since p -value (0.0613) > α (0.05) H0 cannot be rejected as the sample evidence is weak in favour of H1. Hence it can be concluded that, on average, the weight of white bread loaves produced by the Ryeband Bakery is 700 gm. (c) H0: µ ≥ 700 gms H1: µ < 700 gms One-sided lower tailed test Use α 0.05 with degrees of freedom = (n -1) = 63 Area of Acceptance t ≥ -1.671 Read off t(0.05,63) from t-table; or =TINV(0.10,63) [using Excel] t-stat = (695 - 700)/(21/√(64)) = -1.905 Statistical conclusion Since t-stat (-1.905) < t-crit (-1.671), there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Reject H0). Management conclusion Conclude that Ryeband Bakery is not legally compliant.The average weight of all white bread loaves produced by Ryeband Bakery is significantly below 700 gms. Exercise 8.12 (a) Use the t test statistic since σ is unknown (only s = R788 is given) H0: µ ≥ 5500 H1: µ < 5500 One-sided lower tailed test Use α = 0.10 with degrees of freedom = (n -1) = 17 Area of Acceptance t ≥ -1.33 Read off t(0.10,17) from t-table; or =TINV(0.20,17) [using Excel] t-stat = (5275 - 5500)/(788/√(18)) = -1.211 Statistical conclusion Since t-stat (-1.211) lies within the acceptance area, there is insufficient sample evidence at the 10% level of significance to reject H0 in favour of H1 (i.e. Accept H0) Management conclusion Conclude that mean weekly sales of the new pudding flavour is not less than R5500. The company should therefore not withdraw the product at this stage. (b) p -value = P(t < -1.211) = 0.1212 =TDIST(-(-1.211),17,1) Since p -value (0.1212) > α (0.10), H0 cannot be rejected as the sample evidence is weak in favour of H1. Hence it can be concluded that, on average, the weekly sales of the new pudding flavour is not less than R5500 and the product should not be withdrawn. Exercise 8.13 (a) Use the t test statistic since σ is unknown (only s = 3.6 is given) H0: µ ≤ 80 kg H1: µ > 80 kg One-sided upper tailed test Use α = 0.05 and degrees of freedom = (n -1) = 25 Area of Acceptance t ≤ 1.708 Read off t(0.05,25) from t-table; or =TINV(0.10,25) [using Excel] t-stat = (81.3 - 80)/(3.6/√(26)) = 1.841 Statistical conclusion Since t-stat (1.841) > t-crit (1.708), it lies in the region of rejection. Therefore there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (Reject H0) Management conclusion Conclude that mean tensile strength of the consignment of wire is more than 80 kg. Marathon Products should accept this consignment as it meets quality specs. (b) p -value = P(t > 1.841) = 0.0388 =TDIST(1.841,25,1) Since p -value (0.0388) < α (0.05) there is strong sample evidence to reject H0 in favour of H1. Hence it can be concluded that, on average, the tensile strength of the wire in the consignment exceeds 80 kg. Hence accept the consignment. Exercise 8.14 (a) Use the t test statistic since σ is unknown (only s = 0.068 is given) H0: µ ≥ 1 H1: µ < 1 One-sided lower tailed test Use α = 0.05 and degrees of freedom = (n -1) = 19 Area of Acceptance t ≥ -1.729 Read off t(0.05,19) from t-table; or =TINV(0.10,19) [using Excel] std error = 0.068/√(20) = t-stat = (0.982 - 1)/(0.0152) = 0.0152 -1.1842 Statistical conclusion Since t-stat (-1.1842) > t-crit (-1.729), it lies within the region of acceptance. There is therefore insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0) Management conclusion Conclude that the mean fill of one-litre milk containers is not less than 1 litre. The Consumer Council's claim that containers are being underfilled is not valid. (b) p -value = P(t < -1.1842) = 0.1255 =TDIST(-(-1.1842),19,1) Since p -value (0.1255) > α (0.05), H0 cannot be rejected as there is no sample evidence to support H1. Hence it can be concluded that, on average, the mean fill of milk containers is at least one 1 lite. Thus there is no statistical support for the claim. Exercise 8.15 (a) Use the z test statistic for proportions H0: π ≥ 0.30 H1: π < 0.30 Use α = 0.05 Area of Acceptance n= p= std error = z-stat = z ≥ -1.645 x= 400 106/400 = √(0.3*0.7)/400 = (0.265 - 0.3)/(0.0229) = One sided lower tailed test Read off z(0.45) from z-table; or =NORMSINV(0.05) [using Excel ] 106 0,265 0,0229 -1.5284 Statistical conclusion Since z-stat (-1.5284) > z-crit (-1.645) there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0) Management conclusion Conclude that 30% or more of listeners tune into the news broadcast of this station. The company should advertise in this radio station's news timeslots. (b) p -value = P(z < -1.5284) = 0.0632 '=NORMSDIST(-1.5284) Since p-value (0.0633) > α (0.05), there is weak sample evidence to reject H0 in favour of H1. (i.e. Accept H0) The company can accept the radio station's claim as valid. (c) p -value = P(z < -1.5284) = 0.0632 =NORMDIST(0.265,0.3,0.0229,1) Exercise 8.16 (a) Use the z test statistic for proportions H0: π ≤ 0.60 H1: π > 0.60 Use α = 0.05 Area of Acceptance n= p= std error = z-stat = z ≤ 1.645 x= 150 (150-54)/150 = √(0.6*0.4)/150 = (0.64 - 0.6)/(0.04) = Read off z(0.45) from z-table; or =NORMSINV(0.95) [using Excel ] 54 0.64 0.04 1.000 Statistical conclusion Since z-stat (1.00) < z-crit (1.645) there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Accept H0) Management conclusion Conclude that not more than 60% of Cape Town motorists do not have vehicle insurance. The motor vehicle advisor's claim is not valid. (b) p -value = P(z > 1.000) = 0.1587 =1-NORMSDIST(1) Since p -value (0.1587) > α (0.05), the sample evidence does not support H1. Hence the insurance advisor's claim has no statistical validity. Exercise 8.17 (a) Use the z test statistic for proportions H0: π ≤ 0.15 H1: π > 0.15 Use α = 0.10 Area of Acceptance n= p= std error = z-stat = z ≤ 1.28 x= 560 96/560 = √(0,15*0,85)/560 = (0,1714 - 0,15)/(0,0151) = Read off z(0.40) from z-table; or =NORMSINV(0.90) [using Excel] 96 0.1714 0.0151 1.417 Statistical conclusion Since z-stat (1.417) > z-crit (1.28) there is sufficient sample evidence at the 10% level of significance to reject H0 in favour of H1 (i.e. Reject H0) Management conclusion Conclude that the churn rate in the telecommunications industry exceeds 15%. (b) p -value = P(z > 1.417) = 0.0782 =1-NORMSDIST(1.417) Since p -value (0.0782) > α (0.10), there is moderate sample evidence to reject H0 in favour of H1. The same management conclusion applies as in (a) above. (c) p -value = P(z > 1.417) = 0,0782 =1-NORMDIST(0.1714,0.15,0.0151,1) Exercise 8.18 (a) Use the z test statistic for proportions H0: π ≥ 0.90 H1: π < 0.90 Use α = 0.01 Area of Acceptance n= p= std error = z-stat = z ≥ -2.33 x= 300 260/300 = √(0.9*0.1)/300 = (0.8667 - 0.9)/(0.0173) = Read off z(0.49) from z-table; or =NORMSINV(0.01) [using Excel] 260 0.8667 0.0173 -1.9249 Statistical conclusion Since z-stat (-1.9249) > z-crit (-2.33) there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1 (i.e. Do not reject H0) Management conclusion Conclude that the germination rate of the barley seed is at least 90%. The cooperative can justify buying the barley seed from this merchant. The cooperative can accept the seed merchant's claim. (b) p -value = P(z < -1.9249) = 0.0271 =NORMSDIST(-1.9249) Since p -value (0.0271) > α (0.01), there is weak sample evidence at the 1% level of signficance to reject H0 in favour of H1. Hence do not reject H0. The same management conclusion applies as in (a) above. (c) p -value = P(z < -1.9249) = 0.0271 =NORMDIST(0.8667,0.9,0.0173,1) Exercise 8.19 File: X8.19 - cost-to-income.xlsx Use the t test statistic since σ is unknown H0: µ ≥ 75% H1: µ < 75% Use α = 0.05 and degrees of freedom = (n -1) = 49 Read off t(0.05,49) from t-table; or =TINV(0.10,49) [using Excel] Area of Acceptance std error = t-stat = t ≥ -1.676 1.9973 (71.24 - 75)/(1.9973) = -1.8825 Cost-to-income (%) Mean 71.24 Standard Error 1.9973 Median 68 Mode 84 Standard Deviation 14.1227 Sample Variance 199.451 Kurtosis -1.1450 Skewness 0.1391 Range 53 Minimum 44 Maximum 97 Sum 3562 Count 50 Statistical conclusion Since t-stat (-1.8825) < t-crit (-1.676) there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (i.e. Reject H0) Management conclusion Conclude that the mean cost-to-income ratio of JSE companies is less than 75%. JSE companies are therefore adhering to the rule of thumb. Exercise 8.20 (a) File: X8.20 - kitchenware.xlsx Normality assumption check Sales values ≤ 75 76 - ≤ 100 101 - ≤ 125 126 - ≤ 150 151 - ≤ 175 176 - ≤ 200 201 - ≤ 225 > 225 Count 5 6 8 11 9 6 4 1 The distribution appears to be normal. (b) Use the t test statistic since σ is unknown H0: µ ≥ R150 H1: µ < R150 Use α = 0.05 and degrees of freedom = (n -1) = 49 Read off t(0.05,49) from t-table; or =TINV(0.10,49) [using Excel] Area of Acceptance std error = t-stat = t ≥ -1.676 6.657 (137.12 - 150)/(6.657) = -1.9348 Sales Transaction value Mean 137.12 Standard Error 6.657 Median 138.5 Mode 184 Standard Deviation 47.074 Sample Variance 2215.944 Kurtosis -0.741 Skewness 0.022 Range 190 Minimum 52 Maximum 242 Sum 6856 Count 50 Statistical conclusion Since t-stat (-1.9348) < t-crit (-1.676) there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (Reject H0). Management conclusion Conclude that the mean transaction value at the Claremont branch is signficantly below R150. The management are advised to close the Claremont branch since it is unprofitable. Exercise 8.21 (a) X8.21 - flight delays.xlsx File: Normality assumption check delays ≤5 5 - ≤ 7.5 7.6 - ≤ 10 10.1 - ≤ 12.5 12.6 - ≤ 15 15.6 - ≤ 17.5 > 17.5 Count 0 8 28 29 13 2 0 Histogram of Flight Delay Times 35 28 30 29 No. of Flights 25 20 13 15 8 10 5 0 2 0 5 7.5 10 12.5 15 17.5 0 More Delay time intervals The histogram distribution appears normal. The assumption is satisfied. (b) Use the t test statistic since σ is unknown H0: µ ≤ 10 min H1: µ > 10 min Use α = 0.10 and degrees of freedom = (n -1) = 79 Read off t(0.10,79) from t-table; or =TINV(0.20,79) [using Excel] Area of Acceptance std error = t-stat = t ≥ 1.292 0.261 (10.324 - 10)/(0.261) = 1.241 flight delays Mean 10.324 Standard Error 0.261 Median 10.3 Mode 11 Standard Deviation 2.333 Sample Variance 5.444 Kurtosis -0.550 Skewness 0.084 Range 10.3 Minimum 5.3 Maximum 15.6 Sum 825.9 Count 80 Statistical conclusion Since t-stat (1.241) < t-crit (1.292) there is insufficient sample evidence at the 10% level of significance to reject H0 in favour of H1 (i.e. Accept H0). Management conclusion Conclude that flight delay times, on average, do not exceed 10 minutes. ACASA management do not need to conduct an indepth investigation on flight delays. Exercise 8.22 (a) X8.22 - medical claims.xls File: Normality assumption check Claims intervals ≤ 100 101 - ≤ 125 126 - ≤ 150 151 - ≤ 175 176 - ≤ 200 201 - ≤ 225 226 - ≤ 250 251 - ≤ 300 Count 5 3 9 21 25 18 8 11 Histogram of Daily Claims Processed 30 25 No. of Claims 25 21 18 20 15 11 9 10 5 5 8 3 0 0 100 125 150 175 200 225 250 300 More Claims processed intervals The distribution appears to be reasonably normal. The assumption is satisfied (b) Use the t test statistic since σ is unknown H0: µ ≤ 180 claims H1: µ > 180 claims Use α = 0.01 and degrees of freedom = (n -1) = 99 Read off t(0.01,99) from t-table; or =TINV(0.02,99) [using Excel] Area of Acceptance std error = t-stat = t ≥ 2.365 4.568 (190.39 - 180)/(4.568) = 2.275 Medical Claims Mean 190.39 Standard Error 4.568 Median 190 Mode 210 Standard Deviation 45.680 Sample Variance 2086.685 Kurtosis -0.092 Skewness 0.096 Range 199 Minimum 92 Maximum 291 Sum 19039 Count 100 Statistical conclusion Since t-stat (2.275) < t-crit (2.365) there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1 (Thus accept H0). Management conclusion Conclude that the average number of claims processed daily does not exceed 180. Thus the supervisor has no statistical grounds for requesting additional staff. Exercise 8.23 X8.23 - newspaper readership.xlsx File: Guardian Newspaper Claim (a) One-Way Pivot Table (Frequency Table) and Bar Chart Tabloid Sun Data % Count % Count % Count % Count Guardian Mail Voice Total % Total Count Total 15.8% 19 35.0% 42 25.8% 31 23.3% 28 100% 120 Tabloid Sun Guardian Mail Voice % 15.8 35 25.8 23.3 Bar Chart of Tabloid Readership 40 35 35 % of Readers 30 25.8 25 20 23.3 15.8 15 10 5 0 Sun Guardian Mail Voice Interpretation Based on the sample fo 120 tabloid readers, the Guardian newspaper has the largest share of 35%. The Sun has the lowest percentage of readers at 16%. (b) Use the z test statistic for proportions H0: π ≥ 0.40 (c) Use α = 0.05 Area of Acceptance n= p= std error = z-stat = H1: π < 0.40 Test claim that Guardian newspaper has at least a 40% market share. z ≥ -1.645 x= 120 42/120 = √(0.4*0.6)/120 = (0.35 - 0.4)/(0.0447) = Read off z(0.45) from z-table; or =NORMSINV(0.05) [using Excel] 42 0.35 0.0447 -1.1186 Statistical conclusion Since z-stat (-1.1186) > z-crit (-1.645) there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1 (Thus accept H0) Management conclusion Hence conclude that the Guardian's market share is at least 40% The Guardian newspaper 's claim is justified statistically. Exercise 8.24 (a) File: X8.24 - citrus products.xlsx One-Way Pivot Table (Frequency Table) and Bar Chart Awareness High Low Moderate Total Count Total % Data Count % Count % Count % Total 34 20% 72 42.35% 64 37.65% 170 100% Awareness Low Moderate High % 42.35 37.65 20 Bar Chart of Awareness Levels % of respondents 50 42.35 40 37.65 30 20 20 10 0 Low Moderate High Awareness Levels Interpretation 80% of sampled consumers have a low or moderate awareness. Only 20% of the sampled consumers indicated a high awarness of the nutritional benefits of citrus. (b) (i) Use the z test statistic for proportions H0: π ≤ 0.15 (b) (ii) H1: π > 0.15 Use α = 0.01 Area of Acceptance n= p= std error = z-stat = z ≤ 2.326 x= 170 34/170 = √(0.15*0.85)/170 = (0.20 - 0.15)/(0.0274) = Read off z(0.49) from z-table; or =NORMSINV(0.99) [using Excel] 34 0.2 0.0274 1.825 Statistical conclusion Since z-stat (1.825) < z-crit (2.326) there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1 (i.e. Accept H0) (b) (iii) Management conclusion Since the level of high consumer awareness does not exceed 15%, it is recommended that Fruitco should launch a national awareness campaign. Exercise 8.25 (a) File: X8.25 - aluminium scrap.xlsx Histogram of Daily % Scrap of Machine Daily % Scrap ≤ 2.8 2.8 - ≤ 3.2 3.2 - ≤ 3.6 3.6 - ≤ 4.0 4.0 - ≤ 4.4 4.4 - ≤ 4.8 Total Count 6 12 13 11 5 3 50 Histogram of Machine Daily % Scrap 16 14 12 13 11 No. of days 12 10 8 6 5 6 3 4 2 0 Scrap % intervals Interpretation The assumption of normality is largely satisfied. The histogram is only moderately skewed to the right. (b) 95% Confidence Limits for Machine Machine - Daily % Scrap Mean 3.483 Standard Error 0.0723 Median 3.41 Mode 2.97 Standard Deviation 0.511 Sample Variance 0.261 Kurtosis -0.851 Skewness 0.331 Range 1.83 Minimum 2.67 Maximum 4.5 Sum 174.14 Count 50 Confidence Level(95.0%) 0.145 Lower 95% confidence limit = Upper 95% confidence limit = 3.483 -0.145 = 3.483+0.145 = 3.34 3.63 There is a 95% chance that the average daily % scrap produced by the machine is likely to lie between 3.34% and 3.63%. (c) Let μ = population mean daily % scrap produced by the machine. Use the t test statistic since σ is unknown (only s is given) H0: H1: μ ≥ 3.75% μ < 3.75% One sided lower tailed test Region of Acceptance t-crit = Use α = 0.05 with df = (50 - 1) = 49 t(0.05)(49) = =TINV(0.1,49) Decision rule Do not reject H0 in favour of H1 if -1.677 ≤ t-stat t-stat std error = t-stat = 0.0723 -1.677 (see Table above) (3.483 - 3.75)/0.0723 -3.693 Statistical conclusion Since t-stat (-3.693) lies well within the region of rejection, there is strong sample evidence at the 5% significance level to reject H0 in favour of H1. (Thus reject H0). Management conclusion Conclude that the average daily % scrap produced by the machine is less than 3.75%. The machine is not yet due for a full maintenance service. Hypothesis Test for a Single Population Variance, σ2 5m pipe length variability analysis Exercise 8.26 Problem Characteristics Variable x = length of pipe (specification = 5m) Data type numerical, ratio scaled, continuous. Data Step 1 n = α= s= 26 0.05 3.46 pipes σ0 = H0 : σ2 ≤ 9 One sided Upper tailed test Management Question in H0 H1 : σ2 > 9 Step 2 Region of Non-Rejection Upper Step 3 3 2 Χ -crit = 37.652 2 Sample test statistic X -stat 2 X -stat = 33.254 (Use Chi-Square) Do not reject Ho if Χ2-stat ≤ 37.652 Formula 8.4 p -value = 0.12 Also Use CHISQ.DIST.RT(x, df) Steps 4, 5 Statistical Conclusion Since X2-stat < upper X2-crit (i.e. lies in region of non-rejection of H0), do not reject H0 at 5% level of significance. Management Conclusion The production manager can be 95% confident that the variation in pipe lengths is within the limits of the product specification. Exercise 8.27 Hypothesis Test for a Single Population Variance, σ2 Problem Characteristics Variable x = unknown - but numeric Data type numerical, ratio scaled, continuous Data Step 1 n = α= s2 = 20 0.1 σ20 = 49.3 H0 : σ2 ≤ 30 30 One sided Upper tailed test 2 H1 : σ > 30 Step 2 Region of Acceptance Upper Step 3 2 Χ -crit = (Use Chi-Square) 27.204 2 Sample test statistic X -stat X2-stat = 31.223 Also 2 Do not reject Ho if Χ -stat ≤ 27.204 Formula 8.4 p -value = 0.0382 Use CHISQ.DIST.RT(x, df) Steps 4, 5 Statistical Conclusion Since X2-stat > X2-crit (i.e. lies in region of rejection of H0), reject H0 at 10% level of significance. Management Conclusion We are 90% confident that the population variance is significantly greater than the specified value of 30. Exercise 8.28 Hypothesis Test for a Single Population Variance, σ Insurance claim values variability analysis 2 Problem Characteristics Variable x = claim values (in Rand) Data type numerical, ratio scaled - implies means and standard deviations Data Step 1 n = α= 32 0.05 s= 84 claims 2 σ 0= 2 H0 : σ = 5625 H1 : σ ≠ 5625 Two-tailed test Management Question in H0 Region of Non-Rejection 2 Χ -crit = 48.232 Upper (Use Chi-Square) Lower 2 Step 2 5625 2 Χ -crit = 17.539 2 Do not reject Ho if 17.539 ≤ Χ -stat ≤ 48.232 Step 3 2 Use Formula 8.4 Sample test statisticX -stat 2 X -stat = 38.886 Also p -value = 0.1561 Use CHISQ.DIST.RT(x, df) Steps 4, 5 Statistical Conclusion Since X2-stat < X2-crit (i.e. lies in region of non-rejection of H0), do not reject H0 at 5% level of significance. Management Conclusion 2 The insurance industry can be 95% confident that the variation in claim values is still σ = R5625. ---ooOoo--- Exercise 8.29 File: X8.29 - pain relief.xlsx Problem Characteristics Variable x = time to pain relief (in minutes) Data type numerical, ratio scaled, continuous Data Step 1 n = α= 16 0.01 s= 0.9004 patients σ20 = H0 : σ2 ≥ 1.8 One sided lower tailed test Management Question in H0 2 H1 : σ < 1.8 Step 2 Region of Non-Rejection Lower Step 3 1.8 2 Χ -crit = (Use Chi-Square) 5.229 2 Sample test statistic X -stat 2 X -stat = 6.756 Also 2 Do not reject Ho if Χ -stat ≥ 5.299 Use Formula 8.4 p -value = 0.0359 Use CHISQ.DIST(x, df) Steps 4, 5 Statistical Conclusion Since X2-stat > lower X2-crit (i.e. lies in region of non-rejection of H0), do not reject H0 at 1% level of significance. Management Conclusion The pharmaceutical company can be 99% confident that the variation in time to pain relief from the new headache pill is not significantly less than 2 the current headache pill (of σ = 1.8 min). (i.e. it does not significantly reduce variation in time to pain relief). CHAPTER 9 HYPOTHESIS TESTS COMPARISON BETWEEN TWO POPULATIONS (MEANS, VARIANCES AND PROPORTIONS) Exercise 9.1 When the population standard deviations of the two populations are unknown. Exercise 9.2 Use Formula 9.1 z-stat = z-stat = Exercise 9.3 (a) (b) [(72 - 66) - 0]/√(202/40+102/50) 1.7321 t-crit (0.05,38) = 2.024 t-crit (0.05,90) = 1.987 Refer to Appendix 1 Table 2 Refer to Appendix 1 Table 3 Exercise 9.4 When the two samples are not independent. Exercise 9.5 H0: π1 ≥ π2 H1: π1 < π2 One sided lower tailed test. Exercise 9.6 (a) Test for equality of variances H0 : σ21 = σ22 H1 : σ²1 ≠ σ²2 Two tailed test F-cri t = F(0.05/2, 23,18) = (from Table 4(b), Appendix 1) 2.50 Decision rule: Do not reject H0 if F-stat ≤ 2.50 F-stat = = 4.14²/3.32² = 1.555 =F.INV.RT(0.025,23,18) 2.515 Use rule of larger variance in numerator Since F-stat (1.555) < F-crit (2.515), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Therefore, use the pooled-variances t -test to test for differences in means. (b) Test for difference between two population means Let population 1 = Manufacturers Let population 2 = Retailers Let μi = population mean earnings yield (%) per sector i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) (c) H0 : μ1 = μ2 H1 : μ1 ≠ μ2 Two tailed test Region of Acceptance t-crit = Use α = 0.05 with df = (19+24-2) = 41 t(0.05)(41) = 2.021 Decision rule Do not reject H0 in favour of H1 if -2.021 ≤ t-stat ≤ 2.021 t-stat s² (pooled variance) = ((19-1)3,32²+(24-1)4,14²)/(19+24-2) 14.454 std error = √(14.454*(1/19+1/24)) 1.16747 t-stat = ((8.45-10.22)-(0))/1.16747 -1.51609 Conclusion Since t-stat (-1.51609) lies within the region of acceptance, there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Accept H0). Conclude that there is no difference in the mean earnings yield (%) between manufacturing companies and retail companies. Exercise 9.7 (a) Let population 1 = DIY Consumers Let population 2 = Non-DIY Consumers Let μi = population mean age of each consumer group i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) (b) H0: μ1 ≥ μ2 H1: μ1 < μ2 One sided lower tailed test Region of Acceptance t-crit = Use α = 0.10 with df = (29+34-2) = 61 t(0.10)(61) = -1.296 Decision rule Do not reject H0 in favour of H1 if -1.296 ≤ t-stat t-stat s² (pooled variance) = ((29-1)15.9²+(34-1)16.2²)/(29+34-2) 258.0197 std error = √(258.0197*(1/29+1/34)) 4.060301 t-stat = ((41.8-47.4)-(0))/4.0603 -1.379 (c) Statistical conclusion Since t-stat (-1.379) lies outside (below) the region of acceptance, there is sufficient sample evidence at the 10% level of significance to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that the mean age of DIY consumers is significantly lower than the mean age of non-DIY consumers at the 10% level of significance. (d) Region of Acceptance (new) t-crit = Decision rule Use α = 0.05 with df = (29+34-2) = 61 t(0.05)(61) = -1.671 Do not reject H0 in favour of H1 if -1.671 ≤ t-stat Statistical conclusion t-stat (-1.379) now lies within the (new) region of acceptance. Thus there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no significant difference in the mean ages between DIY consumers and non-DIY consumers. Exercise 9.8 (a) Let population 1 = Bus commuters Let population 2 = Train commuters Let μi = population mean commuting time for each transport mode i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) (b) H0: μ1 ≤ μ2 H1: μ1 > μ2 One sided upper tailed test Test for equality of variances 2 1 =σ 2 2 H0: σ H1: σ²1 ≠ σ²2 Two tailed test F-cri t = F(0.05/2, 21,35) = 2.10 Decision rule: Do not reject H0 if F-stat ≤ 2.10 (from Table 4(b), Appendix 1) F-stat = Use rule of larger variance in numerator = 7.8²/4.6² = 2.875 =F.INV.RT(0.025,21,35) 2.105 Since F-stat (2.875) > F-crit (2.10), reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are different. Therefore, use the unequal-variances t -test to test for differences in means. (c) Region of Acceptance df = 30 (Use df formula 9.9) Use α = 0.01 with df = 30 t-crit = t(0.01)(30) = df (num) = 11.14938 df (denom) =0.374049 2.75 Decision rule Do not reject H0 in favour of H1 if t-stat ≤ 2.75 t-stat std error = √(7.82/22 + 4.62/36) 1.680545 t-stat = ((35.3-31.8)-(0))/1.6805 2.083 Statistical conclusion Since t-stat (2,083) lies within the region of acceptance, there is insufficient sample evidence at the 1% level of significance to reject H0 in favour of H1. (i.e. Accept H0). (d) Management conclusion Conclude that there is no difference in the mean commuting times between bus and train commuters. Recommendation Since there is no difference in mean commuting times, either mode of transport can be prioritised for upgrading (or both can be upgraded simultaneously). Exercise 9.9 Let population 1 = Mastercard users Let population 2 = Visa Card users Let μi = population mean month-end credit card balance (in Rands) for each card type i . Use the z test statistic since σi's are known H0: μ1 = μ2 H1: μ1 ≠ μ2 Region of Acceptance z-crit = Two sided test Use α = 0.05 z(0.05) = 1.96 Decision rule Do not reject H0 in favour of H1 if -1.96 ≤ z-stat ≤ 1.96 z-stat std error = √(294²/45+336²/66) 60.26065 z-stat = ((922-828)-(0))/60.26065 1.5599 Statistical conclusion Since z-stat (1.5599) lies within the region of acceptance, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no difference in the mean month-end credit card balances between Mastercard holders and Visa Card holders. Exercise 9.10 Let population 1 = non-attendees of job enrichment workshops Let population 2 = attendees of job enrichment workshops Let μi = population mean job satisfaction rating for each employee category i . Use the z test statistic since σi's are known H0: μ1 ≥ μ2 One-sided lower tailed test H1: μ1 < μ2 ← non-attendees have lower job satisfaction than attendees. Region of Acceptance z-crit = Use α = 0.05 z(0.05) = -1.645 Decision rule Do not reject H0 in favour of H1 if -1.645 ≤ z-stat z-stat std error = √(1.1²/22+0.8²/25) 0.283901 z-stat = ((6.9-7.5)-(0))/0.283901 -2.1134 Statistical conclusion Since z-stat (-2.1134) lies outside (below) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that the mean job satisfaction score for non-attendees is significantly lower than the mean job satisfaction score for job enrichment attendees. Thus, the statistical evidence supports the view that the job enrichment workshops significantly increased job satisfaction levels of sales consultants. Exercise 9.11 (a) 95% Confidence Limits - Explorer Fund only Std error = √(2.3²/15) = 0.5939 z (0.95) = 1.96 Lower 95% confidence limit = Upper 95% confidence limit = 11.24 days 13.56 days 12.4 - 1.96 (0.5939) = 12.4 + 1.96 (0.5939) = There is a 95% chance that the true average time to settlement for claims lodged against the Explorer Fund lies between 11.24 days and 13.56 days. (b) Let population 1 = Green-Aid Medical Fund Let population 2 = Explorer Medical Fund Let μi = population mean time to settlement of claims by each medical fund i . Use the z test statistic since σi's are known H0: μ1 ≥ μ2 One-sided lower tailed test H1: μ1 < μ2 ← Green-Aid Fund settles quicker than Explorer Fund Region of Acceptance z-crit = (Use α = 0.05) z(0.05) = -1.645 Decision rule Do not reject H0 in favour of H1 if -1.645 ≤ z-stat z-stat std error = √(3.2²/14+2.3²/15) 1.041199 z-stat = ((10.8-12.4)-(0))/1.041199 -1.5367 Statistical conclusion Since z-stat (-1.5367) lies within the region of acceptance, there is insufficient sample (or p-value = 0.0622 > α = 0.05) (see (iii) below), there is insufficient sample evidence evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion There is no difference in the mean claims settlement time between the two Funds. Thus, the statistical evidence does not support the view that the Green-Aid Medical Fund settles claims sooner, on average, than the Explorer Medical Fund. Exercise 9.12 Let population 1 = Gas ovens Let population 2 = Electric ovens Let μi = population mean baking time for each oven type i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 ≥ μ2 One sided lower tailed test H1: μ1 < μ2 ← gas ovens bake faster than electric ovens Region of Acceptance t-crit = Use α = 0.05 with df = (5+5-2) = 8 t(0.05)(8) = -1.86 Decision rule Do not reject H0 in favour of H1 if t-stat ≥ -1.86 t-stat s² (pooled variance) = ((5-1)0.16²+(5-1)0.09²)/(5+5-2) 0.01685 std error = √(0.01685*(1/5+1/5)) 0.08210 t-stat = ((0.75-0.89)-(0))/0.0821 -1.705 Statistical conclusion Since t-stat (-1.705) lies within the region of acceptance, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no difference in the mean bread baking time between gas and electric ovens, at the 5% significance level. Gas ovens are therefore, not faster, on average, than electric ovens. Exercise 9.13 (a) Test for equality of variances H0: σ21 = σ22 H1: σ²1 ≠ σ²2 Rule of thumb test: F-stat < 3? = 152.2²/121.5² = F-stat = 1.569 Since F-stat (1.569) < 3, do not reject H0 in favour of H1. Conclude that the two population variances are equal. Therefore, use the pooled-variances t -test to test for differences in means. (b) Let population 1 = Cape Town branch Let population 2 = Durban branch Let μi = population mean size of orders received by each branch i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 ≤ μ2 H1: μ1 > μ2 One sided upper tailed test ← CT branch performing better than Durban branch t-crit = Use α = 0.10 with df = (18+15-2) = 31 t(0.10)(31) = 1.309 Region of Acceptance Do not reject H0 in favour of H1 if t-stat ≤ 1.309 Decision rule t-stat s² (pooled variance) = ((18-1)121.5²+(15-1)152.2²)/(18+15-2) 18556.97 std error = √(18556.97*(1/18+1/15)) 47.6243 t-stat = ((335.2-265.6)-(0))/47.6243 1.4614 Statistical conclusion Since t-stat (1.4614) lies outside (above) the region of acceptance, there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1 (i.e. Reject H0). Management conclusion Conclude that the mean size of orders received by the Cape Town branch is significantly larger than the mean size of orders received by the Durban branch. Thus the Cape Town branch is performing better than the Durban branch in terms of average order size. (c) Region of Acceptance (new) t-crit = Decision rule (new) Use α = 0.05 with df = (18+15-2) = 31 t(0.05)(31) = 1.696 Do not reject H0 in favour of H1 if t-stat ≤ 1.696 Statistical conclusion t-stat (= 1.4614) now lies within the (new) region of acceptance. Thus there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no significant difference in the mean order sizes between the Cape Town branch and the Durban branch. Thus there is no evidence to believe that the Cape Town branch is performing better than the Durban branch in terms of average order size. (d) Findings based on the 5% significance level are more meaningful than those based on the 10% significance level because it requires stronger (more convincing) sample evidence before tests conducted at 5% are prepared to reject the null hypothesis. The operations manager can be more confident that there is no difference in mean performance between the two branches (conclusion based on (b)). Exercise 9.14 File: X9.14 - package designs.xlsx First, test for equality of variances H0: σ21 = σ22 H1: σ²1 ≠ σ²2 Two tailed test F-cri t = F(0.05/2,7,7) = (from Table 4(b), Appendix 1) 4.99 Decision rule: Do not reject H0 if F-stat ≤ 4.99 F-stat = = 5.706²/4.862² = 1.377 =F.INV.RT(0.025,7,7) 4.995 Use rule of larger variance in numerator Since F-stat (1.377) < F-crit (4.99), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Therefore, use the pooled-variances t -test to test for differences in means. Now conduct the t-test for equal means, using the pooled-variances t-test approach. Let population 1 = Pyramid-shaped carton Let population 2 = Barrel-shaped carton Let μi = population mean sales volume of one-litre cartons for each carton shape i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 ≥ μ2 One sided lower tailed test H1: μ1 < μ2 'Pyramid' sales are less than 'Barrel' sales Region of Acceptance t-crit = Use α = 0.05 with df = (8+8-2) = 14 t(0.05)(14) = -1.761 Decision rule Do not reject H0 in favour of H1 if -1.761 ≤ t-stat t-stat s² (pooled variance) = ((8-1)4.862²+(8-1)5.706²)/(8+8-2) 28.09874 std error = √(28.09874*(1/8+1/8)) 2.650412 t-stat = ((23.75-27.375)-(0))/2.650412 -1.3677 Statistical conclusion Since t-stat (-1.3677) lies within the region of acceptance, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no difference in the mean weekly sales of one-litre cartons of apple juice between the pyramid-shaped and barrel-shaped carton designs. Thus the marketer can choose either package design to achieve higher weekly sales. Exercise 9.15 Let population 1 = Fruit Puffs consumers Let population 2 = Fruity Wheat consumers Let πi = population proportion of consumers who prefer fruit-flavoured wheat cereal i . Use the z test statistic H0: π1 ≤ π2 One sided upper tailed test H1: π1 > π2 Fruit Puffs is preferred by more consumers than Fruity Wheat. Region of Acceptance Decision rule Sample data z-stat z-crit = Use α = 0.05 z(0.05) = 1.645 Do not reject H0 in favour of H1 if z-stat ≤ 1.645. n x pi Fruit Puffs 175 54 0.309 Fruity Wheat 150 36 0.240 π(hat) (pooled proportion) = (54+36)/(175+150) = 0.2769 std error = √(0.2769*(1-0.2769)*(1/175+1/150)) 0.04978949 z-stat = (0.309-0.24)/0.049789 1.3858 Statistical conclusion Since z-stat (= 1.3858) lies within the region of acceptance, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no difference in the percentage of consumers who prefer each type of fruit-flavoured wheat cereal. The marketer's view that Fruit Puffs is more preferred than Fruity Wheat cannot be validated based on the statistical evidence at the 5% level of significance. The marketer can therefore choose to launch either fruit flavour of wheat cereal. Exercise 9.16 (a) Let population 1 = Male respondents Let population 2 = Female respondents Let πi = population proportion who prefer jazz for each gender i . Use the z test statistic H0: π1 = π2 H1: π1 ≠ π2 Use α = 0.05 z-crit = Region of Acceptance Decision rule Sample data z-stat Two sided test (equal preference) 1.96 Do not reject H0 in favour of H1 if -1.96 ≤ z-stat ≤ 1.96. n x pi Male 140 46 0.329 π(hat) (pooled proportion) = Female 110 21 0.191 (46+21)/(140+110) = 0.268 std error = √(0.268*(1-0.268)*(1/140+1/110)) 0.056432928 z-stat = (0.329-0.191)/0.056433 2.445 Statistical conclusion Since z-stat (2.445) lies outside (above) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1 (i.e. Reject H0) Management conclusion Conclude that there is a difference in the proportion of males compared to the proportion of females who enjoy listening to jazz. By inspection , proportionately more males than females enjoy listening to jazz. (b) p -value = =(1-NORMSDIST(2.445))*2 0.0145 Since the p-value (=0.0145) < α = 0.05, there is strong sample evidence in support of H1. Hence conclude that there is a difference in the proportion of males compared to the proportion of females who enjoy listening to jazz. Exercise 9.17 (a) Let population 1 = Status Cheque Account clients Let population 2 = Elite Cheque Account clients Let πi = population proportion of clients for each account type i who are overdrawn. Use the z test statistic H0: π1 ≥ π2 One sided lower tailed test H1: π1 < π2 'Status' proportion less than 'Elite' proportion Use α = 0.05 z-crit = Region of Acceptance Decision rule Sample data z-stat -1.645 Do not reject H0 in favour of H1 if z-stat ≥ -1.645. n x pi Status 300 48 0.16 Elite 250 55 0.22 π(hat) (pooled proportion) = (48+55)/(300+250) = 0.1873 std error = √(0.1873*(1-0.1873)*(1/300+1/250)) 0.0334 z-stat = (0.16-0.22)/0.0334 -1.796 Statistical conclusion Since z-stat (= -1.796) lies outside (below) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that proportionately more Elite cheque account clients are overdrawn compared to Status cheque account clients. (b) p -value = =NORMSDIST(-1.796) 0.0363 Since the p-value (=0.0363) < α = 0.05, there is strong sample evidence in support of H1. Hence conclude that proportionately more Elite cheque account clients are overdrawn compared to Status cheque account clients. Exercise 9.18 (a) (i) File: X9.18 - aluminium scrap.xlsx Test for equality of variances 2 1 2 2 H0: σ =σ H1: σ²1 ≠ σ²2 Two tailed test Since F-stat (1.702) < F-crit (1.77), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Use the pooled-variances t -test to test for differences in means. (Data Analysis - Excel ) F-Test Two-Sample for Variances Machine 1 Machine 2 Mean 3.483 3.668 Variance 0.261 0.153 Observations 50 30 df 49 29 F-stat 1.702 P(F<=f) one-tail 0.0639 F Critical one-tail 1.777 (a) (ii) Let population 1 = machine 1 daily % scrap Let population 2 = machine 2 daily % scrap Let μi = population average daily % scrap produced by each machine i . (a) (iii) Use the pooled-variances t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 ≥ μ2 One sided lower tailed test H1: μ1 < μ2 Machine 1 produces lower % scrap than machine 2 Test performed manually using Descriptive Statistics from Data Analysis Region of Acceptance t-crit = Use α = 0.05 with df = (50+30-2) = 78 t(0.05)(78) = TINV(0.1,78) Decision rule Do not reject H0 in favour of H1 if -1.664 ≤ t-stat Sample data - descriptive statistics Mean Standard Error Median Mode Standard Deviation Sample Variance Skewness Range Minimum Maximum Sum Count Machine 1 3.483 0.072 3.41 2.97 0.511 0.261 0.331 1.83 2.67 4.5 174.14 50 Machine 2 3.668 0.072 3.705 3.77 0.392 0.153 0.014 1.59 2.88 4.47 110.03 30 -1.664 t-stat s² (pooled variance) = std error = t-stat = (a) (iii) ((50-1)*0.511²+(30-1)*0.392²)/(50+30-2) 0.221169038 √(0.221169*(1/50+1/30)) 0.108607919 ((3.483-3.668)-(0))/0.10861 -1.7032 Test performed using t-Test Assuming Equal Variances in Data Analysis Using Data Analysis (in Excel ) t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail (a) (iv) Machine 1 3.483 0.261 50 0.221 0 78 -1.703 0.046 -1.665 0.093 1.991 Machine 2 3.668 0.153 30 Statistical conclusion Since t-stat (= -1.7032) lies outside (below) the region of acceptance, there is sufficient or p-value = 0.0463 < α = 0.05 (see below), there is sufficient sample evidence sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that machine 1 has a significantly lower average daily % scrap than machine 2. (b) p -value = =T.DIST(-(-1.7032),78,1) 0.0463 =T.DIST(-1.7032,78,TRUE) Since the p-value (=0.046) < α = 0.05, there is strong sample evidence in support of H1. Conclude that machine 1 produces scrap at a significantly lower average daily rate than machine 2. This conclusion is valid at the 5% significance level. Exercise 9.19 File: X9.19 - water purification.xlsx (a) (i) Test for equality of variances 2 1 =σ 2 2 H0: σ H1: σ²1 ≠ σ²2 Two tailed test Since F-stat (1.063) < F-crit (1.924), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Use the pooled-variances t -test to test for differences in means. F-Test Two-Sample for Variances Free State Mean 27.458 Variance 1.563 Observations 24 df 23 F-stat 1.063 P(F<=f) one-tail 0.4342 F Critical one-tail 1.924 (Data Analysis - Excel ) KZN 26.448 1.470 29 28 (a) (ii) Let population 1 = Free State plant daily impurities levels Let population 2 = KZN plant daily impurities levels Let μi = population average daily impurities level per plant i . (a) (iii) Use the pooled-variances t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 ≤ μ2 One sided upper tailed test H1: μ1 > μ2 FS treatment plant has higher level of impurities than the KZN water treatment plant. Test performed manually using Descriptive Statistics from Data Analysis Region of Acceptance t-crit = Use α = 0.01 with df = (24+29-2) = 51 t(0.01)(51) = TINV(0.02,51) 2.402 Decision rule Do not reject H0 in favour of H1 if t-stat ≤ 2.402 Descriptive Statistics Free State Mean 27.458 Standard Error 0.255 Median 27.5 Mode 28 Standard Deviation 1.250 Sample Variance 1.563 Skewness 0.030 Range 5 Minimum 25 Maximum 30 Sum 659 Count 24 Confidence Level(99.0%) 0.717 t-stat s² (pooled variance) = std error = t-stat = KZN 26.448 0.225 27 27 1.213 1.470 -0.193 5 24 29 767 29 0.622 ((24-1)*1.25²+(29-1)*1.213²)/(24+29-2) 1.51247 √(1.51247*(1/24+1/29)) 0.33937 ((27.458-26.448)-(0))/0.33937 2.9764 Test performed using t-Test Assuming Equal Variances in Data Analysis Using Data Analysis (in Excel ) t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Free State 27.458 1.563 24 1.512 0 51 2.9764 0.00223 2.402 0.004 2.676 KZN 26.448 1.470 29 (a) (iv) Statistical conclusion Since t-stat (2.9764) lies outside (above) the region of acceptance, there is sufficient sample evidence at the 1% significance level to reject H 0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that the KZN plant produces water of a higher quality (fewer impurities, on average) than the Free State plant. Thus the KZN plant manager's claim can be supported statistically at the 1% level of significance. (b) p -value = =T.DIST(2.9761),51,1) 0.00223 =T.DIST.RT(2.9764,51) Since the p-value (=0.00223) << α = 0.01, there is overwhelming sample evidence in support of H1. Same conclusion as in (a) applies. Exercise 9.20 (a) File: X9.20 - herbal tea.xlsx Test for equality of variances H0: σ21 = σ22 Two tailed test H1: σ²1 ≠ σ²2 Since F-stat (1.227) < F-crit (2.098), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Use the pooled-variances t -test to test for differences in means. (Data Analysis - Excel ) F-Test Two-Sample for Variances Freshpak Yellow Label Mean 7.689 6.917 Variance 2.170 1.768 Observations 19 23 df 18 22 F-stat 1.227 P(F<=f) one-tail 0.3205 F Critical one-tail 2.098 (b) (i) Let population 1 = Freshpak brand Let population 2 = Yellow Label brand Let μi = population mean level of quercetin in mg/kg in each brand i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) H0: μ1 = μ2 H1: μ1 ≠ μ2 Two sided test (No difference) Test performed manually using Descriptive Statistics from Data Analysis Region of Acceptance Decision rule Use α = 0.05 with df = (19+23-2) = 40 t-crit = t(0.05)(40) = 2.021 Do not reject H0 in favour of H1 if -2.021 ≤ t-stat ≤ 2.021 Descriptive Statistics Mean Standard Error Median Mode Standard Deviation Sample Variance Skewness Range Minimum Maximum Sum Count Freshpak 7.689 0.3379 7.9 7.9 1.4731 2.1699 -0.1752 5.1 5 10.1 146.1 19 Yellow Label 6.917 0.2772 7.1 5.7 1.3296 1.7679 0.1030 5.3 4.5 9.8 159.1 23 t-stat s² (pooled variance) = ((19-1)*1.473²+(23-1)*1.3296²)/(19+23-2) 1.948688 std error = √(1.948688*(1/19+1/23)) 0.43277 t-stat = ((7.689-6.917)-(0))/0.43277 1.784 Test performed using t-Test Assuming Equal Variances in Data Analysis Using Data Analysis (in Excel ) t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Freshpak 7.6895 2.1699 19 1.9488 0 40 1.7840 0.0410 1.6839 0.0820 2.0211 Yellow Label 6.9174 1.7679 23 Statistical conclusion Since t-stat (1.784) lies within the region of acceptance, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that there is no difference in the mean quercetin content (in mg/kg) between the Freshpak and Yellow Labels brands of rooibos tea. (b) (ii) Let population 1 = Let population 2 = Freshpak brand (FP) Yellow Label brand (YL) H0: μ1 ≤ μ2 H1: μ1 > μ2 One sided upper tailed test FP contains more quercetin than YL Using Data Analysis (in Excel ) t-Test: Two-Sample Assuming Equal Variances Freshpak Mean 7.6895 Variance 2.1699 Observations 19 Pooled Variance 1.9488 Hypothesized Mean Difference 0 df 40 t Stat 1.7840 P(T<=t) one-tail 0.0410 t Critical one-tail 1.6839 P(T<=t) two-tail 0.0820 t Critical two-tail 2.0211 Yellow Label 6.9174 1.7679 23 Statistical conclusion Since t-stat (= 1.784) lies outside (above) the region of acceptance (t-stat ≤ 1.6839), (see table above), there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that Freshpak's claim that their brand contains more quercetin, on average, than the Yellow Label brand, can be supported statistically at the 5% significance level. Exercise 9.21 (a) (i) File: X9.21 - meat fat.xlsx Test for equality of variances H0: σ21 = σ22 Two tailed test H1: σ²1 ≠ σ²2 Since F-stat (1.261) < F-crit (2.066), do not reject H0 in favour of H1. Conclude,at α = 0.05, that the two population variances are equal. Use the pooled-variances t -test to test for differences in means. F-Test Two-Sample for Variances Namibia Mean 27.074 Variance 33.840 Observations 27 df 26 F 1.261 P(F<=f) one-tail 0.3002 F Critical one-tail 2.066 (a) (ii) (Data Analysis - Excel ) Little Karoo 30.333 26.833 21 20 Let population 1 = Namibian meat producer Let population 2 = Little Karoo meat producer Let μi = population average fat content of meat supplied by each producer i . Use the t test statistic since σi's are unknown (only s1 and s2 are given) (a) (iii) H0: μ1 ≥ μ2 H1: μ1 < μ2 One sided lower tailed test Test performed manually using Descriptive Statistics from Data Analysis Region of Acceptance t-crit = Decision rule t-stat Use α = 0.01 with df = (27+21-2) = 46 t(0.01)(46) = =TINV(0.02,46) 2.412 Do not reject H0 in favour of H1 if t-stat ≤ 2.412 s² (pooled variance) =((27-1)*5.8173²+(21-1)*5.1801²)/(27+21-2) 30.794 std error = √(30.794*(1/27+1/21)) 1.61459 t-stat = ((27.0741-30.3333)-(0))/1.61459 -2.0186 Test performed using t-Test Assuming Equal Variances in Data Analysis Using Data Analysis (in Excel ) t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail (a) (iv) Namibia 27.0741 33.8405 27 30.7939 0 46 -2.0186 0.0247 -2.4102 0.0494 2.6870 Little Karoo 30.3333 26.8333 21 Statistical conclusion Since t-stat (-2.0186) lies within the region of acceptance, there is insufficient sample evidence at the 1% significance level to reject H0 in favour of H1. (i.e. Accept H0). Management conclusion Conclude that the mean fat content of meat between the Namibian producer and the Little Karoo producer is the same . There is therefore no statistical justification, at the 1% significance level, to sign an exclusive agreement with the Namibian producer. (b) p -value = =TDIST(-(-2.0186),46,1) 0.0247 =T.DIST(-2.0186,46,TRUE) This is the same p-value as shown in the Data Analysis output for a one-tailed test. Since p-value = 0.0247 > α = 0.01 (see Table above), there is insufficient sample evidence to reject H0 in favour of H1 at the 1% level of signficance. Same management conclusion as in (a) (iii) above. Exercise 9.22 File: X9.22 - disinfectant sales.xlsx (a) Matched pairs test The same retail outlets were surveyed both before and after the promotional campaign. Thus the two samples are not independent. (b) Define x1 = x2 = Sales per outlet before the promotional campaign Sales per outlet after the promotional campaign (x1 - x2) Let d = i.e. " before" - "after" Let μd = population mean difference in sales from before to after the promotional campaign. Use the matched pairs t test statistic (c) H0: μd ≥ 0 One sided lower tailed test H1: μd < 0 'before' sales are lower than the 'after' sales Region of Acceptance t-crit = Use α = 0.05 with df = (12-1) = 11 t(0.05)(11) = -1.796 Decision rule Do not reject H0 in favour of H1 if -1.796 ≤ t-stat Sample data t-stat Σxd n x(bar)d -8 12 -0.667 sd 1.231 (-0.667 - 0)/(1.231/√12) = -1.877 Statistical conclusion Since t-stat (-1.877) lies outside (below) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that there has been a significant increase in mean sales of 500ml bottles of disinfectant liquid from before to after the promotional campaign. The promotional campaign has therefore been a success at significantly increasing mean sales volume of the product - at the 5% significant level. (d) t-Test: Paired Two Sample for Means using Data Analysis Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Before 11.5 4.455 12 0.817 0 11 -1.876 0.0437 1.7959 0.0874 2.2010 p- value = =TDIST(-(-1.877),11,1) 0.0436 After 12.17 3.606 12 Statistical conclusion Since p -value = 0.0437 < α = 0.05, there is moderately strong sample evidence to reject H0 in favour of H1 and conclude that the promotional campaign has been effective. Exercise 9.23 File: X9.23 - performance ratings.xlsx (a) The samples are dependent as the same employee is tested both before and after the training sessions. (b) Matched pairs test Performance rating before the training sessions x1 = Performance rating after the training sessions x2 = (x1 - x2) Let d = i.e. " before" - "after" Let μd = population mean difference in performance rating scores from before to after the training sessions. Use the matched pairs t test statistic H0: μd ≥ 0 One sided lower tailed test H1: μd < 0 'before rating scores lower than 'after' rating scores. Test performed manually Use α = 0.05 with df = (18-1) = 17 t(0.05)(17) = -1.74 Region of Acceptance t-crit = Decision rule Sample data t-stat Do not reject H0 in favour of H1 if -1.74 ≤ t-stat Σxd n x(bar)d -6.4 18 -0.356 sd 0.7114 (-0.356 - 0)/(0.7114/√18) = -2.123 Data Analysis: t-Test : Paired Two Sample for Means Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Before 11.067 6.024 18 0.95743865 0 17 -2.120 0.0245 1.740 0.04898861 2.10981556 After 11.422 5.835 18 Statistical conclusion Since t-stat (-2.123) lies outside (below) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1 (i.e. Reject H0) Management conclusion Conclude that there has been a significant increase in mean performance ratings scores of employees who attended the series of workshops ans seminars. The performance enhancement sessions have therefore been effective at increasing motivation and productivity - at the 5% level of significance. (c) p -value = =TDIST(-(-2.123),17,1) 0.0244 Also see t-Test Table above. Since p-value = 0.0244 < α = 0.05, there is strong sample evidence to reject H0 in favour of H1 and conclude that the workshops have significantly increased employee motivation and productivity. Exercise 9.24 File: X9.24 - household debt.xlsx (a) The samples are dependent as the same household is tested both a year ago and at the current time period . (b) Matched pairs test Household debt level a year ago. x1 = Household debt level currently. x2 = (x1 - x2) Let d = i.e. " year ago" - "current" Let μd = population mean difference in household debt levels from a year ago to the current period. Use the matched pairs t test statistic H0: μd ≤ 0 One sided upper tailed test H1: μd > 0 Debt higher a year ago than today (current period) Test performed manually Use α = 0.05 with df = (10-1) = 9 t(0.05)(9) = 1.833 Region of Acceptance t-crit = Decision rule Sample data Do not reject H0 in favour of H1 if t-stat ≤ 1.833 Σxd n x(bar)d sd t-stat 12 10 1.2 1.8135 (1.2 - 0)/(1.8135/√10) = 2.0925 Data Analysis: t-Test : Paired Two Sample for Means Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Year ago 40.5 35.611 10 0.9541 0 9 2.0925 0.0330 1.8331 0.0659 2.2622 Current 39.3 36.011 10 Statistical conclusion Since t-stat (2.0925) lies outside (above) the region of acceptance, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. (i.e. Reject H0). Management conclusion Conclude that there has been a significant decrease in the average level of household debt from a year ago. The increase in prime interest rate (from 6% to 11%) has lead to a significant decline in the average level of household debt from a year ago - at the 5% significance level. (c) p -value = =TDIST(2.0925,9,1) 0.033 Also see t-Test Table above Since the p -value (0.033) < α = 0,05, there is strong sample evidence to reject H0 in favour of H1 at the 5% level of significance. Same statistical and management conclusions as (b) above. Exercise 9.25 Process output variability study Random variable: H0 H1: Hourly output per process unit of measure: units produced per hour 2 2 2 2 σ (1) = σ (2) Management question in H0 σ (1) ≠ σ (2) Region of Acceptance (use α = 0.05) Note: Set up the F-test as an upper tailed test (F-stat = larger s2/smaller s2) F-crit (upper) = F(0.05,30,24) 1.939 Decision rule: Do not reject H0 if F-stat (the sample evidence) ≤ 1.939 Sample data s12 = 14.6 n1 = 25 F-stat = larger s2/smaller s2 α= s22 = n2 = 0.05 23.2 31 F-stat = 23.2/14.6 = p -value = 0.1240 1.589 Statistical conclusion Since F-stat = 1.589 < F-crit = 1.939, there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Management conclusion Therefore conclude, with 95% confidence, that the variability of hourly outputs between the two production processes is the same. Exercise 9.26 File: X9.26 - milk yield.xlsx Random variable: H0: H1: σ2(f) ≤ σ2(c) σ2(f) > σ2(c) milk yield (in litres per week) per cow Management question in H1 Region of Acceptance: Use α = 0.05 with df1 = 16-1 = 15 and df2 = 16-1 = 15 (See Excel output) F-crit = F(0.05,15,15) = 2.403 Decision rule: Do not reject H0 if F-stat ≤ 2.403 F-Test Two-Sample for Variances Free grazing Mean 31.306 Variance 72.138 Observations 16 df 15 F-stat 1.673 P(F<=f) one-tail 0.1647 F Critical one-tail 2.403 F-stat = 1.673 (See Excel output) (Data Analysis in Excel ) Controlled Feed 35.375 43.110 16 15 p -value = 0.1647 Statistical conclusion Since F-stat = 1.673 < F-crit = 2.403, there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Management conclusion Therefore conclude, with 95% confidence, that there is no signficant difference in the variability in milk yields of cows between the two feeding practices. Exercise 9.27 File: X9.27 - employee wellness.xlsx Random variable: H0: H1: Hours spent exercising per week σ2(over 40) ≤ σ2(under 40) σ2(over 40) > σ2(under 40) Management question in H1 Region of Acceptance: Use α = 0.05 with df1 = 23-1 = 22 and df2 = 21-1 = 20 (See Excel output) F-crit = F(0.05,22,20) = 2.102 Decision rule: Do not reject H0 if F-stat ≤ 2.102 F-Test Two-Sample for Variances Over 40 Mean 2.461 Variance 0.836 Observations 23 df 22 F-stat 2.240 P(F<=f) one-tail 0.0373 F Critical one-tail 2.102 F-stat = 2.24 (See Excel output) (Data Analysis (Excel )) Under 40 3.086 0.373 21 20 p -value = 0.0373 Statistical conclusion Since F-stat = 2.24 > F-crit = 2.102, there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Management conclusion Therefore conclude, with 95% confidence, that 'over 40' employees do indeed exercise more 'erractically' (i.e. show significantly greater variability in exercise times) than 'under 40' employees. Exercise 9.28 File: X9.28 - attrition rate.xlsx Random variable 2 2 2 2 Attrition rates (%) per call center per month H0: σ (fin) = σ (health) Management question in H0 H1: σ (fin) ≠ σ (health) Region of Acceptance of H0: Find only F-crit (lower) or F-crit (upper) This depends on whether the smaller or the larger sample variance is in the numerator of F-stat OR For an F-crit (upper) only: F-stat = Larger variance / Smaller variance For an F-crit (lower) only: F-stat = Smaller variance / Larger variance F-Test Two-Sample for Variances Health Financial Mean 5.46 6.13 Variance 1.1413 0.6653 Observations 17 21 df 16 20 F-stat 1.715 P(F<=f) one-tail 0.1263 F-crit (upper) 2.184 F-Test Two-Sample for Variances Financial Mean 6.13 Variance 0.6653 Observations 21 df 20 F-stat 0.583 P(F<=f) one-tail 0.1263 F-crit (lower) 0.458 F-crit (upper) = F-crit (lower) = Decision rule: F(0.05,16,20) =1.1413/0.6653 1/F(0.05,16,20) =0.6653/1.1413 (or F(0.95,20,16)) For upper tailed test: Do not reject H0 if F-stat ≤ 2.184 OR For lower tailed test: Do not reject H0 if F-stat ≥ 0.458 Health 5.46 1.1413 17 16 2.184 0.458 Now F-stat = 1.715 (for an upper tailed test); or F-stat = 0.583 (for a lower tailed test) Conclusion (based on an upper tailed F-test) Since F-stat = 1.715 > F-crit (upper) = 2.184 (and hence lies within the acceptance region), there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Same Conclusion (based on a lower tailed F-test) Since F-stat = 0.583 > F-crit (lower) = 0.458 (and hence lies within the acceptance region), there is insufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Management conclusion With 95% confidence, it can be concluded that there is no significant difference in the variability in attrition rates between the two sectors (financial and health). CHAPTER 10 CHI-SQUARE HYPOTHESIS TESTS Exercise 10.1 Purpose: To test whether there is any statistically significant association between the outcomes of two categorical variables. Stated differently, are the outcomes associated with two categorical variables independent of each other or not? Exercise 10.2 Categorical (nominal or ordinal-scaled) data Exercise 10.3 H0: There is no statistical association between the two categorical variables Exercise 10.4 Expected frequencies represent the null hypothesis of no association (or statistical independence ) between the two categorical variables. Exercise 10.5 χ²-crit (0.05,6) = 12.592 χ²-crit (0.10,6) = 10.645 Exercise 10.6 (a) File: X10.6 - motivation status.xlsx Row Percentages Male Female Total High 26.7 47.5 38.6 Motivation level Moderate 26.7 30.0 28.6 Low 46.7 22.5 32.9 Total 100 100 100 Interpretation (by inspection) When compared to the general population profile, males tend to have low levels of motivation, while females tend to be more highly motivated. It appears therefore that a statistical association exists between gender and motivation level. (b) H0: There is no association between Gender and Motivation level H1: There is an association between Gender and Motivation level Region of Rejection (Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2) χ²-crit = χ²(0.10)(2) = 4.605 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 4.605 χ²-stat Observed frequencies (fo) High Moderate Male 8 8 Female 19 12 Total 27 20 Low 14 9 23 Total 30 40 70 Expected frequencies (fe) High Male 11.57 Female 15.43 Total 27 Moderate 8.57 11.43 20 Low 9.86 13.14 23 Total 30 40 70 Chi-Squared components High Male 1.102 Female 0.827 Moderate 0.038 0.029 Low 1.741 1.306 χ²-stat = 5.0428 Conclusion Since χ²-stat = 5.0428 > χ²-crit = 4.605, there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that there is a statistical association between the gender of an employee and their level of motivation. The nature of the relationship is described in (a) above. Exercise 10.7 (a) File: X10.7 - internet shopping.xlsx Row Percentages full-time at-home Total Internet shopping Yes No 75.7 24.3 18.5 81.5 20.8 79.2 Total 100 100 100 Interpretation (by inspection) Since the row percentage profiles (between full-time employed and at-home customers) are very similar to each other and to the general population profile, it can be concluded, by observation, that the two attributes are not associated (i.e. they are statistically independent). (b) H0: There is no association between Employment Status and Use of Internet Shopping H1: There is an association between Employment Status and Use of Internet Shopping Region of Rejection χ²-crit = (Use α = 0.05 with degrees of freedom = (2-1)(2-1) = 1) χ²(0.05)(1) = 3.843 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 3.843 χ²-stat Observed frequencies (fo) Yes full-time 35 at-home 40 Total 75 No 109 176 285 Total 144 216 360 Expected frequencies (fe) Yes full-time 30 at-home 45 Total 75 No 114 171 285 Total 144 216 360 Chi-Squared components Yes No full-time 0.8333 0.2193 at-home 0.5556 0.1462 χ²-stat = 1.7544 Conclusion Since χ²-stat = 1.7544 < χ²-crit = 3.843, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is no statistical association between the employment status of a customer and their use of the internet for shopping purposes. These two events are statistically independent. Exercise 10.8 (a) File: X10.8 - car size.xlsx Row Percentages Under 30 30 - 45 Over 45 Total Small 15.2 21.1 37.5 26.3 Car sizes bought Medium Large 33.3 51.5 36.8 42.1 29.2 33.3 33.0 40.7 Total 100 100 100 100 Interpretation (by inspection) With reference to the general population profile (Total row %), under 30's tend to prefer larger cars; 30-45 year age car buyers marginally tend towards medium to large cars, while over 45's strongly tend to prefer smaller cars. (b) H0: There is no association between Age of car buyer and Car Size bought. H1: There is an association between Age of car buyer and Car Size bought. Region of Rejection (Use α = 0.01 with degrees of freedom = (3-1)(3-1) = 4) χ²-crit = χ²(0.01)(4) = 13.277 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 13.277 χ²-stat Observed frequencies (fo) Small Medium Under 30 10 22 30 - 45 24 42 Over 45 45 35 Total 79 99 Large 34 48 40 122 Total 66 114 120 300 Expected frequencies (fe) Small Medium Under 30 17.38 21.78 30 - 45 30.02 37.62 Over 45 31.6 39.6 Total 79 99 Large 26.84 46.36 48.8 122 Total 66 114 120 300 Chi-Squared components Small Medium Under 30 3.134 0.002 30 - 45 1.207 0.510 Over 45 5.682 0.534 Large 1.910 0.058 1.587 χ²-stat = 14.6247 Conclusion Since χ²-stat = 14.6247 > χ²-crit = 13.277, there is sufficient sample evidence at the 1% significance level to reject H0 in favour of H1. Therefore conclude that there is a statistical association between the age of a car buyer and the size of car bought. (c) Management Conclusion The nature of the statistical relationship found in (b) is described in (a) above. Under 30's tend to prefer larger cars; 30-45 year age car buyers marginally tend towards medium to large cars, while over 45's strongly tend to prefer smaller cars. Recommendation Target larger cars to the younger market and smaller cars to the older market. Exercise 10.9 (a) File: X10.9 - sports readership.xlsx Let πi = proportion of people who read Sports News in each of the i regions. H0: π1 = π2 = π3 H1: At least one πi is different (i = 1,2,3) Sample proportions (b) Region of Rejection Decision rule χ²-stat χ²-crit = E Cape 0.160 W Cape 0.104 KZN 0.250 Use α = 0.01 with degrees of freedom = (2-1)(3-1) = 2 χ²(0.01)(2) = 9.21 Reject H0 in favour of H1 if χ²-stat ≥ 9.21 Observed frequencies (fo) E Cape W Cape 84 86 No 16 10 Yes 100 96 Total KZN 78 26 104 Total 248 52 300 Expected frequencies (fe) E Cape W Cape 82.67 79.36 No 17.33 16.64 Yes 100 96 Total KZN 85.97 18.03 104 Total 248 52 300 Chi-Squared components E Cape W Cape No 0.0215 0.5556 Yes 0.1026 2.6496 KZN 0.7395 3.5267 χ²-stat = 7.5954 Conclusion Since χ²-stat = 7.5954 < χ²-crit = 9.21, there is insufficient sample evidence at the 1% significance level to reject H0 in favour of H1. Therefore conclude that the proportion of people who read Sports News is the same in each Geographical Region. These two events are statistically independent. (c) H0: There is no association between the propensity to read Sports News and Region H1: There is an association between the propensity to read Sports News and Region Exercise 10.10 (a) File: X10.10 - gym activity.xlsx Row Percentages Male Female Total Gym activity Spinning Swimming 42.35 22.35 52.73 29.09 46.43 25.00 Circuit 35.29 18.18 28.57 Total 100 100 100 Interpretation (by inspection) When compared to the general population of gym goers, more females than males tend to prefer spinning and swimming; while more males relative to females tend to prefer mainly doing the circuit. The evidence is however not strongly convincing. (b) H0: There is no association between Gender and preferred Gym Activity H1: There is an association between Gender and preferred Gym Activity Region of Rejection Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2 χ²(0.10)(2) = 4.605 χ²-crit = Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 4.605 χ²-stat Observed frequencies (fo) Spinning Swimming Male 36 19 Female 29 16 Total 65 35 Circuit 30 10 40 Total 85 55 140 Expected frequencies (fe) Spinning Swimming Male 39.46 21.25 Female 25.54 13.75 Total 65 35 Circuit 24.29 15.71 40 Total 85 55 140 Circuit 1.345 2.078 χ²-stat = 4.803 Chi-Squared components Spinning Swimming Male 0.304 0.238 Female 0.470 0.368 Conclusion Since χ²-stat = 4.803 > χ²-crit = 4.605 (marginally), there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that gender and preferred gym activity are associated. The nature of the relationship is described in (a) above. (c) New rejection region Decision rule Use α = 0.05 with degrees of freedom = (2-1)(3-1) = 2 χ²-crit = χ²(0.05)(2) = 5.991 Reject H0 in favour of H1 if χ²-stat ≥ 5.991 New decision Do not reject H0 at the 5% significance level, New conclusion (d) since χ²-stat = 4.803 < χ²-crit = 5.991. There is no statistical association between gender and gym activity (i.e. they are statistically independent) at the 5% level of significant. Let πi = proportion of females who prefer each gym activity i (spinning, swimming, circuit) H0: π1 = π2 = π3 H1: At least one πi is different (i = 1,2,3) Sample proportions Spin 0.446 Swim 0.457 Circuit 0.250 Statistical conclusion The same statistical conclusion applies as in (b) (i.e. a statistical association exists) or stated differently: at least one population proportion is different. Management conclusion By an inspection of the row percentage table in (a), it can be concluded that spinning and swimming are the most preferred gym activities of females, while doing the circuit is the least preferred gym activity of females. Exercise 10.11 (a) File: X10.11 - supermarket visits.xlsx Categorical Frequency Table - Supermarket Visits Visits Daily 3 / 4 times Twice Once only Total Customers 36 55 62 27 180 Percent 20.0 30.6 34.4 15.0 100 Belief % 25 35 30 10 100 Interpretation (by inspection) Once-a-week visits are the least common shopping behaviour (only 15%). Only one-in-five shoppers (20%) shop daily. The most common shopping pattern is either 3/4 times a week (30.6%) or twice a week (34.4%). These two shopping behaviours represent 65% of all the sampled shoppers. (b) Goodness-of-fit test for an empirical distribution H0: The frequency of store visits per week is as per the manager's belief. H1: The frequency of store visits per week differs significantly from the manager's belief. Region of Rejection (Use α = 0.05 with degrees of freedom = (4-1) = 3) χ²-crit = χ²(0.05)(3) = 7.815 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 7.815 χ²-stat Visits Daily 3 / 4 times Twice Once only Total fo 36 55 62 27 180 Customers %fe 25 35 30 10 100 fe 45 63 54 18 180 χ²-stat 1.8 1.016 1.185 4.5 8.501 Conclusion Since χ²-stat = 8.501 > χ²-crit = 7.815, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that the shopping frequency of customers differs significantly from the manager's belief. The nature of the relationship is described in (a) above. (c) Interpretation The manager's belief is that customers shop more frequently during a week. frequently than the manager believes. For example, the survey found that only 51% shop more than 3 times per week, while the manager assumed that this percentage was 60%. Similarly, more customers prefer to shop only once or twice a week (49%) compared to the manager's belief that this percentage was only 40%). These differences are however not strongly significantly different. Exercise 10.12 File: X10.12 - equity portfolio.xlsx Goodness-of-fit test for an empirical distribution H0: There is no change in the equity portfolio mix between 2008 and 2012. H1: There is a significant change in the equity portfolio mix between 2008 and 2012. Region of Rejection (Use α = 0.05 with degrees of freedom = (4-1) = 3) χ²-crit = χ²(0.05)(3) = 7.815 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 7.815 χ²-stat Equity Mining Industrial Retail Financial Total fo (2012) 900 1400 400 1800 4500 Equities Ratio fe 2 3 1 4 10 fe (2008) 900 1350 450 1800 4500 χ²-stat 0 1.852 5.556 0 7.407 Conclusion Since χ²-stat = 7.407 < χ²-crit = 7.815, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is no significant change in the equity portfolio mix of the investor between 2008 and 2012. The equity portfolio profile is essentially the same in 2012 as it was in 2008. Exercise 10.13 (a) File: X10.13 - payment method.xlsx Goodness-of-fit test for an empirical distribution H0: There is no change in the payment method for electronic goods. H1: There is a significant change in the payment method for electronic goods. Region of Rejection Use α = 0.05 with degrees of freedom = (3-1) = 2 χ²-crit = χ²(0.05)(2) = 5.991 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 5.991 χ²-stat Payment Cash Debit Card Credit Card Total Payment Methods fo % fe fe 23 41 46 35 49 70 42 110 84 100 200 200 χ²-stat 0.543 6.3 8.048 14.891 Conclusion Since χ²-stat = 14.891 >> χ²-crit = 5.991, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is a significant shift in payment practices from the past. (b) Management Interpretation There is a significant shift in payment practices for electronic goods. There is more emphasis on credit card payment (55%) today than in the past (42%). Exercise 10.14 (a) File: XS10.14 - package sizes.xlsx Goodness-of-fit test for an empirical distribution H0: Limpopo sales pattern follows the national sales pattern. H1: Limpopo sales pattern does not follow the national sales pattern. Region of Rejection (Use α = 0.05 with degrees of freedom = (3-1) = 2) χ²-crit = χ²(0.05)(2) = 5.991 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 5.991 χ²-stat Package Large Midsize Small Total Package Size Sales fo Ratio fe fe 3 190 162 5 250 270 2 100 108 10 540 540 χ²-stat 4.840 1.481 0.593 6.914 Conclusion Since χ²-stat = 6.914 > χ²-crit = 5.991, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that the Limpopo sales pattern of cereal package sizes differs significantly from the national sales pattern of package sizes sold. (b) Management Interpretation The Limpopo sales patterns differs significantly from the national sales pattern of package sizes sold. By an inspection of the Limpopo sales profile relative to the national pattern, Limpopo tends to sell more large sized packages relative to the national pattern. Exercise 10.15 (a) File: X10.15 - compensation plan.xlsx Column Percentages Plan Present New Total Cape 62 38 100 Regions Gauteng Free State 75.7 67.1 24.3 32.9 100 100 KZN 72.7 27.3 100 Total 70.8 29.2 100 Interpretation (by inspection) When compared to the national population profile of all employees, the present compensation plan enjoys more support in Gauteng, and the least support in the Cape Province where the new plan is favoured more. The evidence is however not overwhelming (i.e. the profile differences are not large) (b) Let πi = proportion of sales staff in favour of the present payment plan in each province i . H0: π1 = π2 = π3 = π4 H1: At least one πi is different (i = 1,2,3,4) Sample proportions (c) Region of Rejection Decision rule χ²-stat Cape 0.62 Gauteng 0.757 Free State 0.671 KZN 0.727 Use α = 0.10 with degrees of freedom = (2-1)(4-1) = 3 χ²-crit = χ²(0.10)(3) = 6.251 Reject H0 in favour of H1 if χ²-stat ≥ 6.251 Observed frequencies (fo) Plan Cape Gauteng Present 62 140 New 38 45 Total 100 185 Free State 47 23 70 KZN 80 30 110 Total 329 136 465 Expected frequencies (fe) Plan Cape Gauteng Present 70.75 130.89 New 29.25 54.11 Total 100 185 Free State 49.53 20.47 70 KZN 77.83 32.17 110 Total 329 136 465 Chi-Squared components Plan Cape Gauteng Present 1.0828 0.6337 New 2.6194 1.5330 Free State 0.1289 0.3119 KZN 0.0606 0.1466 χ²-stat = 6.5169 Conclusion Since χ²-stat = 6.5169 > χ²-crit = 6.251 (marginally), there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that the support for the present compensation plan is different in at least one of the provinces. The nature of the relationship is described in (a) above. (d) Formulate as a test for independence of association between payment plan and province. H0: There is no association between payment plan preference and province. H1: (e) There is an association between payment plan preference and province. New rejection region χ²-crit = Decision rule Use α = 0.05 with degrees of freedom = (2-1)(4-1) = 3 χ²(0.05)(3) = 7.815 Reject H0 in favour of H1 if χ²-stat ≥ 7.815 New decision Do not reject H0 at the 5% significance level, since χ²-stat = 6.5169 < χ²-crit = 7.815. New conclusion There is no statistical association between payment plan preference and province. (i.e. they are statistically independent) at the 5% level of significant. The sample evidence is not strong enough (i.e. the sample proportion differences are not great enough) to reject H0 in favour of H1 at the 5% level of significance. Exercise 10.16 (a) File: X10.16 - tyre defects.xlsx Row Percentages Morning Afternoon Night Total Nature of defective tyre technical mechanical material 22.1 61.8 16.2 30.2 46.5 23.3 42.6 36.8 20.6 31.5 48.2 20.3 Total 100 100 100 100 Interpretation (by inspection) When compared to the total production of defective tyres (total row %), tyre defects due to mechanical problems tend to be more prevalent during morning shifts. Technical defects however tend to be more prevalent during the night shift. Thus there does appear to be an association between shift and nature of tyre defects. (b) Formulate as a test for independence of association between nature of defect and shift H0: There is no association between nature of tyre defect and shift. H1: There is an association between nature of tyre defect and shift. Region of Rejection χ²-crit = Use α = 0.05 with degrees of freedom = (3-1)(3-1) = 4 χ²(0.05)(4) = 9.488 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 9.488 χ²-stat Observed frequencies (fo) technical mechanical Morning 15 42 Afternoon 26 40 Night 29 25 Total 70 107 material 11 20 14 45 Total 68 86 68 222 Expected frequencies (fe) technical mechanical Morning 21.44 32.77 Afternoon 27.12 41.45 Night 21.44 32.77 Total 70 107 material 13.78 17.43 13.78 45 Total 68 86 68 222 Chi-Squared components technical mechanical Morning 1.9351 2.5967 Afternoon 0.0460 0.0508 Night 2.6646 1.8443 (c) material 0.5622 0.3782 0.0034 χ²-stat = 10.0812 Conclusion Since χ²-stat = 10.0812 > χ²-crit = 9.488, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that the nature of tyre defects produced is related to the shift on which the defects occur. The nature of the relationship is described in (a) above. (d) Let πi = proportion of defective tyres caused by mechanical factors per shift i . H0: π1 = π2 = π3 H1: At least one πi is different (i = 1,2,3) The hypothesis test procedure is identical to (b) above. The sample proportions being compared are: Morning 0.618 Afternoon 0.465 Night 0.368 Conclusion Since H0 is rejected in favour of H1 at the 5% significance level, it can be concluded that there is at least one shift that has a different proportion of defective tyres due to mechanical factors. Based on the row percentages table in (a) above, it is clear that the morning shift produces a proportionally larger percentage of defective tyres due to mechanical factors than the afternoon or night shifts. Exercise 10.17 Histogram Delays(min) <5 5 - 7.5 7.5 - 10 10-12.5 12.5-15 15-17.5 Total Count 0 8 28 29 13 2 80 Histogram of Flight Delay Times 35 29 28 30 No. of flights (a) X10.17 - flight delays.xlsx File: 25 20 13 15 8 10 5 0 2 0 <5 5 - 7.5 7.5 - 10 10-12.5 12.5-15 15-17.5 Delay Intervals (minutes) Interpretation (b) Flight delay times (in minutes) appear to be normally distributed. Descriptive Statistics Interpretation flight delays Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 10.324 0.261 10.3 11 2.333 5.444 -0.550 0.084 10.3 5.3 15.6 825.9 80 The low skewness coefficient (0.084) indicates approximate normality. (c) (i) Goodness-of-fit test for Normality H0: Flight time delays (in minutes) follows a normal distribution with μ = 10.324 min and σ = 2.333 min. H1: Flight time delays (in minutes) do not follow a normal distribution with μ = 10.324 min and σ = 2.333 min. Region of Rejection Use α = 0.01 with degrees of freedom = (7-2-1) = 4 χ²-crit = χ²(0.01)(4) = 13.277 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 13.277 Delay Intervals -∞ < x < 5 5 < x < 7.5 7.5 < x < 10 10 < x < 12.5 12.5 < x < 15 15 < x < 17.5 17.5 < x < +∞ fo 0 8 28 29 13 2 0 80 Expected frequencies using x ≡ N(10.324; 2.333) from Z-table Normal probability intervals (x and z) Probability fe P(x < 5) 0.90 P(z < -2.282) 0.01130 P(-2.282 < z < -1.21) 0.10180 P(5 < x < 7.5) 8.14 P(-1.21 < z < -0.139) 0.33120 P(7.5 < x < 10) 26.50 P(-0.139 < z < 0.933) 0.37950 P(10 < x < 12.5) 30.36 P(0.933 < z < 2.004) 0.15400 P(12.5 < x < 15) 12.32 P(2.004 < z < 3.076) 0.02117 P(15 < x < 17.5) 1.69 P(z > 3.076) 0.00103 P(x > 17.5) 0.08 1 80 χ²-stat = Conclusion Since χ²-stat = 1.2282 << χ²-crit = 13.277, there is insufficient sample evidence at the 1% significance level to reject H 0 in favour of H1. Therefore conclude that flight delay times (in minutes) follows a Normal distribution with μ = 10.324 min and σ = 2.333 min. χ²-stat 0.9040 0.0025 0.0854 0.0609 0.0375 0.0554 0.0824 1.2282 (c) (ii) P(x < 5) P(5 < x < 7,5) P(7,5 < x < 10) P(10 < x < 12,5) P(12,5 < x < 15) P(15 < x < 17,5) P(x > 17,5) =NORMDIST(5,10.324,2.333,1) =NORMDIST(7.5,10.324,2.333,1) - NORMDIST(5,10.324,2.333,1) =NORMDIST(10,10.324,2.333,1) - NORMDIST(7.5,10.324,2.333,1) =NORMDIST(12.5,10.324,2.333,1) - NORMDIST(10,10.324,2.333,1) =NORMDIST(15,10.324,2.333,1) - NORMDIST(12.5,10.324,2.333,1) =NORMDIST(17.5,10.324,2.333,1) - NORMDIST(15,10.324,2.333,1) =1-NORMDIST(17.5,10.324,2.333,1) NORMDIST() From Z-table 0.01124 0.01130 0.10181 0.10180 0.33172 0.33120 0.37974 0.37950 0.15297 0.15400 0.02147 0.02117 0.00105 0.00103 1 1 Exercise 10.18 (a) File: newspaper sections.xlsx Two-way Pivot table of Gender by Newspaper Section Most Preferred to Read. Gender Female Data Count Row % Count Row % Male Total Count Total Row % Section Sport 14 23.3% 41 34.2% 55 30.6% Social 28 46.7% 35 29.2% 63 35.0% Business 18 30.0% 44 36.7% 62 34.4% Grand Total 60 100% 120 100% 180 100% Interpretation (by inspection) Females tend to read the Social section most, with the Sports section read the least. Males, alternatively, are marginally more interested in the Sport and Business sections. These observational conclusions are, however, not overwhelmingly conclusive. (b) Stacked Bar Chart Newspaper Section Read by Gender 120% 100% 80% 30.0% 36.7% 46.7% 29.2% 60% Social 40% 20% 23.3% 0% (c) Business Female Sport 34.2% Male Formulate as a test for independence of association between gender and section read . H0: There is no association between gender and the newspaper section most preferred. H1 : There is an association between gender and the newspaper section most preferred. Region of Rejection Decision rule χ²-crit = Use α = 0.10 with degrees of freedom = (2-1)(3-1) = 2 χ²(0.10)(2) = 4.605 Reject H0 in favour of H1 if χ²-stat ≥ 4.605 χ²-stat Observed frequencies (fo) Sport Social female 14 28 male 41 35 Total 55 63 Business 18 44 62 Total 60 120 180 Expected frequencies (fe) Sport Social female 18.3 21.0 male 36.7 42.0 Total 55 63 Business 20.7 41.3 62 Total 60 120 180 Chi-Squared components Sport Social female 1.0242 2.3333 male 0.5121 1.1667 Business 0.3441 0.1720 χ²-stat = 5.5525 Conclusion Since χ²-stat = 5.5525 > χ²-crit = 4.605 (marginal), there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that gender and the newspaper section most preferred are statistically associated. The nature of the relationship is described in (a) above. (d) Let πi = proportion of females who most prefer each newspaper section i . H0: π1 = π2 = π3 H1: At least one πi is different (i = 1,2,3) The hypothesis test procedure is identical to (c) above. The sample proportions being compared are: Sport Social Business 0.255 0.444 0.290 Conclusion Since H0 is rejected in favour of H1 at the 10% significance level, it can be concluded that there is at least one newspaper section that females prefer differently to the other sections. Based on the row percentages table in (a) above, it is clear that females tend to read the Social section most, with the Sports section read the least. Exercise 10.19 (a) File: X10.19 - vehicle financing.xlsx One-way Pivot table of Car Loan Sizes Count of Loan size Loan size Under 100 100 - <150 150 - <200 200 - <250 Above 250 Grand Total Count of Loan size Loan size Under 100 100 - <150 150 - <200 200 - <250 Above 250 Grand Total Total 18 58 110 70 44 300 Total 6% 19% 37% 23% 15% 100% Interpretation (by inspection) The most popular car loan size was between R150 000 and R200 000 (37% of all applications) followed by car loan sizes of between R200 000 and R250 000 (23%). Only 6% of loan applications were for amounts below R100 000. (b) Bar Chart of Car Loan Applications 0.4 0.35 0.3 0.25 0.2 0.37 0.15 0.1 0.05 0 0.23 0.19 0.15 0.06 Under 100 100 - <150 150 - <200 200 - <250 Above 250 Size of Loan Applications (R 1000) (c) Test for Goodness-of-Fit for an empirical distribution. H0: There is no change in the size of car loan applications from four years ago. H1: There is a significant shift in the size of car loan applications from four years ago. (d) Region of Rejection Decision rule χ²-stat χ²-crit = Use α = 0.05 with degrees of freedom = (5-1) = 4 χ²(0.05)(4) = 9.488 Reject H0 in favour of H1 if χ²-stat ≥ 9.488 Loan Size < R100 R100 - R150 R150 - R200 R200 - R250 > R250 Total fo 18 58 110 70 44 300 Payment Methods % fe 10 20 40 20 10 100 fe 30 60 120 60 30 300 χ²-stat 4.800 0.067 0.833 1.667 6.533 13.900 Conclusion Since χ²-stat = 13.9 > χ²-crit = 9.488, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there has been a significant shift in the size of car loan applications from 4 years ago. The shift has been towards larger car loan applications. Exercise 10.20 (a) X10.20 - milk products.xlsx File: Two-way Pivot table of milk type purchased and health-concious status of consumers Question 1 Data Fat-free Count Percent Low fat Count Percent Full cream Count Percent Total Count Total Percent Question 2 Yes 20 43.5% 15 32.6% 11 23.9% 46 100% No 5 16.7% 10 33.3% 15 50.0% 30 100% Grand Total 25 32.9% 25 32.9% 26 34.2% 76 100% Interpretation (by inspection) The sample evidence points strongly towards health-concious consumers purchasing more fat-free dairy products, while non-health-concious consumers tend to purchase more full cream dairy products. (b) Stacked Bar Chart Milk Type Purchased and Health-Concious Status 80% % of Consumers 70% 60% 50% 16.7% 33.3% 50.0% 40% Yes 30% 20% 43.5% 10% 0% No Fat-free 32.6% Low fat 23.9% Full cream Milk Categories (c) Formulate as a test for independence of association between milk type purchased and health-concious status of consumers. H0: There is no association between milk type purchased and health-concious status. H1: There is an association between milk type purchased and health-concious status. Region of Rejection (Use α = 0.05 with degrees of freedom = (3-1)(2-1) = 2) χ²-crit = χ²(0.05)(2) = 5.991 Decision rule Reject H0 in favour of H1 if χ²-stat ≥ 5.991 χ²-stat Observed frequencies (fo) Yes 20 fat-free 15 low fat 11 full cream 46 Total No 5 10 15 30 Total 25 25 26 76 Expected frequencies (fe) Yes fat-free 15.13 low fat 15.13 full cream 15.74 46 Total No 9.87 9.87 10.26 30 Total 25 25 26 76 Chi-Squared components Yes 1.566 fat-free 0.001 low fat 1.426 full cream No 2.402 0.002 2.186 χ²-stat = 7.583 Conclusion Since χ²-stat = 7.583 > χ²-crit = 5.991, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is a significant statistical association between the health-concious status of a consumer and their preference for certain type of dairy milk products. The nature of the relationship is described in (a) above. (d) Let πi = proportion of health-concious consumers who prefer each milk type i . H0: π1 = π2 = π3 H1: At least one πi is different (i = 1,2,3) The hypothesis test procedure is identical to (c) above. The sample proportions being compared are: fat-free 0.800 low-fat 0.600 full cream 0.423 Conclusion Since H0 is rejected in favour of H1 at the 5% significance level, it can be concluded that there is at least one milk type that is purchased by a different proportion of health-concious consumers. Based on the row percentages table in (a) above, it is clear that health-concious consumers tend to purchase more fat-free dairy products than full cream products. CHAPTER 11 ANALYSIS OF VARIANCE COMPARING MEANS ACROSS MULTIPLE POPULATIONS Exercise 11.1 The purpose of one-factor Anova is to test for equality of means across multiple (more than two) populations. Exercise 11.2 Example: Compare the output performance of five identical machines. Exercise 11.3 Variation between groups measures how similar or how different (i.e. how close or how far apart) the sample means are from each other. It is a measure of the level of influence of the treatment factor on the response measure. Any differences can be attributed to (or explained by) the influence of the treatment factor on the numeric response measure. Exercise 11.4 SST = 25.5 Numerator degrees of freedom = SSE = 204.6 - 25.5 = 179.1 Denominator degrees of freedom = (4 - 1) = 3 (4x10 - 4) = 36 MST = MSE = 25.5 / 3 = 179.1 / 36 = 8.5 4.975 F-stat = 8.5 / 4.975 = 1.7085 Exercise 11.5 F-crit = F(0.05, 3, 36) = 2.866 Exercise 11.6 H 0 : μ 1 = μ2 = μ3 = μ4 In Excel, use H1: At least one μi differs Decision Rule: Do not reject H0 if F-stat ≤ F-crit . Since F-stat = 1.7085 < F-crit = 2.866, do not reject H0. Conclusion: All population means are equal. =FINV(0.05,3,36) Exercise 11.7 (a) File: X11.7 - car fuel efficiency.xlsx Sample Average Fuel Consumption (l/100km) Σxi Peugot 32.4 ni 5 x(bar)i 6.48 Grand mean = VW 29.3 4 7.325 96.1/14 = Ford 34.4 5 6.88 6.864 litres / 100 km Bar Chart of Fuel Efficiency Means 10 9 8 7 6 5 4 6.88 3 2 1 0 Car Types (b) 7.325 6.48 One-Factor Anova Peugot 6.48 VW 7.325 Ford 6.88 Factor = Car Type ( 1 = Peugot; 2 = VW; 3 = Ford) Response measure = Fuel Consumption (l/100km) H0: μ1 = μ2 = μ3 H1: At least one μi differs (i = 1, 2, 3) (Use α = 0.05 with df1 = 3-1 = 2 and df2 = 14-3 = 11) Region of Rejection F-crit = F(0.05)(2,11) = 3.98 Decision rule Reject H0 in favour of H1 if F-stat ≥ 3.98 F-stat SSW = SSB = SST = (7-6.48)²+(6.3-6.48)²+(6-6.48)²+(6.4-6.48)²+(6.7-6.48)²+ (6.8-7.325)²+(7.4-7.325)²+(7.9-7.325)²+(7.2-7.325)²+ (7.6-6.88)²+(6.8-6.88)²+(6-4.88)²+(7-6.88)²+(6.6-6.88)² = 2.0635 (6.48-6.864)²+(7.325-6.864)²+(6.88-6.864)² = 1.5886 2.0635 + 1.5886 = 3.6521 Anova Table Source of SS Variation Between 1.5886 Within 2.0635 Total 3.6521 df MS F-stat 2 11 13 0.79432 0.18759 4.234 Conclusion Since F-stat = 4.234 > F-crit = 3.98, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one motor vehicle type with a different average fuel consumption to the rest. By inspection, it would appear that VW has an average fuel consumption that is significantly different (higher, and hence least fuel efficient) from Peugot and Ford. (c) (Use α = 0.01 with df1 = 3-1 = 2 and df2 = 14-3 = 11) New Region of Rejection F-crit = F(0.01)(2,11) = 7.21 Decision rule Reject H0 in favour of H1 if F-stat ≥ 7.21 New decision Do not reject H0 at the 1% significance level, since F-stat = 4.234 < F-crit = 7.21 New conclusion There is no statistical evidence, at the 1% signficance level, to conclude that average fuel consumption differs across motor vehicle types. Note: The sample evidence must be more convincing (i.e. larger differences between sample means) before one is prepared to reject the null hypothesis in favour of the alternative hypothesis. A level of significance of 1% indicates that the sample evidence is not strong enough (meaningful differences) yet to reject the null hypothesis of equal means. Exercise 11.8 (a) File: X11.8 - package design.xlsx Assumption 1 Assumption 2 Equal population variances. A normally distribution population for the response variable. One-Factor Anova Factor = Package designs (1 = A; 2 = B; 3 = C) Response measure = Carton sales (units) Let μi = population mean sales of a breakfast cereal packaged in design shape i H0: μ1 = μ2 = μ3 H1: At least one μi differs (i = 1, 2, 3) Region of Rejection (Use α = 0.05 with df1 = 3-1 = 2 and df2 = 21-3 = 18) F-crit = F(0.05)(2,18) = Decision rule Reject H0 in favour of H1 if F-stat ≥ 3.55 F-stat 3.55 (Refer to formulae in Chapter 11) Sample evidence Sample means Grand mean Design A 35.75 721/21 = Design B 32.86 34.33 Design C 34.17 SSW = (35-35.75)²+(37-35.75)²+(39-35.75)²+(36-35.75)²+(30-35.75)²+(39-35.75)²+(36-35.75)²+(34-35.75)²+ (35-32.86)²+(34-32.86)²+(30-32.86)²+(31-32.86)²+(34-32.86)²+(32-32.86)²+(34-32.86)²+ (38-34.17)²+(34-34.17)²+(32-34.17)²+(34-34.17)²+(34-34.17)²+(33-34.17)² SSB = (35.75-34.33)²+(32.86-34.33)²+(34.17-34.33)² SST = 101.1905+31.4762 = = 101.1905 = 31.4762 132.6667 Anova Table Source of Variation Between Within Total SS df MS F-stat 31.4762 101.1905 132.6667 2 18 20 15.7381 5.6217 2.7995 Conclusion Since F-stat = 2.7995 < F-crit = 3.55, there is insufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is no difference in the mean volume of sales across the 3 package designs. (b) Recommendation There is no strong statistical evidence to conclude that sales volumes differ across the three package designs. All are likely to generate the same average sales. Therefore the cereal producer can choose any of the three package designs for their new muesli cereal. Exercise 11.9 (a) File: One-Factor Anova Let μi = Factor = Bank Response measure = (1 = X; 2 = Y; 3 = Z) Rating score (1 to 10) population mean service rating score for bank i H0: μ 1 = μ2 = μ3 H1: At least one μi differs (i = 1, 2, 3) Region of Rejection (Use α = 0.10 with df1 = 3-1 = 2 and df2 = 27-3 = 24) F-crit = F(0.10)(2,24) = 2.538 Decision rule Reject H0 in favour of H1 if F-stat ≥ 2.538 F-stat Sample means Grand mean SSB = SST = =FINV(0.1,2,24) (Refer to the formulae in Chapter 11) Sample evidence SSW = X11.9 - bank service.xlsx Bank X 6.875 169/27 = Bank Y 5.778 6.259 Bank Z 6.2 (8-6.875)²+(6-6.875)²+...+(5-5.778)²+(6-5.778)²+…+(8-6.2)²+(7-6.2)²+…+(6-6.2)² = 24.0306 (6.875-6.259)²+(5.778-6.259)²+(6.2-6.259)² = 5.1546 24.0306 + 5.1546 = 29.1852 Anova Table Source of Variation Between Within Total SS df MS F-stat 5.1546 24.0306 29.1852 2 24 26 2.5773 1.0013 2.574 Conclusion Since F-stat = 2.574 > F-crit = 2.538, there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one bank that has a different mean service rating score to the other banks. By inspection, it would appear that Bank X has a significantly higher mean service rating score than the other two banks. (b) (Use α = 0.05 with df1 = 3-1 = 2 and df2 = 27-3 = 24) New Region of Rejection F-crit = F(0.05)(2,24) = 3.40 Decision rule Reject H0 in favour of H1 if F-stat ≥ 3.40 Do not reject H0 at the 5% significance level, New decision since F-stat = 2.574 < F-crit = 3.40 New conclusion There is no statistical evidence, at the 5% signficance level, to conclude that the mean service ratings is different across the three banks. The three banks are perceived similarly by customers in terms of their service levels. Note: The reason for the change in conclusion between (a) and (b) is that the statistical evidence is only weak (i.e. small differences in sample means) at the 10% significance level, while it the 5% significance level, it is not seen as strong enough (i.e. differences are not large enough) to reject the null hypothesis. Exercise 11.10 File: One-Factor Anova Let μi = X11.10 - shelf height.xlsx Factor = Shelf Height (1 = Bottom; 2 = Waist; 3 = Shoulder; 4 = Top) Response measure = Sales volume (units sold) population mean sales of a drinking chocolate product displayed at shelf height i H0: μ1 = μ2 = μ3 = μ4 H1: At least one μi differs (i = 1, 2, 3, 4) Region of Rejection (Use α = 0.05 with df1 = 4-1 = 3 and df2 = 30-4 = 26) F-crit = F(0.05)(3,26) = 2.990 Decision rule Reject H0 in favour of H1 if F-stat ≥ 2,990 F-stat (Refer to the formulae in Chapter 11) Sample evidence Sample means Grand mean SSW = SSB = SST = Bottom 76.143 2375/30 = Waist 81.375 79.167 Shoulder 82.444 Top 74.833 (78-76.143)²+….+(78-81.375)²+….+(83-82.444)²+….+(69-74.833)²+…+(75-74.833)² = 727.788 (76.143-79.167)²+(81.375-79.167)²+(82.444-79.167)²+(74.833-79.167)² = 312.379 727.788 + 312.379 = 1040.167 Anova Table Source of Variation Between Within Total SS 312.379 727.788 1040.167 df 3 26 29 MS 104.126 27.992 F-stat 3.720 Conclusion Since F-stat = 3.72 > F-crit = 2.99, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one shelf height that generates a different mean level of sales to the other shelves. By inspection, it would appear that shoulder and waist high shelves generate higher average sales of the drinking chocolate product than bottom or top shelves. Exercise 11.11 (a) File: One-Factor Anova X11.11 - machine evaluation.xlsx Factor = Labelling Machine (1 = A; 2 = B; 3 = C) Response measure = Processing time (in minutes) population mean processing time for shaping and labelling machine i Let μi = H0: μ1 = μ2 = μ3 H1: At least one μi differs (i = 1, 2, 3) Anova: Single Factor SUMMARY Groups Machine A Machine B Machine C Count 5 5 5 ANOVA Source of Variation Between Groups Within Groups SS 28.933 52.8 Total 81.733 Sum 59 56 72 Average 11.8 11.2 14.4 Variance 3.7 5.7 3.8 df MS 14.467 4.4 F-stat 3.288 2 12 P-value 0.0727 F crit 2.8068 14 Conclusion Since F-stat = 3.288 > F-crit = 2.8068, there is sufficient sample evidence at the 10% significance level to reject H0 in favour of H1. Therefore conclude that there is at least machine that has a different mean processing time. By inspection, machine C has a significantly longer mean processing time than either machines A and B. Machine C must not be considered for purchase. (b) Two-Sample Test of Mean Processing Times between Machines A (1) and B (2) H0: μ1 = μ2 (two tailed test) H1: μ1 ≠ μ2 t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Machine A Machine B 11.8 11.2 3.7 5.7 5 5 4.7 0 8 0.4376 0.3366 1.3968 t-crit (0.10, 8) = 0.6733 =TINV(0.1,8) 1.8595 1.8595 Conclusion Since t-stat = 0.4376 lies with the region of non-rejection of H 0 (i.e. within ± 1.8595), the sample evidence is not strong enough to reject H 0 in favour of H1. The population mean processing times between the two machines A and B are likely to be identical. Thus the company can purchase either machine A or machine B. (c) Recommendation Based on the statistical evidence in (a) and (b), the company can purchase either machine A or machine B - they are likely to be equally efficient in operation. Exercise 11.12 (a) File: One-Factor Anova Let μi = H0: H1: (b) X11.12 - earnings yield.xlsx Factor = Economic Sector (1 = Financial; 2 = Retail; 3 = Industrial; 4 = Mining) Response measure = Earnings yield (%) population mean earnings yield per economic sector i μ 1 = μ 2 = μ 3 = μ4 At least one μi differs (i = 1, 2, 3, 4) Anova: Single Factor SUMMARY Groups Industrial Retail Financial Mining Count 20 20 20 20 Sum 96.1 94.4 113.2 111 Average 4.81 4.72 5.66 5.55 Variance 1.091 1.492 1.588 2.067 ANOVA Source of Variation Between Groups Within Groups SS 14.3894 118.5195 df MS 4.7965 1.5595 F-stat 3.0757 Total 132.9089 3 76 P-value 0.0326 F crit 2.7249 79 Conclusion Since F-stat = 3.076 > F-crit = 2.725, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one sector with a different mean earnings yield relative to the other sectors. (c) Interpretation By inspection, the Industrial and Retail sectors each appear to have significantly lower mean earnings yields relative to the Financial and Mining sectors' mean earnings yields. Exercise 11.13 (a) File: One-Factor Anova Let μi = (b) X11.13 - advertising strategy.xlsx Factor = Advertising strategy (1 = Sophisticated; 2 = Athletic; 3 = Trendy) Response measure = Sales (units of cans) H0: population mean level of sales achieved under each advertising strategy i μ1 = μ2 = μ3 H1: At least one μi differs (i = 1, 2, 3) Anova: Single Factor SUMMARY Groups Sophisticated Athletic Trendy Count 20 20 20 Sum 8380 7241 8030 Average 419 362.05 401.5 Variance 5075.579 6201.313 3781.737 ANOVA Source of Variation Between Groups Within Groups SS 34039.03 286113.95 df 2 57 MS 17019.52 5019.54 F-stat 3.391 Total 320152.98 59 P-value 0.0406 Conclusion Since F-stat = 3.391 > F-crit = 3.159, there is sufficient sample evidence at the 5% level of significance to reject H0 in favour of H1. Therefore conclude that there is at least one advertising strategy that results in a different mean level of deodorant sales relative to the other strategies. (c) Interpretation and recommendation By inspection, the Athletic advertising strategy resulted in the lowest mean sales of all three advertising strategies. On average, the Sophisticated and Trendy strategies appear to be equally effective (the difference in sample means does not appear significant). Recommendation: Either the Trendy or the Sophisticated strategy can be adopted. F-crit 3.159 (d) Two-Sample Test of Means between the Sophisticated (1) and Trendy (2) Strategies H0: μ1 = μ2 (two-tailed test) H1: μ1 ≠ μ2 t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Sophisticated Trendy 419 401.5 5075.579 3781.737 20 20 4428.6579 0 38 0.8316 0.2054 1.6860 0.4108 2.0244 t-crit (0.05,38) = =TINV(0.05,38) 2.0244 Conclusion Since t-stat = 0.8316 lies within the region of non-rejection of H0 (i.e. ± 2.0244), there is insufficient sample evidence to reject H0 in favour of H1. Therefore the population mean sales from each of the two strategies is likely to be identical. The two strategies are therefore equally effective and either can be adopted by the company. (e) Recommendation The Athletic strategy can be discarded. It appears the least effective. The remaining two strategies (Sophisticated and Trendy) are equally effective and therefore either can be adopted by the company to promote its new ladies deodorant. Exercise 11.14 (a) X11.14 - leverage ratio.xlsx File: Descriptive Statistics Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Confidence Level(95.0%) Technology 73.83 1.697 72.5 60 9.296 86.420 -1.309 0.010 28 60 88 2215 30 3.471 Construction 78.07 1.912 78.5 80 10.471 109.651 -0.546 0.051 40 58 98 2342 30 3.910 Banking Manufacturing 69.73 2.527 68 68 13.841 191.582 -0.453 0.103 58 41 99 2092 30 5.168 76.37 2.347 79.5 81 12.853 165.206 0.495 -0.857 50 44 94 2291 30 4.799 Interpretation The mean leverage ratio is lowest for the banking sector and highest for the construction and the manufacturing sectors. These differences do appear to be significant. (b) One-Factor Anova Let μi = Factor = Economic Sector (1 = Technology; 2 = Construction; 3 = Banking; 4 = Manufacturing) Response measure = Leverage ratio H0: population mean leverage ratio per economic sector i μ1 = μ2 = μ3 = μ 4 H1: At least one μi differs (i = 1, 2, 3, 4) Anova: Single Factor SUMMARY Groups Technology Construction Banking Manufacturing Count 30 30 30 30 ANOVA Source of Variation Between Groups Within Groups SS 1181.13 16032.87 Total 17214 Sum 2215 2342 2092 2291 df 3 116 119 Average 73.83 78.07 69.73 76.37 Variance 86.42 109.65 191.58 165.21 MS 393.711 138.214 F-stat 2.849 P-value 0.0406 F crit 2.683 Conclusion Since F-stat = 2.849 > F-crit = 2.683 (marginally), there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one sector with a different mean leverage ratio relative to the other sectors. By inspection, the banking sector has the lowest mean leverage ratio, while construction and manufacturing appear to have similarly high mean leverage ratios. Recommendation (c) The investor is advised to consider either the banking sector (with the lowest mean leverage ratio) or the technology sector (with a marginally higher mean leverage ratio). (This difference may not be statistically significant). Two-Sample Test of Means - Leverage ratios between Technology (1) and Banking (2) sectors. H0: μ1 = μ2 (two-tailed test) H1: μ1 ≠ μ2 t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Technology 73.833 86.420 30 139.001 0 58 1.347 0.092 1.672 0.183 2.002 Banking 69.733 191.582 30 t-crit (0.05, 58) = =TINV(0.05,58) 2.002 Conclusion Since t-stat = 1.347 lies within the region of non-rejection of H 0 (i.e. within ±2.002), the sample evidence is not strong enough to reject H 0 in favour of H1. The population mean leverage ratios between the Technology and the the Banking sector are therefore likely to be equal. (d) Recommendation Based on the statistical evidence in (c) and (d), an investor is advised to consider either the Technology sector or the Banking sector. Their mean leverage ratios are likely to be equal. Since both sectors offer an investor the same lower risk, either or both can be chosen for investment. Exercise 11.15 (a) File: Descriptive Statistics Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Confidence Level(95.0%) On-the-Job 9 0.082 9 9 0.327 0.107 0.109 0.472 1.2 8.5 9.7 144 16 0.174 Lecture 8.75 0.112 8.65 8.5 0.420 0.177 0.339 0.759 1.5 8.2 9.7 122.5 14 0.243 X11.15 - training methods.xlsx Role Play 9.2 0.094 9.2 9.6 0.325 0.105 -1.420 0.229 0.900 8.8 9.7 110.4 12 0.206 Audio-Visual 8.85 0.114 8.9 8.9 0.426 0.181 3.419 -1.522 1.7 7.7 9.4 123.9 14 0.246 Interpretation The differences in mean performance scores appear marginal across the four different training methods, with on-the-job training and role-play having the highest average scores. (b) One-Factor Anova Let μi = Factor = Training Method (1 = On-the-Job; 2 = Lecture; 3 = Role Play; 4 = Audio-Visual) Response measure = Performance score (1 - 10) H0: population mean performance score per training method i μ1 = μ2 = μ3 = μ4 H1: At least one μi differs (i = 1, 2, 3, 4) Anova: Single Factor SUMMARY Groups On-the-Job Lecture Role Play Audio-Visual Count 16 14 12 14 Sum 144 122.5 110.4 123.9 ANOVA Source of Variation Between Groups Within Groups SS 1.487 7.41 df 3 52 Total 8.897 55 Average 9 8.75 9.2 8.85 MS 0.4957 0.1425 Variance 0.107 0.177 0.105 0.181 F-stat 3.479 P-value 0.0223 F-crit 2.7826 Conclusion Since F-stat = 3.479 > F-crit = 2.7826, there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Therefore conclude that there is at least one training method with a different mean performance score relative to the other training methods. By inspection, the lecture and audio-visual are the least effective (lower mean scores), while on-the-job and the role-play methods are more effective (with higher mean scores). Recommendation The training manager is advised to consider either on-the-job training or use role play methods (The difference does not appear to be statistically significant). (c) Two-Sample Test of Means Performance scores between On-the-Job (1) and Role Play (2) methods. H0: μ1 = μ2 (two-tailed test) H1: μ1 ≠ μ2 t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail On-the-Job Role Play 9 9.2 0.10666667 0.105454545 16 12 0.10615385 0 26 -1.6074361 0.06001825 1.7056179 0.1200365 t-crit (0.05,26) = 2.05552942 =TINV(0.05,26) 2.0555 Conclusion Since t-stat = -1.6074 lies within the region of non-rejection of H 0, (i.e. within ±2.0555) the sample evidence is not strong enough to reject H0 in favour of H1. The population mean performance scores between the two training methods is likely to be the same. (d) Recommendation Based on the statistical evidence in (b) and (c), the HR manager is advised to select either on-the-job training or role play methods of training. Both are likely to produce similar high mean performance scores. Exercise 11.16 In one-factor ANOVA, only one categorical factor is used to explain possible differences between the observed sample means. In two-factor ANOVA, two categorical factors are used to explain possible differences between the observed sample means. Exercise 11.17 Response variable (y) Factor 1 (x1) numeric, ratio-scaled categorical (nominal / ordinal-scaled) Factor 2 (x2) categorical (nominal / ordinal-scaled) Exercise 11.18 Interaction term means that different combinations of the levels from the two factors can have different influences (effects) on the numeric response variable. Exercise 11.19 Interaction plot: It visually displays the nature and strength of the interaction effect of combinations of factor levels on the dependent response variable. It is constructed from the sample means of the various combinations of the different factor levels. Exercise 11.20 (a) - (f) Two-factor ANOVA with Interaction Table Source of Variation (g) df SS MS F-stat p -value F -crit Factor A: PC O.S. 2 68 34 6.8 0.0025 5.08 Factor B: Laptop 3 42 14 2.8 0.0499 4.22 Interaction (AxB) 6 132 22 4.4 0.0013 3.20 Error (Residual) Total 48 59 240 482 5 Factor A is statistically signficant (at α = 0.01) since F-stat (6.8) > F-crit (5.08). Alternatively, its p-value (0.0025) < α (0.01). Factor B is not statistically signficant (at α = 0.01) since F-stat (2.8) < F-crit (4.22). Alternatively, its p-value (0.0499) > α (0.01). The interaction effect between factor A and factor B is statistically signficant (at α = 0.01) since F-stat (4.4) > F-crit (3.20). Alternatively, its p-value (0.0013) < α (0.01). XS11.21 Exercise 11.21 (a) (i) File: X11.21 - sales ability.xlsx Factor: Qualifications H0: μ(b) = μ(a) = μ(s) Factor: Experience H0: μ(under 3) = μ(over 3) H1: At least one μi differs (I = b, a, s) H1: At least one μi differs (i = under 3, over 3) Factor: Interaction H0: No interaction effect H1: There is an interaction effect Anova: Two-Factor With Replication SUMMARY Business (Data Analysis - Excel ) Arts Social Science Total Under 3 years Count Sum Average Variance 10 10 400.1 376.6 40.01 37.66 61.31656 23.07156 10 30 325.9 1102.6 32.59 36.7533333 14.56766667 40.628092 10 10 446.2 357.6 44.62 35.76 28.37956 92.68711 10 30 415.8 1219.6 41.58 40.6533333 42.13066667 64.626023 20 20 846.3 734.2 42.315 36.71 48.08029 55.78305 20 741.7 37.085 48.12555263 Over 3 years Count Sum Average Variance Total Count Sum Average Variance ANOVA Source of Variation Sample (Experience) Columns (Qualifications) Interaction (Exper x Qualif) Within Total SS 228.15 392.73 300.26 2359.38 3280.52 df 1 2 2 54 59 MS 228.150 196.365 150.131 43.692 F-stat p-value 5.222 0.0263 4.494 0.0157 3.436 0.0394 F-crit 4.020 3.168 3.168 (a) (ii) Factor: Qualifications Since F-stat (4.494) > F-crit (3.168), reject H0 and conclude that qualifications is a statistically significant factor (α = 0.05) Business graduates generate significantly higher average sales (42.32) than graduates with Arts (36.71) or Social Science (37.09) degrees. (a) (iii) Factor: Experience Since F-stat (5.22) > F-crit (4.02), reject H0 and conclude that experience is a statistically significant factor (α = 0.05) Graduates with over 3 years experience generate significantly higher average sales (40.65) than graduates with less than 3 years experience (36.75). (a) (iv) Factor: Interaction effect Since F-stat (3.436) > F-crit (3.168), reject H0 and conclude that there is a significant interaction effect between experience and qualifications on graduates' sales performance (α = 0.05) Business graduates with over 3 years experience perform the best (44.62) while social science graduates with with less than 3 years experience perform the worst (32.59). (b) Summary of Sample Means Table Under 3 years Over 3 years Business 40.01 44.62 Arts 37.66 35.76 Social Science 32.59 41.58 Page 19 XS11.21 Chart Title Interaction Plot 46 44 42 40 38 36 34 32 30 Business Arts Under 3 years Social Science Over 3 years See (a) (iv) for an interpretation of the interaction plot. (c) Recommendation to HR manager: Recruit predominantly business and social science graduates with over 3 years of work experience. If Arts graduates are employed, they must be given intensive marketing training. Page 20 Exercise 11.22 (a) (i) File: X11.22 - dropped calls.xlsx Random variable: % of daily dropped calls per network switch and transmission type Factor 1 = Switch type (SW1, SW2, SW3, SW4) Factor 2 = Transmission type (Voice, Data bundles) Management Question: Is the % of daily dropped calls the same across all network switching devices and / or transmission types Factor: Switch type H0: μ(1) = μ(2) = μ(3) = μ(4) Factor: Transmission type H0: μ(voice) = μ(data) H1: At least one μi differs (i = 1,2,3,4) H1: At least one μi differs (i = voice, data) Factor: Interaction H0: No interaction effect H1: There is an interaction effect Anova: Two-Factor With Replication (Data Analysis - Excel ) SUMMARY SW2 SW1 SW3 SW4 Total Voice Count Sum Average Variance 8 8 8 8 32 7.76 4.35 4.69 6.39 23.19 0.97 0.54375 0.58625 0.79875 0.724688 0.098657 0.081884 0.086227 0.071155 0.106645 Data Count Sum Average Variance 8 8 8 8 32 5.83 6.82 8.16 7.1 27.91 0.72875 0.8525 1.02 0.8875 0.872188 0.047984 0.078707 0.057171 0.067279 0.067818 Total Count Sum Average Variance 16 16 16 16 13.59 11.17 12.85 13.49 0.849375 0.698125 0.803125 0.843125 0.083953 0.100363 0.11709 0.066703 ANOVA Source of Variation Sample (Transmission) Columns (Switch) Interaction (Trans x Switch) Within Total SS 0.3481 0.234819 1.050075 4.12345 5.756444 df MS F-stat 1 0.3481 4.727498 3 0.078273 1.063014 3 0.350025 4.753641 56 0.073633 63 P-value F-crit 0.033926 4.012973 0.372175 2.769431 0.005055 2.769431 (a) (ii) Factor: Switching device Since F-stat (1.063) < F-crit (2.769), do not reject H0 and conclude that switch devices are not statistically significant (α = 0.05). Hence all switch devices are likely to have the same average dropped call rate. (a) (iii) Factor: Transmission Since F-stat (4.73) > F-crit (4.01), reject H0 and conclude that transmission type is a statistically significant factor (α = 0.05) Data transmission lead to a higher average dropped call rate (0.872) than voice transmission (0.725). (a) (iv) Factor: Interaction effect Since F-stat (4.75) > F-crit (2.769), reject H0 and conclude that there is a significant interaction effect between switch devices and transmission type. (α = 0.05) SW2 and SW3, transmitting voice (0.544 and 0.586), are likely to have the lowest average dropped call rate, while SW1 transmitting voice (0.97) and SW3 transmitting data (1.02), are likely to have the highest average dropped call rate. (b) Summary of Sample Means Table SW1 0.970 0.729 Voice Data SW2 0.544 0.853 SW3 0.586 1.020 SW4 0.799 0.888 Chart Title 1.100 1.000 1.020 0.970 0.900 0.888 0.853 0.800 0.799 0.729 0.700 0.600 0.586 0.544 0.500 0.400 SW1 SW2 SW3 Voice SW4 Data See (a) (iv) for an interpretation of the interaction plot. (c) Recommendation to the chief engineer: For voice transmissions, use switching devices 2 and 3; For data transmission, use only switching device 1. Investigate the high % dropped calls rates of switching device 3 (for data transmission) and switching device 4 for both transmission types. Exercise 11.23 (a) File: X11.23 - rubber wastage.xlsx Random variable: % rubber wastage per week Factor: Machine (TAM) H0: μ(1) = μ(2) = μ(3) Factor: Tyre H0: μ(R) = μ(B) H1: At least one μi differs (i = 1, 2, 3) H1: At least one μi differs (i = Radial, Bias) Factor: Interaction H0: No interaction effect H1: There is an interaction effect Anova: Two-Factor With Replication (Data Analysis - Excel ) SUMMARY TAM2 TAM1 TAM3 Total Radial Count Sum Average Variance 5 19.71 3.942 3.96097 5 33.52 6.704 6.56833 5 15 19.44 72.67 3.888 4.844667 3.44237 5.844455 5 13.47 2.694 1.35283 5 38.6 7.72 9.83965 5 15 43.77 95.84 8.754 6.389333 8.96443 13.26548 Bias Count Sum Average Variance Total Count Sum Average Variance 10 10 10 33.18 72.12 63.21 3.318 7.212 6.321 2.794329 7.579173 12.09134 ANOVA Source of Variation Sample (Tyres) Columns (TAMS) Interaction (Tyre x TAM) Within Total SS 17.89496 83.25042 47.77433 136.5143 285.434 df 1 2 2 24 29 MS F-stat p-value F-crit 17.89496 3.146037 0.088802 4.259677 41.62521 7.317951 0.003301 3.402826 23.88716 4.1995 0.0273 3.402826 5.688097 Statistical and Management Conclusions Since F-stat (3.15) < F-crit (4.26), do not reject H0 and conclude that tyre types is not Factor: Tyres a statistically significant factor (α = 0.05) Regardless of TAM used, average rubber wastage is the same across both tyre types produced. Since F-stat (7.318) > F-crit (3.403), reject H0 and conclude that TAM used Factor: TAMs is a statistically significant factor (α = 0.05) Regardless of tyre type produced, average rubber wastage is likely to be lowest on TAM1 (3.318). compared to TAM2 (7.212) and TAM3 (6.321). Since F-stat (4.199) > F-crit (3.403), reject H0 and conclude that there is a significant Factor: Interaction effect interaction effect between tyre type produced and TAM used (α = 0.05) Average rubber wastage for bias tyres is lowest on TAM1 (2.694) but highest on TAM3 (8.754). By contrast, average rubber wastage of radial tyres is lowest on TAM3 (3.888) and highest on TAM2 (6.704). (b) Interaction Plot Radial Bias TAM1 3.942 2.694 TAM2 6.704 7.720 TAM3 3.888 8.754 Chart Title 10.000 9.000 8.000 7.000 6.000 5.000 4.000 3.000 2.000 1.000 0.000 8.754 7.720 6.704 3.942 3.888 2.694 TAM1 TAM2 Radial See (a) for an interpretation of the interaction plot. TAM3 Bias (c) Recommendation to the production manager: TAM1 machine is the most efficient in minimising wastage for both tyre types (purchase more of TAM1). Allocate Radial tyres manufacturing to TAM3 (it has the minimum % wastage of all three machines on Radial tyres). Investigate the high % wastage of TAM3 for Bias tyres and TAM2 for both tyre types. Possibly replace TAM2 type machines with TAM1 type machines. CHAPTER 12 SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS Exercise 12.1 Regression analysis defines the structural relationship between two numeric variables as a mathematical equation (usually a straight line equation). Its purpose is to use the equation for estimation / prediction purposes. Correlation analysis measures the strength of the relationship between the two numeric variables used in the regression equation. Exercise 12.2 Dependent variable, y Exercise 12.3 An independent variable, x , is used as a predictor of the dependent variable. Exercise 12.4 Scatter plot Exercise 12.5 Method of Least Squares Exercise 12.6 Strong inverse linear relationship Exercise 12.7 H0: ρ = 0 H1: ρ ≠ 0 Degrees of freedom = 18 - 2 = 16 t-crit = t(0.05, 16) = (Use =TINV(0.05,16) in Excel ) 2.12 Decision Rule: Accept H0 if -2.12 ≤ t-stat ≤ +2.12 (0.42) * √[(18 - 2)/(1 - 0.422)] = t-stat = 1.851 Conclusion: Do not reject H0. There is no statistically significant relationship between x and y . Exercise 12.8 (a) File: X12.8 - training effectiveness.xlsx Scatter plot - Training hours versus Productivity Scatter Plot - Training hours vs Productivity 80 output (units) 70 60 50 40 30 20 10 15 20 25 30 35 40 45 hours of training Interpretation There is a strong positive relationship between hours of training and worker output. The more the training received, the higher the output. (b) n = 10 Σ Training (x) 20 36 20 38 40 33 32 28 40 24 311 Coefficients Output (y) 40 70 44 56 60 48 62 54 63 38 535 b1 = b0 = Thus x² 400 1296 400 1444 1600 1089 1024 784 1600 576 10213 xy 800 2520 880 2128 2400 1584 1984 1512 2520 912 17240 y² 1600 4900 1936 3136 3600 2304 3844 2916 3969 1444 29649 (10*17240-(311*535))/(10*10213-(311²)) 1.112 (535 - 1.112*311)/10 18.917 ŷ = 18.917 + 1.112 x where 20 ≤ x ≤ 40 (c) Correlation coefficient (r) = (17240-(311*535)/10)/√((10213-(311²)/10)*(29649-(535²)/10)) = 0.8072 Coefficient of Determination (r²) = 0.8072² = 0.6516 Variation in training hours can explain 65.16% of the variability in worker output. This is a high level of explained variation. Hence training input is very beneficial to worker output and the training programmes should be continued. (d) ŷ =18.917+1.112*25 x = 25 = 46.717 units A worker with 25 hours of training can be expected to produce 46.72 units of output, on average. Exercise 12.9 (a) File: X12.9 - capital utilisation.xlsx Scatter plot - Earnings Yield versus Inventory Turnover Scatter Plot Earnings Yield vs Inventory Turnover 18 earnings yield 16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 7 8 9 inventory turnover Interpretation There is a strong positive relationship between inventory turnover and earnings yield. As inventory turnover increases, earnings yields also increases. (b) n =9 Σ Inv t/o (x) 3 5 4 7 6 4 8 6 5 48 Coefficients Thus (c) Correlation coefficient (r) = E.Y. (y) 10 12 8 13 15 10 16 13 10 107 x² 9 25 16 49 36 16 64 36 25 276 xy 30 60 32 91 90 40 128 78 50 599 y² 100 144 64 169 225 100 256 169 100 1327 b1 = (9*599-(48*107))/(9*276-(48²)) 1.4167 b0 = (107-1.4167*48)/9 4.333 ŷ = 4.333 + 1.4167 x where 3 ≤ x ≤ 8 (599-(48*107)/9)/√((276-(48²)/9)*(1327-(107²)/9)) = 0.8552 There is a strong positive linear association between inventory turnover and earnings yield. Thus the business analyst's view is supported by the strong sample evidence. (d) Coefficient of Determination (r²) = 0.8552² = 0.7313 i.e. 73.13% Inventory turnover (capital utilisation) can explain 73.12% of the variability in a company's earnings yield. This is a high level of explained variation. Hence inventory turnover has been shown to have a significant direct effect on a company's earnings yield. Yes, the regression equation can by used with confidence to estimate earnings yield based on a company's level of inventory turnover. (e) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t-crit = t(α=0.05,df = 7) = ± 2.364 -2.364 ≤ t-stat ≤ +2.364 t-stat = (0.8552)*√((9-2)/(1-(0.8552²))) = 2.364624 4.3655 Conclusion Since t-stat (4.3655) lies outside the area of acceptance for H 0,there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Conclude that there is a strong positive association between inventory turnover and earnings yield. (f) Expected earnings yield for x = 6 ŷ = 4.333 + 1.4167(6) = 12.833% A company with an expected inventory turnover of 6 next year can expect to achieve an earnings yield of 12.833%. Exercise 12.10 File: X12.10 - loan applications.xlsx (a) Dependent variable (y) = No. of Loan applications received Independent variable (x) = interest rate (%) Interest rate (%) is assumed to influence the number of loan applications received. (b) Scatter plot no. loan applications received Scatter Plot Loan applications Received vs Interest Rate (%) 35 30 25 20 15 10 4 5 6 7 8 9 10 interest rate (%) Interpretation A moderate to strong negative linear relationship between interest rate and number of loan applications received is observed from the scatter plot. (c) n = 11 Σ Correlation coefficient Int rate % (x) 7 6.5 5.5 6 8 8.5 6 6.5 7.5 8 6 75.5 r= Applications (y) 18 22 30 24 16 18 28 27 20 17 21 241 x² 49 42.25 30.25 36 64 72.25 36 42.25 56.25 64 36 528.25 y² 324 484 900 576 256 324 784 729 400 289 441 5507 (11*1614.5 - 75.5*241)/√((11*528.25-75.5²)*(11*5507-241²)) = -0.8302 Interpretation This correlation shows a strong negative (inverse) assocation between interest rates (%) and the number of loan applications received. xy 126 143 165 144 128 153 168 175.5 150 136 126 1614.5 (d) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t-crit = t(α=0.05,df = 9) = -2.262 ≤ t-stat ≤ +2.262 ± 2.262 t-stat = (-0.8302)*√((11-2)/(1-(-0.8302²))) = '= -4.4677 Conclusion Since t-stat (-4,4677) lies outside the area of acceptance for H0,there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Conclude that there is a strong negative relationship between interest rates (%) and the number of loan applications received monthly. (e) Regression coefficients Thus b1 = (11*1614.5 - 75.5*241)/(11*528.25-75.5²) -3.9457 b0 = (241 - (-3.9457)*(75.5))/11 48.991 ŷ = 48.991 - 3.9457 x where 5.5 ≤ x ≤ 8.5 (f) Interpretation of b1 coefficient For a 1% increase in interest rate, 3.95 fewer loan applications will be received. (g) x=6 ŷ= 48.991 - 3.9457*(6) = 25.32 applications If the rate of interest is 6%, the bank can expect to receive 25.32 (say, 25) applications. Exercise 12.11 File: X12.11 - maintenance costs.xlsx (a) Dependent variable (y) = Annual maintenance cost (Rand) Independent variable (x) = Machine age (years) Machine age is assumed to influence annual maintenance costs. (b) Scatter plot annual maintenance cost (R) Scatter Plot Annual Maintenance Cost (R) vs Age of Machines 70 60 50 40 30 20 10 0 0 2 4 6 8 10 machine age (years) Interpretation A strong direct (positive) linear relationship between the ages of machines and their annual maintenance costs (in Rands) is observed in the scatter plot. (c) Correlation coefficient n = 12 Σ Age (yrs) 4 3 3 8 6 7 1 1 5 2 4 6 50 r= Annual Cost (R) 45 20 38 65 58 50 16 22 38 26 30 35 443 x² 16 9 9 64 36 49 1 1 25 4 16 36 266 y² 2025 400 1444 4225 3364 2500 256 484 1444 676 900 1225 18943 (12*2182 - 50*443)/√((12*266-50²)*(12*18943-443²)) = 0.870028 Interpretation There is a very strong positive association between the ages of of machines and their annual maintenance costs (in Rands). xy 180 60 114 520 348 350 16 22 190 52 120 210 2182 (d) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t crit = t(α=0.05,df = 10) = -2.228 ≤ t-stat ≤ +2.228 ± 2.228 t-stat = (0.870028)*√((12-2)/(1-0.870028²)) 5.5806 Conclusion Since t-stat (5.5806) lies outside the area of acceptance for H 0,there is sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1. Conclude that there is a strong direct association between age of machines and the level of annual maintance costs (in Rands). (e) Regression coefficients b1 = b0 = Thus (12*2182 - 50*443)/(12*266 - 50²) 5.8295 (443 - 5.8295*50)/12 12.627 ŷ = 12.627 + 5.8295 x where 1 ≤ x ≤ 8 (f) Interpretation of b1 coefficient If the age of a machine increased by 1 year, annual maintenance costs will rise by R5.8295. (Alternatively, for every year older, the annual maintenance costs increase by R5.8295) (g) ŷ= 12.627 + 5.8295*(5) = x=5 R 41.77 For a 5 year old machine, annual maintenance costs are expected to be R41.77. Exercise 12.12 (a) File: X12.12 - employee performance.xlsx Scatter Plot performance rating (0 - 100) Scatter Plot Performance Rating vs Aptitude Score 95 90 85 80 75 70 65 60 55 50 2 3 4 5 6 7 8 9 10 aptitude score (1 - 10) Interpretation There appears to be a weak to moderate positive association between an employee's aptitude score and their performance rating. (b) Correlation coefficient (r) n = 12 Σ Aptitude (x) 7 6 5 4 5 8 7 8 9 6 4 6 75 r= Perf rating (y) 82 74 82 68 75 92 86 69 85 76 72 64 925 x² 49 36 25 16 25 64 49 64 81 36 16 36 497 y² 6724 5476 6724 4624 5625 8464 7396 4761 7225 5776 5184 4096 72075 xy 574 444 410 272 375 736 602 552 765 456 288 384 5858 (12*5858 - 75*925)/√((12*497-75²)*(12*72075-925²)) 0.5194 Interpretation There is a moderate positive association between the aptitude scores of employees and their performance scores after one year. (c) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t-crit = t(α=0.05,df = 10) = -2.228 ≤ t-stat ≤ +2.228 t-stat = (0.5194)*√((12-2)/(1-0.5194²)) ± 2.228 1.922 Conclusion Since t-stat (1.922) lies inside the area of acceptance for H0,there is not sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1. Conclude, at a 5% significance level, that there is no statistically significant association between an employee's aptitude score and their job performance rating score one year later. (d) Regression coefficients Thus b1 = (12*5858-75*925)/(12*497-75²) b0 = (925 - 2.7168*75)/12 ŷ = 60.1032 + 2.7168 x 2.7168 60.1032 where 4 ≤ x ≤ 9 Interpretation of b1 coefficient If an employee's aptitude score increases by one unit, their performance rating score will increase by 2,7168 points. (e) Estimation ŷ= x =8 60.1032 + 2.7168*(8) = 81.84 For an employee with an aptitude score of 8, they could expect a job performance rating score of 81.84. The association between aptitude score and performance rating is not statistically significant. Therefore, the call centre manager should have low confidence in this estimated performance rating score. Exercise 12.13 (a) File: X12.13 - opinion polls.xlsx Correlation coefficient (r) n = 11 Σ Poll (%) (x) 42 34 59 41 53 40 65 48 59 38 62 541 r= Election (%) (y) 51 31 56 49 68 35 54 52 54 43 60 553 x² 1764 1156 3481 1681 2809 1600 4225 2304 3481 1444 3844 27789 y² 2601 961 3136 2401 4624 1225 2916 2704 2916 1849 3600 28933 xy 2142 1054 3304 2009 3604 1400 3510 2496 3186 1634 3720 28059 (11*28059 - 541*553)/√((11*27789-541²)*(11*2893-553²)) 0.7448 Interpretation There is a moderate to strong positive association between opinion poll predictions and the actual election results. (b) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t-crit = t(α=0.05,df = 9) = -2.262 ≤ t-stat ≤ +2.262 ± 2.262 t-stat = (0.7448)*√((11-2)/(1-0.7448²)) 3.3484 Conclusion Since t-stat (3.3484) lies outside the area of acceptance for H0,there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Conclude, at a 5% significance level, that there is a statistically significant association between opinion poll predictions and the actual election results. (c) Regression coefficients b1 = (11*28059 - 541*553)/(11*27789-541²) = b0 = (553 - 0.729*541)/11 = Thus 0.729 14.4175 where 34 ≤ x ≤ 65 ŷ = 14.4175 + 0.729 x Interpretation of b1 coefficient For a one percentage point increase in an opinion poll prediction, the actual election percentage is likely to increase by 0.729 percentage points. (d) Coefficient of Determination r² r² = 0.7448² = 0.5547 Interpretation Opinion poll predictions can explain 55.47% of variation in actual election results percentages (e) Prediction x = 58 ŷ= 14.4175 + 0.729*(58) = 56.70% If an opinion poll predicts support at 58%, the actual election result is likely to be 56.7%. (f) x = 82 ŷ= 14.4175 + 0.729*(82) = 74.20% If an opinion poll predicts support at 82%, the actual election result is likely to be 74.2%. Because x = 82 is beyond the domain of x (34 ≤ x ≤ 65), this expected actual election result is unreliable and possibly invalid due to extrapolation being used. Exercise 12.14 File: (a) Dependent variable (y) = Independent variable (x) = (b) Scatter Plot X12.14 - capital investment.xlsx return on investment (%) capital investment Scatter Plot Return on Investment (%) vs Capital Investment (%) return on investment (%) 10 8 6 4 2 0 10 20 30 40 50 60 70 80 90 levels of capital investm ent (%) Interpretation There appears to be a weak to moderate association between a company's level of capital investment and its return on investment. (c) Excel's Data Analysis (Regression) SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.41447 0.17179 0.15253 2.12495 45 ANOVA Regression Residual Total Intercept Capital df SS 40.27335 194.16309 234.43644 MS 40.27335 4.51542 F-stat 8.91907 Coefficients Standard Error 1.27412 1.19293 0.06783 0.02271 t Stat 1.06806 2.98648 P-value 0.29145 0.00465 1 43 44 Correlation coefficient (r) = Regression line (equation) p-value 0.00465 Lower 95% Upper 95% -1.13166 3.67990 0.02203 0.11363 0,414473 ŷ = 1.2741 + 0.06783 x for 21.1 ≤ x ≤ 79.5 (d) Coefficient of Determination r² r² = 0.41447² = 0.171788 Interpretation The variation in the level of capital investment explains only 17.18% of the variation in return on investment . (e) H0 : ρ = 0 Area of Acceptance for H0 H1 : ρ ≠ 0 t-crit = t-stat = t(α=0.05,df = 43) = -2.016 ≤ t-stat ≤ +2.016 (0.41447)*√((45-2)/(1-0.41447²)) = ± 2.016 2.9865 Conclusion Since t-stat (2.9865) lies outside the area of acceptance for H0,there is sufficient sample evidence at the 5% significance level to reject H0 in favour of H1. Conclude, at a 5% significance level, that there is a statistically significant association between a company's level of capital investment and its return on investment. (f) b1 = Interpretation of b1 coefficient 0.06783 For a one percentage point change in capital investment , company return on investment can be expected to change by 0.06783 percentage points. (h) Estimation x = 55 ŷ = 1.27412 + 0.06783 (55) = 5.005% The expected return on investment for a company with a 55% level of capital investment is 5.005%. Exercise 12.15 (a) File: Dependent variable (y ) = Independent variable (x ) = X12.15 - property valuations.xlsx Market Values Council Valuations Council valuations are assumed to have an influence on property market values. (b) Scatter Plot Market Values vs Council Valuations 300 market values (R) 250 200 150 100 50 Market values 0 0 50 100 150 200 council valuations (R) Interpretation There appears to be a strong positive correlation between the council's valuation of a residential property in Bloemfontein and its market value. (c) Excel's Data Analysis (Regression) SUMMARY OUTPUT Regression Statistics Multiple R 0.78104 R Square 0.61002 Adjusted R Square 0.59975 Standard Error 26.40553 Observations 40 ANOVA df Regression Residual Total Intercept Council valuations SS MS 41444.82 41444.82 26495.58 697.2521 67940.4 Coefficients Standard Error 14.2370 71.3619 1.0151 0.1317 Correlation coefficient (r) = Regression line (equation) (d) 1 38 39 t Stat 5.0124 7.7097 0.78104 ŷ = 71.3619 + 1.0151 x F-stat 59.4402 p-value 2.7522E-09 P-value Lower 95% Upper 95% 1.28E-05 42.541 100.183 2.75E-09 0.749 1.282 for 48 ≤ x ≤ 154 Coefficient of Determination r² r² = 0.78104² = 0.61002 Interpretation Variation in council valuations explain 61.002% of the variation in property market values . (e) H0: ρ = 0 Area of Acceptance for H0 H1: ρ ≠ 0 t-crit = t-stat = t(α=0.05,df = 38) = -2.024 ≤ t-stat ≤ +2.024 (0.78104)*√((40-2)/(1-0.78104²)) = ± 2.024 7.70975 Conclusion Since t-stat (7.70975) lies well outside the area of acceptance for H0,there is sufficient sample evidence at the 5% significance level to reject H 0 in favour of H1. Conclude, at a 5% significance level, that there is a statistically significant association between the Council's valuation and the resultant property market valuation. (f) b1 = Interpretation of b1 coefficient 1.0151 For a R1 (in R1000) change (up / down) in council valuation , market valuation of a property can be expected to change (up / down) by R1.0151 (in R1000) . (g) Estimation x = 100 ŷ = 71.3619 + 1.0151 (100) = R172.874 (in R1000s) The expected market value of a property which the council values at R100 (in R1000s) is likely to be R172.874 (in R1000s). CHAPTER 13 MULTIPLE LINEAR REGRESSION Exercise 13.1 Simple linear regression has only one independent variable (x1) whereas multiple linear regression has two or more independent variables (x1, x2, x3, … , xk) that are assumed to influence the outcome of the dependent variable, y . Exercise 13.2 ANOVA df Regression Residual Total 5 40 45 SS 84 148 232 MS 16.8 3.7 F-stat 4.541 p-value 0.002261 (a) R2 = 84/232 = 0.3621 36.2% of total variation in y can be explained by the 5 independent variables. (b) H0: β1 = β2 = β3 = β4 = β5 = 0 or H0: ρ = 0 vs vs F-crit 2.45 H1: At least one βi ≠ 0 (i = 1,2,3,4,5) H1: ρ ≠ 0 (c) F-stat = 4.5412 and F-crit = F(0.05,5,40) = 2.45 (See Anova Table) (d) Reject H0. Conclude that the overall model is statistically significant. (i.e. at least one x i is statistically significant in estimating y) Exercise 13.3 Intercept A B C D t-crit = (a) Coefficients 1.82 0.68 -2.35 0.017 1.96 Std Error 1.12 0.28 0.984 0.012 1.16 t-stat 1.63 2.44 -2.39 1.42 1.69 p-value 0.1215 0.0253 0.0140 0.1737 0.1083 Lower 95% -0.53 0.09 -4.42 -0.01 -0.48 Upper 95% 3.92 2.78 -0.25 2.12 4.06 t(0.05,24-4-1) = t(0.05,19) = 2.093 For each x i variable (A, B, C and D), test: H0: βi = 0 against H1: βi ≠ 0 for i = A, B, C and D. (b), (c), (d) t-crit = t(0.05,19) = ±2.093 For A : Since t-stat (2.44) > t-crit (+2.093); or p -value (0.0253) < α (0.05), or {0.09 ≤ βA ≤ 2.78} does not cover zero, conclude variable A is statistically significant. For B : Since t-stat (-2.39) < -t-crit (-2.093); or p -value (0.014) < α (0.05), or {-4.42 ≤ βB ≤ -0.25} does not cover zero, conclude variable B is statistically significant. For C : Since t-stat (1.42) < -t-crit (+2.093); or p -value (0.1737) > α (0.05), or {-0.01 ≤ βC ≤ 2.12} covers zero, conclude variable C is not statistically significant. For D : Since t-stat (1.69) < -t-crit (+2.093); or p -value (0.1083) > α (0.05), or {-0.48 ≤ βD ≤ 4.06} covers zero, conclude variable D is not statistically significant. Exercise 13.4 (a) For x 3 variable, test: H0: β3 = 0 against H1: β3 ≠ 0 (b) t-crit = t(0.05,30) = ±2.042. Hence, do not reject H0 if -2.042 ≤ t-stat ≤ +2.042. (c ) Since t-stat (2.44) > t-crit (+2.042), hence reject H0 in favour of H1 at α = 0.05. (d) Conclude that the x 3 variable is statistically significant in estimating y . Exercise 13.5 (a) Holding all other variables constant, a unit increase in x 2 will result in a 1.6 reduction in y^ . (b) For x 2 variable, test: H0: β2 = 0 against H1: β2 ≠ 0 Yes, since the 95% confidence interval for β2 does not cover zero. Exercise 13.6 Binary coding scheme (Choose 'Lean' as the base category)' F1 and F2 are the dummy variable names chosen. Fuel type Leaded Unleaded Lean Exercise 13.7 F1 1 0 0 F2 0 1 0 Binary coding scheme (Choose 'spring' as the base category)' S1, S2 and S3 are the dummy variable names chosen. Season summer autumn winter spring S1 1 0 0 0 S2 0 1 0 0 S3 0 0 1 0 Exercise 13.8 File: X13.8 - employee absenteeism.xlsx Excel's Data Analysis - Regression SUMMARY OUTPUT Regression Statistics Multiple R 0.7384 R Square 0.5453 Adjusted R Square 0.5012 Standard Error 3.7204 Observations 35 ANOVA Regression Residual Total Intercept Tenure Satisfaction Commitment df SS 514.467 429.076 943.543 MS 171.489 13.841 Coefficients Standard Error 36.411 5.929 0.220 0.065 -0.184 0.109 -0.332 0.080 t-stat 6.141 3.376 -1.692 -4.130 3 31 34 (a) R2 = 514.467/943.543 = 54.53% (b) H0: βT = βS = βC = 0 H1: At least one βi ≠ 0 versus and F-crit = F(0.05,3,31) = 2.92 F-stat = 12.39 Reject H0. Conclude the overall model is statistically significant. F-stat p-value 12.390 0.00001707 p-value Lower 95% Upper 95% 8.22345E-07 24.318 48.504 0.001997 0.087 0.352 0.100765 -0.406 0.038 0.000254 -0.497 -0.168 p -value = 0.00001707 (i.e. at least one x i is statistically significant in estimating y) (c), (d), (e) For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0 with t-crit = t(0.05,31) = ±2.04 Tenure : Since t-stat (3.376) > t-crit (+2.04); or p -value (0.001997) < α (0.05), or {0.087 ≤ βT ≤ 0.352} does not cover zero, conclude Tenure is statistically significant. Satisfaction : Since t-stat (-1.692) lies within t-crit (±2.04); or p -value (0.100765) > α (0.05), or {-0.406 ≤ βS ≤ 0.038} covers zero, conclude Satisfaction is not statistically significant. Commitment : Since t-stat (-4.13) < t-crit (-2.04); or p -value (0.000254) < α (0.05), or {-0.497 ≤ βC ≤ -0.168} does not cover zero, conclude Commitment is statistically significant. (f) (i) No, organisational commitment is the most important explanatory factor because it has a larger t-stat value (-4.13) and a smaller p- value (0.000254) than job tenure . (f) (ii) No, job satisfaction is not a statistically signficant explanatory factor of employee absenteeism (see (c), (d) and (e) above). Yes, organisational commitment does play a statistically signficant role in explaining employee absenteeism (see (c), (d) and (e) above). (g) y(hat) = 36.411 + 0.22 (48) - 0.184 (50) - 0.332 (60) = 17.8008 Using t-crit = t(0.025,31) = ±2.04; standard error = 3.7204; n = 35 giving margin of error = 2.04 (3.7204)/√35 = 1.2826 16.5182 Lower 95% confidence limit = 17.8008 - 1.2826 19.0835 Upper 95% confidence limit = 17.8008 + 1.2827 {16.52 ≤ y(hat)(estimated) ≤ 19.08} Management interpretation: We can be 95% confident that the true average number of days absent per employee per annum is likely to lie between 16.5 days and 19.1 days. Exercise 13.9 File: X13.9 - plastics wastage.xlsx Excel's Data Analysis - Regression SUMMARY OUTPUT Regression Statistics Multiple R 0.8061 R Square 0.6498 Adjusted R Square 0.6109 Standard Error 0.5160 Observations 31 ANOVA Regression Residual Total Intercept Dexterity Speed Viscosity df 3 27 30 SS 13.3384 7.1887 20.5271 MS 4.4461 0.2662 Coefficients 1.8179 -0.1112 0.0173 1.9189 Standard Error 1.2189 0.0286 0.0047 1.2581 t-stat 1.4914 -3.8816 3.6770 1.5252 (a) R2 = 13.3384/20.5271 = 64.98% (b) H0: βD = βS = βV = 0 versus H1: At least one βi ≠ 0 F-crit = F(0.05,3,27) = 2.99 F-stat = 16.6992 and p-value = 0.000002466 Reject H0. Conclude the overall model is statistically significant. F-stat 16.6992 p-value 2.466E-06 p-value Lower 95% Upper 95% 0.1474 -0.6830 4.3188 0.0006 -0.1700 -0.0524 0.0010 0.0077 0.0270 0.1388 -0.6625 4.5004 (i.e. at least one x i is statistically significant in estimating y) (c), (d), (e) For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0 (f) The most important factor is operator dexterity (p -value (0.0006)), then machine speed with p -value = 0.001. Plastic viscosity is not a significant influencing factor (p -value = 0.1388). (g) y(hat) = 1.8179 - 0.1112 (25) + 0.0173 (200) + 1.9189 (0.25) = 2.9826 Using t-crit = t(0.025,27) = ±2.052; standard error = 0.516; n = 31 giving margin of error = 2.052 (0.516)/√31 = 0.1902 Lower 95% confidence limit = 2.9826 - 0.1902 = 2.7924 Upper 95% confidence limit = 2.9826 + 0.1902 = 3.1728 {2.79% ≤ y(hat)(estimated) ≤ 3.17%} Management interpretation: We can be 95% confident that the true average % of plastic wastage per shift is likely to lie between 2.79% and 3.17%. with t-crit = t(0.05,27) = ±2.052 Dexterity : Since t-stat (-3.886) < t-crit (-2.052); or p -value (0.0006) < α (0.05), or {-0.17 ≤ βD ≤ -0.0524} does not cover zero, conclude Dexterity is statistically significant. Speed : Since t-stat (3.677) > t-crit (+2.052); or p -value (0.001) < α (0.05), or {0.0077 ≤ βS ≤ 0.027} does not cover zero, conclude Speed is statistically significant. Viscosity : Since t-stat (1.525) < t-crit (+2.052); or p -value (0.1388) > α (0.05), or {-0.6625 ≤ βV ≤ 4.5} covers zero, conclude Viscosity is not statistically significant. Exercise 13.10 (a) File: Binary coding scheme (Choose 'method C' as the base category) MA and MB are the dummy variable names chosen. Method Code A B C (b) X13.10 - employee performance.xlsx MA 1 0 0 MB 0 1 0 Sample input data for first 6 consultants (showing the binary coded data) Consultant 1 2 3 4 5 6 Productivity 24 30 26 37 29 28 Experience 9 4 10 12 10 6 MA 1 0 1 0 0 0 MB 0 1 0 1 0 0 Excel's Data Analysis - Regression (using the binary coded data as in (a)) SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.7894 0.6231 0.5524 2.3425 20 ANOVA Regression Residual Total df 3 16 19 SS 145.151 87.799 232.95 Intercept Experience MA MB Coefficients 26.387 0.389 -3.659 1.472 Std error 1.940 0.167 1.376 1.336 MS 48.384 5.487 F-stat 8.817 t-stat p-value 13.601 3.29E-10 2.335 0.0329 -2.659 0.0172 1.102 0.2867 p-value 0.001108 Lower 95% Upper 95% 22.274 30.499 0.036 0.742 -6.576 -0.742 -1.360 4.304 Estimated multiple regression equation: y^ = 26.387 + 0.389 Experience - 3.659 MA + 1.472 MB (based on data recoded as in (a)) (c) R2 = 145.151/232.95 = 62.31% (d) H0: βE = βMA = βMB = 0 H1: At least one βi ≠ 0 versus F-crit = F(0.05,3,16) = 3.24 and F-stat = 8.817 p -value = 0.001108 Reject H0. Conclude the overall model is statistically significant. (i.e. at least one x i is statistically significant in estimating y) (e) For Experience , test: H0: βE = 0 against H1: βE ≠ 0 (f) For each of MA and MB, test: H0: βi = 0 vs H1: βi ≠ 0 (i = MA, MB), wiith t-crit = t(0.05,16) = ±2.12 with t-crit = t(0.05,16) = ±2.12 Since t-stat (2.335) > t-crit (2.12), conclude work experience is statistically significant at α = 0.05. MA : Since t-stat (-2.659) < lower t-crit (-2.12), conclude MA is statistically significant (i.e. adopting marketing method A results in significantly lower consultant productivity levels, on average, compared to using marketing method C (i.e. the base category). MB : Since –t-crit (-2.12) < t-stat (1.102) < +t-crit (-2.12), conclude MB is not statistically significant (i.e. consultant productivity levels, on average, are the same for marketing method B and and marketing method C (i.e. no difference to the base category average productivity level). Overall conclusion: the independent variable ‘marketing method’ is statistically significant, but only for marketing method A (when compared to method C (i.e. the base category). Marketing methods B and C can be combined as there is no statistically significant difference between them with regards to their average productivity levels across consultant. (g), (h) For each x i variable, test: H0: βi = 0 against H1: βi ≠ 0 Experience : Since its p -value (0.0329) < α = 0.05, or {0.036 ≤ βE ≤ 0.742} does not cover zero, conclude that work experience is statistically significant. MA : Since its p -value (0.0172) < α = 0.05, or {-6.576 ≤ βMA ≤ -0.742} does not cover zero, conclude that marketing method A is statistically significantly different from marketing method C (i.e. the base category) in terms of consultants' average productivity levels. MB : Since its p -value (0.2867) > α = 0.05, or {-1.36 ≤ βMA ≤ 4.304} covers zero, conclude marketing method B is not statistically significantly different from marketing method C (i.e. the base category) in terms of consultants' average productivity levels. (i) Employ consultants with longer work experience and avoid using marketing method A as it produces lower productivity levels than either marketing methods B or C. (j) y^ = 26.387 + 0.389 (8) - 3.659 (0) + 1.472 (1) = 30.9717 Using t-crit = t(0.025,16) = ±2.1199; standard error = 2.3425; n = 20 giving margin of error = 2.1199 (2.3425)/√20 = 1.1104 Lower 95% confidence limit = 30.9717 - 1.1104 = 29.86 Upper 95% confidence limit = 30.9717 + 1.1104 = 32.08 {29.86 deals ≤ y(hat)(estimated) ≤ 32.08 deals} Management interpretation: The bank management can be 95% confident that the actual average number of deals closed per month per consultant is likely to lie between 30 and 32 (rounded). Exercise 13.11 (a) File: X13.11 - corporate performance.xlsx Binary coding scheme (Choose region 'KZN' as the base category) R1 and R2 are the dummy variable names chosen. Region Gauteng Cape KZN Code 1 2 3 R1 1 0 0 R2 0 1 0 Binary coding scheme (Choose sector 'Construction' as the base category) S is the dummy variable name chosen. Sector Agriculture Construction (b) Code 1 2 S 1 0 Sample input data for first 6 companies (showing the binary coded data) ROC(%) 19.7 17.2 17.1 16.6 16.6 16.5 Sales 7178 1437 3948 1672 2317 4123 Margin% 18.7 18.5 16.5 16.2 16.0 15.6 Debt ratio(%) 28.5 24.3 65.6 26.4 20.1 46.4 R1 1 1 1 1 1 0 R2 0 0 0 0 0 1 Excel's Data Analysis - Regression (using the binary coded data as in (a)) SUMMARY OUTPUT Regression Statistics Multiple R 0.9125 R Square 0.8327 Adjusted R Square 0.7769 Standard Error 0.9524 Observations 25 Y = f(Sales; Margin %; Debt ratio(%); Region; Sector) ANOVA df Regression Residual Total 6 18 24 SS 81.2595 16.3261 97.5856 MS 13.5433 0.9070 F-stat 14.9318 p-value 0.000004076 S 0 1 1 1 1 0 REGRESSION OUTPUT Intercept Sales Margin% Debt ratio(%) R1 R2 S Coefficients 11.0146 0.0002 0.1791 0.0091 3.1453 0.9213 -0.9230 Standard Error 0.8746 0.0001 0.0672 0.0154 0.8389 0.5862 0.4205 t Stat 12.5936 2.1668 2.6656 0.5930 3.7494 1.5716 -2.1950 P-value 0.0000 0.0439 0.0158 0.5606 0.0015 0.1335 0.0415 Lower 95% 9.1771 0.00001 0.03794 -0.02321 1.38288 -0.31031 -1.80651 Upper 95% 12.8520 0.00033 0.32030 0.04146 4.90770 2.15295 -0.03956 y^ = 11.0146 + 0.0002 Sales + 0.1791 Margin % + 0.0091 Debt ratio % + 3.1453 R1 + 0.9213 R2 - 0.923 S (based on data recoded as in (a)) 2 (c ) R = 81.2595/97.5856 = 83.27% (d) H0: βS = βM% = βDR = βR1 = βR2 = βS = 0 versus H1: At least one βi ≠ 0 F-crit = F(0.05,6,18) = 2.66 and F-stat = 14.932 p -value = 0.000004076 Reject H0. Conclude that the overall model is statistically significant. (i.e. at least one x i is statistically significant in estimating y) (e), (h), (i) For all variables, test: H0: βi = 0 vs H1: βi ≠ 0 with t-crit = t(0.05,18) = ±2.101 Sales : Since t-stat (2.1668) > t-crit (2.101), or p -value (0.0439) < α = 0.05 or {0.00001 ≤ βS ≤ 0.00033} does not cover zero, conclude sales is statistically significant. Margin% : Since t-stat (2.6656) > t-crit (2.101), or p -value (0.0158) < α = 0.05 or {0.0379 ≤ βM% ≤ 0.3203} does not cover zero, conclude margin% is statistically significant. Debt ratio% : Since –t-crit (-2.101) < t-stat (0.593) < t-crit (2.101), or p -value (0.5606) > α = 0.05 or {-0.0232 ≤ βDR% ≤ 0.0415} covers zero, conclude debt ratio% is not statistically significant. (f) Region: For each dummy variable R1 and R2 , test: H0: βi = 0 against H1: βi ≠ 0 with t-crit = t(0.05,18) = ±2.101 R1 (Gauteng) : Since t-stat (3.7494) > upper t-crit (2.101); or its p -value (0.0015) < α = 0.05, or {1.38288 ≤ βR1 ≤ 4.9077} does not cover zero, conclude that companies that operate in the Gauteng region have a statistically significantly higher return on capital (%), on average, than companies that operate in the KZN region (i.e. the base region). R2 (Cape) : Since t-stat (1.5716) < upper t-crit (2.101); or its p -value (0.1335) > α = 0.05, or {-0.31031 ≤ βR2 ≤ 2.15295} covers zero, conclude that companies that operate in the Cape region do not have a statistically significant difference in their average return on capital (%)than companies that operate in the KZN region (i.e. the base region). Overall conclusion: the independent variable ‘region’ is statistically significant, but only with respect to the Gauteng region (when compared to th KZN region (i.e. the base region). The Cape and KZN regions can be merged into a single region as there is no statistically significant difference in the average return on capital (%) of companies operating within these two regions. (g) Sector: For the dummy variable, S, test: H0: βS = 0 against H1: βS ≠ 0 with t-crit = t(0.05,18) = ±2.101 S (Agriculture) : Since t-stat (-2.195) < lower t-crit (-2.101); or its p -value (0.0415) < α = 0.05, or {-1.80651 ≤ βS ≤ -0.03956} does not cover zero, conclude that companies that operate in the agricultural sector have a statistically significantly lower return on capital (%), on average, than companies that operate in the construction sector (i.e. the base sector). Overall conclusion: the independent variable ‘sector’ is statistically significant. (j) Significant performance measures of ROC% are: Sales, Margin%, but not Debt ratio%. For region, Gauteng has a significantly positive impact on average ROC% compared to Cape and KZN. The agricultural sector has a significantly negative impact on average ROC% compared to the construction sector. (j) y^ = 11.0146 + 0.0002 (8862) + 0.1791 (10) + 0.0091 (22) + 3.1453 (0) + 0.9213 (1) - 0.923 (0) = 15.6995 Using t-crit = t(0.025,18) = ±2.101; standard error = 0.9524; n = 25 giving margin of error = 2.101 (0.9524)/√25 = 0.400198 Lower 95% confidence limit = 15.6995 - 0.400198 = 15.299 Upper 95% confidence limit = 15.6995 + 0.400198 = 16.100 {15.299% ≤ y(hat)(estimated) ≤ 16.10%} Management interpretation: The investment analyst can be 95% confident that the actual average return on capital (%) of companies with the given profile lies between 15.3% and 16.1%. CHAPTER 14 INDEX NUMBERS MEASURING BUSINESS ACTIVITY Exercise 14.1 An index number is a single summary value that measures the overall change in the level of activity of a single item or a basket of related items from one time period to the another. Example: Consumer Price Index (CPI) - the inflation indicator. Example 14.2 A price index measures changes in price levels over time, holding quantities constant. A quantity index measures changes in consumption levels over time, holding prices constant. Example 14.3 Items are 'weighted' in a basket to reflect the importance (or value) of each item in the basket relative to the other items in the basket. Example 14.4 Laspeyres weighting method and Paasche weighting method Example 14.5 (i) (ii) (iii) (iv) (v) Example 14.6 A link relative is a period-on-period (consecutive) change in level of activity. A price relative is a change in the level of activity of an item in a given period relative to a base period. Example 14.7 Real (constant) values are found by dividing monetary values by an 'inflation' index. This removes the influence of price increases. Real (constant) values refer to the actual purchasing power of money / or the real (actual) change in the level of activity. The purpose (or scope) of the index The selection of the basket of items (i.e. the mix of items) The choice of item weights The choice of a suitable base year The formulation of a substitution rule Exercise 14.8 File: 2009 Data (a) X14.8 - motorcycle sales.xlsx 2010 Motorcycle model Unit price (R1000) Quantity (units sold) Unit price (R1000) Quantity (units sold) A B C 25 15 12 10 55 32 30 19 14 7 58 40 Motorcycle p1/p0*100 model A 30/25*100 = B 19/15*100 = C 14/12*100 = Price Relative 120.0 126.7 116.7 Interpretation Model A has risen in price by 20% from 2009 to 2010; model B by 26.7% and model C by 16.7%. (b) (i) Laspeyres Weighted Aggregates Price Index Motorcycle model A B C Totals Base Value (p0*q0) 25*10 = 15*55 = 12*32 = Composite Price Index (b) (ii) 250 825 384 1459 30*10 = 19*55 = 14*32 = 300 1045 448 1793 122.9 1793/1459*100 = Laspeyres Weighted Average of Price Relatives Index Motorcycle model A B C Totals Base Value (p0*q0) 25*10 = 15*55 = 12*32 = Composite Price Index (c) Current Value (p1*q0) 250 825 384 1459 179300/1459 = Price Relative (p1/p0) 30/25*100 = 19/15*100 = 14/12*100 = 120.0 126.7 116.7 Weighted Price Relatives 120*250 + 126.7*825 + 116.7*384 = 179300 122.9 Interpretation Motorcycle models A, B and C have risen in price by 22.9% on average from 2009 to 2010. Exercise 14.9 Data (a) File: Motorcycle model A B C Motorcycle model A B C 2009 Unit price Quantity (R1000) (units sold) 25 10 15 55 12 32 q1/q0*100 7/10*100 = 58/55*100 = 40/32*100 = X14.8 - motorcycle sales.xlsx 2010 Unit price Quantity (R1000) (units sold) 30 7 19 58 14 40 Quantity Relative 70.0 105.5 125.0 Interpretation Model A unit sales dropped 30% from 2009 to 2010; model B unit sales rose by 5.55%; while model C unit sales rose the most by 25% over the year. (b) (i) Laspeyres Weighted Aggregates Quantity Index Motorcycle model A B C Totals Base Value (p0*q0) 25*10 = 15*55 = 12*32 = Composite Quantity Index (b) (ii) 25*7 = 15*58 = 12*40 = 250 825 384 1459 1525/1459*100 = 175 870 480 1525 104.5 Laspeyres Weighted Average of Quantity Relatives Index Motorcycle model A B C Totals Base Value 25*10 = 15*55 = 12*32 = Composite Quantity Index (c) Current Value (p0*q1) (p0*q0) 250 825 384 1459 Quantity Relative (q1/q0) 7/10*100 = 58/55*100 = 40/32*100 = 152537.5/1459 = 70.0 105.5 125.0 104.5 Interpretation Unit sales of motorcycle models A, B and C have risen by 4.5% on average from 2009 to 2010. Weighted Quantity Relatives 70*250 + 105.5*825 + 125*384 = 152537.5 Exercise 14.10 Data (a) (i) File: Motorcycle model A B C 2009 Unit price Quantity (R1000) (units sold) 25 10 15 55 12 32 Base Value (p1*q0) 30*10 = 19*55 = 14*32 = Composite Quantity Index Current Value (p1*q1) 30* 7 = 19*58 = 14*40 = 300 1045 448 1793 1872/1793*100 = 210 1102 560 1872 104.41 Paasche Weighted Average of Quantity Relatives Index Motorcycle model A B C Totals Base Value 30*10 = 19*55 = 14*32 = Composite Quantity Index (b) 2010 Unit price Quantity (R1000) (units sold) 30 7 19 58 14 40 Paasche Weighted Aggregates Quantity Index Motorcycle model A B C Totals (a) (ii) X14.8 - motorcycle sales.xlsx (p1*q0) 300 1045 448 1793 187200/1793 = Quantity Relative (q1/q0) 7/10*100 = 58/55*100 = 40/32*100 = 70.0 105.5 125.0 Weighted Quantity Relatives 70*300 + 105.5*1045 + 125*448 = 187200 104.41 Interpretation Unit sales of motorcycle models A, B and C have risen by 4.41% on average from 2009 to 2010. Exercise14.11 File: 2009 Data Telkom Services TalkPlus SmartAccess ISDN (a) 2010 Unit price Quantity (cents/call) (100's calls) 65 35 50 X14.11 - Telkom services.xlsx Unit price (cents/call) 14 27 16 Price relatives (2010) Telkom Services 70/65*100 = TalkPlus 107.7 40/35*100 = SmartAccess 114.3 45/50*100 = ISDN 90.0 2011 Quantity Unit price Quantity (100's calls) (cents/call) (100's calls) 70 40 45 18 29 22 55 45 40 Price relatives (2011) 55/65*100 = 84.6 45/35*100 = 128.6 40/50*100 = 80.0 Interpretation TalkPlus services increased by 7.7% in price from 2009 to 2010, but then dropped by 15.4% in price from 2009 to 2011. SmartAccess on the other hand showed an increase in price from 2009 to 2010 of 14.3% and by 28.6% from 2009 to 2011. ISDN showed a decrease in price by 10% in the first year (from 2009 to 2010) and by 20% over the 2 year period from 2009 to 2011. (b) Laspeyres Weighted Aggregates Price Index Telkom Services TalkPlus SmartAccess ISDN Base Value 2010 Value (p0*q0) (p1*q0) 910 945 800 2655 Composite Price Indexes 980 1080 720 2780 104.71 2011 Value (p1*q0) 770 1215 640 2625 98.87 Interpretation (based on the Laspeyres approach) The cost of Telkom services increased, on average, by 4.71% from 2009 to 2010, while there was a net reduction in costs, on average of 1.13% from 2009 to 2011. (c) Paasche Weighted Aggregates Quantity Index 2010 Prices Telkom Services TalkPlus SmartAccess ISDN Base Value 2010 Value (p1*q0) (p1*q1) 980 1080 720 2780 Composite Quantity Indexes (d) Interpretation 1260 1160 990 3410 2011 Prices Base Value (p1*q0) 770 1215 640 2625 122.7 (based on the Paasche approach) 2011 Value (p1*q1) 935 1080 1280 3295 125.5 17 24 32 The printing company's usage of Telkom services has risen by 22.7% over one year (from 2009 to 2010) and by 25.5%, on average over two years (from 2009 to 2011). Exercise 14.12 Data File: Job categories Systems analyst Programmer Network manager (a) Annual salary (in R10 000) 2008 2011 42 50 29 36 24 28 Systems analyst Programmer Network manager Base Value 2008 (p0*q0) 3528 2784 1392 7704 Laspeyres Composite Salary Index (c) No. of employees 2008 2011 84 107 96 82 58 64 Laspeyres Weighted Aggregate Salary Index IT Job Category (b) X14.12 - computer personnel.xlsx Interpretation IT Job Category Systems analyst Programmer Network manager Interpretation Current Value 2011 (p1*q0) 4200 3456 1624 9280 120.46 On average, the overall remuneration has increased by 20.46% over the 3 years period from 2008 to 2011. Price relatives (p1/p0) 119.05 124.14 116.67 Programmers have enjoyed the largest percentage increase in remuneration of 24.14% from 2008 to 2011. Exercise 14.13 Data Job categories Systems analyst Programmer Network manager (a) IT Job Category Systems analyst Programmer Network manager File: X14.12 - computer personnel.xlsx Annual salary (in R10 000) 2008 2011 42 50 29 36 24 28 No. of employees 2008 2011 84 107 96 82 58 64 Staff Relatives (q1/q0) 127.4 85.4 110.3 Interpretation The staff complement of Systems Analysts grew by 27.4% and that of Network managers grew by 10.3% over the period from 2008 to 2011. However, the number of Programmers, on the other hand, reduced by 14.6% over this same period. (b) (i) Laspeyres Quantity index (Weighted Aggregates approach) IT Job Category Systems analyst Programmer Network manager Base Value 2008 (p0*q0) 3528 2784 1392 7704 Current Value 2011 (p0*q1) 4494 2378 1536 8408 Composite Staff Complement (Quantity) Index 109.14 (b) (ii) Laspeyres Quantity index (Weighted Average of Relatives approach) IT Job Category Systems analyst Programmer Network manager Base Value 2008 Staff Relatives (p0*q0) (q1/q0) 3528 2784 1392 7704 127.38 85.42 110.34 Composite Staff Complement (Quantity) Index (c) Weighted Ave of Quantity Relatives 449400 237800 153600 840800 109.14 Interpretation The overall IT staff complement across all job categories has increased by an average of 9.14% from 2008 to 2011. Exercise 14.14 File: X14.14 - printer cartridges.xlsx Data Printer cartridges HQ21 HQ25 HQ26 HQ32 (a) 2008 Quantity Unit price used 145 24 172 37 236 12 314 10 Print Cartridges HQ26 HQ32 Unit price (2008) 236 314 2009 2010 Unit price Quantity used Unit price 155 165 255 306 28 39 12 8 149 160 262 299 Unit price (2010) 262 299 Quantity used 36 44 14 11 Price relative 2010 262/236% = 299/314% = 111.02 95.22 Interpretation The price of the HQ26 printer cartridge has increased by 11.02% from 2008 to 2010. The price of the HQ32 printer cartridge has decreased by 4.78% from 2008 to 2010. (b) (i) Paasche Weighted Aggregates Price Index Printer cartridges HQ21 HQ25 HQ26 HQ32 Totals 2009 Prices Base Value Current Value (2008) (2009) 4060 4340 6708 6435 2832 3060 2512 2448 16112 16283 Composite Price Indexes (b) (ii) 2010 Prices Base Value Current Value (2008) (2010) 5220 5364 7568 7040 3304 3668 3454 3289 19546 19361 101.06 99.05 Paasche Weighted Average of Relatives Price Index - for 2009 Printer cartridges Base Value (2008) HQ21 HQ25 HQ26 HQ32 Totals 4060 6708 2832 2512 16112 Price relative (2009) 106.9 95.9 108.1 97.5 Paasche Composite Price Index = Weighted Ave (2009) 434000 643500 306000 244800 1628300 101.06 Paasche Weighted Average of Relatives Price Index - for 2010 Printer cartridges Base Value (2008) HQ21 HQ25 HQ26 HQ32 Totals 5220 7568 3304 3454 19546 Price relative (2010) 102.8 93.0 111.0 95.2 Paasche Composite Price Index Weighted Ave (2010) 536400 704000 366800 328900 1936100 99.05 (c) Interpretation The average price of print cartridges increased marginally by 1.06% from 2008 to 2009. However, the average price of print cartridges decreased by 0.95% from 2008 to 2010. (d) Composite Link Relatives Composite price indexes Composite link relatives 2008 100 2009 101.06 101.06 2010 99.05 98.01 Exercise 14.15 Data Composite Price Index (Electrical goods) (a) File: 2004 2005 2006 2007 2008 2009 2010 88 96 100 109 114 112 115 Reset base year to 2009 Composite Price Index (Electrical goods) X14.15 - electrical goods.xlsx 100k/112 2004 2005 2006 2007 2008 2009 2010 78.6 85.7 89.3 97.3 101.8 100 102.7 Interpretation In 2004, the average price of electrical goods was 21.4% below the 2009 (base) price levels, while in 2006, it was only 10.7% below the 2009 (base) price levels. In 2010, prices were 2.7% higher on average than in the base period of 2009. (b) Link relatives Composite Price Index (Electrical goods) 2004 2005 2006 2007 2008 2009 2010 100 109.1 104.2 109 104.6 98.2 102.7 Interpretation The annual average price changes in electrical goods starting in 2004 was 9.1% (for 2005); 4.2% (for 2006); 9% (for 2007); 4.6% (for 2008); -1.8% (decrease in 2009); and 2.7% (for 2010). Exercise 14.16 Data File: X14.16 - insurance claims.xlsx 2006 2007 2008 2009 2010 2011 Federal Insurance (base = 2008) 92.3 95.4 100 102.6 109.4 111.2 Baltic Insurance (base = 2009) 93.7 101.1 98.2 100 104.5 107.6 2006 2007 2008 2009 2010 2011 Federal Insurance (base = 2010) 84.4 87.2 91.4 93.8 100 101.6 Baltic Insurance (base = 2010) 89.7 96.7 94 95.7 100 103 (a) (b) Ferderal Insurance showed a 8.6/91.4% (9.41%) increase from 2008 to 2010. Baltic Insurance showed a 6.0/94% (6.4%) increase from 2008 to 2010. Hence Federal Insurance showed the bigger claims increase from 2008 to 2010. (c) Link Relatives Federal Insurance Baltic Insurance 2006 100 100 2007 103.4 107.9 2008 104.8 97.1 2009 102.6 101.8 2010 106.6 104.5 2011 101.6 103 (d) Baltic showed a year on year increase of 3%, while Federal showed a year on year increase of only 1.6%. Hence Baltic Insurance. (e) Geometric mean (f) Federal Insurance's claims processed increased by an average of 3.785% annually between 2006 and 2011. Federal Baltic 5 √(1.034*1.048*1.026*1.066*1.016) -1 = √(1.079*0.971*1.018*1.045*1.03) - 1 = 5 3.785% 2.799% Exercise 14.17 File: Data Food items in micro market basket Milk (litres) Bread (loaves) Sugar (kg) Maize meal (kg) (a) Food items Milk (litres) Bread (loaves) Sugar (kg) Maize meal (kg) Unit prices (in Rand) 2010 2011 7.29 7.89 4.25 4.45 2.19 2.45 5.25 5.59 2010 7.29 4.25 2.19 5.25 2011 7.89 4.45 2.45 5.59 X14.17 - micro-market basket.xlsx Consumption 2010 2011 117 98 56 64 28 20 58 64 Price Relatives 108.23 104.71 111.87 106.48 Interpretation The price of milk rose by 8.23%; bread by 4.71%; sugar by 11.87% and maize meal by 6.48% per unit of measure from 2010 to 2011. The largerst price change (increase) was sugar with a 11.87% increase. (b) Paasche Weighted Average of Price Relatives Index Food items Milk (litres) Bread (loaves) Sugar (kg) Maize meal (kg) Total Base Value (p0*q1) 714.42 272 43.8 336 1366.22 Paasche Composite Price Index Price Relatives 108.23 104.71 111.87 106.48 146478/1366.22 = Weighted Average 77322 28480 4900 35776 146478 107.21 On average, the price of the micro-basket of items increased by 7.21% from 2010 to 2011. (c) Food items 2010 2011 Milk (litres) Bread (loaves) Sugar (kg) Maize meal (kg) 117 56 28 58 98 64 20 64 Quantity Relatives 83.8 114.3 71.4 110.3 Interpretation The consumption of milk decreased by 16.2%; bread consumption rose by 14.3%; sugar consumption decreased significantly by 28.6% and maize meal consumption rose by 10.3% from 2010 to 2011 respectively. The largest consumption change (decrease) was sugar with a 28.6% decrease. It is interesting to note that sugar showed the largest unit price increase while simultaneously recorded the largest decrease in consumption from 2010 to 2011. (d) Paasche Weighted Average of Quantity (Consumption) Relatives Index Food items Milk (litres) Bread (loaves) Sugar (kg) Maize meal (kg) Total Base Value (p1*q0) 923.13 249.20 68.60 324.22 1565.15 Paasche Composite Quantity Index Quantity Relatives 83.8 114.3 71.4 110.3 146478/1565.15 = Weighted Average 77322 28480 4900 35776 146478 93.6 On average, consumption of the micro-basket items dropped by 6.4% from 2010 to 2011. Exercise 14.18 Data Utilities Electricity Sewage Water Telephone (a) Utilities Electricity Sewage Water Telephone File: Prices (in Rand / unit) 2008 2009 2010 2.05 2.09 1.97 0.68 0.72 0.62 0.31 0.35 0.29 1.18 1.06 1.24 Unit Price 2008 Unit Price 2010 1.97 0.62 0.29 1.24 2.09 0.72 0.35 1.06 X14.18 - utilities usage.xlsx Consumption (No. of units) 2008 2009 2010 745 812 977 68 56 64 296 318 378 1028 1226 1284 Price relative 2010 106.1 116.1 120.7 85.5 Interpretation From 2008 to 2010, electricity increased by 6.1%; sewage costs by 16.1% water by 20.7%; while telephone costs decreased by 14.5% over this period. Electricity showed the smallest change (increase) of 6.1% from 2008 to 2010. (b) (i) Laspeyres Weighted Aggregates Price Index Base Value (2008) Utilities Electricity Sewage Water Telephone Totals 1467.7 42.2 85.8 1274.7 2870.4 Laspeyres Composite Price Indexes Current Value (2009) 1527.3 46.2 91.8 1213.0 2878.3 Current Value (2010) 1557.1 49.0 103.6 1089.7 2799.3 100.3 97.5 (b) (ii) Laspeyres Weighted Average of Price Relatives Index Utilities Electricity Sewage Water Telephone Totals Base Value (2008) 1467.7 42.2 85.8 1274.7 2870.4 Price relative (2009) 104.1 109.7 106.9 95.2 Laspeyres Composite Price Indexes Price relative (2010) 106.1 116.1 120.7 85.5 Weighted Ave (2009) Weighted Ave (2010) 152782.4 4625.0 9176.3 121353.3 287937.0 155717.7 4894.8 10360.9 108988.6 279961.9 100.3 97.5 (c) Interpretation There was virtually no change in the average cost of household utilities between 2008 and 2009. A slight increase of only 0.3% was recorded. From 2008 to 2010, however, the average cost of household utilities actually decreased marginally by 2.5%. Exercise 14.19 Data File: Utilities Electricity Sewage Water Telephone (a) Utilities Electricity Sewage Water Telephone X14.18 - utilities usage.xlsx Prices (in Rand / unit) 2008 2009 2010 2.05 2.09 1.97 0.68 0.72 0.62 0.31 0.35 0.29 1.18 1.06 1.24 Usage 2008 Usage 2010 745 68 296 1028 977 64 378 1284 Consumption (No. of units) 2008 2009 2010 745 812 977 68 56 64 296 318 378 1028 1226 1284 Quantity relative 2010 131.1 94.1 127.7 124.9 Interpretation Consumption of electricity, water and telephone increased by 31.1%; 27.7% and 24.9% respectively from 2008 to 2010. Only sewage showed a decline in usage by 5.9% from 2008 to 2010. (b) (i) Laspeyres Weighted Aggregates Quantity (Consumption) Index Base Value (2008) Utilities Electricity Sewage Water Telephone Totals 1467.7 42.2 85.8 1274.7 2870.4 Laspeyres Composite Quantity Indexes (b) (ii) Current Value (2010) 1924.7 39.7 109.6 1592.2 3666.2 113.1 127.7 Laspeyres Weighted Average of Quantity Relatives Index Utilities Electricity Sewage Water Telephone Totals Base Value (2008) 1467.7 42.2 85.8 1274.7 2870.4 Quantity relative (2009) 109.0 82.4 107.4 119.3 Laspeyres Composite Quantity Indexes (c) Current Value (2009) 1599.6 34.7 92.2 1520.2 3246.8 Quantity Weighted relative Ave (2009) (2010) 131.1 159959 94.1 3474 127.7 9219 124.9 152074 324726 113.1 Weighted Ave (2010) 192468 3968 10962 159213 366610 127.7 Interpretation On average, household consumption of utilities increased by 13.1% in 2009 from 2008 and also showed an overall consumption increase of 27.7% from 2008 to 2010. Exercise 14.20 File: Data 2005 2006 2007 2008 2009 2010 2011 97 92 100 102 107 116 112 100/107 = 0.934579 Composite cost index (a) X14.20 - leather goods.xlsx Re-base to 2009 Adjustment factor = 2005 2006 2007 2008 2009 2010 2011 90.7 86 93.5 95.3 100 108.4 104.7 Composite cost index (b) Composite cost index 120 Cost Index 110 100 90 80 70 60 Composite cost index 2005 2006 2007 2008 2009 2010 2011 Interpretation Prior to 2009, the average annual cost of leather goods inputs were below the 2009 level, but showed a steady increase towards 2009 prices over this period. Relative to 2009 unit costs, the cost of leather goods inputs was higher by 8.4% and 4.7% respectively for 2010 and 2011. (c) The average increase in unit costs of leather goods inputs was only 4.7% between 2009 and 2011. (d) Link Relatives Composite cost index 2005 2006 2007 2008 2009 2010 2011 100 94.8 108.7 102.0 104.9 108.4 96.6 The largest year-on-year change in overall unit costs of leather goods inputs was between 2006 and 2007 with an increase of 8.7%. (e) Geometric mean 6 √(0.948*1.087*1.02*1.049*1.0841*0.966) 1.024244 (i.e. 2.424% increase on average per year) =GEOMEAN(0.9485,1.087,1.02,1.049,1.0841,0.9655) = 1.024244 Exercise 14.21 (a) (b) (c) Year 2005 2006 2007 2008 2009 2010 2011 File: Salary 387 406 422 448 466 496 510 CPI 95 100 104 111 121 126 133 X14.21 - accountants' salaries.xlsx Real Salary 407.4 406.0 405.8 403.6 385.1 393.7 383.5 Interpretation The base real salary is R406 (in R1000s). Since 2006, real salaries have declined relative to 2006 (base) and are continuing to fall further behind inflation (CPI). Year 2005 2006 2007 2008 2009 2010 2011 Salary 387 406 422 448 466 496 510 CPI 95 100 104 111 121 126 133 Salary 104.9 103.9 106.2 104.0 106.4 102.8 Link Relatives CPI Diff (Sal - CPI) 105.3 104.0 106.7 109.0 104.1 105.6 -0.4 -0.1 -0.6 -5.0 2.3 -2.7 Interpretation On a year-on-year basis, salary increases have lagged behind the inflation rate (CPI) in all years except 2010 when salary adjustments exceeded CPI by 2.3%. Exercise 14.22 Data (a) File: 2003 2004 2005 2006 2007 2008 2009 2010 94.8 97.6 100 105.2 108.5 113.9 116.7 121.1 Composite cost index Re-base to 2008 Composite cost index X14.22 - school equipment .xlsx 0.878 Adjustment factor = 100/113.9 = 2003 2004 2005 2006 2007 2008 2009 2010 83.2 85.7 87.8 92.4 95.3 100 102.5 106.3 Plot and Interpretation of Re-based Cost Index Series (b) and (e) Composite Cost Index 110 105 100 95 90 85 Composite cost index 80 75 2003 2004 2005 2006 2007 2008 2009 2010 Interpretation The average cost of school equipment has shown a steady increase of 3% to 5% annually over the period 2003 to 2010. (c) Link Relatives Composite price index 2003 2004 2005 2006 2007 2008 2009 2010 100 102.95 102.46 105.20 103.14 104.98 102.46 103.77 Interpretation The link relatives confirm the year-on-year increases in the average cost of school equipment of between 3% to 5% p.a. (d) The 2003 budget must be adjusted annually by the link relative (year-on-year) indexes. Therefore multiply each previous year's budget by the next year's link relative index. Budgets 2003 5000000 2004 5147500 2005 5274129 2006 2007 5548384 5722603 2008 6007589 2009 2010 6155376 6387434 Exercise 14.23 Data Coffee types Java Colombia Sumatra Mocha (a) Coffee types Java Colombia Sumatra Mocha File: Unit price (2007) 85 64 115 38 Quantity (2007) 52 75 18 144 Unit price (2007) 85 64 115 38 Unit price (2010) 98 74 133 42 X14.23 - coffee imports.xlsx Unit price (2010) 98 74 133 42 Quantity (2010) 46 90 20 168 Price Relatives 115.3 115.6 115.7 110.5 Interpretation The coffee types of Java, Colombia and Sumatra increased by over 15% from 2007 to 2010, while Mocha increased by only 10.5% over this same period. (b) Laspeyres Composite Price Index (Weighted Aggregates approach) Coffee type Java Colombia Sumatra Mocha Total Base Value Current Value (2007) (2010) 4420 4800 2070 5472 16762 5096 5550 2394 6048 19088 Laspeyres Composite Price Index = 19088/16762 = 113.9 Interpretation The cost of coffee imports has increased by 13.9% on average from 2007 to 2010. (c) Cheaper. The cost of coffee imports has increased by only 13.9% on average, since 2007. (d) Laspeyres Composite Quantity Index (Weighted Aggregates approach) Coffee type Java Colombia Sumatra Mocha Total Base Value Current Value (2007) (2010) 4420 4800 2070 5472 16762 3910 5760 2300 6384 18354 Laspeyres Composite Quantity Index 18354/16762 = 109.5 Interpretation The quantity of coffee imported has increased by 9.5% on average from 2007 to 2010. Exercise 14.24 Data Claim Type GPs Specialists Dentists Medicines (a) File: Ave Value (R) (2008) 220 720 580 400 Claims (2008) 20 30 10 50 X14.24 - medical claims.xlsx Ave Value (R) (2010) 255 822 615 438 Claims (2010) 30 25 15 70 Laspeyres Composite Quantity Index (Weighted Aggregates approach) (Use the Laspeyres Approach as the default method if none is specified) Claim type GP's Specialists Dentists Medicines Total Base Value (2008) 4400 21600 5800 20000 51800 Current Value (2010) 6600 18000 8700 28000 61300 Laspeyres Composite Quantity Index = 61300/51800 = 118.34 Interpretation The number of claims received has increased by 18.34% from 2008 to 2010. (b) Claim type Claims (2008) Claims (2010) GP's Specialists Dentists Medicines 20 30 10 50 30 25 15 70 Quantity relatives 150.0 83.33 150.0 140.0 Interpretation GP's and Dentists showed the biggest increase in number of claims (both 50% increase). (c) Laspeyres Composite Price (Claims Value) Index (Weighted Aggregates approach) Claim type GP's Specialists Dentists Medicines Total Base Value (2008) 4400 21600 5800 20000 51800 Current Value (2010) 5100 24660 6150 21900 57810 Laspeyres Composite Price Index = (d) 57810/51800 = 111.6 Interpretation The value of claims received between 2008 and 2010 has risen by 11.6%. Exercise 14.25 Data Shoe Model Trainer Balance Dura (a) File: Unit price (2009) 320 445 562 Pairs Sold (2009) 96 135 54 Unit price (2010) 342 415 595 Pairs sold (2010) 110 162 48 Quantity relatives 114.6 120.0 88.9 X14.25 - tennis shoes.xlsx Pairs Sold (2010) 110 162 48 Quantity Relatives Shoe model Trainer Balance Dura Pairs sold (2009) 96 135 54 Interpretation Dura's sales volume is down by 11.1% from 2009. (b) Laspeyres Weighted Aggregates Quantity (Volume) Index Shoe model Trainer Balance Dura Total Base Value (2009) 30720 60075 30348 121143 Current Value (2010) 35200 72090 26976 134266 Composite Quantity Index = 134266/121143*100 = (c) 110.83 Interpretation The average increase in shoes sold from 2009 to 2010 was only 10.83%. Overall, sales volumes do not meet the required growth of at least 12% p.a. Exercise 14.26 Data Fuel cost index File: 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 100.0 116.2 122.4 132.1 135.7 140.3 142.8 146.9 153.4 160.5 (a) Re-base to 2006 Fuel cost index X14.26 - energy fund.xlsx Adjustment factor = 0.7128 100/140.3 = 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 71.28 82.83 87.25 94.16 96.73 100.00 101.79 104.71 109.34 114.40 2002 2003 2004 2005 2006 2007 2008 2009 2010 105.34 107.92 102.73 103.39 101.78 102.87 104.42 104.63 (b) Link Relatives Fuel cost index 2001 100.00 116.20 Fuel Cost Index 120.00 115.00 110.00 105.00 100.00 95.00 90.00 Fuel cost index 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 (c) Geometric Mean 9√(1.162*1.0534*1.0792*1.0273*1.0339*1.0178*1.0287*1.0442*1.0463) 1.05397 Annual Average Increase = 5.3974% p.a. on average. Using Excel =GEOMEAN(all link relatives) 105.3976 (d) Interpretation Annual fuel cost increases and decreases were high in the period 2001 to 2004. Thereafter, year-on-year increases were moderate and stable at between 3% to 5% p.a. Exercise 14.27 Data (a) File: X14.27 - motorcycle distributor.xlsx Average Selling Price (R) Model 2007 18050 Blitz 25650 Cruiser 39575 Classic 2008 19235 26200 42580 2009 21050 27350 43575 2010 21950 28645 43950 2011 22400 31280 46750 Units Sold Model Blitz Cruiser Classic 2008 185 386 70 2009 168 402 111 2010 215 519 146 2011 225 538 132 2009 106.6 2010 111.7 2011 121.9 2007 205 462 88 Price Relative Index series - Cruiser model Model Cruiser 2007 100.0 2008 102.1 The price of the Cruiser model rose marginally (between 2% and 7%) for 2008 and 2009 relative to 2007; but rose strongly by 11.7% in 2010 and 21.9% in 2011. (b) Laspeyres Composite Price Index (Using Weighted Aggregates method) Base Value Value Value 2008 2007 2009 2010 3700250 3943175 4315250 4499750 11850300 12104400 12635700 13233990 3482600 3747040 3834600 3867600 19033150 19794615 20785550 21601340 Model Blitz Cruiser Classic Total Price Index 2007 100 2008 104.0 2009 109.2 Value 2011 4592000 14451360 4114000 23157360 2010 113.5 2011 121.7 Laspeyres Composite Price Index - Motorcycles 130 120 110 100 90 80 70 60 Price Index 2007 2008 2009 2010 2011 (c) Interpretation There has been a reasonably constant increase in the average price of motorcycles over the past 5 years. In 2011, motorcycles cost 21.7% more on average than they did in 2007. (d) Link relative of Selling Prices - Classic Model 2007 100 Classic 2008 107.6 2009 102.3 2010 100.9 2011 106.4 Interpretation The Classic model has shown a very modest year-on-year price increase of between 2.3% and 7.6% between 2007 and 2011. In 2010, the annual (year-on-year) increase was only 0.9% (less than 1%). (e) Geometric Mean Using Excel 4 = √ (1.076*1.023*1.009*1.064) - 1= = GEOMEAN(link relatives) - 100 = 4.263 4.253 Motorcycle prices have risen by an average of 4.263% per annum. (f) Laspeyres Composite Quantity Index (Using Weighted Aggregates method) Model Blitz Cruiser Classic Total Quantity Index Base Value Value Value 2008 2007 2009 2010 3700250 3339250 3032400 3880750 11850300 9900900 10311300 13312350 3482600 2770250 4392825 5777950 19033150 16010400 17736525 22971050 2007 100 2008 84.1 2009 93.2 Value 2011 4061250 13799700 5223900 23084850 2010 120.7 2011 121.3 Laspeyres Composite Quantity Index - Motorcycles 130 120 110 100 90 80 Quantity Index 70 60 2007 2008 2009 2010 2011 (g) Interpretation Sales of motorcycles dropped for two years after 2007 (i.e. by 15.9% in 2008 and by 6.8% in 2009). Thereafter unit sales, on average per model, increased by 20.7% in 2010 relative to 2007; but performed at the same level in 2011 compared to 2010. Exercise 14.28 File: Data X14.28 - tyre production.xlsx Hillstone Tyre Production Costs and Volumes - Uitenhage Plant Cost/Tyre Passenger Light truck Giant truck 2010-Jan Feb March April May June July Aug 210.69 212.47 210.73 218.14 219.22 216.19 225.92 234 376.45 361.7 361.76 363.94 363.62 364.06 376.9 375.4 1171.1 1109.6 1101.8 1119.7 1127.2 1120.32 1162.8 1181 Output (1000's) Passenger Light truck Giant truck 2010-Jan 78 11 10 (a) Cost relative index series Passenger tyre Feb 102 14 14 March 93 13 13 April 81 12 11 May 105 16 15 June 100 16 15 July 117 19 16 Aug 105 17 16 Sept Oct Nov Dec 229.89 222.76 223.96 200.3 375.55 375.04 375.59 376.3 1157.7 1166.75 1157.7 1148 Sept 98 14 16 Oct 110 15 15 Nov 97 13 14 Dec 43 6 5 Cost Relative Index - Passenger Tyres Jan Feb March April May June July Aug Sept Oct Nov Dec 100.0 100.8 100.0 103.5 104.0 102.6 107.2 111.1 109.1 105.7 106.3 95.1 Interpretation Passenger tyre costs increased steadily throughout the year reaching a peak of 11.1% above January 2010 levels in August. Thereafter costs declined steadily and ended the year 4.9% below the starting level in January 2010. (b) Passenger Light truck PRODUCTION COST ANALYSIS Jan 16433.82 Feb March 16573 16437 April 17015 Laspeyres Composite Cost Index (Weighted Aggregates) May June 17099 16862.8 July Aug 17622 18250 4140.95 3978.7 3979.4 4003.3 3999.8 4004.66 4145.9 Sept Oct 17931 17375.3 Nov Dec 17469 15623 4129 4131.1 4125.44 4131.5 4139 Giant truck 11711 11096 11018 11197 11272 11203.2 11628 11812 11577 11667.5 11577 11479 Total Cost 32285.77 31647 31435 32216 32371 32070.7 33396 34192 33639 33168.2 33177 31241 Jan Feb March April May June July Aug Sept Oct Nov Dec 100.0 98.0 97.4 99.8 100.3 99.3 103.4 105.9 104.2 102.7 102.8 96.8 Composite Cost Index Composite Cost Index 108.0 106.0 104.0 102.0 100.0 98.0 96.0 94.0 Composite Cost Index 92.0 Jan Feb March April May June July Aug Sept Oct Nov Dec Interpretation Composite tyre manufacturing costs declined or remained constant for the first 6 months until June 2010. Thereafter, costs, on average increased by 6% over July and August before being brought under control. By December, composite costs were 3.2% below the beginning of the year levels. (c) Light truck Link Relatives Link Relatives - Costs - Light Truck Tyres Jan 376.45 100.0 Feb 361.7 96.1 March April May 361.76 363.94 363.62 100.0 100.6 99.9 June 364.06 100.1 July 376.9 103.5 Aug 375.4 99.6 Sept 375.55 100.0 Oct 375.04 99.9 Nov Dec 375.59 376.3 100.1 100.2 Interpretation Light truck radial tyres costs have remained almost constant and unchanged throughout the year with most adjustments not exceeding 0.5%. Only in Feb (decrease of 3.9%) and July (increase of 3.5%) did costs fluctuate to any degree. (d) Passenger Light truck Giant truck Total Value Volume Index PRODUCTION VOLUME ANALYSIS Laspeyres Quantity Index (Weighted Aggregates) Jan Feb March April May June July Aug Sept Oct Nov Dec 16433.82 21490 19594 17066 22122 21069 24651 22122 20648 23175.9 20437 9060 4140.95 5270.3 4893.9 4517.4 6023.2 6023.2 7152.6 6400 5270.3 5646.75 4893.9 2259 11711 16395 15224 12882 17567 17566.5 18738 18738 18738 17566.5 16395 5856 32285.77 43156 39712 34465 45712 44658.7 50541 47260 44656 46389.2 41726 17174 Jan 100.0 Feb 133.7 March 123.0 April 106.8 May 141.6 June 138.3 July 156.5 Aug 146.4 Sept 138.3 Oct 143.7 Nov 129.2 Laspeyres Composite Production Output (Volume) Index 180.0 160.0 140.0 120.0 100.0 80.0 60.0 40.0 20.0 Volume Index 0.0 Jan (e) Feb March April May June July Aug Sept Oct Nov Dec Interpretation Production volumes of all three makes of tyres showed a steady increase above the January level by up to nearly 60% over the first half of the year. However output showed a steady decline in the second half of the year ending at only 53.2% of the beginning of the year levels. Dec 53.2 CHAPTER 15 TIME SERIES ANALYSIS A FORECASTING TOOL Exercise 15.1 Cross-sectional data is gathered at one point in time; Time series data is recorded at fixed intervals over time. Exercise 15.2 Monthly national new car sales; Daily maximum temperature for Cape Town. Exercise 15.3 A line graph Exercise 15.4 Trend; Cycles; Seasonality; Irregular (random) Seasonality tends to show the most regularity. Exercise 15.5 The Moving Average method smooths out short-term fluctuations in a time series to allow the longer-term underlying trend and cyclical patterns to be revealed. Exercise 15.6 Yes, averaging occurs over a longer time period (i.e. five periods) producing a smoother curve. Exercise 15.7 A seasonal index of 108 means that seasonal influences stimulate the time series values by 8% above the trend / cyclical level. Exercise 15.8 A seasonal index of 88 means that seasonal influences depress the time series values by 12% below the trend / cyclical level. Exercise 15.9 File: (a) and (b) Year Coal Tonnage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 118 124 108 120 132 115 122 148 160 188 201 174 191 178 146 161 (c) and (d) X15.9 - coal tonnage.xlsx Uncentred 4 year moving total Centred 4year Moving Average 470 484 475 489 517 545 618 697 723 754 744 689 676 119.25 119.875 120.5 125.75 132.75 145.375 164.375 177.5 184.625 187.25 179.125 170.625 Centred 5-year Moving Average 120.4 119.8 119.4 127.4 135.4 146.6 163.8 174.2 182.8 186.4 178 170 Line Graphs of Coal mined, 4 and 5 year Moving Averages coal mined (100 000 tonnes) 250 200 150 Coal Tonnage 100 Centred 4- year Moving Average 50 0 Centred 5-year Moving Average 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 years (e) Interpretation The annual tonnage of coal mined in the Limpopo province was constant for the first 7 years after which there was an expansion phase for the next 4 years. Thereafter coal production has been declining steadily. This could be evidence of a cyclical effect caused by economic cycles in the demand for coal worldwide. Overall, a moderate upward cyclical trend. Exercise 15.10 (a) File: X15.10 - franchise dealers.xlsx Line Graph - New Franchise Dealers Line Graph: Trend Line of New Franchise Dealers 60 no. of new dealers 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 time periods (b) n = 10 Σ Period (x) 1 2 3 4 5 6 7 8 9 10 55 x² 1 4 9 16 25 36 49 64 81 100 385 xy 28 64 129 124 190 282 280 360 495 420 2372 b1 = ((10*2372)-(55*401))/(10*385-55²) = b0 = (401-2.0182*55)/10 = ŷ= (c) Dealers (y) 28 32 43 31 38 47 40 45 55 42 401 29 + 2.0182 x 29 x = 1, 2, 3, …, 10 Trend Estimates x 11 12 13 substitute into ŷ 29 + 2.0182 (11) 29 + 2.0182 (12) 29 + 2.0182 (13) 2.0182 ŷ 51.20 53.22 55.24 The number of new franchise dealers are likely to be 51, 53 and 55 in periods 11, 12 and 13 respectively, based on trend estimates. Exercise 15.11 (a) File: X15.11 - policy claims.xlsx Line Graph - Household Policy Claims no. of claims Line Graph - Household Policy Claims 90 80 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 quarters (b) n = 16 Σ Period (x) Claims (y) 1 84 2 53 3 60 4 75 5 81 6 57 7 51 8 73 9 69 10 37 11 40 12 77 13 73 14 46 15 39 16 63 136 978 x² 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 1496 xy 84 106 180 300 405 342 357 584 621 370 440 924 949 644 585 1008 7899 b1 = =[(16)(7899) - (136)(978)]/[(16*1496)-(136)2] b0 = =(978-(-1.21765)*136)/16 ŷ= 71.475 - 1.21765 x x= 1 in Q1 2008 2 in Q2 2008 3 in Q3 2008 -1.21765 71.475 Interpretation There is a downward trend in household policy claims over the past 4 years. 16 (c) Time Periods Claims 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 84 53 60 75 81 57 51 73 69 37 40 77 73 46 39 63 Uncentred 4 period Moving Total 2x4 period Moving Total 272 269 273 264 262 250 230 219 223 227 236 235 221 541 542 537 526 512 480 449 442 450 463 471 456 Seasonal Indexes Q1 Q2 Q3 Q4 Centred 4 period Moving Average 67.625 67.75 67.125 65.75 64 60 56.125 55.25 56.25 57.875 58.875 57 Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 88.725 110.701 120.670 86.692 79.688 121.667 122.940 66.968 71.111 133.045 123.992 80.702 79.688 121.667 122.940 80.702 78.705 120.166 121.423 79.706 Totals 404.996 400.000 121.4 79.7 78.7 120.2 Interpretation Household policy claims tend to increase significantly in Quarters 1 and 4 of each year, by about 20% on average, while there is a significant decline in claims during Quarters 2 and 3 by about 20% on average. (d) Seasonally-adjusted trend estimate of houshold policy claims x = 17 in Quarter 1 2012 ŷ= Trend estimate 71.475 - 1.21765 (17) = 50.77495 x = 18 in Quarter 1 2012 ŷ= Trend estimate 71.475 - 1.21765 (18) = 49.5573 Seasonally-adjusted trend estimate Q1 2012 Q2 2012 ŷ (adj) = ŷ (adj) = 50.77495*1.214 = 49.55742*0.797 = 61.64 39.5 Interpretation The insurance company can expect to receive 62 and 40 (rounded) household policy claims in the first and second quarters of 2012 respectively. Exercise 15.12 (a) File: X15.12 - hotel occupancy.xlsx Line Graph - Monthly Hotel Occupancy Rate (%) Line Graph - Monthly Hotel Occupancy Rate (%) 100 occupancy rate (%) 90 80 70 60 50 40 1 2 3 4 5 6 7 8 9 10 Months (Sept - June ) (b) n = 10 Σ Month (x) 1 2 3 4 5 6 7 8 9 10 55 Rate (y) 74 82 70 90 88 74 64 69 58 65 734 x² 1 4 9 16 25 36 49 64 81 100 385 xy 74 164 210 360 440 444 448 552 522 650 3864 b1 = (10*3864 - 55*734)/(10*385 - 55²) = b0 = (734 -(-2.09697)*55)/10 = ŷ= 84.933 - 2.09697 x -2.09697 84.933 x= 1 in Sept 2 in Oct 3 in Nov Interpretation There is a downward trend in hotel occupancy over the past 10 months since September. (c) Trend estimate Period July Aug x 11 12 ŷ 61.87% 59.77% Interpretation The continued downward trend is reflected in the next 2 month's projections. Exercise 15.13 (a), (e), (f) File: X15.13 - electricity demand.xlsx Line Graph - Electricity Demand for a City (Cape Town) Line Graph - Cape Town Electricity Demand y = 4.95x + 20.8 demand (1000 megawatts) 180 160 140 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 quarters (Q1 2008 - Q4 2011) Interpretation Electricity demand in Cape Town shows a clear seasonal pattern, peaking in quarter 3 and bottoming out in quarter 4 of each year. (b) n = 16 Σ Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 136 Demand (y) 21 42 60 12 35 54 91 14 39 82 136 28 78 114 160 40 1006 x² 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 1496 xy 21 84 180 48 175 324 637 112 351 820 1496 336 1014 1596 2400 640 10234 b1 = = (16*10234 - 136*1006)/(16*1496 - 136²) = 4.95 b0 = (1006 -(4.95)*136)/16 = 20.8 ŷ= 20.8 + 4.95 x x= 1 in Q1 2008 2 in Q2 2008 3 in Q3 2008 (c) Time Periods Demand 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 21 42 60 12 35 54 91 14 39 82 136 28 78 114 160 40 Uncentred 4 period Moving Total 2x4 period Moving Total 135 149 161 192 194 198 226 271 285 324 356 380 392 284 310 353 386 392 424 497 556 609 680 736 772 Centred 4 period Moving Average 35.5 38.75 44.125 48.25 49 53 62.125 69.5 76.125 85 92 96.5 Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes Totals Seasonal Indexes Q1 Q2 Q3 Q4 178.65 30.97 79.32 117.99 175.61 30.44 77.97 115.98 406.927 400.000 169.01 30.97 79.32 111.92 185.71 26.42 62.78 117.99 178.65 32.94 84.78 118.13 77.97 115.98 175.61 30.44 Interpretation Electricity demand peaks in Q3 by 75% over the trend / cyclical level; and drops to 70% below the trend /cyclical level during Q4. (d) Seasonally-adjusted trend estimate of Cape Town's electricity demand Period x Q3 2012 Q4 2012 19 20 Trend ŷ 114.85 119.8 Seasonal Index 175.61 30.44 Seasonally adjusted Trend 201.69 36.47 Interpretation Electricity demand in Cape Town is likely to peak at 201.69 megawatts in Q3 of 2012 and bottom out at 36.47 megawatts in Q4 of 2012. Exercise 15.14 File: (a) Seaonal Index 136 112 62 90 136 112 62 90 136 112 62 90 Actual 568 495 252 315 604 544 270 510 662 605 310 535 (b) and (e) X15.14 - hotel turnover.xlsx Deseasonalised 417.6 442.0 406.5 350.0 444.1 485.7 435.5 566.7 486.8 540.2 500.0 594.4 Hotel Industry Quarterly Turnover turnover (R millions) 700 600 500 400 300 200 Actual 100 0 1 2 3 4 5 6 7 De-seasonalised 8 9 10 11 quarters (Summer 2008 - Spring 2010) (c) Trend line estimate ŷ n = 12 Σ Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 78 T/over (y) 568 495 252 315 604 544 270 510 662 605 310 535 5670 x² 1 4 9 16 25 36 49 64 81 100 121 144 650 xy 568 990 756 1260 3020 3264 1890 4080 5958 6050 3410 6420 37666 12 (f) (12*37666 - 78*5670)/(12*650 - 78²) = 5.6713 b0 = (5670 - (5.6713)*78)/12 = 435.64 ŷ= 435.64 + 5.6713 x 1 in Summer 2008 2 in Autumn 2008 3 in Winter 2008 x= Seasonally-adjusted trend estimate of Hotel Industry Quarterly Turnover Period x Summer 2011 Autumn 2011 13 14 Seasonal Index Trend ŷ 509.37 515.04 136 112 Seasonally adjusted Trend 692.74 576.84 Trend line estimation using Excel's Chart Wizard Hotel Industry Quarterly Turnover y = 5.6713x + 435.64 700 turnover (R millions) (d) b1 = 600 500 400 300 200 100 Actual 1 2 3 4 5 6 7 quarters 8 Linear (Actual) 9 10 11 12 Exercise 15.15 File: X15.15 - farming equipment.xlsx Plot of Farming Equipment Sales (2008 - 2011) Quarterly Farming Implements Sales y = 0.8544x + 52.05 75 70 no. units sold 65 60 55 50 45 Sales 40 Linear (Sales) 35 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 quarters (2008 - 2011) (a) Seasonal Indexes for Farming Equipment Sales Periods Sales 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 57 51 50 56 60 56 53 61 65 60 58 68 64 62 58 70 Uncentred 4 period Moving Total 214 217 222 225 230 235 239 244 251 250 252 252 254 2x4 period Moving Total 431 439 447 455 465 474 483 495 501 502 504 506 Seasonal Indexes (b) Centred 4 period Moving Average 53.88 54.88 55.88 56.88 58.13 59.25 60.38 61.88 62.63 62.75 63.00 63.25 Summer Autumn Winter Spring Seasonal Ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 92.81 102.05 107.38 98.46 91.18 102.95 107.66 96.97 92.61 108.37 101.59 98.02 92.61 102.95 107.38 96.97 92.63 102.97 107.40 97.00 Totals 399.92 400 107.40 97.00 92.63 102.97 Seasonal Influences The influence of seasonal forces on farming equipment sales is modest. There is a small stimulatory effect during Spring and Summer (no more than 7%) and a small depressing effect (also no more than 7%) during Autumn and Winter. (c) n = 16 Σ (d) Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 136 Sales (y) 57 51 50 56 60 56 53 61 65 60 58 68 64 62 58 70 949 x² 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 1496 xy 57 102 150 224 300 336 371 488 585 600 638 816 832 868 870 1120 8357 b1 = (16*8357-136*949)/(16*1496-136²) = 0.8544 b0 = (949 -(0.8544)*136)/16 = 52.05 ŷ= 52.05 + 0.8544 x x= 1 in Summer 2008 2 in Autumn 2008 3 in Winter 2008 Seasonally-adjusted trend estimate of Farming Equipment Sales for 2012 Period x Summer Autumn Winter Spring 17 18 19 20 Trend ŷ 66.57 67.43 68.28 69.14 Seasonal Index 107.40 97.00 92.63 102.97 Seasonally adjusted Trend 71.50 65.41 63.25 71.19 Interpretation The company can expect to sell between 63 and 72 farming implements each quarter during 2012 with the higher sales expected in Summer and Spring. Exercise 15.16 File: (a), (e) and (f) X15.16 - energy costs.xlsx Line Graph of Office Complex Energy Costs (in R100 000) Office Complex Energy Costs y = 0.0601x + 3.1091 energy cost (R100 000) 5 4.5 4 3.5 3 2.5 2 1.5 1 1 2 3 4 5 6 7 8 9 10 11 12 quarters (2009 - 2011) (b) Time Periods Costs 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 2.4 3.8 4 3.1 2.6 4.1 4.1 3.2 2.6 4.5 4.3 3.3 Uncentred 4 period Moving Total 13.3 13.5 13.8 13.9 14.0 14.0 14.4 14.6 14.7 2x4 period Moving Total Centred 4 period Moving Average 26.80 27.30 27.70 27.90 28.00 28.40 29.00 29.30 3.35 3.4125 3.4625 3.4875 3.5 3.55 3.625 3.6625 Seasonal Indexes Summer Autumn Winter Spring Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 119.40 90.84 75.09 117.56 117.14 90.14 71.72 122.87 118.27 90.49 73.41 120.21 117.6 90.0 73.0 119.5 Totals 402.39 400 73.0 119.5 117.6 90.0 Interpretation Energy costs rise by nearly 20% (19,5% and 17,6%) over the colder months of Autumn and Winter, and decline by between 10% (in Spring) and almost 30% (27%) in Summer. (c) n = 12 Σ (d) Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 78 Cost (y) 2.4 3.8 4 3.1 2.6 4.1 4.1 3.2 2.6 4.5 4.3 3.3 42 x² 1 4 9 16 25 36 49 64 81 100 121 144 650 xy 2.4 7.6 12 12.4 13 24.6 28.7 25.6 23.4 45 47.3 39.6 281.6 2 b1 = [(12*281.6)-(78*42)]/[12*650-78 ] = 0.0601 b0 = (42 - 0.0602*78)/12 = 3.1091 ŷ= 3.1091 + 0.0601 x x= 1 in Summer 2009 2 in Autumn 2009 3 in Winter 2009 Seasonally-adjusted trend estimate of Office Complex Energy Costs for 2012 Period x Trend ŷ Seasonal Index Summer Autumn Winter Spring 13 14 15 16 3.89 3.95 4.01 4.07 72.97 119.50 117.57 89.95 Seasonally adjusted Trend 2.84 4.72 4.71 3.66 Interpretation The office complex manager must budget between R284 000 and R472 000 each quarter during 2012 with higher costs expected in the Autumn and Winter periods. Exercise 15.17 (a) File: X15.17 - business registrations.xlsx Line Graph of New Business Registrations (2007 - 2011) Line Graph - New Business Registrations no. of new registrations 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 quarters (2007 - 2011) (b) and (c) 4-Period Moving Average and Quarterly Seasonal Indexes Uncentred 4 Time New period Moving Periods Registrations Total 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 1005 1222 1298 1199 1173 1371 1456 1376 1314 1531 1605 1530 1459 1671 1762 1677 1604 1837 1916 1819 4724 4892 5041 5199 5376 5517 5677 5826 5980 6125 6265 6422 6569 6714 6880 7034 7176 2x4 period Moving Total (b) Centred 4 period Moving Average 9616 9933 10240 10575 10893 11194 11503 11806 12105 12390 12687 12991 13283 13594 13914 14210 1202.00 1241.63 1280.00 1321.88 1361.63 1399.25 1437.88 1475.75 1513.13 1548.75 1585.88 1623.88 1660.38 1699.25 1739.25 1776.25 Seasonal Indexes Q1 Q2 Q3 Q4 (c) Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 107.99 96.57 91.64 103.72 106.93 98.34 91.38 103.74 106.07 98.79 92.00 102.90 106.12 98.69 92.22 103.42 106.526 98.514 91.820 103.568 106.41 98.41 91.72 103.46 Totals 400.429 400 91.72 103.46 106.41 98.41 (d) Interpretation of Seasonal Influences on new business registrations New business registrations show a modest seasonal pattern, ranging between 8,3% below the annual average in quarter 1 to 6.4% above the annual average in quarter 3. (b) 4-Period Moving Average Line Graph and Original Data Line Graph Line Graph - New Business Registrations no. of new businesses 2500 2000 1500 1000 New Registrations 500 Centred 4 period Moving Average 0 quarters (2007 - 2011) (e) Least Squares trend line (using Add Trendline in Excel ) ŷ= 1078.2 + 39.338 x x= 1 in Q1 2007 2 in Q2 2007 3 in Q3 2007 Periods y Adjusted Seasonal Indexes 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 1005 1222 1298 1199 1173 1371 1456 1376 1314 1531 1605 1530 1459 1671 1762 1677 1604 1837 1916 1819 91.72 103.46 106.41 98.41 91.72 103.46 106.41 98.41 91.72 103.46 106.41 98.41 91.72 103.46 106.41 98.41 91.72 103.46 106.41 98.41 Trend Estimate ŷ ŷ (adj) 1117.5 1156.9 1196.2 1235.6 1274.9 1314.2 1353.6 1392.9 1432.2 1471.6 1510.9 1550.3 1589.6 1628.9 1668.3 1707.6 1746.9 1786.3 1825.6 1865.0 1025.0 1196.9 1272.9 1215.9 1169.3 1359.7 1440.3 1370.8 1313.7 1522.5 1607.8 1525.6 1458.0 1685.3 1775.2 1680.5 1602.3 1848.1 1942.6 1835.3 Plot of Actual vs Seasonally adjusted Trend estimates (New Business Registrations) New Business Registrations (Actual vs Trend Adjusted) 2000 1800 1600 1400 1200 quarters (2007 - 2011) Comment The seasonally adjusted trend estimates track the actual number of new business registrations very closely. It is a good fitting graph. 2011 Q4 2011 Q3 2011 Q2 2011 Q1 ŷ (adj) 2010 Q4 2010 Q2 2010 Q1 2009 Q4 2009 Q3 2009 Q2 2009 Q1 2008 Q4 2008 Q3 2008 Q2 2008 Q1 2007 Q4 2007 Q3 2007 Q2 800 2010 Q3 y 1000 2007 Q1 (g) Seasonally-adjusted Trend estimates no. of new registrations (f) (h) Seasonally-adjusted trend estimate of New Business Registrations (2012) Period x Trend ŷ Q1 2012 Q2 2012 Q3 2012 Q4 2012 21 22 23 24 1904.3 1943.6 1983.0 2022.3 Seasonal Index 91.72 103.46 106.41 98.41 Seasonally adjusted Trend 1746.6 2010.9 2110.1 1990.2 Exercise 15.18 File: Total Sales (Estimate for next year) Quarter 1 2 3 4 Seasonal Index 95 115 110 80 Seasonally Trend ŷ Adjusted Trend Estimate ŷ(adj) 12 11.4 12 13.8 12 13.2 12 9.6 48 48 X15.18 - engineering sales.xlsx Exercise 15.19 (a) and (b) File: Period Wi 2009 Sp 2009 Su 2009 Au 2009 Wi 2010 Sp 2010 Su 2010 Au 2010 Visitors 18.1 26.4 41.2 31.6 22.4 33.2 44.8 32.5 S Index 62 89 162 87 62 89 162 87 X15.19 - Table Mountain.xlsx De-Seasonlised 29.19 29.66 25.43 36.32 36.13 37.30 27.65 37.36 no. of visitors (1000's) Table Mountain Visitors (Actual vs De-Seasonalised) 50 45 40 35 30 25 20 15 10 5 0 Visitors De-Seasonlised Wi 2009 Sp 2009 Su 2009 Au 2009 Wi 2010 Sp 2010 Su 2010 Au 2010 Quarters (2009 - 2010) Comment The trend (after de-seasonalising the quarterly data) is only marginally upwards. (c) Table Mountain Visitors (per quarter for 2011) Quarter Seasonal Index Winter Spring Summer Autumn 62 89 162 87 Seasonally Trend ŷ Adjusted Trend Estimate ŷ(adj) 37.5 37.5 37.5 37.5 150 23.25 33.375 60.75 32.625 150 Exercise 15.20 (a) File: Year 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 X15.20 - gross domestic product.xlsx GDP (Millions) 45 47 61 64 72 74 84 81 93 90 98 African Country Domestic Product Gross y = 5.2636x + 41.964 GDP (in R100 millions) 120 100 80 60 40 GDP (Millions) 20 0 Linear (GDP (Millions)) 1 2 3 4 5 6 7 8 9 10 11 years (2001 - 2011) ŷ = 41.964 + 5.2636 x (b) Trend line (see graph in (i)) (c) Expected GDP for 2012 and 2013 (in R100 million) Year 2012 2013 Year (x ) 12 13 Trend ŷ 41.964 + 5.2636 (12) 41.964 + 5.2636 (13) x = Estimate (ŷ) 105.1 110.4 1 in 2001 2 in 2002 3 in 2003 Exercise 15.21 File: (a) and (c) (a) Year Months 2007 Jan - Apr May - Aug Sept - Dec Jan - Apr May - Aug Sept - Dec Jan - Apr May - Aug Sept - Dec Jan - Apr May - Aug Sept - Dec 2008 2009 2010 (c) X15.21 - pelagic fish.xlsx Pelagic fish 3 Period Moving Ave 44 36 34 45 42 34 38 32 27 40 31 28 Seasonal ratio 38.00 38.33 40.33 40.33 38.00 34.67 32.33 33.00 32.67 33.00 (c) Unadj Seasonal Index Seasonal Index 96.85 88.70 111.57 97.79 89.56 112.65 297.12 300 94.74 88.70 111.57 104.13 89.47 109.62 98.97 81.82 122.45 93.94 Totals Seasonal Indexes 112.65 97.79 89.56 Periods Jan - Apr May - Aug Sept - Dec Pelagic fish catches are: - significantly higher (12.65%) than the trend /cyclical volumes in the Jan - April period. - only marginally lower (2.21%) than the trend /cyclical volumes in the May - Aug period. - more than 10% lower than the trend /cyclical volumes in the Sept - Dec period. (b) no. of tonnes Pelagic Fish Catches (Actual vs 3-Period Moving Average) 50 45 40 35 30 25 20 15 10 5 0 Pelagic fish 3 Period Moving Ave 1 2 3 4 5 6 7 8 9 10 11 2003 - 2006 (4-monthly) Comment The trend in pelagic fish catches over the past 4 years is decidedly downwards. 12 (d) n = 12 Σ (e) Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 78 Catch (y) 44 36 34 45 42 34 38 32 27 40 31 28 431 x² 1 4 9 16 25 36 49 64 81 100 121 144 650 xy 44 72 102 180 210 204 266 256 243 400 341 336 2654 b1 = (12*2654 - 78*431)/(12*650 - 78²) = -1.0315 b0 = (431 - (-1.031469)*78)/12 = 42.621 ŷ= 42.621 - 1.0315 x x= 1 in Jan - Apr 2007 2 in May - Aug 2007 3 in Sept - Dec 2007 Seasonally-adjusted trend estimates of Pelagic Fish Catches for 2011 Period 2011 x Trend ŷ Seasonal Index Jan - Apr May - Aug Sept - Dec 13 14 15 29.21 28.18 27.15 112.65 97.79 89.56 Seasonally adjusted Trend 32.91 27.56 24.32 Exercise 15.22 File: Month January February March April May June July Aug Sept Oct Nov Share price 90 82 78 80 74 76 70 Period 1 2 3 4 5 6 7 8 9 10 11 X15.22 - share price.xlsx Trend Price calc = 89.428 - 2.7143(1) = 89.428 - 2.7143(2) = 89.428 - 2.7143(3) = 89.428 - 2.7143(4) = 89.428 - 2.7143(5) = 89.428 - 2.7143(6) = 89.428 - 2.7143(7) = 89.428 - 2.7143(8) = 89.428 - 2.7143(9) = 89.428 - 2.7143(10) = 89.428 - 2.7143(11) Trend Price 86.7 84.0 81.3 78.6 75.9 73.1 70.4 67.7 65.0 62.3 59.6 Trend equation (derived from Excel 's Add Trendline) ŷ = 89.429 - 2.7143 x 1 in January 2 in February 3 in March x = 60c 90 - 1/3 (90) = Threshhold selling price Share Price Trend y = 89.429 - 2.7143 x 100 price (cents) 90 80 70 60 Share price 50 40 Trend Price 1 2 3 4 5 6 7 8 9 10 11 months (Jan - Nov) Conclusion The trend estimate of the share price is likely to fall below 60c (the selling level) by November of the same year if the downward trend continues uninterrupted. Exercise 15.23 (a) File: Season Actual Su 2008 Au 2008 Wi 2008 Sp 2008 Su 2009 Au 2009 Wi 2009 Sp 2009 Su 2010 Au 2010 Wi 2010 Sp 2010 196 147 124 177 199 152 132 190 214 163 145 198 Seaonal Index X15.23 - Addo park.xlsx Deseasonalised 112 94 88 106 112 94 88 106 112 94 88 106 175.0 156.4 140.9 167.0 177.7 161.7 150.0 179.2 191.1 173.4 164.8 186.8 (b) and (d) Line Plots of Actual and De-Seasonalised Visitor Numbers (by Season) Addo National Park Visitors (Actual vs De-Seasonalised) no. of visitors (in 1000s) 250 200 150 100 Actual 50 0 De-seasonalised Su 2008 Au 2008 Wi 2008 Sp 2008 Su 2009 Au 2009 Wi 2009 Sp 2009 Su 2010 Au 2010 Wi 2010 Sp 2010 quarters (2004 - 2006) (c) Conclusion There is a very slight upward trend in visitors to the Addo National Park. There is almost no growth in visitors over the past 3 years. Exercise 15.24 File: X15.24 - healthcare claims.xlsx Value of Healthcare Claims (in R millions) (a) Seasonal Indexes for Healthcare Claims Periods Claims 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 11.8 13.2 19.1 16.4 10.9 12.4 22.4 17.8 12.2 16.2 24.1 14.6 12.8 14.5 20.8 16.1 Uncentred Centred 4 2x4 period 4 period period Moving Moving Moving Total Total Average 60.5 59.6 58.8 62.1 63.5 64.8 68.6 70.3 67.1 67.7 66 62.7 64.2 120.1 118.4 120.9 125.6 128.3 133.4 138.9 137.4 134.8 133.7 128.7 126.9 Seasonal Indexes (b) n = 16 Σ Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 136 Claims (y) 11.8 13.2 19.1 16.4 10.9 12.4 22.4 17.8 12.2 16.2 24.1 14.6 12.8 14.5 20.8 16.1 255.3 15.0125 14.8 15.1125 15.7 16.0375 16.675 17.3625 17.175 16.85 16.7125 16.0875 15.8625 Q1 Q2 Q3 Q4 x² 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 1496 Seasonal Ratios 127.23 110.81 72.13 78.98 139.67 106.75 70.27 94.32 143.03 87.36 79.56 91.41 70.37 89.19 136.28 104.15 xy 11.8 26.4 57.3 65.6 54.5 74.4 156.8 142.4 109.8 162 265.1 175.2 166.4 203 312 257.6 2240.3 Unadjusted Seasonal Indexes Adjusted Seasonal Indexes 139.67 106.75 72.13 91.41 136.28 104.15 70.37 89.19 409.96 400 (c) b1 = (16*2240.3 - 136*255.3)/(16*1496 - 136²) = b0 = (255.3 -(0.206618)*136)/16 = ŷ= 14.2 + 0.2066 x 0.2066 14.2 x= 1 in Q1 2007 2 in Q2 2007 3 in Q3 2007 Seasonally-adjusted trend estimate of Healthcare Claims for 2011 Period x Q1 2011 Q2 2011 Q3 2011 Q4 2011 17 18 19 20 Trend ŷ Seasonal Index 17.71 17.92 18.13 18.33 70.37 89.19 136.28 104.15 Seasonally adjusted Trend 12.46 15.98 24.71 19.09 Interpretation Healthcare claims are expected to rise during 2011 from a low of R12,46 (mill) in Q1 to R24,71 (mill) in Q3. (d) Line Plots of Healthcare Claims (Actual vs Seasonally-adjusted trend estimates) Periods 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 Claims 11.8 13.2 19.1 16.4 10.9 12.4 22.4 17.8 12.2 16.2 24.1 14.6 12.8 14.5 20.8 16.1 S Index 70.37 89.19 136.28 104.15 70.37 89.19 136.28 104.15 70.37 89.19 136.28 104.15 70.37 89.19 136.28 104.15 70.37 89.19 136.28 104.15 Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Trend (ŷ) 14.41 14.61 14.82 15.03 15.23 15.44 15.65 15.85 16.06 16.27 16.47 16.68 16.89 17.09 17.30 17.51 17.71 17.92 18.13 18.33 Trend (ŷ-adj) 10.14 13.03 20.20 15.65 10.72 13.77 21.32 16.51 11.30 14.51 22.45 17.37 11.88 15.24 23.58 18.23 12.46 15.98 24.70 19.09 Healthcare Claims (Actual vs Seasonally-adjusted Trend Estimates) Claims (in R millions) 30 25 20 15 10 5 Claims 0 quarters (2007 - 2010) Trend (ŷ-adj) Exercise 15.25 (a) File: Year 2005 2006 2007 2008 2009 2010 2011 Sum n =7 Time (x) 1 2 3 4 5 6 7 28 Exp (y) 9.6 11.8 12 13.6 14.1 15 17.8 93.9 X15.25 - financial advertising.xlsx x² 1 4 9 16 25 36 49 140 xy 9.6 23.6 36 54.4 70.5 90 124.6 408.7 b1 = (7*408.7-28*93.9)/(7*140-282) = 1.1821 b0 = (93.9-1.182143*28)/7 8.6857 ŷ = 8.6857 + 1.1821 x (b) ŷ (2012) = 8.6857 + 1.1821(8) = (c) 1 in 2005 2 in 2006 3 in 2007 x= 18.143 Financial Services Sector Advertising Expenditure y = 1.1821x + 8.6857 Adspend (in R10 millions) 20 18 16 14 12 10 8 1 2 3 4 5 years (2005 - 2011) (d) Trend line equation (using Excel 's Add Trendline function) ŷ = 8.6857 + 1.1821 x x= 1 in 2005 2 in 2006 3 in 2007 6 7 Year 2005 2006 2007 2008 2009 2010 2011 2012 Time (x) 1 2 3 4 5 6 7 8 Actual (y) Trend (ŷ) 9.6 9.87 11.8 11.05 12 12.23 13.6 13.41 14.1 14.60 15 15.78 17.8 16.96 18.14 Financial Services Adspend 20 adspend (in R10 million) (e) 18 16 14 12 Actual (y) 10 8 Trend (ŷ) 1 2 3 4 5 years (2005 - 2011) 6 7 8 Exercise 15.26 (a) File: X15.26 - policy surrenders.xlsx Line Graph of Surrendered Endowment Policies (2008 - 2010) Surrendered Endowment Policies 240 no. of policies 220 200 180 160 140 Policies 120 100 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 quarters (2008 - 2010) Interpretation There is a distinct moderate downward trend in the number of surrendered policies over the period 2008 - 2010. This reduction could be due to the improved client communication strategy adopted by the company in recent years. (b) Time Periods Policies 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 212 186 192 205 186 165 169 182 169 158 162 178 Uncentred Centred 4 2x4 period 4 period period Moving Moving Moving Total Total Average 795 769 748 725 702 685 678 671 667 1564 1517 1473 1427 1387 1363 1349 1338 195.5 189.625 184.125 178.375 173.375 170.375 168.625 167.25 Seasonal Indexes Seasonal ratios 98.21 108.11 101.02 92.50 97.48 106.82 100.22 94.47 Unadjusted Seasonal Indexes Adjusted Seasonal Indexes 97.84 107.47 100.62 93.49 97.99 107.62 100.77 93.62 Totals 399.41 400 Q1 Q2 Q3 Q4 100.77 93.62 97.99 107.62 Interpretation Endowment policy surrenders are highest in Q4 by 7,62% over the quarterly average; and lowest in Q2 by 6,38% below the quarterly average. There is very little - to none seasonal impact on policy surrenders in Q1 and only 2,01% below the quarterly average in Q3. (c) n = 12 Σ (d) Period (x) 1 2 3 4 5 6 7 8 9 10 11 12 78 Cost (y) 212 186 192 205 186 165 169 182 169 158 162 178 2164 x² 1 4 9 16 25 36 49 64 81 100 121 144 650 b1 = (12*13558 - 78*2164)/(12*650-78²) = b0 = (2164 -(-3.552448)*78)/12 = ŷ= 203.42 - 3.5524 x xy 212 372 576 820 930 990 1183 1456 1521 1580 1782 2136 13558 -3.55245 203.42 x= 1 in Q1 2008 2 in Q2 2008 3 in Q3 2008 Seasonally-adjusted trend estimate of no. of Surrendered Endowment Policies in2011 Period x Trend ŷ Seasonal Index Q1 2011 Q2 2011 Q3 2011 Q4 2011 13 14 15 16 157.24 153.69 150.13 146.58 100.77 93.62 97.99 107.62 Seasonally adjusted Trend 158.45 143.88 147.12 157.75 (Rounded) 158 144 147 158 Exercise 15.27 (a) File: Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Liquidations 246 243 269 357 163 154 109 162 222 273 284 305 293 348 423 291 320 253 234 162 240 298 264 253 293 302 188 X15.27 - company liquidations.xlsx 3-per M A 5-per M A 252.7 289.7 263.0 224.7 142.0 141.7 164.3 219.0 259.7 287.3 294.0 315.3 354.7 354.0 344.7 288.0 269.0 216.3 212.0 233.3 267.3 271.7 270.0 282.7 261.0 255.6 237.2 210.4 189.0 162.0 184.0 210.0 249.2 275.4 300.6 330.6 332.0 335.0 327.0 304.2 252.0 241.8 237.4 239.6 243.4 269.6 282.0 260.0 (b) Company Liquidations (Actual vs 3 and 5 period Moving Averages) 450 no. of liquidations 400 350 300 250 200 150 100 50 0 Liquidations 1 3 5 7 9 11 13 3-per M A 15 17 19 5-per M A 21 23 25 periods (c) Interpretation The level of business liquidations shows no actual upward / downward trend over the past 27 periods. There is a distinct cyclical pattern wih the longest lasting from period 7 to period 20. 27 Exercise 15.28 File: X15.28 - passenger tyres.xlsx Passenger Tyre Sales (Y) (a) Time Periods 2005 Q1 2005 Q2 2005 Q3 2005 Q4 2006 Q1 2006 Q2 2006 Q3 2006 Q4 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 Quarterly Seasonal Ratios and Trend Line Equation Tyre Sales 64876 58987 54621 62345 68746 66573 60927 71234 78788 71237 68098 74444 77659 76452 73456 78908 84563 81243 74878 86756 91556 85058 77035 80145 102923 96456 (a) Trend line Uncentred 4 period Moving Total 2x4 period Moving Total Centred 4 period Moving Average Seasonal ratios 485528 496984 510876 526071 545002 559708 571543 581924 584005 588091 598664 608486 619854 631549 637762 647032 661873 672681 678653 674199 678955 701720 60691.0 62123.0 63859.5 65758.9 68125.3 69963.5 71442.9 72740.5 73000.6 73511.4 74833.0 76060.8 77481.8 78943.6 79720.3 80879.0 82734.1 84085.1 84831.6 84274.9 84869.4 87715.0 90.00 100.36 107.65 101.24 89.43 101.82 110.28 97.93 93.28 101.27 103.78 100.51 94.80 99.95 106.07 100.45 90.50 103.18 107.93 100.93 90.77 91.37 Q1 Q2 Q3 Q4 107.76 100.61 90.72 100.91 240829 244699 252285 258591 267480 277522 282186 289357 292567 291438 296653 302011 306475 313379 318170 319592 327440 334433 338248 340405 333794 345161 356559 Seasonal Indexes Use Excel 's Add Trendline function Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 90.64 100.81 107.65 100.51 90.72 100.91 107.76 100.61 399.62 400.00 Passenger Car Tyre Sales y = 1302x + 58114 units sold (in 1000's) 120000 100000 80000 60000 40000 Tyre Sales 20000 Linear (Tyre Sales) quarters (2000 - 2006) (b) (b) (c) ŷ = 58114 + 1302 x 1 in Q1 2005 2 in Q2 2005 3 in Q3 2005 x = Seasonally-adjusted Trend estimates of passenger car tyre sales (in units) for 2012 / 2013. Time Periods Time Trend (ŷ) 2012 Q1 2012 Q2 2012 Q3 2012 Q4 2013 Q1 2013 Q2 2013 Q3 2013 Q4 29 30 31 32 33 34 35 36 95872 97174 98476 99778 101080 102382 103684 104986 Seasonal Indices 107.76 100.61 90.72 100.91 107.76 100.61 90.72 100.91 Seasonally Adjusted Trend Estimate 103312 97767 89337 100686 108924 103007 94062 105941 Interpretation The pattern of passenger car tyre sales is very stable. The trend is linear and upward, and seasonal variations are consistent over time. Hence Hillstone management could have high confidence in the estimates. Exercise 15.29 File: X15.29 - outpatient attendances.xlsx Outpatients Attendances - Butterworth Clinic (a) Quarterly Seasonal Ratios and Trend Line Equation Time Periods Visits 2006 Q1 2006 Q2 2006 Q3 2006 Q4 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 12767 16389 19105 15780 14198 19868 20899 18304 14641 20204 21078 16077 14075 21259 20967 16183 12412 21824 22150 17979 17417 20568 24310 23118 Uncentred 4 period Moving Total 64041 65472 68951 70745 73269 73712 74048 74227 72000 71434 72489 72378 72484 70821 71386 72569 74365 79370 78114 80274 85413 67996 47428 2x4 period Moving Total Centred 4 period Moving Average 129513 134423 139696 144014 146981 147760 148275 146227 143434 143923 144867 144862 143305 142207 143955 146934 153735 157484 158388 165687 153409 115424 16189.1 16802.9 17462.0 18001.8 18372.6 18470.0 18534.4 18278.4 17929.3 17990.4 18108.4 18107.8 17913.1 17775.9 17994.4 18366.8 19216.9 19685.5 19798.5 20710.9 19176.1 14428.0 118.01 93.91 81.31 110.37 113.75 99.10 78.99 110.53 117.56 89.36 77.73 117.40 117.05 91.04 68.98 118.82 115.26 91.33 87.97 99.31 126.77 160.23 Q1 Q2 Q3 Q4 79.10 110.69 117.46 92.75 Seasonal Indexes Trend line Use Excel 's Add Trendline function Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 117.31 92.62 78.99 110.53 117.46 92.75 79.10 110.69 399.46 400.00 Outpatient Visits y = 224.66x + 15591 30000 no. of patients 25000 20000 15000 10000 Visits 5000 0 Linear (Visits) 1 3 5 7 9 11 13 15 17 19 21 23 quarters (2006 - 2011) ŷ = 15591 + 224.66 x (b) (c) x = 1 in Q1 2006 2 in Q2 2006 3 in Q3 2006 Seasonally-adjusted Trend estimates of Outpatient Visits (Butterworth) for 2012 / 2013 (first half). Time Periods Time 2012 Q1 2012 Q2 2012 Q3 2012 Q4 2013 Q1 2013 Q2 25 26 27 28 29 30 Trend (ŷ) 21207.5 21432.2 21656.8 21881.5 22106.1 22330.8 Seasonal Indices 79.10 110.69 117.46 92.75 79.10 110.69 Seasonally Adjusted Trend Estimate 16775 23723 25438 20295 17486 24718 Interpretation The pattern of outpatient attendances at the Butterworth Clinic is very stable. It shows a steady upward trend with highly consistent seasonal variations. Demand increases in the winter months ((Q3) and is lowest in the summer months (Q1). Exercise 15.30 File: X15.30 - construction absenteeism.xlsx Construction Absenteeism (a) Quarterly Seasonal Ratios and Trend Line Equation Time Days_lost Periods 933 865 922 864 967 936 931 902 892 845 907 801 815 715 779 711 822 856 762 722 785 715 740 704 2x4 period Moving Total Centred 4 period Moving Average 3584 3618 3689 3698 3736 3661 3570 3546 3445 3368 3238 3110 3020 3027 3168 3151 3162 3125 2984 2962 2944 2159 1444 7202 7307 7387 7434 7397 7231 7116 6991 6813 6606 6348 6130 6047 6195 6319 6313 6287 6109 5946 5906 5103 3603 900.3 913.4 923.4 929.3 924.6 903.9 889.5 873.9 851.6 825.8 793.5 766.3 755.9 774.4 789.9 789.1 785.9 763.6 743.3 738.3 637.9 450.4 102.42 94.59 104.72 100.73 100.69 99.79 100.28 96.70 106.50 97.00 102.71 93.31 103.06 91.82 104.07 108.47 96.96 94.55 105.62 96.85 116.01 156.31 Q1 Q2 Q3 Q4 104.21 96.98 102.88 95.93 Seasonal Indexes Seasonal ratios Unadjusted Adjusted Seasonal Seasonal Indexes Indexes 102.74 95.80 104.07 96.85 102.88 95.93 104.21 96.98 399.45 400.00 Trend line Use Excel 's Add Trendline function Construction Industry Days Lost due to Absenteeism y = -9.917x + 952.75 1000 900 no. of days lsot 2006 Q1 2006 Q2 2006 Q3 2006 Q4 2007 Q1 2007 Q2 2007 Q3 2007 Q4 2008 Q1 2008 Q2 2008 Q3 2008 Q4 2009 Q1 2009 Q2 2009 Q3 2009 Q4 2010 Q1 2010 Q2 2010 Q3 2010 Q4 2011 Q1 2011 Q2 2011 Q3 2011 Q4 Uncentred 4 period Moving Total 800 700 600 Days_lost 500 400 Linear (Days_lost) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 quarters (2006 - 2011) ŷ = 952.75 - 9.917 x (b) (c) x = 1 in Q1 2006 2 in Q2 2006 3 in Q3 2006 Seasonally-adjusted Trend estimates of Days Lost in Construction Industry for 2012. Time Periods Time 2012 Q1 2012 Q2 2012 Q3 2012 Q4 25 26 27 28 Seasonal Trend (ŷ) Indices 704.8 694.9 685.0 675.1 104.21 96.98 102.88 95.93 Seasonally Adjusted Trend Estimate 734 674 705 648 Interpretation The pattern of days lost due to absenteeism in the Construction industry shows a distinct downward trend but with inconsistent seasonal variations. CHAPTER 16 FINANCIAL CALCULATIONS INTEREST, ANNUITIES and NPV Exercise 16.1 Simple interest Interest is computed on the original lump sum for each period. Compound interest For each period, interest is computed on the original lump sum plus all accummulated interest of the preceeding periods. Exercise 16.2 No - a compounded amount will earn more interest than a simple interest investment. Exercise 16.3 Yes - compounding quarterly will result in interest being capitalised sooner - and therefore earning more interest than an annual compounded investment. Exercise 16.4 Nominal interest rate is the quoted per annum interest rate Effective interest rate is the actual interest rate achieved when interest is compounded more than once per year. Exercise 16.5 Annuity - an annuity is when a constant sum of money is paid (or received) at regular intervals over a period of time. Exercise 16.6 Ordinary annuity - regular payments begin the first period of the annuity term Deferred annuity - regular payments begin only at some future period into the term of the annuity. Exercise 16.7 Ordinary annuity certain - the series of regular payments take place at the end of each payment period. Ordinary annuity due - the series of regular payments take place at the beginning of each payment period. Exercise 16.8 NPV is the term used to convert all cash inflows (and outflows) over time into present value terms by dividing by the annual rate of interest It represents future cash flows in current terms. Exercise 16.9 (a) Fv = 15000*(1+0.08*5) = R21 000.00 (b) Fv = 15000*(1+0.08)5 = R22 039.92 (c) Fv = 15000*(1+0.04)10 = R22 203.66 Exercise 16.10 (a) Pv = 150000/(1+0.12)² = R119 579.10 (b) Pv = 150000/(1+0.06)4 = R118 814 (c) Pv = 150000/(1+0.01)24 = R118 134.90 Exercise 16.11 (a) n = (3/1-1)/0.16 = 12.5 years (b) n = log(3/1)/log(1+0.16) = 7.402 years (c) n = log(3/1)/log(1+0.04) = 28.011 quarters (or 7.003 years) Exercise 16.12 (a) Pv = 10525/(1+0.14/12*30) = R 7,796.30 (b) Pv = 10525/(1+0.14)2.5 R 7,585.08 Exercise 16.13 (a) Effective Rate =(1+0.15/12)12-1 = 16.0755% (b) =EFFECT(0.15,12) 16.0755% Exercise 16.14 (a) Effective Rate =(1+0.09/4)4-1 = 9.3083% (b) =EFFECT(0.09,4) 9.3083% Exercise 16.15 n = log(58890/25000)/log(1+0.11/2) = n = 8.0013 years 16.00267 half years Exercise 16.16 Part 1 - First 3 months Fv = 2000*(1+0.10/2)0.5 = R 2,049.39 Part 2 - Remaining 21 months Fv = 2049.39*(1+0.12/12)21 = R 2,525.65 The value of the investment after 2 years is R 2,525.65 Exercise 16.17 Quarterly rate (i ) = (10200/7500)1/(3*4) - 1 = 0.02595% per quarter Annual rate = (0.02595*4) = 10.3819% p.a. Exercise 16.18 Monthly rate (i ) = (8000/5000)1/(4*12) - 1 = 0.00984% per month Annual rate = (0.00984*12) = 11.8078% p.a. Exercise 16.19 Pv = 25000/(1+0.09/4)11 = R 19,572.37 The investor must deposit R19 572.37 today. Exercise 16.20 n = log(30000/21353.4)/log(1+0.12) = 3 years Exercise 16.21 Let Pv = R1 and Fv = R2 Quarterly rate (i ) = (2/1)1/(7*4) - 1 = Annual rate (%) = 0.025064*4 = 0.025064 10.0257% p.a. Exercise 16.22 Ordinary Annuity Certain (a) Fv = (b) 1600*((1+0.12/12)15 - 1)/(0.12/12) = R 25,755.03 =FV(0.12/12,15,1600,,0) R 25,755.03 Ordinary Annuity Due (c) (d) Fv = 1600*((1+0.12/12)15-1)*(1+0.12/12)/(0.12/12) = R 26,012.58 =FV(0.12/12,15,1600,,1) R 26,012.58 Exercise 16.23 Compound Interest Fv1 = Pv*(1+0.07)9 Simple Interest Fv2 = Pv*(1+0.07*9) Difference Fv1 - Fv2 = 334.16 Pv*(1+0.07)9 - Pv*(1+0.07*9) = 334.16 334.16/((1+0.07)9 - (1+0.07*9)) = 1602.999 Pv = Captial Sum (Pv) = R 1,603.00 Exercise 16.24 (a) Car price in 3 years Fv (Compound Interest) = Invest at end of month R= (b) 80000*(1+0.04)3 = 89989.12 89989.12/(((1+0.09/12)(3*12) - 1)/(0.09/12)) R 2,186.71 Invest at beginning of month 89989.12/(((1+0.09/12)(3*12) - 1)*(1+0.09/12)/(0.09/12)) R= R 2,170.43 Exercise 16.25 (a) R= (b) Total paid = 8500/((1-(1+0.18/12)(-3*12))/(0.18/12)) = 11062.8 Interest amt = 11062.8 - 8500 = R 2,562.80 % of debt 2562.8/8500% = 30.15% R 307.30 Exercise 16.26 Present value of an Ordinary Annuity Certain. Pv = 8750*((1-(1+0.1/12)^(-(5*12)))/(0.1/12)) = The employee will receive a gratuity of R411 821.98. R 411,821.98 Exercise 16.27 Ordinary Annuity Due 750*((1+0.145/4)(4*15)-1)*(1+0.145/4)/(0.145/4) = (a) Fv = (b) =FV(0.145/4,60,750,,1) (using Excel function) R 160,149.71 R 160,149.71 Exercise 16.28 Ordinary annuity certain (OAC) for 2 years with R = 540. (2*12) -1)/(0.12/12) = Fv1 (OAC) = 540*((1+0.12/12) R14 565.67 Then compute Fv on the capital sum after 2 years until maturity (for 7 years). Fv1 (CI) = 14565.67*(1+0.12/12)^(7*12) = R33 598.96 Ordinary annuity certain (OAC) for 7 years with R = 750. (7*12) Fv2 (OAC) = 750*((1+0.12/12) Total Funds Available -1)/(0.12/12) = Fv1 (CI) + Fv2 (OAC) = R98 004.21 R131 603.17 Exercise 16.29 (a) (b) (c) Ordinary Annuity Certain Fv (monthly) = 1000*((1+0.085/12)^(1*12)-1)/(0.085/12) = R12 478.72 Fv (quarterly) = 3000*((1+0.10/4)^(1*4)-1)/(0.10/4) = R12 457.55 Conclusion It is better to invest monthly . Ordinary Annuity Certain - Using Excel 's function, FV. Fv (monthly) = =FV(0.085/12,12,1000,,0) R12 478.72 Fv (quarterly) = =FV(0.10/4,4,3000,,0) R12 457.55 Ordinary Annuity Due - Using Excel 's function, FV. Fv (monthly) = =FV(0.085/12,12,1000,,1) R12 567.11 Fv (quarterly) = =FV(0.10/4,4,3000,,1) R12 768.99 Conclusion It is now better to invest quarterly . Exercise 16.30 (a) PV of an Ordinary Annuity Certain Pv = 2200*(1-(1+0.09/12)^(-(4*12)))/(0.09/12) = Deposit = 20000 Total Purchase Price of Motor Vehicle = (b) R88 406.52 R20 000,00 R108 406.52 Pv of an Ordinary Annuity Certain - Using Excel 's function, PV Purchase Price = PV(0.09/12,48,2200,,0) + deposit = R88 406.52 + R20 000 = R108 406.52 Exercise 16.31 (a) Using Ordinary Annuity Certain for 2 years Fv1 (2 years) = 1000*((1+0.08/12)(2*12)-1)/(0.08/12) = Using Compound Interest on Capital Sum for 1 year. =25933.19*(1+0.1/12)(1*12) = FV1 (1 year) (b) R25 933.19 R28 648.73 Using Ordinary Annuity Certain for 1 year (1*12) 1000*((1+0.10/12) -1)/(0.10/12) = Fv2 (1 year) = 12565.56809 R12 565.57 Maturity value after 3 years R41 214.30 R28 648.73 + R12 565.57 = Using Excel 's function FV with a compound interest calculation =FV(0.08/12,24,1000,,0)*(1+0.1/12)12 + FV(0.1/12,12,1000,,0) R41 214.30 Exercise 16.32 (a) Months 1 - 5 FV1 = Withdrawal Balance Compound Interest Months 6 - 10 FV2 = Withdrawal Balance Compound Interest Months 11, 12 FV3 = (b) 200*((1+0.12/12)5-1)/(0.12/12) = R1020.20 - R300.00 = 7 720.2*(1+0.12/12) = 5 200*((1+0.12/12) -1)/(0.12/12) = R1020.20 - R300.00 = 720.2*(1+0.12/12)2 = 2 R1020.20 R 300.00 R 720.20 R 772.15 R1020.20 R 300.00 R 720.20 R 734.68 200*((1+0.12/12) -1)/(0.12/12) = R 402.00 Total Amount Available after 12 months = R772.15 + R734.68 + R402.00 = R1908.83 Using Excel 's Function, FV Months 1 - 5 (Ordinary Annuity Certain - R300 + Compound Interest for 7 months) 7 =(-(FV(0.12/12,5,200,,0))-300)*(1+0.12/12) R 772.15 Months 6 - 10 (Ordinary Annuity Certain - R300 + Compound Interest for 2 months) 2 =(-(FV(0.12/12,5,200,,0))-300)*(1+0.12/12) R 734.68 Months 11, 12 (Ordinary Annuity Certain) =(-FV(0.01,2,200,,0)) R 402.00 Total Amount Available after 12 months = R772.15 + R734.68 + R402.00 = R1908.83 Exercise 16.33 (a) Option 1 R= Repayment at the end of every month 26000/((1-(1+0.14/4)(-3*4))/(0.14/4)) = R2 690.58 (b) Option 2 R= Repayment at the beginning of every month 26000/(((1-(1+0.14/4)(-3*4))*(1+0.14/4))/(0.14/4)) = R2 599.60 (c) The student should select to repay at the beginning of every month. Total repayment will be less than repaying at the end of every month. Exercise 16.34 Present Value (Pv) of a Deferred Annuity Factor 1 =(1-(1+0.16/12)(-(24+36)))/(0.16/12) Factor 2 =(1-(1+0.16/12) (-(36)) )/(0.16/12) Difference 41.12171 - 28.44381 = R= 30000/12.6779 = The repayment per month is R2 366.32. 41.12171 28.44381 12.6779 R2 366.32 Exercise 16.35 Rate (i ) (half yearly) = (40697.7/18000)(1/10)-1 = Rate (i ) (nominal % p.a.) = 0.085*2 = 0.085 17% p.a. Exercise 16.36 Re-arrange the Fv formula for an Ordinary Annuity Certain Factor 1 (103757.98/2000)*(0.12/12)+1 = n = LOG(1.51879)/LOG(1+0.12/12) = 1.51879 42 months The house owner will take 3 years and 6 months to save R103 757.98. Exercise 16.37 Initial investment (R) Annual Cash Flow (R) year 1 year 2 year 3 year 4 year 5 12% p.a. cost of capital 1 2 3 4 5 NPV INVESTMENT OPTIONS Trucking Laundry 60000 60000 32000 38500 26000 13000 9500 0 7500 45000 37500 55500 28571 30692 18506 8262 5391 91422 0 5979 32030 23832 31492 93333 R31 422 R33 333 Recommend the purchase of the Laundry business.