MSc 9002 – Business Statistics Fall 2022 Multiple-choice Questions (6 points). Choose one correct answer. 1) The mode of a standard normal distribution is A. 2 B. 1 C. 0 D. Not fixed 2) The standard deviation of SAT scores is 100 points. A researcher decides to take a sample of 500 students' scores to estimate the mean score of students in your state. What is the standard error of the sample mean? A. 0.2 B. 5 C. 4.47 D. 100 3) A single number determined from a sample that is used to estimate the corresponding population is called a A. Point estimate B. Sample point C. Sample space D. Statistic 4) What z-value is used to construct a two-sided 95% confidence interval? A. 1.28 B. 1.65 C. 1.96 D. 2.58 5) Which of the following is not a way to reduce the margin of error? A. Decrease the sample size B. Increase the sample size C. Reduce the confidence level D. Reduce the standard deviation 6) The general format to construct any confidence interval is A. Parameter ± (Critical Value)(Standard Deviation) B. Parameter ± (Confidence Level)(Standard Error) C. Point Estimate ± (Confidence Level)(Standard Error) D. Point Estimate ± (Critical Value)(Standard Error) True or False (7 points)? State True or False. 1) The confidence interval is centered around the point estimate of the sample and indicates the probability that the sample statistic is located within the interval. 2) As the alpha value increases, the confidence interval's width decreases. 3) The confidence interval of a sample statistic always contains the true population parameter. 4) The mean, the median, and the mode of a normal distribution are always equal. 5) The formula for determining the confidence interval estimation for the population mean where the population standard deviation is known is x̄ + se, where se is the sampling error. 6) Hypothesis testing starts with a researcher's assumption about a sample statistic. This assumption is called a hypothesis. 7) As the sample size increases, the width of the confidence interval increases. 1 Name: November 23, 2022 MSc 9002 – Business Statistics Fall 2022 Data Questions (52 points). Use data sheets to find answers. Summarize your findings here. 1) (4 points) We have a dataset of daily working hours (WorkHours) for 300 employees. We want to derive the percentage of employees in Istanbul who work more than 6 hours a day. Suppose the confidence level is 95%. Find the lower and upper limits of the confidence interval. 2) (12 points) What percentage of the population lives in their state of birth? According to the U.S. Census Bureau's American Community Survey, the figure ranges from 25% in Nevada to 78.7% in Louisiana. The average percentage across all states and the District of Columbia is 57.7%. The data (Residents) are consistent with the findings in the American Community Survey. The data are for a random sample of 120 Arkansas residents and 180 Virginia residents. a. Formulate hypotheses that can be used to determine whether the percentage of stayat-home residents in the two states differs from the overall average of 57.7%. b. Estimate the proportion of stay-at-home residents in Arkansas. Does this proportion differ significantly from the mean proportion for all states? Use = 0.05. c. Estimate the proportion of stay-at-home residents in Virginia. Does this proportion differ significantly from the mean proportion for all states? Use = 0.05. 3) (12 points) A company provides a random sample of 50 full-time workers. The data (Earnings) includes Usual Hours Worked, Education (years), Yearly Earnings, and Gender of the workers. It is estimated that 25% of the workers in this company are female. a. The company is trying to make it more of an equal-opportunity workplace and is now sensitive about its low number of female workers. Let's say X represents the random variable equal to the number of female workers in a random sample of workers. Count the female workers in the company's sample (x) and calculate their percentage in the sample. You should calculate the probability of X being less than or equal to the number you have observed in the given dataset, P(X<= x). Hint: Use the estimate for the female percentage to calculate the probability. b. Calculate the expected value and the standard deviation of X, for this sample. Calculate how many standard deviations exist between the observed value, x (from part a), and the expected value of X. c. Tabulate the sample by 'Gender' and 'yearly earnings' to find the number of workers in each category. Group the yearly earnings by 10000. 4) (16 points) A company that produces cell phones claims its standard phone battery lasts longer on average than other batteries in the market. To support this claim, the company took a sample of cell phones and recorded the hours their batteries lasted. In the data file (CellPhone), you can see a sample of cellphones with the number of hours their battery last. a. Calculate the sample size (using an appropriate formula), the sample average of the hours batteries last, and the alpha value given that the required confidence level is 95% (all the remaining parts will use this same confidence level). b. Set up a 95% confidence interval assuming the population standard deviation is 3 hours. c. Perform a two-sided hypothesis test for the population mean at a significance level of 5%, in which the null hypothesis is = 31. d. Comment on what these confidence intervals indicate in a few sentences. 2 Name: November 23, 2022 MSc 9002 – Business Statistics Fall 2022 5) (8 points) A university provides a shuttle service between its campus and downtown locations. A survey asked 276 students whether they used the service the previous month and their overall satisfaction with it. Each respondent's gender was also recorded. a. Determine the proportion of students who used the shuttle in the previous month, broken down by gender. b. Determine a 95% confidence interval for the proportion of students who used the shuttle service in the past month (for each gender). Long-Answer Questions (35 points). Provide an answer. In addition, show your work. 1) (5 points) Installation of certain hardware takes a random amount of time with a standard deviation of 5 minutes. A computer technician installs this hardware on 64 different computers, with an average installation time of 42 minutes. Compute a 95% confidence interval for the mean installation time. 2) (9 points) A car leasing company determined that the distance traveled per car per year is normally distributed with a mean of 30 thousand miles and a standard deviation of 10 thousand miles. a. What proportion of cars is expected to travel less than 25 thousand miles? b. What proportion of cars is expected to travel more than 50 thousand miles? c. What is the distance that 90% of the cars do not exceed? 3) (12 points) The H.R. director for a large company is interested in using regression analysis to help make recruitment decisions for sales managers. Due to the highly technical nature of their business, several company officials believe that only electrical engineers should be considered for the job. The H.R. manager considers the following variables of important indicators of sales performance: Experience – Number of years of selling experience. Wonder – Score from Wonderlic Personnel Test at the time of employment (a popular group intelligence test to measure learning and problem-solving abilities). S.C. – Score on the Strong-Campbell Interest Inventory Test at the time of employment (a test to measure the applicant's perceived interest in sales). Engineer – A binary variable that takes the value of 1 if the sales manager has a degree in electrical engineering and 0 otherwise. The sales performance (Sales) of the managers is measured by the ratio of yearly sales divided by the target sales value in that region, which was mutually agreed upon as "realistic expectations." The following is the portion of the data on 45 existing sales managers: A regression including all variables is run, and the following output is obtained: 3 Name: November 23, 2022 MSc 9002 – Business Statistics Fall 2022 a. Write the regression equation. b. Estimate the sales performance of a typical manager with five years of experience, a degree in history, and scores of 25 and 70 from Wonder and S.C., respectively. c. What can you say about the significance of the relationship between sales performance and an electrical engineering degree? d. The correlations between all variables are given below. Based on these results, is there a risk of multicollinearity? Based on the overall results, do you see any problems with the regression result? Would you change the regression model? How? 4) (9 points) The time needed for college students to complete a certain maze follows a normal distribution with a mean of 45 seconds. To see if the meantime µ (in seconds) is changed by vigorous exercise, we have a group of nine college students exercise vigorously for 30 minutes and then complete the maze. The sample mean and standard deviation of the collected data is 49.2 seconds and 3.5 seconds, respectively. Use these data to perform an appropriate hypothesis test at a 5% level of significance. a. Write the appropriate null and alternative hypothesis. b. Use the critical value method to perform the test for the sample ( = 0.05). c. Use the p-value method to perform the test for the sample. ( = 0.05). 4 Name: November 23, 2022