A Managerial Report ON Par Inc., using hypothesis testing By J.CH.T.NAGA SAI PRADEEP Table of Contents 1. Project Objective 1 2. Assumptions 1 3. Procedure 1 3.1. Environment set up and Data import 1 3.2. Comparison of the driving distances of the golf balls 2 3.3. Descriptive Statistical Summary 2 3.4. Calculation of p value 2 3.5. Confidence Interval 2 4. Conclusion 3 5. Source Code 3 1 Project Objective: The objective of this report is to formulate a solution for the Par Inc., data set (“Golf.xls”) in R Studio and generate insights about the data set. This report will consists of the following: Importing the dataset in R Comparison of driving distances of the golf balls Descriptive statistical Summary Calculation of p-value Confidence interval Conclusion 2 Assumptions: For two sample t-test, we assume that the variances are equal. For welch two sample t-test, we assume that the variances are not equal. Here any one of the methods can be used based upon our requirement. However, this report contains both of them. 3 Procedure: A Typical Hypothesis Testing activity consists of the following steps: 1. Environment Set up and Data Import 2. Comparison of the driving distances of the golf balls 3. Descriptive Statistical Summary 4. Calculation of p-value 5. Confidence Interval 3.1 Environment Set up and Data Import Our first step is to check the type of data set we want to read. It can be in excel, .csv or in any other format. Here the given file is in the .xls format. We have converted this file into a .csv file using simple excel save as option. This .csv file consists the data of driving distances of new and current golf balls. It has a collection of 4O records. Now, we begin our analysis by setting up the working directory which is shown in the source code section. Our next step is to read the file using ‘read.csv’ command which is explained in the source code section. We have successfully imported the data set into the R world. We are now ready to perform some calculations and try to understand the behavior of the data. 3.2 Comparison of the driving distances of the golf balls: Here we are using a two-tailed test which refers to the difference between the mean distances. Here, ‘X’ refers to the mean driving distance of the new golf ball whilst ‘Y’ refers to the mean driving distance of the current golf ball. When the difference between the mean distances is equal to zero, therefore the null hypothesis is rejected. We have calculated the mean, standard deviation and variance for the distances of the new and current golf balls. The commands used to calculate mean is mean(), standard deviation is sd() and variance is var(). The results are shown in the source code section. 3.3 Descriptive Statistical summary: This Short Summary gives us the overview of the entire data set. It is obtained by using the summary() command. From the graphical representation shown in the source code section we observe that the boxplot of Current distances has a normal distribution whereas the boxplot of New distances has a right skewed distribution. The boxplots are obtained using the boxplot() command. 3.4 Calculation on p-value: By using the t.test() command we have calculated the p- value and the confidence interval. The results are shown in the code section. If the p value is less than alpha the null hypothesis is rejected. 3.5 Confidence Interval: The confidence interval is also obtained from the t.test() command. The results are shown in the source code section. 4. Conclusion: Finally, from the above calculations we presented the degree of freedom to be 76. Therefore, the p-value is greater than the level of significance which was chosen to be 0.05 and the null hypothesis is not rejected. Using the t.test() command, the interval estimation was calculated. Therefore, with confidence at 95% the differences between the mean distances are in between -1.385740214 and 6.935740214 yard. The sample size is said to have an inverse relationship with standard error. Increasing the sample size give a low return because the increased accuracy will be negligible. Therefore, it is recommended for Par inc., to have a larger sample size in order to estimate a more accurate result. 5. Source Code: ############################################################################ # # MANGERIAL REPORT ON PAR INC.,USING HYPOTHESIS TESTING # ############################################################################ #Setting up Working Directory setwd("C:/Users/lalitha/Desktop/R programming") getwd() ## [1] "C:/Users/lalitha/Desktop/R programming" #Importing the data set mydata = read.csv("Golf.csv", header = TRUE) attach(mydata) #Descriptive Statastical Summary summary(mydata) ## ## ## ## ## ## ## Current Min. :255.0 1st Qu.:263.0 Median :270.0 Mean :270.3 3rd Qu.:275.2 Max. :289.0 sd(Current) ## [1] 8.752985 New Min. :250.0 1st Qu.:262.0 Median :265.0 Mean :267.5 3rd Qu.:274.5 Max. :289.0 sd(New) ## [1] 9.896904 var(Current) ## [1] 76.61474 var(New) ## [1] 97.94872 sd(New - Current) ## [1] 13.74397 #Graphical Exploration boxplot(Current, horizontal = TRUE,col = "Blue", main = "Boxplot of Current") boxplot(New, horizontal = TRUE, col = "Red", main = "Boxplot of New") #alpha =0.05 #N= 40 #Formulating hypothesis using the two tailed t-test #Assuming the variances are not equal t.test(Current,New) ## ## ## ## ## ## ## ## ## ## ## Welch Two Sample t-test data: Current and New t = 1.3284, df = 76.852, p-value = 0.188 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.384937 6.934937 sample estimates: mean of x mean of y 270.275 267.500 #Assuming the variances are equal t.test(Current,New, var.equal = TRUE) ## ## ## ## ## ## ## ## ## ## ## Two Sample t-test data: Current and New t = 1.3284, df = 78, p-value = 0.1879 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.383958 6.933958 sample estimates: mean of x mean of y 270.275 267.500 ############################################################################# # # The End # ############################################################################# ##