Uploaded by Jayanth Krishna

Golf balls PAR inc

advertisement
A Managerial Report
ON
Par Inc., using hypothesis testing
By
J.CH.T.NAGA SAI PRADEEP
Table of Contents
1. Project Objective
1
2. Assumptions
1
3. Procedure
1
3.1. Environment set up and Data import
1
3.2. Comparison of the driving distances of the golf balls
2
3.3. Descriptive Statistical Summary
2
3.4. Calculation of p value
2
3.5. Confidence Interval
2
4. Conclusion
3
5. Source Code
3
1 Project Objective:
The objective of this report is to formulate a solution for the Par Inc., data set (“Golf.xls”) in R
Studio and generate insights about the data set. This report will consists of the following:
Importing the dataset in R
Comparison of driving distances of the golf balls
Descriptive statistical Summary
Calculation of p-value
Confidence interval
Conclusion
2 Assumptions:
For two sample t-test, we assume that the variances are equal. For welch two sample t-test, we
assume that the variances are not equal. Here any one of the methods can be used based upon our
requirement. However, this report contains both of them.
3 Procedure:
A Typical Hypothesis Testing activity consists of the following steps:
1. Environment Set up and Data Import
2. Comparison of the driving distances of the golf balls
3. Descriptive Statistical Summary
4. Calculation of p-value
5. Confidence Interval
3.1 Environment Set up and Data Import
Our first step is to check the type of data set we want to read. It can be in excel, .csv or in any
other format. Here the given file is in the .xls format. We have converted this file into a .csv file using
simple excel save as option. This .csv file consists the data of driving distances of new and current
golf balls. It has a collection of 4O records.
Now, we begin our analysis by setting up the working directory which is shown in the source
code section.
Our next step is to read the file using ‘read.csv’ command which is explained in the source
code section. We have successfully imported the data set into the R world.
We are now ready to perform some calculations and try to understand the behavior of the data.
3.2 Comparison of the driving distances of the golf balls:
Here we are using a two-tailed test which refers to the difference between the mean
distances. Here, ‘X’ refers to the mean driving distance of the new golf ball whilst ‘Y’ refers to the
mean driving distance of the current golf ball. When the difference between the mean distances is
equal to zero, therefore the null hypothesis is rejected.
We have calculated the mean, standard deviation and variance for the distances of the new and
current golf balls. The commands used to calculate mean is mean(), standard deviation is sd() and
variance is var(). The results are shown in the source code section.
3.3 Descriptive Statistical summary:
This Short Summary gives us the overview of the entire data set. It is obtained by using the
summary() command. From the graphical representation shown in the source code section we observe
that the boxplot of Current distances has a normal distribution whereas the boxplot of New distances
has a right skewed distribution. The boxplots are obtained using the boxplot() command.
3.4 Calculation on p-value:
By using the t.test() command we have calculated the p- value and the confidence interval. The
results are shown in the code section. If the p value is less than alpha the null hypothesis is rejected.
3.5
Confidence Interval:
The confidence interval is also obtained from the t.test() command. The results are shown in the
source code section.
4.
Conclusion:
Finally, from the above calculations we presented the degree of freedom to be 76. Therefore, the
p-value is greater than the level of significance which was chosen to be 0.05 and the null hypothesis
is not rejected. Using the t.test() command, the interval estimation was calculated.
Therefore, with confidence at 95% the differences between the mean distances are in between
-1.385740214 and 6.935740214 yard. The sample size is said to have an inverse relationship with
standard error. Increasing the sample size give a low return because the increased accuracy will be
negligible. Therefore, it is recommended for Par inc., to have a larger sample size in order to
estimate a more accurate result.
5.
Source Code:
############################################################################
#
#
MANGERIAL REPORT ON PAR INC.,USING HYPOTHESIS TESTING
#
############################################################################
#Setting up Working Directory
setwd("C:/Users/lalitha/Desktop/R programming")
getwd()
## [1] "C:/Users/lalitha/Desktop/R programming"
#Importing the data set
mydata = read.csv("Golf.csv", header = TRUE)
attach(mydata)
#Descriptive Statastical Summary
summary(mydata)
##
##
##
##
##
##
##
Current
Min.
:255.0
1st Qu.:263.0
Median :270.0
Mean
:270.3
3rd Qu.:275.2
Max.
:289.0
sd(Current)
## [1] 8.752985
New
Min.
:250.0
1st Qu.:262.0
Median :265.0
Mean
:267.5
3rd Qu.:274.5
Max.
:289.0
sd(New)
## [1] 9.896904
var(Current)
## [1] 76.61474
var(New)
## [1] 97.94872
sd(New - Current)
## [1] 13.74397
#Graphical Exploration
boxplot(Current, horizontal = TRUE,col = "Blue", main = "Boxplot of Current")
boxplot(New, horizontal = TRUE, col = "Red", main = "Boxplot of New")
#alpha =0.05
#N= 40
#Formulating hypothesis using the two tailed t-test
#Assuming the variances are not equal
t.test(Current,New)
##
##
##
##
##
##
##
##
##
##
##
Welch Two Sample t-test
data: Current and New
t = 1.3284, df = 76.852, p-value = 0.188
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.384937 6.934937
sample estimates:
mean of x mean of y
270.275
267.500
#Assuming the variances are equal
t.test(Current,New, var.equal = TRUE)
##
##
##
##
##
##
##
##
##
##
##
Two Sample t-test
data: Current and New
t = 1.3284, df = 78, p-value = 0.1879
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.383958 6.933958
sample estimates:
mean of x mean of y
270.275
267.500
#############################################################################
#
#
The End
#
############################################################################# ##
Download