# EE318 ```1
Final Exam
EE318 Engineering Data Analysis
(December 2020)
Christopher R. Brumley
Department of Electrical Engineering
University of North Dakota

Abstract—The goal of this paper is to explore the
statistical nature of empirical models. Explicitly, the
straight-line regression model. The regression model is a
great tool for engineers to analyze statistical data pertaining
to many areas of engineering. Given a sample data set, we
can determine whether the sample follows a regression
model. If so, we can make precise estimations and
predictions on future data samples.
I. INTRODUCTION
T
HIS final exam project is designed to demonstrate
the statistical analysis of a regression model. The
experimental data gathered are from a rocket motor
manufacturer interested in the correlation between the
shear strength and the age of the propellant used. A
sample size of 20 was gathered for this experiment. The
data gathered will be used to determine whether the model
follows a simple straight-line regression model in order to
make predictions and estimations of our experiment.
II. SCATTER DIAGRAM
A scatter diagram is an informative way to distinguish
whether two or more variables are related based on
observed data. A scatter diagram is used when there is
no obvious physical mechanism that relates the variables
of interest. We will utilize a scatter diagram for the data
sample collected to determine the relationship between
the age of the propellant used and the shear strength of
the bond.
A. Observation
The following data set was gathered and analyzed.
Strength (Y)
2158.70
1678.15
2316.00
2061.30
2207.50
1708.30
1784.70
2575.00
2357.90
2277.70
2165.20
2399.55
1779.80
2336.75
1765.30
2053.50
2414.40
2200.50
2654.20
1753.70
Age (X)
15.50
23.75
8.00
17.00
5.00
19.00
24.00
2.50
7.50
11.00
13.00
3.75
25.00
9.75
22.00
18.00
6.00
12.50
2.00
21.50
Table 1: Sample Data Set
The data from Table 1 is represented with a scatter plot below in
figure 1 built in the Minitab software.
2
III. TESTING FOR SIGNIFICANCE
We will now test for the significance of the regression model
depicted below. This will test hypotheses about the slope and
intercept of a linear regression model.
𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0
𝑇𝑒𝑠𝑡 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝐹0 =
Figure 1: Scatter Diagram with straight-line regression
𝑀𝑆𝑅 =
𝑀𝑆𝑅
𝑀𝑆𝐸
𝑆𝑆𝑅
𝑆𝑆𝐸
𝑎𝑛𝑑 𝑀𝑆𝐸 =
1
𝑛−𝑝
𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 = 𝑆𝑆𝑅 = ∑(𝑦̂𝑖 − 𝑦)2
B. Straight-line Regression Model
2
𝐸𝑟𝑟𝑜𝑟 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 = 𝑆𝑆𝐸 = = ∑(𝑦𝑖 − 𝑦̂)
𝑖
As you can see in figure 1, the straight-line regression
model does seem plausible. It is important to notice that
the linear line will not pass through every point, but a
negative linear trend is noticeable. Therefor, it is
reasonable to assume that the variables are inversely
related.
Because we have a single output and single input, our
linear regression model follows this equation:
𝑦̂ = 𝛽0 + 𝛽1 𝑥 + 𝜀
In this equation, y represents the output, x is the input, β
are unknown coefficients and 𝜀 is the random error. The
y component in this equation is dependent on the x
component. With a more abundant sample size, we can
get a more accurate representation of our data linear
regression line.
𝛽0 = 𝑦 − 𝛽1 𝑥 ≈ 2625.4
𝛽1 =
𝑛 ∑ 𝑥 ∗ 𝑦 − ∑𝑥 ∗ ∑𝑦
𝟐
𝒏∑𝒙 − (∑𝒙)𝟐
≈ −36.96
Our data regression equation is checked utilizing the
Minitab software depicted below.
As you can see in the Minitab depiction above, with a P-Value
of zero, evidence shows that we can reject the null hypothesis.
IV. ESTIMATION
With our simple regression equation, we can make
estimates for future data inputs to our system. For
example, if we wanted to estimate the mean shear
strength of a motor made from twenty-week-old
propellant, we could just plug the twenty-week input
into our equation.
𝑦 = 2625.4 − 36.96𝑥
𝑦 = 2625.4 − 36.96(20)
𝑦 = 1886.2
3
We can reliably estimate that the shear strength would be
1886.2 psi.
V. PREDICTIONS
REFERENCES
VI. RESIDUAL ANALYSIS
The residual analysis if a straight-line regression model
is the value each output of y is away from the regression
line for the data. A residual analysis is used to examine
how well the chosen regression is. The data of the
residual analysis should be unbiased and random with
minimum variance.
The figure above is the residual analysis. The dotted line
at zero is representative of the red regression line in the
figure below. For example, the first data point above is
approximately seventy-five units above the dotted line.
This equates to the first data point in the figure below
being the same distance above the red line.
 D. C. Montgomery, G. C. Runger, and N. F. Hubele, Engineering
statistics. Hoboken, NJ: John Wiley &amp;amp; Sons, Inc., 2011.
 “Minitab 18 Support,” Minitab. [Online]. Available:
https://support.minitab.com/en-us/minitab/18/. [Accessed: 18Dec-2020].
```