Simple Linear Regression
Model
MA208
Probability Theory and Applications
Presented by: U N Shyam Pramod, 2nd Year BTech,
Mechanical Engineering, NITK Surathkal
by Shyam pramod U.N
preencoded.png
What is Simple Linear
Regression?
Model Relationship
Between two variables
Prediction Tool
Predict dependent using independent variable
Equation
Y = β₀ + β₁X + ε
Example
Salary prediction from experience
preencoded.png
Example – House Prices
Dependent Variable (Y)
Independent Variable (X)
House price
Size of house (sq. ft.)
preencoded.png
PDF and CDF
Error Distribution
PDF Formula
ε ~ N(0, σ²)
fY(y) = (1/√2πσ²)e^(-(y-(β₀+β₁x))²/2σ²)
MGF Formula
CDF Formula
MY(t)=exp((β0+β1x)t+(σ^2)(t^2)/2)
FY(y) = Φ((y-(β₀+β₁x))/σ)
preencoded.png
Derivation – Least Squares
Goal
Slope Formula
Intercept Formula
β̂₀ = ȳ - β̂₁x̄
β̂₁ = ∑(xi-x̄)(yi-ȳ)/∑(xi-x̄)²
Minimize total error ∑(yi-β₀-β₁xi)²
preencoded.png
Python Code Illustration
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]
x_mean = sum(x)/len(x)
y_mean = sum(y)/len(y)
a = sum((xi - x_mean)*(yi - y_mean) for xi, yi in zip(x, y)) / sum((xi - x_mean)**2 for xi in x)
b = y_mean - a*x_mean
y_pred = [a*xi + b for xi in x]
plt.scatter(x, y)
plt.plot(x, y_pred, color='red')
plt.show()
preencoded.png
Theorem:
The Least Squares Estimation
We aim to minimize the sum of squared residuals between observed and
predicted values.
S
2
Sum of Squares
Parameters
S(β₀,β₁) = ∑(yᵢ - β₀ - β₁xᵢ)²
β₀ (intercept) and β₁ (slope)
0
Goal
Set partial derivatives to zero
preencoded.png
Taking Partial Derivatives
For β₀
For β₁
∂S/∂β₀ = -2∑(yᵢ - β₀ - β₁xᵢ)
∂S/∂β₁ = -2∑xᵢ(yᵢ - β₀ - β₁xᵢ)
Setting these derivatives to zero gives us the conditions for minimizing the sum of squared errors.
preencoded.png
Solving the Normal Equations
Equation 1
∑(yᵢ - β₀ - β₁xᵢ) = 0
Simplifies to: β₀ = ȳ - β₁x̄
Equation 2
∑xᵢ(yᵢ - β₀ - β₁xᵢ) = 0
Substitutuion
∑(xᵢ - x̄)(yᵢ - ȳ) = β₁∑(xᵢ - x̄)²
preencoded.png
The Least Squares Regression Formula
Final Regression Line
1
ŷ = β̂₀ + β̂₁x
Intercept Formula
2
β̂₀ = ȳ - β̂₁x̄
Slope Formula
3
β̂₁ = ∑(xᵢ - x̄)(yᵢ - ȳ) / ∑(xᵢ - x̄)²
These formulas give us the optimal line that minimizes the sum of squared errors between our data points and the regression
line.
preencoded.png
Real-World Applications
Predict salary, house prices, crop yield
Used in business, economics, science
preencoded.png
Assumptions in SLR
Independence
Homoscedasticity
Residuals not correlated
Constant variance of errors
Linearity
Normality
X and Y relationship is linear
Residuals normally
distributed
preencoded.png
THANK YOU
preencoded.png