Uploaded by Sae Royi

Logistic Regression Implementation in Python Lab Report

DATE: 14-06-2021
Logistic Regression
Implementation of Logistic regression model using python language
Software Required: Google Collab
Logistic Regression:
 It is a supervised Learning technique which is mainly used for predicting the categorical
dependent variable using a given set of independent variables
 In Logistic regression, instead of fitting a regression line, an "S" shaped logistic function is
used to predict two maximum values (0 or 1)
 In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and a
value below the threshold values tends to 0.
 it has the ability to provide probabilities and classify new data using continuous and
discrete datasets
Assumptions for Logistic Regression:
 The dependent variable must be categorical in nature.
 The independent variable should not have multi-collinearity.
Steps in Logistic Regression:
Data Pre-processing
Fitting Logistic Regression to the Training set
Predicting the test result
Test accuracy of the result(Creation of Confusion matrix)
from google.colab import files
uploaded = files.upload()
Choose Files diabetes.csv
diabetes.csv(application/vnd.ms-excel) - 23873 bytes, last modified: 6/11/2021 - 100% done
Saving diabetes.csv to diabetes.csv
Import Data and Train the model
import pandas as pd
from sklearn import preprocessing
diabete = pd.read_csv("diabetes.csv")
X = diabete.drop(['Outcome'], axis = 1) # Features
y = diabete.Outcome # Target variable
# split X and y into training and testing sets
from sklearn.model_selection import train_test_split
# import the class
from sklearn.linear_model import LogisticRegression
# instantiate the model (using the default parameters)
logreg = LogisticRegression(max_iter=1000)
# fit the model with data
Confusion Matrix
# import the metrics class
from sklearn import metrics
cnf_matrix = metrics.confusion_matrix(y_test, y_pred)
print("Confusion matrix",cnf_matrix)
print("misclassified:%d" ,(y_test!=y_pred).sum())
print("classified:%d" ,(y_test==y_pred).sum())
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
print("Precision:",metrics.precision_score(y_test, y_pred))
print("Recall:",metrics.recall_score(y_test, y_pred))
print("F1 score",metrics.f1_score(y_test, y_pred))
Confusion matrix [[173 23]
[ 41 71]]
misclassified:%d 64
classified:%d 244
Accuracy: 0.7922077922077922
Precision: 0.7553191489361702
Recall: 0.6339285714285714
F1 score 0.6893203883495145
Predict Output
new1= logreg.predict([[6,148,72,35,0,33.6,0.627,50]])
if new1==1:
print('Diabetes Positive')
print('Diabetes negative')
Diabetes Positive
Result: Logistic regression using python language is implemented successfully
Bala Murugan R