DATE: 14-06-2021 EXP NO: 6 Logistic Regression Aim: Implementation of Logistic regression model using python language Software Required: Google Collab Theory: Logistic Regression: It is a supervised Learning technique which is mainly used for predicting the categorical dependent variable using a given set of independent variables In Logistic regression, instead of fitting a regression line, an "S" shaped logistic function is used to predict two maximum values (0 or 1) In logistic regression, we use the concept of the threshold value, which defines the probability of either 0 or 1. Such as values above the threshold value tends to 1, and a value below the threshold values tends to 0. it has the ability to provide probabilities and classify new data using continuous and discrete datasets Assumptions for Logistic Regression: The dependent variable must be categorical in nature. The independent variable should not have multi-collinearity. Steps in Logistic Regression: Data Pre-processing Fitting Logistic Regression to the Training set Predicting the test result Test accuracy of the result(Creation of Confusion matrix) from google.colab import files uploaded = files.upload() Choose Files diabetes.csv diabetes.csv(application/vnd.ms-excel) - 23873 bytes, last modified: 6/11/2021 - 100% done Saving diabetes.csv to diabetes.csv Import Data and Train the model import pandas as pd from sklearn import preprocessing diabete = pd.read_csv("diabetes.csv") X = diabete.drop(['Outcome'], axis = 1) # Features y = diabete.Outcome # Target variable # split X and y into training and testing sets from sklearn.model_selection import train_test_split X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.40,random_state=102) # import the class from sklearn.linear_model import LogisticRegression # instantiate the model (using the default parameters) logreg = LogisticRegression(max_iter=1000) # fit the model with data logreg.fit(X_train,y_train) #Prediction y_pred=logreg.predict(X_test) Confusion Matrix # import the metrics class from sklearn import metrics cnf_matrix = metrics.confusion_matrix(y_test, y_pred) print("Confusion matrix",cnf_matrix) print("misclassified:%d" ,(y_test!=y_pred).sum()) print("classified:%d" ,(y_test==y_pred).sum()) print("Accuracy:",metrics.accuracy_score(y_test, y_pred)) print("Precision:",metrics.precision_score(y_test, y_pred)) print("Recall:",metrics.recall_score(y_test, y_pred)) print("F1 score",metrics.f1_score(y_test, y_pred)) Confusion matrix [[173 23] [ 41 71]] misclassified:%d 64 classified:%d 244 Accuracy: 0.7922077922077922 Precision: 0.7553191489361702 Recall: 0.6339285714285714 F1 score 0.6893203883495145 Predict Output new1= logreg.predict([[6,148,72,35,0,33.6,0.627,50]]) print(new1) if new1==1: print('Diabetes Positive') else: print('Diabetes negative') [1] Diabetes Positive Result: Logistic regression using python language is implemented successfully Bala Murugan R 180051601046 ECE-A