Linear Classification ( logistic regression ) Using Recent Tensorflow 2+

In this article , we implement Linear classification using most recent version of Tensorflow. i.e 2.2+

What Is Linear Classification ?

When a data points to be classified are separable by a line or in easier terms we can draw a line between success records vs failure records , we call it Linear Classifier.
In the field of data science or machine learning , A linear classifier has a target of statistical classification to identify which output class an input set belongs to.
By using values of linear combinations and making a classification decision, a linear classifier achieves its end goal.
Examples of linear classifier include , Logistic Regression , NB classification etc.

What is the difference between linear and nonlinear classification techniques?

For 2D feature space, say data having only x-axis and y-axis. If one can draw a line between points without cutting any of them, they are linearly separable.

Now , look at the image below (on your right) . Such cases where data is not separable using a straight line falls under non-linear category. Here we needed aa circle to separate out the points.

Same goes for clusters in 3D data. Only difference being , A plane is used there instead of line.

This type of classifier works better when the problem is linear separable.

What is the difference types of linear classifiers ?

Majorly divided into :

A : Generative ( assumes conditional density functions ) :

A1 : Linear Discriminant Analysis (LDA)—assumes Gaussian conditional density models

A2 : Naive Bayes classifier with multinomial or multivariate Bernoulli models.

B : Discriminative models
(attempts to maximise the quality of the output on a training set)

B1 : Logistic regression

B1 : Perceptrons

B1 : Support Vector Machines(SVM)

Following stunt shows us how can we implement a linear classier using very famous tensorflow libary.
This code also demonstrates , how can we save the model and use it in production.

Python Version	Difficulty Level	Pre-Requisites
2.7+	Easy	Basic Python

”’
Import Modules 
”’
import tensorflow as tf
import numpy as np
import pandas as pd
 
import matplotlib.pyplot as plt
%matplotlib inline
 
# Load in the data
from sklearn.datasets import load_breast_cancer
 
#Used to split dataset to train and test
from sklearn.model_selection import train_test_split
 
#Used to scale the data
from sklearn.preprocessing import StandardScaler
 
 
”’
Inuilt module to load the data
This downloads the data (internet connection needed)
”’
data = load_breast_cancer()
 
# check the type of ‘data’
type(data)
”’
What data stands for & Get the data columns and target 
”’
data.target_names , data.feature_names

(array(['malignant', 'benign'], dtype='<U9'), array(['mean radius', 'mean texture', 'mean perimeter', 'mean area', 'mean smoothness', 'mean compactness', 'mean concavity', 'mean concave points', 'mean symmetry', 'mean fractal dimension', 'radius error', 'texture error', 'perimeter error', 'area error', 'smoothness error', 'compactness error', 'concavity error', 'concave points error', 'symmetry error', 'fractal dimension error', 'worst radius', 'worst texture', 'worst perimeter', 'worst area', 'worst smoothness', 'worst compactness', 'worst concavity', 'worst concave points', 'worst symmetry', 'worst fractal dimension'], dtype='<U23'))

”’
We break the input data using sklearn library (Check the parameter syntax for train_test_split)
”’
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.33)
N, D = X_train.shape
 
 
”’
Scaling is a MUST to help neural networks effectively find patterns
”’
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

”’
Model building section 
”’
model = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=(D,)),
tf.keras.layers.Dense(1, activation=‘sigmoid’)
])

#compile teh model
model.compile(optimizer=‘adam’,
loss=‘binary_crossentropy’,
metrics=[‘accuracy’])


# Train the model
r = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=100)


# Evaluate the model – evaluate() functin returns loss and accuracy
print(“Train score:”, model.evaluate(X_train, y_train))
print(“Test score:”, model.evaluate(X_test, y_test))

Epoch 100/100 12/12 [==============================] - 0s 4ms/step - loss: 0.0855 - accuracy: 0.9816 - val_loss: 0.1101 - val_accuracy: 0.9734

Test score: [0.10941322147846222, 0.9734042286872864]

”’
Let’s plot the loss
”’
plt.plot(r.history[‘loss’], label=‘loss’)
plt.plot(r.history[‘val_loss’], label=‘val_loss’)
plt.legend()

”’
Let’s plot the accuracy
”’
plt.plot(r.history[‘accuracy’], label=‘acc’)
plt.plot(r.history[‘val_accuracy’], label=‘val_acc’)
plt.legend()

Lets use this model to make Predictions

”’
We use X_test to make predictions (backtrack and see whaat is X_test if you donot remember)
”’
pred = model.predict(X_test) #this returns output 
pred = np.round(pred).flatten() #we need to round and flattern to compare it with actual results from y_test
print(pred)

”’
We calculate the accuracy of the model 
evaluate() returns LOSS & ACCURACY
”’
print(“Manually calculated accuracy:”, np.mean(pred == y_test))
print(“Evaluate output:”, model.evaluate(X_test, y_test))

Manually calculated accuracy: 0.973404255319149 6/6 [==============================] - 0s 2ms/step - loss: 0.1094 - accuracy: 0.9734 Evaluate output: [0.10941322147846222, 0.9734042286872864]

Lets save this model to deploy it in production

”’
Synatx to save the model to yoru local
”’
model.save(‘linearclassifier.h5’)”’
Let’s check if model is saved in local system
”’
!ls -lrt

-rw-r--r-- 1 root root 18480 Jun 5 07:41 linearclassifier.h5

”’
Note :
We can optimise this model to get better accuracy by chaging values like epochs, 
number of hiddenn layers etc
Its all a game of trail and error. 
”’

”’
This is how we loaad the saved model and use it for predictions; 
”’
model = tf.keras.models.load_model(‘linearclassifier.h5’)
print(model.layers)
model.evaluate(X_test, y_test)

[<tensorflow.python.keras.layers.core.Dense object at 0x7fd15925eb38>]
6/6 [==============================] - 0s 2ms/step - loss: 0.1094 - accuracy: 0.9734

[0.10941322147846222, 0.9734042286872864]

Linear Classification ( logistic regression ) Using Recent Tensorflow 2+

What Is Linear Classification ?

What is the difference between linear and nonlinear classification techniques?

What is the difference types of linear classifiers ?

Lets use this model to make Predictions

Lets save this model to deploy it in production

That was great !

Don’t miss out !

Hashing In Python From Scratch ( Code Included )

Recursion In Python With Examples | Memoization

Unsupervised Text Classification In Python

Unsupervised Sentiment Analysis Using Python