How Do You Start Machine Learning in Python?
- Python is a popular and powerful interpreted language. Unlike R, Python is a complete language and platform that you can use for both research and development and also for data analytics and machine learning.
- There are also a lot of modules and libraries to choose from, providing multiple ways to do each task. It is easy & opens up scope to all folks even from non technical backgrounds to be a programmer and learn Machine learning
- The best way to learn machine learning using python is get started with small and simple projects. The program below is your step 0 in machine learning.
We Need
Python Version 2.7 +
- Pre-Requisites : Basic Python skillset
- Difficulty Level : Beginner
Problem Statement:
Your company has told you to build a machine learning Regression model to predict its share value which follows this equation.
f(x) = 3*x0 – 4*z0 + 5
Where x0 is company investments and z0 is company performance & 5 is the bias
Let's Begin !
Let's import few basic analytics libraries¶
In [7]:
'''
numpy : is a mathematical computation library.
matplotlib : A plotting library
'''
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
We use numpy library to generate input data¶
In [8]:
# First, size of the training set we want to generate.
observations = 2000
# We generate them randomly, drawing from an uniform distribution. There are 3 arguments of this method (low, high, size).
# The size of x0 and z0 is observations by 1. In this case: 2000 x 1.
x0 = np.random.uniform(low=-10, high=10, size=(observations,1))
z0 = np.random.uniform(low=-10, high=10, size =(observations,1))
# Combine two dimensions of x0 , z0 into one input matrix.
# This is the X matrix from the linear model y = x*w + b.
# column_stack is a Numpy method, which combines two vectors into a matrix.
inputs = np.column_stack((x0,z0))
In [9]:
#display the dimensions of input matrix
inputs.shape
Out[9]:
(2000, 2)
We generate output using numpy library again¶
In [10]:
# noise is something which isn't in our hands There are 3 arguments of this method (low, high, size).
noise = np.random.uniform(-1, 1, (observations,1))
# Produce the targets according to the f(x,z) = 3x - 4z + 5 + noise definition.
# In this way, we are basically saying: the weights should be 3 and -4, while the bias is 5.
targets = 3*x0 - 4*z0 + 5 + noise
In [11]:
#display the dimensions of input matrix
targets.shape
Out[11]:
(2000, 1)
So Now we have Data Ready !¶
Plotting the data¶
In [12]:
'''
To use the 3D plot, the objects should have a certain shape, so we reshape the targets.
The proper method to use is reshape and takes as arguments the dimensions in which we want to fit the object.
'''
targets = targets.reshape(observations,)
# Plotting according to the conventional matplotlib.pyplot syntax
# Declare the figure
fig = plt.figure()
# A method allowing us to create the 3D plot
ax = fig.add_subplot(111, projection='3d')
# Choose the axes.
ax.plot(x0, z0, targets)
# Set labels
ax.set_xlabel('x0')
ax.set_ylabel('z0')
ax.set_zlabel('Our Targets')
# You can fiddle with the azim parameter to plot the data from different angles. Just change the value of azim=100
# to azim = 0 ; azim = 200, or whatever. Check and see what happens.
ax.view_init(azim=100)
# So far we were just describing the plot. This method actually shows the plot.
plt.show()
In [13]:
'''
We reshape the targets back to the shape that they were in before plotting.
This reshaping is a side-effect of the 3D plot. Sorry for that.
'''
targets = targets.reshape(observations,1)
In [14]:
'''
We will initialize the weights and biases randomly in some small initial range.
init_range is the variable that will measure that.
You can play around with the initial range, but we don't really encourage you to do so.
High initial ranges may prevent the machine learning algorithm from learning.
'''
init_range = 0.1
# Weights are of size k x m, where k is the number of input variables and m is the number of output variables
# In our case, the weights matrix is 2x1 since there are 2 inputs (x and z) and one output (y)
weights = np.random.uniform(low=-init_range, high=init_range, size=(2, 1))
# Biases are of size 1 since there is only 1 output. The bias is a scalar.
biases = np.random.uniform(low=-init_range, high=init_range, size=1)
#Print the weights to get a sense of how they were initialized.
print (weights)
print (biases)
[[-0.07174342] [-0.03948961]] [0.04328575]
In [15]:
'''
Set some small learning rate (denoted eta in the lecture).
0.02 is going to work quite well for our example. Once again, you can play around with it.
it is HIGHLY recommended that you play around with it.
'''
learning_rate = 0.02
The real machine learning logic !¶
In [16]:
'''
We iterate over our training dataset 100 times. That works well with a learning rate of 0.02.
The proper number of iterations is something we will talk about later on, but generally
a lower learning rate would need more iterations, while a higher learning rate would need less iterations
keep in mind that a high learning rate may cause the loss to diverge to infinity, instead of converge to 0.
'''
for i in range (100):
# This is the linear model: y = xw + b equation
outputs = np.dot(inputs,weights) + biases
# The deltas are the differences between the outputs and the targets
# Note that deltas here is a vector 1000 x 1
deltas = outputs - targets
# We are considering the L2-norm loss, but divided by 2, so it is consistent with the lectures.
# Moreover, we further divide it by the number of observations.
# This is simple rescaling by a constant. We explained that this doesn't change the optimization logic,
# as any function holding the basic property of being lower for better results, and higher for worse results
# can be a loss function.
loss = np.sum(deltas ** 2) / 2 / observations
# We print the loss function value at each step so we can observe whether it is decreasing as desired.
print ("Loss :" , loss)
# Another small trick is to scale the deltas the same way as the loss function
# In this way our learning rate is independent of the number of samples (observations).
# Again, this doesn't change anything in principle, it simply makes it easier to pick a single learning rate
# that can remain the same if we change the number of training samples (observations).
# You can try solving the problem without rescaling to see how that works for you.
deltas_scaled = deltas / observations
# Finally, we must apply the gradient descent update rules from the relevant lecture.
# The weights are 2x1, learning rate is 1x1 (scalar), inputs are 1000x2, and deltas_scaled are 1000x1
# We must transpose the inputs so that we get an allowed operation.
weights = weights - learning_rate * np.dot(inputs.T,deltas_scaled)
biases = biases - learning_rate * np.sum(deltas_scaled)
# The weights are updated in a linear algebraic way (a matrix minus another matrix)
# The biases, however, are just a single number here, so we must transform the deltas into a scalar.
# The two lines are both consistent with the gradient descent methodology.
Loss : 423.2945063242378 Loss : 60.653335178432854 Loss : 17.27979574342842 Loss : 11.745259809852493 Loss : 10.710976877352401 Loss : 10.225241417989942 Loss : 9.818943143794698 Loss : 9.43587488705734 Loss : 9.068821565954527 Loss : 8.71640199143762 Loss : 8.377948071287504 Loss : 8.052896319671238 Loss : 7.740714840494926 Loss : 7.440893899477952 Loss : 7.152944075644615 Loss : 6.876395341708037 Loss : 6.610796282495929 Loss : 6.355713356394545 Loss : 6.1107301874693345 Loss : 5.8754468857838 Loss : 5.6494793946499176 Loss : 5.432458863725935 Loss : 5.22403104693603 Loss : 5.02385572422879 Loss : 4.831606146230568 Loss : 4.646968500887297 Loss : 4.4696414012240595 Loss : 4.299335393386301 Loss : 4.135772484159697 Loss : 3.9786856871973915 Loss : 3.8278185872139345 Loss : 3.6829249214344917 Loss : 3.5437681776162147 Loss : 3.4101212079855183 Loss : 3.2817658584611435 Loss : 3.1584926125577795 Loss : 3.0401002493889537 Loss : 2.9263955152109737 Loss : 2.8171928079717468 Loss : 2.7123138743496034 Loss : 2.611587518787536 Loss : 2.514849324047986 Loss : 2.4219413828319407 Loss : 2.3327120400243277 Loss : 2.247015645144976 Loss : 2.164712314600985 Loss : 2.0856677033525175 Loss : 2.009752785619247 Loss : 1.9368436442695125 Loss : 1.8668212685484158 Loss : 1.7995713598146512 Loss : 1.7349841449690258 Loss : 1.6729541972700697 Loss : 1.6133802642442843 Loss : 1.5561651024101193 Loss : 1.501215318545878 Loss : 1.448441217242469 Loss : 1.397756654492177 Loss : 1.349078897074402 Loss : 1.3023284875089167 Loss : 1.2574291143561482 Loss : 1.2143074876527964 Loss : 1.1728932192794639 Loss : 1.1331187080649991 Loss : 1.0949190294400315 Loss : 1.05823182945955 Loss : 1.022997223021577 Loss : 0.9891576961157555 Loss : 0.9566580119423045 Loss : 0.9254451207481239 Loss : 0.8954680732328245 Loss : 0.8666779373833728 Loss : 0.8390277186015948 Loss : 0.8124722829941318 Loss : 0.7869682836996729 Loss : 0.7624740901331799 Loss : 0.7389497200316153 Loss : 0.7163567741902646 Loss : 0.6946583737830941 Loss : 0.6738191001648581 Loss : 0.6538049370566773 Loss : 0.6345832150207275 Loss : 0.6161225581333903 Loss : 0.5983928327698345 Loss : 0.5813650984164211 Loss : 0.5650115604306348 Loss : 0.5493055246714571 Loss : 0.53422135392608 Loss : 0.5197344260618847 Loss : 0.5058210938353374 Loss : 0.49245864629221614 Loss : 0.47962527169616187 Loss : 0.46730002192502473 Loss : 0.45546277827691917 Loss : 0.44409421863014176 Loss : 0.4331757859033612 Loss : 0.4226896577646019 Loss : 0.4126187175395678 Loss : 0.4029465262718161 Loss : 0.3936572958891957
In [17]:
'''
We print the weights and the biases, so we can see if they have converged to what we wanted.
When declared the targets, following the f(x,z), we knew the weights should be 3 and -4, while the bias: 5.
'''
print (weights, biases)
[[ 3.00189141] [-3.99775609]] [4.33826281]
In [18]:
'''
We print the outputs and the targets in order to see if they have a linear relationship.
Again, that's not needed. Moreover, in later lectures, that would not even be possible.
'''
plt.plot(outputs,targets)
plt.xlabel('outputs')
plt.ylabel('targets')
plt.show()
PHEW , WE DID IT !!!¶
In [ ]:
Ever tried audio analytics in Python ?