尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Machine Learning
Linear Regression
Agenda
• Single Dimension Linear Regression

• Multi Dimension Linear Regression

• Gradient Descent

• Generalisation, Over-fitting & Regularisation

• Categorical Inputs
What is Linear Regression?
• Learning

• A supervised algorithm that learns from a set of training samples.

• Each training sample has one or more input values and a single output value.

• The algorithm learns the line, plane or hyper-plane that best fits the training
samples.

• Prediction

• Use the learned line, plane or hyper-plane to predict the output value for any
input sample.
Single Dimension Linear
Regression
Single Dimension Linear Regression
• Single dimension linear regression
has pairs of x and y values as input
training samples. 

• It uses these training sample to
derive a line that predicts values of y.

• The training samples are used to
derive the values of a and b that
minimise the error between actual
and predicated values of y. 

Single Dimension Linear Regression
• We want a line that minimises the
error between the Y values in
training samples and the Y values
that the line passes through.

• Or put another way, we want the
line that “best fits’ the training
samples.

• So we define the error function for
our algorithm so we can minimise
that error.
Single Dimension Linear Regression
• To determine the value of a that
minimises the error E, we look for
where the partial differential of E
with respect to a is zero.
Single Dimension Linear Regression
• To determine the value of b that
minimises the error E, we look for
where the partial differential of E
with respect to b is zero.
Single Dimension Linear Regression
• By substituting the final equations
from the previous two slides we
derive equations for a and b that
minimise the error
Single Dimension Linear Regression
• We also define a function which we can
use to score how well derived line fits.

• A value of 1 indicates a perfect fit. 

• A value of 0 indicates a fit that is no
better than simply predicting the mean
of the input y values. 

• A negative value indicates a fit that is
even worse than just predicting the
mean of the input y values.
Single Dimension Linear Regression
Single Dimension Linear Regression
Single Dimension Linear Regression
Multi Dimension Linear
Regression
Multi Dimension Linear Regression
• Each training sample has an x made
up of multiple input values and a
corresponding y with a single value. 

• The inputs can be represented as
an X matrix in which each row is
sample and each column is a
dimension. 

• The outputs can be represented as
y matrix in which each row is a
sample.
Multi Dimension Linear Regression
• Our predicated y values are
calculated by multiple the X matrix
by a matrix of weights, w.

• If there are 2 dimension, then this
equation defines plane. If there are
more dimensions then it defines a
hyper-plane.
Multi Dimension Linear Regression
• We want a plane or hyper-plane
that minimises the error between
the y values in training samples
and the y values that the plane or
hyper-plane passes through.

• Or put another way, we want the
plane/hyper-plane that “best fits’
the training samples.

• So we define the error function for
our algorithm so we can minimise
that error.
Multi Dimension Linear Regression
• To determine the value of w that
minimises the error E, we look for
where the differential of E with
respect to w is zero.

• We use the Matrix Cookbook to
help with the differentiation!
Multi Dimension Linear Regression
• We also define a function which we can
use to score how well derived line fits.

• A value of 1 indicates a perfect fit. 

• A value of 0 indicates a fit that is no
better than simply predicting the mean
of the input y values. 

• A negative value indicates a fit that is
even worse than just predicting the
mean of the input y values.
Multi Dimension Linear Regression
Multi Dimension Linear Regression
Multi Dimension Linear Regression
• In addition to using the X matrix to represent basic features our training
data, we can can also introduce additional dimensions (i.e. columns in
our X matrix) that are derived from those basic feature values.

• If we introduce derived features whose values are powers of basic
features, our multi-dimensional linear regression can then derive
polynomial curves, planes and hyper-planes.
Multi Dimension Linear Regression
• For example, if we have just one
basic feature in each sample of X, we
can include a range of powers of that
value into our X matrix like this:

• In non-matrix form our multi-
dimensional linear equation is: 

• Inserting the powers of the basic
feature that we have introduced this
becomes a polynomial:
Multi Dimension Linear Regression
Multi Dimension Linear Regression
Gradient Descent
Singular Matrices
• As we have seen, we can use
numpy’s linalg.solve() function to
determine the value of the weights
that result in the lowest possible error.

• But this doesn’t work if np.dot(X.T, X)
is a singular matrix.

• It results in the matrix equivalent of a
divide by zero.

• Gradient descent is an alternative
approach to determining the optimal
weights that in works for all cases,
including this singular matrix case.
Gradient Descent
• Gradient descent is a technique we can use to find the minimum of
arbitrarily complex error functions.

• In gradient descent we pick a random set of weights for our algorithm and
iteratively adjust those weights in the direction of the gradient of the error
with respect to each weight.

• As we iterate, the gradient approaches zero and we approach the
minimum error.

• In machine learning we often use gradient descent with our error function
to find the weights that give the lowest errors.
Gradient Descent
• Here is an example with a very
simple function:

• The gradient of this function is
given by:

• We choose an random initial
value for x and a learning rate of
0.1 and then start descent.

• On each iteration our x value is
decreasing and the gradient (2x)
is converging towards 0.
Gradient Descent
• The learning rate is a what is know as a hyper-parameter.

• If the learning rate is too small then convergence may take a very long
time.

• If the learning rate is too large then convergence may never happen
because our iterations bounce from one side of the minima to the other.

• Choosing a suitable value for hyper-parameters is an art so try different
values and plot the results until you find suitable values.
Multi Dimension Linear Regression
with Gradient Descent
• For multi dimension linear
regression our error function
is:

• Differentiating this with
respect to the weights vector
gives:

• We can iteratively reduce the
error by adjusting the weights
in the direction of these
gradients.
Multi Dimension Linear Regression
with Gradient Descent
Multi Dimension Linear Regression
with Gradient Descent
Generalisation, Over-fitting &
Regularisation
Generalisation & Over-fitting
• As we train our model with more and more data the it may start to fit the training data more and
more accurately, but become worse at handling test data that we feed to it later. 

• This is know as “over-fitting” and results in an increased generalisation error.

• To minimise the generalisation error we should 

• Collect as much sample data as possible. 

• Use a random subset of our sample data for training.

• Use the remaining sample data to test how well our model copes with data it was not trained
with.

• Also, experiment with adding higher degrees of polynomials (X2, X3, etc) as this can reduce
overfitting.
L1 Regularisation (Lasso)
• Having a large number of samples (n) with respect to the number of
dimensionality (d) increases the quality of our model. 

• One way to reduce the effective number of dimensions is to use those that
most contribute to the signal and ignore those that mostly act as noise.

• L1 regularisation achieves this by adding a penalty that results in the
weight for the dimensions that act as noise becoming 0. 

• L1 regularisation encourages a sparse vector of weights in which few are
non-zero and many are zero.
L1 Regularisation (Lasso)
• In L1 regularisation we add a penalty to
the error function: 

• Expanding this we get: 

• Take the derivative with respect to w to
find our gradient:

• Where sign(w) is -1 if w < 0, 0 if w = 0
and +1 if w > 0

• Note that because sign(w) has no
inverse function we cannot solve for w
and so must use gradient descent.
L1 Regularisation (Lasso)
L1 Regularisation (Lasso)
L2 Regularisation (Ridge)
• Another way to reduce the complexity of our model and prevent overfitting
to outliers is L2 regression, which is also known as ridge regression.

• In L2 Regularisation we introduce an additional term to the cost function
that has the effect of penalising large weights and thereby minimising this
skew.
L2 Regularisation (Ridge)
• In L2 regularisation we the sum of
the squares of the weights to the
error function.

• Expanding this we get: 

• Take the derivative with respect to
w to find our gradient:
L2 Regularisation (Ridge)
• Solving for the values of w that give
minimal error:
L2 Regularisation (Ridge)
L2 Regularisation (Ridge)
L1 & L2 Regularisation (Elastic Net)
• L1 Regularisation minimises the impact of dimensions that have low
weights and are thus largely “noise”.

• L2 Regularisation minimise the impacts of outliers in our training data.

• L1 & L2 Regularisation can be used together and the combination is
referred to as Elastic Net regularisation.

• Because the differential of the error function contains the sigmoid which
has no inverse, we cannot solve for w and must use gradient descent.
Categorical Inputs
One-hot Encoding
• When some inputs are categories (e.g. gender) rather than numbers (e.g.
age) we need to represent the category values as numbers so they can be
used in our linear regression equations.

• In one-hot encoding we allocate each category value it's own dimension in
the inputs. So, for example, we allocate X1 to Audi, X2 to BMW & X3 to
Mercedes.

• For Audi X = [1,0,0]

• For BMW X = [0,1,0])

• For Mercedes X = [0,0,1]
Summary
• Single Dimension Linear Regression

• Multi Dimension Linear Regression

• Gradient Descent

• Generalisation, Over-fitting & Regularisation

• Categorical Inputs

More Related Content

What's hot

Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
kishanthkumaar
 
Back propagation
Back propagationBack propagation
Back propagation
Nagarajan
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Decision tree
Decision treeDecision tree
Decision tree
R A Akerkar
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
Neha Kulkarni
 
Cross validation
Cross validationCross validation
Cross validation
RidhaAfrawe
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
Suraj Parmar
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
Shajun Nisha
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
Student
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
SOUMIT KAR
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
Usama Fayyaz
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
Paras Kohli
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners
 

What's hot (20)

Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 
Back propagation
Back propagationBack propagation
Back propagation
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Cross validation
Cross validationCross validation
Cross validation
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 

Similar to Linear regression

Logistic regression
Logistic regressionLogistic regression
Logistic regression
MartinHogg9
 
07 logistic regression and stochastic gradient descent
07 logistic regression and stochastic gradient descent07 logistic regression and stochastic gradient descent
07 logistic regression and stochastic gradient descent
Subhas Kumar Ghosh
 
Regression ppt
Regression pptRegression ppt
Regression ppt
SuyashSingh70
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
arsh260174
 
Regression Analysis Techniques.pptx
Regression Analysis Techniques.pptxRegression Analysis Techniques.pptx
Regression Analysis Techniques.pptx
YutaItadori
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Maninda Edirisooriya
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
gmorishita
 
15303589.ppt
15303589.ppt15303589.ppt
15303589.ppt
ABINASHPADHY6
 
Bootcamp of new world to taken seriously
Bootcamp of new world to taken seriouslyBootcamp of new world to taken seriously
Bootcamp of new world to taken seriously
khaled125087
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptx
smithashetty24
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
Hadrian7
 
Ai saturdays presentation
Ai saturdays presentationAi saturdays presentation
Ai saturdays presentation
Gurram Poorna Prudhvi
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
ajondaree
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
Kush Kulshrestha
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
Nimrita Koul
 
working with python
working with pythonworking with python
working with python
bhavesh lande
 
Logistical Regression.pptx
Logistical Regression.pptxLogistical Regression.pptx
Logistical Regression.pptx
Ramakrishna Reddy Bijjam
 
REGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptxREGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptx
cajativ595
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
hiblooms
 

Similar to Linear regression (20)

Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
07 logistic regression and stochastic gradient descent
07 logistic regression and stochastic gradient descent07 logistic regression and stochastic gradient descent
07 logistic regression and stochastic gradient descent
 
Regression ppt
Regression pptRegression ppt
Regression ppt
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Regression Analysis Techniques.pptx
Regression Analysis Techniques.pptxRegression Analysis Techniques.pptx
Regression Analysis Techniques.pptx
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
15303589.ppt
15303589.ppt15303589.ppt
15303589.ppt
 
Bootcamp of new world to taken seriously
Bootcamp of new world to taken seriouslyBootcamp of new world to taken seriously
Bootcamp of new world to taken seriously
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptx
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Ai saturdays presentation
Ai saturdays presentationAi saturdays presentation
Ai saturdays presentation
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
working with python
working with pythonworking with python
working with python
 
Logistical Regression.pptx
Logistical Regression.pptxLogistical Regression.pptx
Logistical Regression.pptx
 
REGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptxREGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptx
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 

Recently uploaded

Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
anshsharma8761
 
Accelerate your Sitecore development with GenAI
Accelerate your Sitecore development with GenAIAccelerate your Sitecore development with GenAI
Accelerate your Sitecore development with GenAI
Ahmed Okour
 
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
tinakumariji156
 
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service AvailableFemale Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
isha sharman06
 
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service AvailableCall Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
sapnaanpad7
 
NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024
Bert Jan Schrijver
 
AI Based Testing - A Comprehensive Guide.pdf
AI Based Testing - A Comprehensive Guide.pdfAI Based Testing - A Comprehensive Guide.pdf
AI Based Testing - A Comprehensive Guide.pdf
kalichargn70th171
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
Anita pandey
 
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
sapnasaifi408
 
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
ns9201415
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
Alina Yurenko
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
kalichargn70th171
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
ICS
 
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
Chad Crowell
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
Anand Bagmar
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Ortus Solutions, Corp
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
ImtiazBinMohiuddin
 
Digital Marketing Introduction and Conclusion
Digital Marketing Introduction and ConclusionDigital Marketing Introduction and Conclusion
Digital Marketing Introduction and Conclusion
Staff AgentAI
 

Recently uploaded (20)

Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
Call Girls Solapur ☎️ +91-7426014248 😍 Solapur Call Girl Beauty Girls Solapur...
 
Accelerate your Sitecore development with GenAI
Accelerate your Sitecore development with GenAIAccelerate your Sitecore development with GenAI
Accelerate your Sitecore development with GenAI
 
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
🔥 Chennai Call Girls  👉 6350257716 👫 High Profile Call Girls Whatsapp Number ...
 
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service AvailableFemale Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
Female Bangalore Call Girls 👉 7023059433 👈 Vip Escorts Service Available
 
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service AvailableCall Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
Call Girls Goa 💯Call Us 🔝 7426014248 🔝 Independent Goa Escorts Service Available
 
NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024
 
AI Based Testing - A Comprehensive Guide.pdf
AI Based Testing - A Comprehensive Guide.pdfAI Based Testing - A Comprehensive Guide.pdf
AI Based Testing - A Comprehensive Guide.pdf
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
 
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
Premium Call Girls In Ahmedabad 💯Call Us 🔝 7426014248 🔝Independent Ahmedabad ...
 
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
Independent Call Girls In Bangalore 💯Call Us 🔝 7426014248 🔝Independent Bangal...
 
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
Hot Call Girls In Ahmedabad ✔ 7737669865 ✔ Hi I Am Divya Vip Call Girl Servic...
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
Happy Birthday Kubernetes, 10th Birthday edition of Kubernetes Birthday in Au...
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
 
Digital Marketing Introduction and Conclusion
Digital Marketing Introduction and ConclusionDigital Marketing Introduction and Conclusion
Digital Marketing Introduction and Conclusion
 

Linear regression

  • 2. Agenda • Single Dimension Linear Regression • Multi Dimension Linear Regression • Gradient Descent • Generalisation, Over-fitting & Regularisation • Categorical Inputs
  • 3. What is Linear Regression? • Learning • A supervised algorithm that learns from a set of training samples. • Each training sample has one or more input values and a single output value. • The algorithm learns the line, plane or hyper-plane that best fits the training samples. • Prediction • Use the learned line, plane or hyper-plane to predict the output value for any input sample.
  • 5. Single Dimension Linear Regression • Single dimension linear regression has pairs of x and y values as input training samples. • It uses these training sample to derive a line that predicts values of y. • The training samples are used to derive the values of a and b that minimise the error between actual and predicated values of y. 

  • 6. Single Dimension Linear Regression • We want a line that minimises the error between the Y values in training samples and the Y values that the line passes through. • Or put another way, we want the line that “best fits’ the training samples. • So we define the error function for our algorithm so we can minimise that error.
  • 7. Single Dimension Linear Regression • To determine the value of a that minimises the error E, we look for where the partial differential of E with respect to a is zero.
  • 8. Single Dimension Linear Regression • To determine the value of b that minimises the error E, we look for where the partial differential of E with respect to b is zero.
  • 9. Single Dimension Linear Regression • By substituting the final equations from the previous two slides we derive equations for a and b that minimise the error
  • 10. Single Dimension Linear Regression • We also define a function which we can use to score how well derived line fits. • A value of 1 indicates a perfect fit. • A value of 0 indicates a fit that is no better than simply predicting the mean of the input y values. • A negative value indicates a fit that is even worse than just predicting the mean of the input y values.
  • 15. Multi Dimension Linear Regression • Each training sample has an x made up of multiple input values and a corresponding y with a single value. • The inputs can be represented as an X matrix in which each row is sample and each column is a dimension. • The outputs can be represented as y matrix in which each row is a sample.
  • 16. Multi Dimension Linear Regression • Our predicated y values are calculated by multiple the X matrix by a matrix of weights, w. • If there are 2 dimension, then this equation defines plane. If there are more dimensions then it defines a hyper-plane.
  • 17. Multi Dimension Linear Regression • We want a plane or hyper-plane that minimises the error between the y values in training samples and the y values that the plane or hyper-plane passes through. • Or put another way, we want the plane/hyper-plane that “best fits’ the training samples. • So we define the error function for our algorithm so we can minimise that error.
  • 18. Multi Dimension Linear Regression • To determine the value of w that minimises the error E, we look for where the differential of E with respect to w is zero. • We use the Matrix Cookbook to help with the differentiation!
  • 19. Multi Dimension Linear Regression • We also define a function which we can use to score how well derived line fits. • A value of 1 indicates a perfect fit. • A value of 0 indicates a fit that is no better than simply predicting the mean of the input y values. • A negative value indicates a fit that is even worse than just predicting the mean of the input y values.
  • 22. Multi Dimension Linear Regression • In addition to using the X matrix to represent basic features our training data, we can can also introduce additional dimensions (i.e. columns in our X matrix) that are derived from those basic feature values. • If we introduce derived features whose values are powers of basic features, our multi-dimensional linear regression can then derive polynomial curves, planes and hyper-planes.
  • 23. Multi Dimension Linear Regression • For example, if we have just one basic feature in each sample of X, we can include a range of powers of that value into our X matrix like this: • In non-matrix form our multi- dimensional linear equation is: • Inserting the powers of the basic feature that we have introduced this becomes a polynomial:
  • 27. Singular Matrices • As we have seen, we can use numpy’s linalg.solve() function to determine the value of the weights that result in the lowest possible error. • But this doesn’t work if np.dot(X.T, X) is a singular matrix. • It results in the matrix equivalent of a divide by zero. • Gradient descent is an alternative approach to determining the optimal weights that in works for all cases, including this singular matrix case.
  • 28. Gradient Descent • Gradient descent is a technique we can use to find the minimum of arbitrarily complex error functions. • In gradient descent we pick a random set of weights for our algorithm and iteratively adjust those weights in the direction of the gradient of the error with respect to each weight. • As we iterate, the gradient approaches zero and we approach the minimum error. • In machine learning we often use gradient descent with our error function to find the weights that give the lowest errors.
  • 29. Gradient Descent • Here is an example with a very simple function: • The gradient of this function is given by: • We choose an random initial value for x and a learning rate of 0.1 and then start descent. • On each iteration our x value is decreasing and the gradient (2x) is converging towards 0.
  • 30. Gradient Descent • The learning rate is a what is know as a hyper-parameter. • If the learning rate is too small then convergence may take a very long time. • If the learning rate is too large then convergence may never happen because our iterations bounce from one side of the minima to the other. • Choosing a suitable value for hyper-parameters is an art so try different values and plot the results until you find suitable values.
  • 31. Multi Dimension Linear Regression with Gradient Descent • For multi dimension linear regression our error function is: • Differentiating this with respect to the weights vector gives: • We can iteratively reduce the error by adjusting the weights in the direction of these gradients.
  • 32. Multi Dimension Linear Regression with Gradient Descent
  • 33. Multi Dimension Linear Regression with Gradient Descent
  • 35. Generalisation & Over-fitting • As we train our model with more and more data the it may start to fit the training data more and more accurately, but become worse at handling test data that we feed to it later. • This is know as “over-fitting” and results in an increased generalisation error. • To minimise the generalisation error we should • Collect as much sample data as possible. • Use a random subset of our sample data for training. • Use the remaining sample data to test how well our model copes with data it was not trained with. • Also, experiment with adding higher degrees of polynomials (X2, X3, etc) as this can reduce overfitting.
  • 36. L1 Regularisation (Lasso) • Having a large number of samples (n) with respect to the number of dimensionality (d) increases the quality of our model. • One way to reduce the effective number of dimensions is to use those that most contribute to the signal and ignore those that mostly act as noise. • L1 regularisation achieves this by adding a penalty that results in the weight for the dimensions that act as noise becoming 0. • L1 regularisation encourages a sparse vector of weights in which few are non-zero and many are zero.
  • 37. L1 Regularisation (Lasso) • In L1 regularisation we add a penalty to the error function: • Expanding this we get: • Take the derivative with respect to w to find our gradient: • Where sign(w) is -1 if w < 0, 0 if w = 0 and +1 if w > 0 • Note that because sign(w) has no inverse function we cannot solve for w and so must use gradient descent.
  • 40. L2 Regularisation (Ridge) • Another way to reduce the complexity of our model and prevent overfitting to outliers is L2 regression, which is also known as ridge regression. • In L2 Regularisation we introduce an additional term to the cost function that has the effect of penalising large weights and thereby minimising this skew.
  • 41. L2 Regularisation (Ridge) • In L2 regularisation we the sum of the squares of the weights to the error function. • Expanding this we get: • Take the derivative with respect to w to find our gradient:
  • 42. L2 Regularisation (Ridge) • Solving for the values of w that give minimal error:
  • 45. L1 & L2 Regularisation (Elastic Net) • L1 Regularisation minimises the impact of dimensions that have low weights and are thus largely “noise”. • L2 Regularisation minimise the impacts of outliers in our training data. • L1 & L2 Regularisation can be used together and the combination is referred to as Elastic Net regularisation. • Because the differential of the error function contains the sigmoid which has no inverse, we cannot solve for w and must use gradient descent.
  • 47. One-hot Encoding • When some inputs are categories (e.g. gender) rather than numbers (e.g. age) we need to represent the category values as numbers so they can be used in our linear regression equations. • In one-hot encoding we allocate each category value it's own dimension in the inputs. So, for example, we allocate X1 to Audi, X2 to BMW & X3 to Mercedes. • For Audi X = [1,0,0] • For BMW X = [0,1,0]) • For Mercedes X = [0,0,1]
  • 48. Summary • Single Dimension Linear Regression • Multi Dimension Linear Regression • Gradient Descent • Generalisation, Over-fitting & Regularisation • Categorical Inputs
  翻译: