尊敬的 微信汇率:1円 ≈ 0.046239 元 支付宝汇率:1円 ≈ 0.04633元 [退出登录]
SlideShare a Scribd company logo
Presentation on Machine Learning with
Scikit-Learn
Sanjay Nayak
IKST-Bangalore
Traditional Programming Vs. Machine Learning
Courtesy: Internet
Workflow of Machine Learning
Courtesy: Internet
Machine Learning Algorithms
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
Regression/Classification
I. Linear Regression
II. Logistic Regression
III. Decision Tree
IV. SVM
V. Naive Bayes
VI. kNN
VII. K-Means
VIII. Random Forest
IX. Dimensionality Reduction Algorithms
X. Gradient Boosting algorithms
XI. Artificial Neural Network
Linear Regression
• Regression: a statistical technique for estimating the relationships among variables
y = X.β +ε
• X is a tensor in ML (in our work mostly a multidimensional matrix) called feature vector
• y is the target (what we want to predict? e.g. adsorption energy, barrier height, bandgap,
dielectric loss etc.)
• β is/are the coefficient(s)
• ε is the error in prediction
• Goal is to find β for which ε is minimum
• X and y are multidimensional: Solution is Least square
(Ordinary) Least Squares Solution
• When the number variables are not equal to number of equation: No exact solution
• Approximation in solution: Least squares
• Least squares: Overall solution minimizes the sum of the squares of the residuals made
in the results of every single equation (Source: Wikipedia)
• If number of equation is larger than number of unknown variables the solution for β is
β = (XT.X)-1 .XT .y
• If number of equation is smaller than or equal to number of unknown variables:
β = XT.(X.XT)-1 .y
• Solution is valid only if the inverse matrix exist (collinearlity?)
Minimization of the function of residual sum of square (RSS) = ||y-X.β||2
Collinearity in Matrix
1. Inversion of a matrix
2. If matrix is collinear: Determinant is Zero
Solutions?
Remove the collinearity
1. See the correlation coefficient b/w features
and remove the features which are highly correlated
2. Add a penalty term to the inverse matrix (Lasso, Ridge etc...)
Pearson's correlation coefficient
Python Code
Scikit-Learn Library
from sklearn.linear_model import LinearRegression
model = LinearRegression(fit_intercept=True)
What if features available to us are highly collinear?
i.e. not sufficient features to elliminate them!
Partial Least Squares (PLS) Solution
• Find new latent variables from the old features by principal component analysis (PCA)
• PCA: Find a orthonormal matrix P where U = P.X so that (1/n-1) UUT is diagonalizable
• Rows of P are the principal component X
• The new variables are chosen to simulteneously satisfy three conditions:
1. They are highly correlated to dependent variables
2. They model as much as the variance among the independent
variable as possible (Signal to noise ratio max)
3. They are uncorrelated with each other (minimizes the no. of variables )
Disadvantages: Latent variables are abstract and difficult to interpret
Scikit-Learn Library
from sklearn.linear_model import PLSRegression
model = PLSRegression(n_components=5)
optimization of n_components is required!
Ridge Regression (L2 regularization)
1. Developed to overcome the issue of Collinearity problem
2. Add a loss function to inverse matrix (least square regression)
3. Ridge Function:
Lridge (β,λ) = ||y-X.β||2 + λ||β||2
4. Solution for β is
β=(XT.X+λIpp)-1XT.y
5. In practice we have to optimize λ (Hyperparameter)
Scikit-Learn Library
from sklearn.linear_model import Ridge
model = Ridge(alpha=0.0000001, max_iter=10000, tol=0.001)
Lasso Regression (L1 regularization)
• Difference between Lasso and Ridge is the nature of the loss function
• Lasso Function:
LLasso (β,λ) = ||y-X.β||2 + λ||β||
• Solution for β is
• β = sgn (βi
LS) (|βi
LS|-λ)+
Scikit-Learn Library
from sklearn.linear_model import Lasso
model = Lasso(alpha=0.00001,max_iter=100000)
signum function (sgn) for real number
Prepocessing of Data
• Prior to construct any ML model the data need to be preprocess (Majorly time goes here)
• All NaN data should be removed
• Normalize the features (Not Target)
• Creating the feature vector is essentially our job (differnce b/w ML in other fields and material
science)
• Expertise is extremely important (Using elemental properties does not works always)
• Stuructural and chemical descriptors are needed for better precision
• We should look into minimum number but effective ones as features
Concept of Overfitting
• Overfitting is a modeling error which occurs when a function is too closely fit to a
limited set of data points
• Limited number of data: high probability of over fitting (Our case, we should be very careful)
Conclusions
• Basic overview on Machine Learning
• Briefly discussed Least squares regression
• Issues of collinearity
• Discussed about PLS, Lasso, Ridge regression
• A small discussion on preprocessing of data
• Presented a discussion on Overfitting of models

More Related Content

What's hot

K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
Vu Pham
 
Evaluation of programs codes using machine learning
Evaluation of programs codes using machine learningEvaluation of programs codes using machine learning
Evaluation of programs codes using machine learning
Vivek Maskara
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
MLconf
 
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Chris Fregly
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
Jesse Bettencourt
 
Optimal real-time landing using DNN
Optimal real-time landing using DNNOptimal real-time landing using DNN
Optimal real-time landing using DNN
홍배 김
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
홍배 김
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
Krishnan MuthuManickam
 
Recommender Systems
Recommender SystemsRecommender Systems
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
MLconf
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
DB Tsai
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
홍배 김
 
Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr
taeseon ryu
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
recsysfr
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
 
Adversarial Reinforced Learning for Unsupervised Domain Adaptation
Adversarial Reinforced Learning for Unsupervised Domain AdaptationAdversarial Reinforced Learning for Unsupervised Domain Adaptation
Adversarial Reinforced Learning for Unsupervised Domain Adaptation
taeseon ryu
 
Solving 0-1 knapsack problems based on amoeboid organism algorithm
Solving 0-1 knapsack problems based on amoeboid organism algorithmSolving 0-1 knapsack problems based on amoeboid organism algorithm
Solving 0-1 knapsack problems based on amoeboid organism algorithm
juanjo_23
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative RecommendersOn the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
Jérôme KUNEGIS
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
MLAI2
 

What's hot (20)

K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
 
Evaluation of programs codes using machine learning
Evaluation of programs codes using machine learningEvaluation of programs codes using machine learning
Evaluation of programs codes using machine learning
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Optimal real-time landing using DNN
Optimal real-time landing using DNNOptimal real-time landing using DNN
Optimal real-time landing using DNN
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Adversarial Reinforced Learning for Unsupervised Domain Adaptation
Adversarial Reinforced Learning for Unsupervised Domain AdaptationAdversarial Reinforced Learning for Unsupervised Domain Adaptation
Adversarial Reinforced Learning for Unsupervised Domain Adaptation
 
Solving 0-1 knapsack problems based on amoeboid organism algorithm
Solving 0-1 knapsack problems based on amoeboid organism algorithmSolving 0-1 knapsack problems based on amoeboid organism algorithm
Solving 0-1 knapsack problems based on amoeboid organism algorithm
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative RecommendersOn the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
 

Similar to Presentation on machine learning

cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
Least Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear SolverLeast Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear Solver
Ji-yong Kwon
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
Andrew Ferlitsch
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
butest
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
AbdusSadik
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
KNaveenKumarECE
 
nber_slides.pdf
nber_slides.pdfnber_slides.pdf
nber_slides.pdf
ssuser05b736
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
PriyankaRamavath3
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
sghorai
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
Shocky1
 
Introduction to Matlab
Introduction to MatlabIntroduction to Matlab
Introduction to Matlab
Amr Rashed
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
Chun-Ming Chang
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights ReservedMachine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Jonathan Mitchell
 
Introduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and AlgorithmsIntroduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and Algorithms
NBER
 
1619 quantum computing
1619 quantum computing1619 quantum computing
1619 quantum computing
Dr Fereidoun Dejahang
 

Similar to Presentation on machine learning (20)

cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Least Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear SolverLeast Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear Solver
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 
nber_slides.pdf
nber_slides.pdfnber_slides.pdf
nber_slides.pdf
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Introduction to Matlab
Introduction to MatlabIntroduction to Matlab
Introduction to Matlab
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights ReservedMachine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
 
Introduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and AlgorithmsIntroduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and Algorithms
 
1619 quantum computing
1619 quantum computing1619 quantum computing
1619 quantum computing
 

Recently uploaded

Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
IJCNCJournal
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves
 
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
hotchicksescort
 
Intuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sdeIntuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sde
ShivangMishra54
 
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
aarusi sexy model
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
GOKULKANNANMMECLECTC
 
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
nainakaoornoida
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
 
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
sonamrawat5631
 
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort ServiceCuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
yakranividhrini
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
DharmaBanothu
 
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
simrangupta87541
 
Data Communication and Computer Networks Management System Project Report.pdf
Data Communication and Computer Networks Management System Project Report.pdfData Communication and Computer Networks Management System Project Report.pdf
Data Communication and Computer Networks Management System Project Report.pdf
Kamal Acharya
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Balvir Singh
 
Better Builder Magazine, Issue 49 / Spring 2024
Better Builder Magazine, Issue 49 / Spring 2024Better Builder Magazine, Issue 49 / Spring 2024
Better Builder Magazine, Issue 49 / Spring 2024
Better Builder Magazine
 
Literature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptxLiterature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptx
LokerXu2
 
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEERDELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
EMERSON EDUARDO RODRIGUES
 
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Dr.Costas Sachpazis
 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
NaveenNaveen726446
 

Recently uploaded (20)

Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
 
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
❣Unsatisfied Bhabhi Call Girls Surat 💯Call Us 🔝 7014168258 🔝💃Independent Sura...
 
Intuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sdeIntuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sde
 
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
 
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
 
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
 
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort ServiceCuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
Cuttack Call Girls 💯Call Us 🔝 7374876321 🔝 💃 Independent Female Escort Service
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
 
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi 🔥 9711199012 ❄- Pick Your Dream Call Girls with 1...
 
Data Communication and Computer Networks Management System Project Report.pdf
Data Communication and Computer Networks Management System Project Report.pdfData Communication and Computer Networks Management System Project Report.pdf
Data Communication and Computer Networks Management System Project Report.pdf
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
 
Better Builder Magazine, Issue 49 / Spring 2024
Better Builder Magazine, Issue 49 / Spring 2024Better Builder Magazine, Issue 49 / Spring 2024
Better Builder Magazine, Issue 49 / Spring 2024
 
Literature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptxLiterature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptx
 
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEERDELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
DELTA V MES EMERSON EDUARDO RODRIGUES ENGINEER
 
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
 

Presentation on machine learning

  • 1. Presentation on Machine Learning with Scikit-Learn Sanjay Nayak IKST-Bangalore
  • 2. Traditional Programming Vs. Machine Learning Courtesy: Internet
  • 3. Workflow of Machine Learning Courtesy: Internet
  • 4. Machine Learning Algorithms 1. Supervised Learning 2. Unsupervised Learning 3. Reinforcement Learning Regression/Classification I. Linear Regression II. Logistic Regression III. Decision Tree IV. SVM V. Naive Bayes VI. kNN VII. K-Means VIII. Random Forest IX. Dimensionality Reduction Algorithms X. Gradient Boosting algorithms XI. Artificial Neural Network
  • 5. Linear Regression • Regression: a statistical technique for estimating the relationships among variables y = X.β +ε • X is a tensor in ML (in our work mostly a multidimensional matrix) called feature vector • y is the target (what we want to predict? e.g. adsorption energy, barrier height, bandgap, dielectric loss etc.) • β is/are the coefficient(s) • ε is the error in prediction • Goal is to find β for which ε is minimum • X and y are multidimensional: Solution is Least square
  • 6. (Ordinary) Least Squares Solution • When the number variables are not equal to number of equation: No exact solution • Approximation in solution: Least squares • Least squares: Overall solution minimizes the sum of the squares of the residuals made in the results of every single equation (Source: Wikipedia) • If number of equation is larger than number of unknown variables the solution for β is β = (XT.X)-1 .XT .y • If number of equation is smaller than or equal to number of unknown variables: β = XT.(X.XT)-1 .y • Solution is valid only if the inverse matrix exist (collinearlity?) Minimization of the function of residual sum of square (RSS) = ||y-X.β||2
  • 7. Collinearity in Matrix 1. Inversion of a matrix 2. If matrix is collinear: Determinant is Zero Solutions? Remove the collinearity 1. See the correlation coefficient b/w features and remove the features which are highly correlated 2. Add a penalty term to the inverse matrix (Lasso, Ridge etc...) Pearson's correlation coefficient
  • 8. Python Code Scikit-Learn Library from sklearn.linear_model import LinearRegression model = LinearRegression(fit_intercept=True) What if features available to us are highly collinear? i.e. not sufficient features to elliminate them!
  • 9. Partial Least Squares (PLS) Solution • Find new latent variables from the old features by principal component analysis (PCA) • PCA: Find a orthonormal matrix P where U = P.X so that (1/n-1) UUT is diagonalizable • Rows of P are the principal component X • The new variables are chosen to simulteneously satisfy three conditions: 1. They are highly correlated to dependent variables 2. They model as much as the variance among the independent variable as possible (Signal to noise ratio max) 3. They are uncorrelated with each other (minimizes the no. of variables ) Disadvantages: Latent variables are abstract and difficult to interpret Scikit-Learn Library from sklearn.linear_model import PLSRegression model = PLSRegression(n_components=5) optimization of n_components is required!
  • 10. Ridge Regression (L2 regularization) 1. Developed to overcome the issue of Collinearity problem 2. Add a loss function to inverse matrix (least square regression) 3. Ridge Function: Lridge (β,λ) = ||y-X.β||2 + λ||β||2 4. Solution for β is β=(XT.X+λIpp)-1XT.y 5. In practice we have to optimize λ (Hyperparameter) Scikit-Learn Library from sklearn.linear_model import Ridge model = Ridge(alpha=0.0000001, max_iter=10000, tol=0.001)
  • 11. Lasso Regression (L1 regularization) • Difference between Lasso and Ridge is the nature of the loss function • Lasso Function: LLasso (β,λ) = ||y-X.β||2 + λ||β|| • Solution for β is • β = sgn (βi LS) (|βi LS|-λ)+ Scikit-Learn Library from sklearn.linear_model import Lasso model = Lasso(alpha=0.00001,max_iter=100000) signum function (sgn) for real number
  • 12. Prepocessing of Data • Prior to construct any ML model the data need to be preprocess (Majorly time goes here) • All NaN data should be removed • Normalize the features (Not Target) • Creating the feature vector is essentially our job (differnce b/w ML in other fields and material science) • Expertise is extremely important (Using elemental properties does not works always) • Stuructural and chemical descriptors are needed for better precision • We should look into minimum number but effective ones as features
  • 13. Concept of Overfitting • Overfitting is a modeling error which occurs when a function is too closely fit to a limited set of data points • Limited number of data: high probability of over fitting (Our case, we should be very careful)
  • 14. Conclusions • Basic overview on Machine Learning • Briefly discussed Least squares regression • Issues of collinearity • Discussed about PLS, Lasso, Ridge regression • A small discussion on preprocessing of data • Presented a discussion on Overfitting of models
  翻译: