尊敬的 微信汇率:1円 ≈ 0.046239 元 支付宝汇率:1円 ≈ 0.04633元 [退出登录]
SlideShare a Scribd company logo
Unit-1
Introduction and Mathematical
Preliminaries
Pattern Recognition/Classification
• Assign an object or an event (pattern) to one of several known
categories (or classes).
2
Category “A”
Category “B”
100 objects
• Height
• Width
• Purpose of object
Properties of objects
Will there be raining on
15 Jan 2022?
1. Previous years data
2. You need to create
a model
3. Predict the answer
Classification vs Clustering
3
Category “A”
Category “B”
(Supervised Classification)
(Unsupervised Classification)
Classification (known categories)
Clustering (unknown categories)
X1 X2 Xn Label
What is a Pattern?
• A pattern could be an object or
event.
• Typically, represented by a vector x
of numbers.
4
biometric patterns hand gesture patterns
1
2
.
.
n
x
x
x
 
 
 
 

 
 
 
 
x
Regularities in data
Automatically
Discovering
regularities in data
1. Collect the data
(Image, voice,
transactional
data)
2. Store it in some
data structure
{vectors, matrices}
3. Applying some
algos.
Etc etc
What is a Pattern? (con’t)
• Loan/Credit card applications
• Income, # of dependents, mortgage amount  credit
worthiness classification
• Dating services
• Age, hobbies, income “desirability” classification
• Web documents
• Key-word based descriptions (e.g., documents
containing “football”, “NFL”)  document classification
5
What is a Class ?
• A collection of “similar” objects.
6
1. Classification
2. Clustering (we
don’t have labels)
On the basis of Labels
Relation between the
data and what
happened earlier
with that data.
Initially on the basis of just similarity/dissimilarity, we group
female male
Main Objectives
• Separate the data belonging to different classes.
• Given new data, assign them to the correct
category.
7
Gender Classification
Features, attributes,
properties
e.g.
F: {length of hair(lh), glasses
(G), facial structure(fs)}
F: {lh, G, fs}
L; {“male”, ”female”}
Mapping function : phi
M
M
M
M
F
F
F
F
Group 1
Group 2
F
M
Optimization
Y = m1x1+(m2x2)2+ m3x3………
0, 0 Height
Weight
Females
males
Fundamental mathematics
Equation of line: y = mx+c
• Linear as well non-linear curves
[5.7, 64,23]
[5.5, 61,25]
Height
(x1)
Weight (x2)
Y1 = m1x1 + m2x2 + m3x3
Height, Weight, Age
S1: {5.7, 50, 25}
S2: {5.4, 52, 21}
Vector, tensors
coefficients
Tensor??
Vector?
Main Approaches
x: input vector (pattern)
ω: class label (class)
• Generative
– Model the joint probability, p(x, ω).
– Make predictions by using Bayes rule to calculate p(ω/x).
– Pick the most likely class label ω.
• Discriminative
– No need to model p(x, ω).
– Estimate p(ω/x) by “learning” a direct mapping from x to ω (i.e.,
estimate decision boundary).
– Pick the most likely class label ω.
9
ω1
ω2
How can we define the
relationship b/w
labelled data using
probability???
x1w1
X2w2
…
…
…
Xn-->wn
X_unk????
Suppose we are having
a total of 60k samples
Out of which 4k
samples with labelled
If I am
having
images of
• Table
• chairs
Examples
Generative classifiers
• Naïve Bayes
• Bayesian networks
• Markov random fields
• Hidden Markov Models (HMM)
Discriminative Classifiers
• Logistic regression
• Support Vector Machine
• Traditional neural networks
• Nearest neighbour classifiers
• Conditional Random Fields (CRF)s
How do we model p(x, ω)?
• Typically, using a statistical model.
• probability density function (e.g., Gaussian)
11
Gender Classification
male
female
P(x, w) = joint probability of sample x
and class w
1. Calculate the probability of
sample S5 coming in class w=1
2. Calculate the probability of
sample S5 coming in class w=0
S5
W=1
W=0
S4
P(S1, w0)
P(S1, w1)
P(S2, w0)
P(S2, w1)
…..
…..
…
Key Challenges
• Intra-class variability
• Inter-class variability
12
Letters/Numbers that look similar
The letter “T” in different typefaces
Pattern Recognition
Applications
13
Handwriting Recognition
14
License Plate Recognition
15
Biometric Recognition
16
Fingerprint Classification
17
Face Detection
18
Autonomous Systems
19
Medical Applications
20
Skin Cancer Detection Breast Cancer Detection
Land Cover Classification
(from aerial or satellite images)
21
22
The Design Cycle
• Data collection
• Feature Choice
• Model Choice
• Training
• Evaluation
• Computational Complexity
Agent Environment
1. Digital
2. Continuous
When developing a ML model
Quality of data is most important thing to
consider at first
Height
1. Feature selection
2. Feature Extraction
Understanding/Interpreting
collected data
If I am having 50 features
In feature selection, we are going
to select
- Actual values of features
won’t changes
Feature extraction:
Actual values changes ,
Sample 1 = [59, 5.6]
Sample 2 = [68, 5.9]
X = [Sample1, Sample2, ….., so
on]
Feature 1
(f1)
Feature 2
(f2)
Feature3
(f3)
…. Feature n
(784)
Labels
27 143 54 108 Car
10 59 20 30 Car
… …. …. … House
28
28
28*28 = 784
1. Read your data from database
2. Store in form of matrix
Where each row is your 1 image
Img 1
Img1 : [27, 143, 2*27……3*27]
Img2: [10, 59, 2*10….3*10]
Data is redundant
Data is correlated
F is a feature set where we have {f1,
f2… f784}
f1, f3 and f784 are correlated
Discard correlated feature
SFS/SBS or SFFS
1. Feature selection
In case of feature selection the values of feature will not change e.g. if you have chosen f1 and f3 for further
processing then
Img1: {27, 54}
Img2 : {10, 20}
2. Feature extraction : The values of features will be different (What will be the different values?)
PCA, SVD etc
1. Less computation time
2. Higher accuracy (This is not necessary all the time)
Hughes Phenomenon
• Data Collection
• How do we know when we have collected an adequately large and
representative set of examples for training and testing the system?
Working area : Greater Noida (population suppose 1 million)
Objective: find the buyer of a particular product (e.g. college bag)
We, in many case, can’t collect the data of whole population
Collect the samples from the population (there are different sampling methods
available for it)
The samples should represent actual population
Statistical Analysis
1. Univariate (mean, median
etc)
2. Multivariate (scatter plot
• Feature Choice
• Depends on the characteristics of the problem domain. Simple to extract,
invariant to irrelevant transformation insensitive to noise.
• Model Choice
• Unsatisfied with the performance of our fish classifier and want to jump to
another class of model
100s of available choices
Selection should be based on the characteristics of your dataset
ANN
SVM
Decision Trees
• Training
• Use data to determine the classifier. Many different procedures for training
classifiers and choosing models
• Evaluation
• Measure the error rate (or performance and switch from one set of features
to another one
Confusion matrix based measures
• Computational Complexity
• What is the trade-off between computational ease and performance?
• (How an algorithm scales as a function of the number of features, patterns
or categories?)
#computational steps are in millions
Relevant basics of Linear Algebra
n-dimensional Vector
• An n-dimensional vector v is denoted as follows:
• The transpose vT is denoted as follows:
Inner (or dot) product
• Given vT = (x1, x2, . . . , xn) and wT = (y1, y2, . . . , yn),
their dot product defined as follows:
or
(scalar)
Orthogonal / Orthonormal vectors
• A set of vectors x1, x2, . . . , xn is orthogonal if
• A set of vectors x1, x2, . . . , xn is orthonormal if
k
Linear combinations
• A vector v is a linear combination of the
vectors v1, ..., vk if:
where c1, ..., ck are constants.
• Example: vectors in R3 can be expressed as a
linear combinations of unit vectors i
= (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1)
Space spanning
• A set of vectors S=(v1, v2, . . . , vk ) span some space
W if every vector w in W can be written as a linear
combination of the vectors in S
- The unit vectors i, j, and k span R3
w
Linear dependence
• A set of vectors v1, ..., vk are linearly dependent if at least one of them
is a linear combination of the others:
(i.e., vj does not appear on the right side)
Linear independence
• A set of vectors v1, ..., vk is linearly independent if no
vector vj can be represented as a linear combination of
the remaining vectors, i.e.,:
Example:
c1=c2=0
Vector basis
• A set of vectors v1, ..., vk forms a basis in some
vector space W if:
(1) (v1, ..., vk) span W
(2) (v1, ..., vk) are linearly independent
• Standard bases:
R2 R3 Rn
Matrix Operations
• Matrix addition/subtraction
• Add/Subtract corresponding elements.
• Matrices must be of same size.
• Matrix multiplication
Condition: n = q
m x n q x p m x p
n
Diagonal/Identity Matrix
Matrix Transpose
Symmetric Matrices
Example:
Determinants
2 x 2
3 x 3
n x n
Properties:
(expanded along 1st column)
(expanded along kth column)
Matrix Inverse
• The inverse of a matrix A, denoted as A-1, has the
property:
A A-1 = A-1A = I
• A-1 exists only if
• Definitions
• Singular matrix: A-1 does not exist
• Ill-conditioned matrix: A is “close” to being singular
Matrix Inverse (cont’d)
• Properties of the inverse:
Matrix trace
Properties:
Rank of matrix
• Defined as the dimension of the largest square sub-matrix of A that
has a non-zero determinant.
Example: has rank 3
Rank of matrix (cont’d)
• Alternatively, it is defined as the maximum
number of linearly independent columns (or rows)
of A.
i.e., rank is not 4!
Example:
Rank of matrix (cont’d)
• Useful properties:
Eigenvalues and Eigenvectors
• The vector v is an eigenvector of matrix A and λ is an eigenvalue of A
if:
Geometric interpretation: the linear transformation implied by A can
not change the direction of the eigenvectors v, only their magnitude.
(assume v is non-zero)
Computing λ and v
• To compute the eigenvalues λ of a matrix A, find the
roots of the characteristic polynomial.
• The eigenvectors can then be computed:
Example:
Properties of λ and v
• Eigenvalues and eigenvectors are only defined for
square matrices.
• Eigenvectors are not unique (e.g., if v is an
eigenvector, so is kv) 
• Suppose λ1, λ2, ..., λn are the eigenvalues of A,
then:
Matrix diagonalization
• Given an n x n matrix A, find P such that:
P-1AP=Λ where Λ is diagonal
• Solution: Set P = [v1 v2 . . . vn], where v1,v2 ,. . . vn are
the eigenvectors of A:
Matrix diagonalization (cont’d)
Example:
• If A is diagonalizable, then the corresponding
eigenvectors v1,v2 ,. . . vn form a basis in Rn
• If A is also symmetric, its eigenvalues are real and
the corresponding eigenvectors are orthogonal.
Matrix diagonalization (cont’d)
• An n x n matrix A is diagonalizable iff rank(P)=n, that
is, it has n linearly independent eigenvectors.
• Theorem: If the eigenvalues of A are all distinct, then
the corresponding eigenvectors are linearly
independent (i.e., A is diagonalizable).
Are all n x n matrices diagonalizable?
Matrix decomposition
• If A is diagonalizable, then A can be decomposed as follows:
Matrix decomposition (cont’d)
• Matrix decomposition can be simplified in
the case of symmetric matrices (i.e.,
orthogonal eigenvectors):
P-1=PT
A=PDPT=
Clustering vs. Classification
Probability Theory
Estimation Theory
Extra
Main Phases
65
Training Phase
Test Phase
Classification
(thematic values)
Or
Regression
(continuous values)
50,000 image
You have divided 35k
for training and 15 k
for testing
Validation data
Step 1
Step 2
Step 3
Step 1
Step 2
Step 3
Additional
step
Score (Any distance function) 1: 0.75
Complexity of PR – An Example
66
Problem: Sorting
incoming fish on a
conveyor belt.
Assumption: Two
kind of fish:
(1) sea bass
(2) salmon
camera
Sensors
• Sensing:
• Use a sensor (camera or microphone) for data capture.
• PR depends on bandwidth, resolution, sensitivity,
distortion of the sensor.
67
Preprocessing
68
(1) Noise removal
(2) Image enhancement
(3) Separate touching
or occluding fish
(3) Extract boundary of
each fish
Training/Test data
• How do we know that we have collected an
adequately large and representative set of
examples for training/testing the system?
69
Training Set ?
Test Set ?
Feature Extraction
• How to choose a good set of features?
• Discriminative features
• Invariant features (e.g., invariant to geometric
transformations such as translation, rotation and scale)
• Are there ways to automatically learn which features
are best ?
70
Feature Extraction - Example
• Let’s consider the fish
classification example:
• Assume a fisherman told us that
a sea bass is generally longer
than a salmon.
• We can use length as a feature
and decide between sea bass
and salmon according to a
threshold on length.
• How should we choose the
threshold?
71
Feature Extraction - Length
• Even though sea bass is longer than salmon on
the average, there are many examples of fish
where this observation does not hold.
72
threshold l*
Histogram of “length”
Feature Extraction - Lightness
• Consider different features, e.g., “lightness”
• It seems easier to choose the threshold x* but we still
cannot make a perfect decision.
73
threshold x*
Histogram of “lightness”
Multiple Features
• To improve recognition accuracy, we might need to
use more than one features.
• Single features might not yield the best performance.
• Using combinations of features might yield better
performance.
1
2
x
x
 
 
 
1
2
:
:
x lightness
x width
74
How Many Features?
• Does adding more features always improve
performance?
• It might be difficult and computationally expensive to
extract certain features.
• Correlated features might not improve performance (i.e.
redundancy).
• “Curse” of dimensionality.
75
Curse of Dimensionality
• Adding too many features can, paradoxically, lead to a
worsening of performance.
• Divide each of the input features into a number of intervals, so
that the value of a feature can be specified approximately by
saying in which interval it lies.
• If each input feature is divided into M divisions, then the total
number of cells is Md (d: # of features).
• Since each cell must contain at least one point, the number of
training data grows exponentially with d.
76
Missing Features
• Certain features might be missing (e.g., due to
occlusion).
• How should we train the classifier with missing
features ?
• How should the classifier make the best decision
with missing features ?
77
Classification
• Partition the feature space into two regions by finding
the decision boundary that minimizes the error.
• How should we find the optimal decision boundary?
78
Complexity of Classification Model
• We can get perfect classification performance on the
training data by choosing a more complex model.
• Complex models are tuned to the particular training
samples, rather than on the characteristics of the true
model.
79
How well can the model generalize to unknown samples?
overfitting
Generalization
• Generalization is defined as the ability of a classifier to
produce correct results on novel patterns.
• How can we improve generalization performance ?
• More training examples (i.e., better model estimates).
• Simpler models usually yield better performance.
80
complex model simpler model
Understanding model complexity:
function approximation
81
• Approximate a function from a set of samples
o Green curve is the true function
o Ten sample points are shown by the blue circles
(assuming noise)
Understanding model complexity:
function approximation (cont’d)
82
Polynomial curve fitting: polynomials having various
orders, shown as red curves, fitted to the set of 10
sample points.
Understanding model complexity:
function approximation (cont’d)
83
• More data can improve model estimation
• Polynomial curve fitting: 9’th order polynomials fitted
to 15 and 100 sample points.
Improve Classification Performance through Post-
processing
• Consider the problem of character recognition.
• Exploit context to improve performance.
84
How m ch info mation are y u
mi sing?
Improve Classification Performance through
Ensembles of Classifiers
• Performance can be
improved using a "pool" of
classifiers.
• How should we build and
combine different
classifiers ?
85
Cost of miss-classifications
• Consider the fish classification example.
• There are two possible classification errors:
(1) Deciding the fish was a sea bass when it was a
salmon.
(2) Deciding the fish was a salmon when it was a sea
bass.
• Are both errors equally important ?
86
Cost of miss-classifications (cont’d)
• Suppose that:
• Customers who buy salmon will object vigorously if
they see sea bass in their cans.
• Customers who buy sea bass will not be unhappy if
they occasionally see some expensive salmon in their
cans.
• How does this knowledge affect our decision?
87
Computational Complexity
• How does an algorithm scale with the number of:
• features
• patterns
• categories
• Need to consider tradeoffs between computational
complexity and performance.
88
Would it be possible to build a “general purpose”
PR system?
• It would be very difficult to design a system that is
capable of performing a variety of classification
tasks.
• Different problems require different features.
• Different features might yield different solutions.
• Different tradeoffs exist for different problems.
89
Introduction to Pattern recognition
Applications areas
Pattern recognition in medical
Pattern recognition in defence
Pattern recognition in E-commerce

More Related Content

What's hot

Artificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural NetworksArtificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural Networks
The Integral Worm
 
Decision tree
Decision treeDecision tree
Decision tree
SEMINARGROOT
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
Sharayu Patil
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
Mohamed Talaat
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Simplilearn
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
Student
 
Feature selection
Feature selectionFeature selection
Feature selection
dkpawar
 
Presentation on Solution to non linear equations
Presentation on Solution to non linear equationsPresentation on Solution to non linear equations
Presentation on Solution to non linear equations
Rifat Rahamatullah
 
Svm
SvmSvm
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
Krish_ver2
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Hakky St
 
ID3 ALGORITHM
ID3 ALGORITHMID3 ALGORITHM
ID3 ALGORITHM
HARDIK SINGH
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA)Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA)
Anmol Dwivedi
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
Mark Chang
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
Shajun Nisha
 
pattern classification
pattern classificationpattern classification
pattern classification
Ranjan Ganguli
 

What's hot (20)

Artificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural NetworksArtificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural Networks
 
Decision tree
Decision treeDecision tree
Decision tree
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Multi Layer Network
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Presentation on Solution to non linear equations
Presentation on Solution to non linear equationsPresentation on Solution to non linear equations
Presentation on Solution to non linear equations
 
Svm
SvmSvm
Svm
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
ID3 ALGORITHM
ID3 ALGORITHMID3 ALGORITHM
ID3 ALGORITHM
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA)Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA)
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
pattern classification
pattern classificationpattern classification
pattern classification
 

Similar to Unit-1 Introduction and Mathematical Preliminaries.pptx

Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
RohanBorgalli
 
Lect4
Lect4Lect4
Lect4
sumit621
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
Kuppusamy P
 
Chap_10_Object_Recognition.pdf
Chap_10_Object_Recognition.pdfChap_10_Object_Recognition.pdf
Chap_10_Object_Recognition.pdf
SsdSsd5
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
Subrata Kumer Paul
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
Pattern recognition
Pattern recognitionPattern recognition
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
Subrata Kumer Paul
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
PriyankaRamavath3
 
[ppt]
[ppt][ppt]
[ppt]
butest
 
[ppt]
[ppt][ppt]
[ppt]
butest
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
AruneshAdarsh
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
HumayilZia
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
Daiki Tanaka
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
Shocky1
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Jason Rodrigues
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Rishabh Gupta
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
Krish_ver2
 

Similar to Unit-1 Introduction and Mathematical Preliminaries.pptx (20)

Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
Lect4
Lect4Lect4
Lect4
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Chap_10_Object_Recognition.pdf
Chap_10_Object_Recognition.pdfChap_10_Object_Recognition.pdf
Chap_10_Object_Recognition.pdf
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 

Recently uploaded

Startup Grind Princeton - Gen AI 240618 18 June 2024
Startup Grind Princeton - Gen AI 240618 18 June 2024Startup Grind Princeton - Gen AI 240618 18 June 2024
Startup Grind Princeton - Gen AI 240618 18 June 2024
Timothy Spann
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
hanshkumar9870
 
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
mona lisa $A12
 
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
zoykygu
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
nitachopra
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
frp60658
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
EbtsamRashed
 
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In BangaloreBangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
yashusingh54876
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
newdirectionconsulta
 
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
PsychoTech Services
 
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Gabi Münster
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
hiju9823
 
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
meenusingh4354543
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
mparmparousiskostas
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
incitbe
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
prijesh mathew
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
jasodak99
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 

Recently uploaded (20)

Startup Grind Princeton - Gen AI 240618 18 June 2024
Startup Grind Princeton - Gen AI 240618 18 June 2024Startup Grind Princeton - Gen AI 240618 18 June 2024
Startup Grind Princeton - Gen AI 240618 18 June 2024
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
 
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
Delhi Call Girls Karol Bagh 👉 9711199012 👈 unlimited short high profile full ...
 
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
 
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In BangaloreBangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
 
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
 
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
 
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 

Unit-1 Introduction and Mathematical Preliminaries.pptx

  • 2. Pattern Recognition/Classification • Assign an object or an event (pattern) to one of several known categories (or classes). 2 Category “A” Category “B” 100 objects • Height • Width • Purpose of object Properties of objects Will there be raining on 15 Jan 2022? 1. Previous years data 2. You need to create a model 3. Predict the answer
  • 3. Classification vs Clustering 3 Category “A” Category “B” (Supervised Classification) (Unsupervised Classification) Classification (known categories) Clustering (unknown categories) X1 X2 Xn Label
  • 4. What is a Pattern? • A pattern could be an object or event. • Typically, represented by a vector x of numbers. 4 biometric patterns hand gesture patterns 1 2 . . n x x x                  x Regularities in data Automatically Discovering regularities in data 1. Collect the data (Image, voice, transactional data) 2. Store it in some data structure {vectors, matrices} 3. Applying some algos. Etc etc
  • 5. What is a Pattern? (con’t) • Loan/Credit card applications • Income, # of dependents, mortgage amount  credit worthiness classification • Dating services • Age, hobbies, income “desirability” classification • Web documents • Key-word based descriptions (e.g., documents containing “football”, “NFL”)  document classification 5
  • 6. What is a Class ? • A collection of “similar” objects. 6 1. Classification 2. Clustering (we don’t have labels) On the basis of Labels Relation between the data and what happened earlier with that data. Initially on the basis of just similarity/dissimilarity, we group female male
  • 7. Main Objectives • Separate the data belonging to different classes. • Given new data, assign them to the correct category. 7 Gender Classification Features, attributes, properties e.g. F: {length of hair(lh), glasses (G), facial structure(fs)} F: {lh, G, fs} L; {“male”, ”female”} Mapping function : phi M M M M F F F F Group 1 Group 2 F M Optimization Y = m1x1+(m2x2)2+ m3x3……… 0, 0 Height Weight Females males Fundamental mathematics Equation of line: y = mx+c • Linear as well non-linear curves [5.7, 64,23] [5.5, 61,25]
  • 8. Height (x1) Weight (x2) Y1 = m1x1 + m2x2 + m3x3 Height, Weight, Age S1: {5.7, 50, 25} S2: {5.4, 52, 21} Vector, tensors coefficients Tensor?? Vector?
  • 9. Main Approaches x: input vector (pattern) ω: class label (class) • Generative – Model the joint probability, p(x, ω). – Make predictions by using Bayes rule to calculate p(ω/x). – Pick the most likely class label ω. • Discriminative – No need to model p(x, ω). – Estimate p(ω/x) by “learning” a direct mapping from x to ω (i.e., estimate decision boundary). – Pick the most likely class label ω. 9 ω1 ω2 How can we define the relationship b/w labelled data using probability??? x1w1 X2w2 … … … Xn-->wn X_unk???? Suppose we are having a total of 60k samples Out of which 4k samples with labelled If I am having images of • Table • chairs
  • 10. Examples Generative classifiers • Naïve Bayes • Bayesian networks • Markov random fields • Hidden Markov Models (HMM) Discriminative Classifiers • Logistic regression • Support Vector Machine • Traditional neural networks • Nearest neighbour classifiers • Conditional Random Fields (CRF)s
  • 11. How do we model p(x, ω)? • Typically, using a statistical model. • probability density function (e.g., Gaussian) 11 Gender Classification male female P(x, w) = joint probability of sample x and class w 1. Calculate the probability of sample S5 coming in class w=1 2. Calculate the probability of sample S5 coming in class w=0 S5 W=1 W=0 S4 P(S1, w0) P(S1, w1) P(S2, w0) P(S2, w1) ….. ….. …
  • 12. Key Challenges • Intra-class variability • Inter-class variability 12 Letters/Numbers that look similar The letter “T” in different typefaces
  • 20. Medical Applications 20 Skin Cancer Detection Breast Cancer Detection
  • 21. Land Cover Classification (from aerial or satellite images) 21
  • 22. 22 The Design Cycle • Data collection • Feature Choice • Model Choice • Training • Evaluation • Computational Complexity
  • 23. Agent Environment 1. Digital 2. Continuous When developing a ML model Quality of data is most important thing to consider at first Height 1. Feature selection 2. Feature Extraction Understanding/Interpreting collected data If I am having 50 features In feature selection, we are going to select - Actual values of features won’t changes Feature extraction: Actual values changes , Sample 1 = [59, 5.6] Sample 2 = [68, 5.9] X = [Sample1, Sample2, ….., so on]
  • 24. Feature 1 (f1) Feature 2 (f2) Feature3 (f3) …. Feature n (784) Labels 27 143 54 108 Car 10 59 20 30 Car … …. …. … House 28 28 28*28 = 784 1. Read your data from database 2. Store in form of matrix Where each row is your 1 image Img 1 Img1 : [27, 143, 2*27……3*27] Img2: [10, 59, 2*10….3*10] Data is redundant Data is correlated F is a feature set where we have {f1, f2… f784} f1, f3 and f784 are correlated Discard correlated feature SFS/SBS or SFFS
  • 25. 1. Feature selection In case of feature selection the values of feature will not change e.g. if you have chosen f1 and f3 for further processing then Img1: {27, 54} Img2 : {10, 20} 2. Feature extraction : The values of features will be different (What will be the different values?) PCA, SVD etc 1. Less computation time 2. Higher accuracy (This is not necessary all the time) Hughes Phenomenon
  • 26. • Data Collection • How do we know when we have collected an adequately large and representative set of examples for training and testing the system? Working area : Greater Noida (population suppose 1 million) Objective: find the buyer of a particular product (e.g. college bag) We, in many case, can’t collect the data of whole population Collect the samples from the population (there are different sampling methods available for it) The samples should represent actual population Statistical Analysis 1. Univariate (mean, median etc) 2. Multivariate (scatter plot
  • 27. • Feature Choice • Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise.
  • 28. • Model Choice • Unsatisfied with the performance of our fish classifier and want to jump to another class of model 100s of available choices Selection should be based on the characteristics of your dataset ANN SVM Decision Trees
  • 29. • Training • Use data to determine the classifier. Many different procedures for training classifiers and choosing models
  • 30. • Evaluation • Measure the error rate (or performance and switch from one set of features to another one Confusion matrix based measures
  • 31. • Computational Complexity • What is the trade-off between computational ease and performance? • (How an algorithm scales as a function of the number of features, patterns or categories?) #computational steps are in millions
  • 32. Relevant basics of Linear Algebra
  • 33. n-dimensional Vector • An n-dimensional vector v is denoted as follows: • The transpose vT is denoted as follows:
  • 34. Inner (or dot) product • Given vT = (x1, x2, . . . , xn) and wT = (y1, y2, . . . , yn), their dot product defined as follows: or (scalar)
  • 35. Orthogonal / Orthonormal vectors • A set of vectors x1, x2, . . . , xn is orthogonal if • A set of vectors x1, x2, . . . , xn is orthonormal if k
  • 36. Linear combinations • A vector v is a linear combination of the vectors v1, ..., vk if: where c1, ..., ck are constants. • Example: vectors in R3 can be expressed as a linear combinations of unit vectors i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1)
  • 37. Space spanning • A set of vectors S=(v1, v2, . . . , vk ) span some space W if every vector w in W can be written as a linear combination of the vectors in S - The unit vectors i, j, and k span R3 w
  • 38. Linear dependence • A set of vectors v1, ..., vk are linearly dependent if at least one of them is a linear combination of the others: (i.e., vj does not appear on the right side)
  • 39. Linear independence • A set of vectors v1, ..., vk is linearly independent if no vector vj can be represented as a linear combination of the remaining vectors, i.e.,: Example: c1=c2=0
  • 40. Vector basis • A set of vectors v1, ..., vk forms a basis in some vector space W if: (1) (v1, ..., vk) span W (2) (v1, ..., vk) are linearly independent • Standard bases: R2 R3 Rn
  • 41. Matrix Operations • Matrix addition/subtraction • Add/Subtract corresponding elements. • Matrices must be of same size. • Matrix multiplication Condition: n = q m x n q x p m x p n
  • 45. Determinants 2 x 2 3 x 3 n x n Properties: (expanded along 1st column) (expanded along kth column)
  • 46. Matrix Inverse • The inverse of a matrix A, denoted as A-1, has the property: A A-1 = A-1A = I • A-1 exists only if • Definitions • Singular matrix: A-1 does not exist • Ill-conditioned matrix: A is “close” to being singular
  • 47. Matrix Inverse (cont’d) • Properties of the inverse:
  • 49. Rank of matrix • Defined as the dimension of the largest square sub-matrix of A that has a non-zero determinant. Example: has rank 3
  • 50. Rank of matrix (cont’d) • Alternatively, it is defined as the maximum number of linearly independent columns (or rows) of A. i.e., rank is not 4! Example:
  • 51. Rank of matrix (cont’d) • Useful properties:
  • 52. Eigenvalues and Eigenvectors • The vector v is an eigenvector of matrix A and λ is an eigenvalue of A if: Geometric interpretation: the linear transformation implied by A can not change the direction of the eigenvectors v, only their magnitude. (assume v is non-zero)
  • 53. Computing λ and v • To compute the eigenvalues λ of a matrix A, find the roots of the characteristic polynomial. • The eigenvectors can then be computed: Example:
  • 54. Properties of λ and v • Eigenvalues and eigenvectors are only defined for square matrices. • Eigenvectors are not unique (e.g., if v is an eigenvector, so is kv)  • Suppose λ1, λ2, ..., λn are the eigenvalues of A, then:
  • 55. Matrix diagonalization • Given an n x n matrix A, find P such that: P-1AP=Λ where Λ is diagonal • Solution: Set P = [v1 v2 . . . vn], where v1,v2 ,. . . vn are the eigenvectors of A:
  • 57. • If A is diagonalizable, then the corresponding eigenvectors v1,v2 ,. . . vn form a basis in Rn • If A is also symmetric, its eigenvalues are real and the corresponding eigenvectors are orthogonal. Matrix diagonalization (cont’d)
  • 58. • An n x n matrix A is diagonalizable iff rank(P)=n, that is, it has n linearly independent eigenvectors. • Theorem: If the eigenvalues of A are all distinct, then the corresponding eigenvectors are linearly independent (i.e., A is diagonalizable). Are all n x n matrices diagonalizable?
  • 59. Matrix decomposition • If A is diagonalizable, then A can be decomposed as follows:
  • 60. Matrix decomposition (cont’d) • Matrix decomposition can be simplified in the case of symmetric matrices (i.e., orthogonal eigenvectors): P-1=PT A=PDPT=
  • 64. Extra
  • 65. Main Phases 65 Training Phase Test Phase Classification (thematic values) Or Regression (continuous values) 50,000 image You have divided 35k for training and 15 k for testing Validation data Step 1 Step 2 Step 3 Step 1 Step 2 Step 3 Additional step Score (Any distance function) 1: 0.75
  • 66. Complexity of PR – An Example 66 Problem: Sorting incoming fish on a conveyor belt. Assumption: Two kind of fish: (1) sea bass (2) salmon camera
  • 67. Sensors • Sensing: • Use a sensor (camera or microphone) for data capture. • PR depends on bandwidth, resolution, sensitivity, distortion of the sensor. 67
  • 68. Preprocessing 68 (1) Noise removal (2) Image enhancement (3) Separate touching or occluding fish (3) Extract boundary of each fish
  • 69. Training/Test data • How do we know that we have collected an adequately large and representative set of examples for training/testing the system? 69 Training Set ? Test Set ?
  • 70. Feature Extraction • How to choose a good set of features? • Discriminative features • Invariant features (e.g., invariant to geometric transformations such as translation, rotation and scale) • Are there ways to automatically learn which features are best ? 70
  • 71. Feature Extraction - Example • Let’s consider the fish classification example: • Assume a fisherman told us that a sea bass is generally longer than a salmon. • We can use length as a feature and decide between sea bass and salmon according to a threshold on length. • How should we choose the threshold? 71
  • 72. Feature Extraction - Length • Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold. 72 threshold l* Histogram of “length”
  • 73. Feature Extraction - Lightness • Consider different features, e.g., “lightness” • It seems easier to choose the threshold x* but we still cannot make a perfect decision. 73 threshold x* Histogram of “lightness”
  • 74. Multiple Features • To improve recognition accuracy, we might need to use more than one features. • Single features might not yield the best performance. • Using combinations of features might yield better performance. 1 2 x x       1 2 : : x lightness x width 74
  • 75. How Many Features? • Does adding more features always improve performance? • It might be difficult and computationally expensive to extract certain features. • Correlated features might not improve performance (i.e. redundancy). • “Curse” of dimensionality. 75
  • 76. Curse of Dimensionality • Adding too many features can, paradoxically, lead to a worsening of performance. • Divide each of the input features into a number of intervals, so that the value of a feature can be specified approximately by saying in which interval it lies. • If each input feature is divided into M divisions, then the total number of cells is Md (d: # of features). • Since each cell must contain at least one point, the number of training data grows exponentially with d. 76
  • 77. Missing Features • Certain features might be missing (e.g., due to occlusion). • How should we train the classifier with missing features ? • How should the classifier make the best decision with missing features ? 77
  • 78. Classification • Partition the feature space into two regions by finding the decision boundary that minimizes the error. • How should we find the optimal decision boundary? 78
  • 79. Complexity of Classification Model • We can get perfect classification performance on the training data by choosing a more complex model. • Complex models are tuned to the particular training samples, rather than on the characteristics of the true model. 79 How well can the model generalize to unknown samples? overfitting
  • 80. Generalization • Generalization is defined as the ability of a classifier to produce correct results on novel patterns. • How can we improve generalization performance ? • More training examples (i.e., better model estimates). • Simpler models usually yield better performance. 80 complex model simpler model
  • 81. Understanding model complexity: function approximation 81 • Approximate a function from a set of samples o Green curve is the true function o Ten sample points are shown by the blue circles (assuming noise)
  • 82. Understanding model complexity: function approximation (cont’d) 82 Polynomial curve fitting: polynomials having various orders, shown as red curves, fitted to the set of 10 sample points.
  • 83. Understanding model complexity: function approximation (cont’d) 83 • More data can improve model estimation • Polynomial curve fitting: 9’th order polynomials fitted to 15 and 100 sample points.
  • 84. Improve Classification Performance through Post- processing • Consider the problem of character recognition. • Exploit context to improve performance. 84 How m ch info mation are y u mi sing?
  • 85. Improve Classification Performance through Ensembles of Classifiers • Performance can be improved using a "pool" of classifiers. • How should we build and combine different classifiers ? 85
  • 86. Cost of miss-classifications • Consider the fish classification example. • There are two possible classification errors: (1) Deciding the fish was a sea bass when it was a salmon. (2) Deciding the fish was a salmon when it was a sea bass. • Are both errors equally important ? 86
  • 87. Cost of miss-classifications (cont’d) • Suppose that: • Customers who buy salmon will object vigorously if they see sea bass in their cans. • Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans. • How does this knowledge affect our decision? 87
  • 88. Computational Complexity • How does an algorithm scale with the number of: • features • patterns • categories • Need to consider tradeoffs between computational complexity and performance. 88
  • 89. Would it be possible to build a “general purpose” PR system? • It would be very difficult to design a system that is capable of performing a variety of classification tasks. • Different problems require different features. • Different features might yield different solutions. • Different tradeoffs exist for different problems. 89
  • 90. Introduction to Pattern recognition
  翻译: