尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
K-MEANS
CLUSTERING
INTRODUCTION-
What is clustering?
 Clustering is the classification of objects into
different groups, or more precisely, the
partitioning of a data set into subsets
(clusters), so that the data in each subset
(ideally) share some common trait - often
according to some defined distance measure.
Types of clustering:
1. Hierarchical algorithms: these find successive clusters
using previously established clusters.
1. Agglomerative ("bottom-up"): Agglomerative algorithms
begin with each element as a separate cluster and
merge them into successively larger clusters.
2. Divisive ("top-down"): Divisive algorithms begin with
the whole set and proceed to divide it into successively
smaller clusters.
2. Partitional clustering: Partitional algorithms determine all
clusters at once. They include:
 K-means and derivatives
 Fuzzy c-means clustering
 QT clustering algorithm
Common Distance measures:
 Distance measure will determine how the similarity of two
elements is calculated and it will influence the shape of the
clusters.
They include:
1. The Euclidean distance (also called 2-norm distance) is
given by:
2. The Manhattan distance (also called taxicab norm or 1-
norm) is given by:
3.The maximum norm is given by:
4. The Mahalanobis distance corrects data for
different scales and correlations in the variables.
5. Inner product space: The angle between two
vectors can be used as a distance measure when
clustering high dimensional data
6. Hamming distance (sometimes edit distance)
measures the minimum number of substitutions
required to change one member into another.
K-MEANS CLUSTERING
 The k-means algorithm is an algorithm to cluster
n objects based on attributes into k partitions,
where k < n.
 It is similar to the
expectation-maximization algorithm for mixtures of
Gaussians in that they both attempt to find the
centers of natural clusters in the data.
 It assumes that the object attributes form a vector
space.
 An algorithm for partitioning (or clustering) N
data points into K disjoint subsets Sj
containing data points so as to minimize the
sum-of-squares criterion
where xn is a vector representing the the nth
data point and uj is the geometric centroid of
the data points in Sj.
 Simply speaking k-means clustering is an
algorithm to classify or to group the objects
based on attributes/features into K number of
group.
 K is positive integer number.
 The grouping is done by minimizing the sum
of squares of distances between data and the
corresponding cluster centroid.
How the K-Mean Clustering
algorithm works?
 Step 1: Begin with a decision on the value of k =
number of clusters .
 Step 2: Put any initial partition that classifies the
data into k clusters. You may assign the
training samples randomly,or systematically
as the following:
1.Take the first k training sample as single-
element clusters
2. Assign each of the remaining (N-k) training
sample to the cluster with the nearest
centroid. After each assignment, recompute the
centroid of the gaining cluster.
 Step 3: Take each sample in sequence and
compute its distance from the centroid of
each of the clusters. If a sample is not
currently in the cluster with the closest
centroid, switch this sample to that cluster
and update the centroid of the cluster
gaining the new sample and the cluster
losing the sample.
 Step 4 . Repeat step 3 until convergence is
achieved, that is until a pass through the
training sample causes no new assignments.
A Simple example showing the
implementation of k-means algorithm
(using K=2)
Step 1:
Initialization: Randomly we choose following two centroids
(k=2) for two clusters.
In this case the 2 centroid are: m1=(1.0,1.0) and
m2=(5.0,7.0).
Step 2:
 Thus, we obtain two clusters
containing:
{1,2,3} and {4,5,6,7}.
 Their new centroids are:
Step 3:
 Now using these centroids
we compute the Euclidean
distance of each object, as
shown in table.
 Therefore, the new
clusters are:
{1,2} and {3,4,5,6,7}
 Next centroids are:
m1=(1.25,1.5) and m2 =
(3.9,5.1)
 Step 4 :
The clusters obtained are:
{1,2} and {3,4,5,6,7}
 Therefore, there is no
change in the cluster.
 Thus, the algorithm comes
to a halt here and final
result consist of 2 clusters
{1,2} and {3,4,5,6,7}.
PLOT
(with K=3)
Step 1 Step 2
PLOT
Real-Life Numerical Example
of K-Means Clustering
We have 4 medicines as our training data points object
and each medicine has 2 attributes. Each attribute
represents coordinate of the object. We have to
determine which medicines belong to cluster 1 and
which medicines belong to the other cluster.
Object
Attribute1 (X):
weight index
Attribute 2 (Y): pH
Medicine A
1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4
Step 1:
 Initial value of
centroids : Suppose
we use medicine A and
medicine B as the first
centroids.
 Let and c1 and c2 denote
the coordinate of the
centroids, then c1=(1,1)
and c2=(2,1)
 Objects-Centroids distance : we calculate the
distance between cluster centroid to each object.
Let us use Euclidean distance, then we have
distance matrix at iteration 0 is
 Each column in the distance matrix symbolizes the
object.
 The first row of the distance matrix corresponds to the
distance of each object to the first centroid and the
second row is the distance of each object to the second
centroid.
 For example, distance from medicine C = (4, 3) to the
first centroid is , and its distance to the
second centroid is , is etc.
Step 2:
 Objects clustering : We
assign each object based
on the minimum distance.
 Medicine A is assigned to
group 1, medicine B to
group 2, medicine C to
group 2 and medicine D to
group 2.
 The elements of Group
matrix below is 1 if and
only if the object is
assigned to that group.
 Iteration-1, Objects-Centroids distances : The
next step is to compute the distance of all
objects to the new centroids.
 Similar to step 2, we have distance matrix at
iteration 1 is
 Iteration-1, Objects
clustering:Based on the new
distance matrix, we move the
medicine B to Group 1 while
all the other objects remain.
The Group matrix is shown
below
 Iteration 2, determine
centroids: Now we repeat step
4 to calculate the new centroids
coordinate based on the
clustering of previous iteration.
Group1 and group 2 both has
two members, thus the new
centroids are
and
 Iteration-2, Objects-Centroids distances :
Repeat step 2 again, we have new distance
matrix at iteration 2 as
 Iteration-2, Objects clustering: Again, we
assign each object based on the minimum
distance.
 We obtain result that . Comparing the
grouping of last iteration and this iteration reveals
that the objects does not move group anymore.
 Thus, the computation of the k-mean clustering
has reached its stability and no more iteration is
needed..
Object Feature1(X):
weight index
Feature2
(Y): pH
Group
(result)
Medicine A 1 1 1
Medicine B 2 1 1
Medicine C 4 3 2
Medicine D 5 4 2
We get the final grouping as the results as:
K-Means Clustering Visual Basic Code
Sub kMeanCluster (Data() As Variant, numCluster As Integer)
' main function to cluster data into k number of Clusters
' input:
' + Data matrix (0 to 2, 1 to TotalData);
' Row 0 = cluster, 1 =X, 2= Y; data in columns
' + numCluster: number of cluster user want the data to be clustered
' + private variables: Centroid, TotalData
' ouput:
' o) update centroid
' o) assign cluster number to the Data (= row 0 of Data)
Dim i As Integer
Dim j As Integer
Dim X As Single
Dim Y As Single
Dim min As Single
Dim cluster As Integer
Dim d As Single
Dim sumXY()
Dim isStillMoving As Boolean
isStillMoving = True
if totalData <= numCluster Then
'only the last data is put here because it designed to be interactive
Data(0, totalData) = totalData ' cluster No = total data
Centroid(1, totalData) = Data(1, totalData) ' X
Centroid(2, totalData) = Data(2, totalData) ' Y
Else
'calculate minimum distance to assign the new data
min = 10 ^ 10 'big number
X = Data(1, totalData)
Y = Data(2, totalData)
For i = 1 To numCluster
Do While isStillMoving
' this loop will surely convergent
'calculate new centroids
' 1 =X, 2=Y, 3=count number of data
ReDim sumXY(1 To 3, 1 To numCluster)
For i = 1 To totalData
sumXY(1, Data(0, i)) = Data(1, i) + sumXY(1, Data(0, i))
sumXY(2, Data(0, i)) = Data(2, i) + sumXY(2, Data(0, i))
Data(0, i))
sumXY(3, Data(0, i)) = 1 + sumXY(3, Data(0, i))
Next i
For i = 1 To numCluster
Centroid(1, i) = sumXY(1, i) / sumXY(3, i)
Centroid(2, i) = sumXY(2, i) / sumXY(3, i)
Next i
'assign all data to the new centroids
isStillMoving = False
For i = 1 To totalData
min = 10 ^ 10 'big number
X = Data(1, i)
Y = Data(2, i)
For j = 1 To numCluster
d = dist(X, Y, Centroid(1, j), Centroid(2, j))
If d < min Then
min = d
cluster = j
End If
Next j
If Data(0, i) <> cluster Then
Data(0, i) = cluster
isStillMoving = True
End If
Next i
Loop
End If
End Sub
Weaknesses of K-Mean Clustering
1. When the numbers of data are not so many, initial
grouping will determine the cluster significantly.
2. The number of cluster, K, must be determined before
hand. Its disadvantage is that it does not yield the same
result with each run, since the resulting clusters depend
on the initial random assignments.
3. We never know the real cluster, using the same data,
because if it is inputted in a different order it may
produce different cluster if the number of data is few.
4. It is sensitive to initial condition. Different initial condition
may produce different result of cluster. The algorithm
may be trapped in the local optimum.
Applications of K-Mean
Clustering
 It is relatively efficient and fast. It computes result
at O(tkn), where n is number of objects or points, k
is number of clusters and t is number of iterations.
 k-means clustering can be applied to machine
learning or data mining
 Used on acoustic data in speech understanding to
convert waveforms into one of k categories (known
as Vector Quantization or Image Segmentation).
 Also used for choosing color palettes on old
fashioned graphical display devices and Image
Quantization.
CONCLUSION
 K-means algorithm is useful for undirected
knowledge discovery and is relatively simple.
K-means has found wide spread usage in lot
of fields, ranging from unsupervised learning
of neural network, Pattern recognitions,
Classification analysis, Artificial intelligence,
image processing, machine vision, and many
others.
References
 Tutorial - Tutorial with introduction of Clustering Algorithms (k-means, fuzzy-c-means,
hierarchical, mixture of gaussians) + some interactive demos (java applets).
 Digital Image Processing and Analysis-byB.Chanda and D.Dutta Majumdar.
 H. Zha, C. Ding, M. Gu, X. He and H.D. Simon. "Spectral Relaxation for K-means
Clustering", Neural Information Processing Systems vol.14 (NIPS 2001). pp. 1057-
1064, Vancouver, Canada. Dec. 2001.
 J. A. Hartigan (1975) "Clustering Algorithms". Wiley.
 J. A. Hartigan and M. A. Wong (1979) "A K-Means Clustering Algorithm", Applied
Statistics, Vol. 28, No. 1, p100-108.
 D. Arthur, S. Vassilvitskii (2006): "How Slow is the k-means Method?,"
 D. Arthur, S. Vassilvitskii: "k-means++ The Advantages of Careful Seeding" 2007
Symposium on Discrete Algorithms (SODA).
 www.wikipedia.com
K mean-clustering algorithm
K mean-clustering algorithm

More Related Content

What's hot

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
Clustering
ClusteringClustering
Clustering
M Rizwan Aqeel
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
hadifar
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
Dr.E.N.Sathishkumar
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
Mohammad Junaid Khan
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
Archana Swaminathan
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
Activation function
Activation functionActivation function
Activation function
Astha Jain
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
Azad public school
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
Benazir Income Support Program (BISP)
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 

What's hot (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Clustering
ClusteringClustering
Clustering
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Activation function
Activation functionActivation function
Activation function
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 

Viewers also liked

Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithm
khalid Shah
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
Darshak Mehta
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
Cardiac Image Analysis based on K Means Clustering
Cardiac Image Analysis based on K Means ClusteringCardiac Image Analysis based on K Means Clustering
Cardiac Image Analysis based on K Means Clustering
NAVEEN TOKAS
 
K means and dbscan
K means and dbscanK means and dbscan
K means and dbscan
Yan Xu
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
Manje Gowda
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
Subhas Kumar Ghosh
 
Phase rule
Phase rulePhase rule
phase rule & phase diagram
phase rule & phase diagramphase rule & phase diagram
phase rule & phase diagram
Yog's Malani
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
Saramita De Chakravarti
 
The phase rule
The phase ruleThe phase rule
The phase rule
Jatin Garg
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
khanam22
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation ppt
Gichelle Amon
 
Coacervation Phase Separation Techniques
Coacervation Phase Separation TechniquesCoacervation Phase Separation Techniques
Coacervation Phase Separation Techniques
Gargi Nanda
 
Phase Diagrams and Phase Rule
Phase Diagrams and Phase RulePhase Diagrams and Phase Rule
Phase Diagrams and Phase Rule
Ruchi Pandey
 
IMAGE SEGMENTATION.
IMAGE SEGMENTATION.IMAGE SEGMENTATION.
IMAGE SEGMENTATION.
Tawose Olamide Timothy
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
Edureka!
 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
Carol Smith
 

Viewers also liked (18)

Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithm
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Cardiac Image Analysis based on K Means Clustering
Cardiac Image Analysis based on K Means ClusteringCardiac Image Analysis based on K Means Clustering
Cardiac Image Analysis based on K Means Clustering
 
K means and dbscan
K means and dbscanK means and dbscan
K means and dbscan
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
Phase rule
Phase rulePhase rule
Phase rule
 
phase rule & phase diagram
phase rule & phase diagramphase rule & phase diagram
phase rule & phase diagram
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
The phase rule
The phase ruleThe phase rule
The phase rule
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation ppt
 
Coacervation Phase Separation Techniques
Coacervation Phase Separation TechniquesCoacervation Phase Separation Techniques
Coacervation Phase Separation Techniques
 
Phase Diagrams and Phase Rule
Phase Diagrams and Phase RulePhase Diagrams and Phase Rule
Phase Diagrams and Phase Rule
 
IMAGE SEGMENTATION.
IMAGE SEGMENTATION.IMAGE SEGMENTATION.
IMAGE SEGMENTATION.
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
 

Similar to K mean-clustering algorithm

k-mean-clustering.ppt
k-mean-clustering.pptk-mean-clustering.ppt
k-mean-clustering.ppt
RanimeLoutar
 
k-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSSk-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSS
MarkNaguibElAbd
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptx
Syed Ejaz
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
Afzaal Subhani
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.ppt
SyedNahin1
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
JoonyoungJayGwak
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
Subrata Kumer Paul
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
SueMiu
 
11-2-Clustering.pptx
11-2-Clustering.pptx11-2-Clustering.pptx
11-2-Clustering.pptx
paktari1
 
K means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objectsK means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objects
VoidVampire
 
Data Clusterng
Data ClusterngData Clusterng
Data Clusterng
VIDYA NAND JHA
 
Clustering
ClusteringClustering
Clustering
Md. Hasnat Shoheb
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Salah Amean
 
Project PPT
Project PPTProject PPT
Project PPT
Dhaarna Singh
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
butest
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
ImXaib
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
SandinoBerutu1
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
Eng. Dr. Dennis N. Mwighusa
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
refedey275
 

Similar to K mean-clustering algorithm (20)

k-mean-clustering.ppt
k-mean-clustering.pptk-mean-clustering.ppt
k-mean-clustering.ppt
 
k-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSSk-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSS
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptx
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.ppt
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
11-2-Clustering.pptx
11-2-Clustering.pptx11-2-Clustering.pptx
11-2-Clustering.pptx
 
K means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objectsK means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objects
 
Data Clusterng
Data ClusterngData Clusterng
Data Clusterng
 
Clustering
ClusteringClustering
Clustering
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
 
Project PPT
Project PPTProject PPT
Project PPT
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 

More from parry prabhu

The american academy of neurology
The american academy of neurologyThe american academy of neurology
The american academy of neurology
parry prabhu
 
White breads vs brown breads
White breads vs brown breadsWhite breads vs brown breads
White breads vs brown breads
parry prabhu
 
Best hospitals lists
Best hospitals listsBest hospitals lists
Best hospitals lists
parry prabhu
 
Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015
parry prabhu
 
Install notes
Install notesInstall notes
Install notes
parry prabhu
 
Big data requirements
Big data requirementsBig data requirements
Big data requirements
parry prabhu
 
wireless sensor network 2015-2016
wireless sensor network 2015-2016wireless sensor network 2015-2016
wireless sensor network 2015-2016
parry prabhu
 
wireless sensor network,mobile networking
wireless sensor network,mobile networkingwireless sensor network,mobile networking
wireless sensor network,mobile networking
parry prabhu
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
parry prabhu
 
Tecxpera technologies
Tecxpera technologiesTecxpera technologies
Tecxpera technologies
parry prabhu
 
real time big data
real time big data real time big data
real time big data
parry prabhu
 
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
Hasbe a hierarchical attribute based solution for flexible and scalable acces...Hasbe a hierarchical attribute based solution for flexible and scalable acces...
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
parry prabhu
 
2015 ieee Android titles link
2015 ieee Android titles link2015 ieee Android titles link
2015 ieee Android titles link
parry prabhu
 
Privacy preserving public auditing for regenerating-code-based cloud storage
Privacy preserving public auditing for regenerating-code-based cloud storagePrivacy preserving public auditing for regenerating-code-based cloud storage
Privacy preserving public auditing for regenerating-code-based cloud storage
parry prabhu
 
Aston martin
Aston martinAston martin
Aston martin
parry prabhu
 
Database
DatabaseDatabase
Database
parry prabhu
 
how to make a 1st review presentation
how to make a 1st review presentationhow to make a 1st review presentation
how to make a 1st review presentation
parry prabhu
 
system requirements for java project
system requirements for java projectsystem requirements for java project
system requirements for java project
parry prabhu
 
system requirements for .net projects
system requirements for .net projects system requirements for .net projects
system requirements for .net projects
parry prabhu
 
system requirement for network simulator projects
 system requirement for network simulator projects system requirement for network simulator projects
system requirement for network simulator projects
parry prabhu
 

More from parry prabhu (20)

The american academy of neurology
The american academy of neurologyThe american academy of neurology
The american academy of neurology
 
White breads vs brown breads
White breads vs brown breadsWhite breads vs brown breads
White breads vs brown breads
 
Best hospitals lists
Best hospitals listsBest hospitals lists
Best hospitals lists
 
Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015
 
Install notes
Install notesInstall notes
Install notes
 
Big data requirements
Big data requirementsBig data requirements
Big data requirements
 
wireless sensor network 2015-2016
wireless sensor network 2015-2016wireless sensor network 2015-2016
wireless sensor network 2015-2016
 
wireless sensor network,mobile networking
wireless sensor network,mobile networkingwireless sensor network,mobile networking
wireless sensor network,mobile networking
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
Tecxpera technologies
Tecxpera technologiesTecxpera technologies
Tecxpera technologies
 
real time big data
real time big data real time big data
real time big data
 
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
Hasbe a hierarchical attribute based solution for flexible and scalable acces...Hasbe a hierarchical attribute based solution for flexible and scalable acces...
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
 
2015 ieee Android titles link
2015 ieee Android titles link2015 ieee Android titles link
2015 ieee Android titles link
 
Privacy preserving public auditing for regenerating-code-based cloud storage
Privacy preserving public auditing for regenerating-code-based cloud storagePrivacy preserving public auditing for regenerating-code-based cloud storage
Privacy preserving public auditing for regenerating-code-based cloud storage
 
Aston martin
Aston martinAston martin
Aston martin
 
Database
DatabaseDatabase
Database
 
how to make a 1st review presentation
how to make a 1st review presentationhow to make a 1st review presentation
how to make a 1st review presentation
 
system requirements for java project
system requirements for java projectsystem requirements for java project
system requirements for java project
 
system requirements for .net projects
system requirements for .net projects system requirements for .net projects
system requirements for .net projects
 
system requirement for network simulator projects
 system requirement for network simulator projects system requirement for network simulator projects
system requirement for network simulator projects
 

Recently uploaded

Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
sexytaniya455
 
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
nonods
 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
NaveenNaveen726446
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Balvir Singh
 
My Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdfMy Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdf
Geoffrey Wardle. MSc. MSc. Snr.MAIAA
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
 
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
sonamrawat5631
 
Lateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptxLateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptx
DebendraDevKhanal1
 
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
shourabjaat424
 
Intuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sdeIntuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sde
ShivangMishra54
 
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
aarusi sexy model
 
Cricket management system ptoject report.pdf
Cricket management system ptoject report.pdfCricket management system ptoject report.pdf
Cricket management system ptoject report.pdf
Kamal Acharya
 
Technological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdfTechnological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdf
tanujaharish2
 
Covid Management System Project Report.pdf
Covid Management System Project Report.pdfCovid Management System Project Report.pdf
Covid Management System Project Report.pdf
Kamal Acharya
 
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
nainakaoornoida
 
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
dulbh kashyap
 
SPICE PARK JUL2024 ( 6,866 SPICE Models )
SPICE PARK JUL2024 ( 6,866 SPICE Models )SPICE PARK JUL2024 ( 6,866 SPICE Models )
SPICE PARK JUL2024 ( 6,866 SPICE Models )
Tsuyoshi Horigome
 
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl LucknowCall Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
yogita singh$A17
 
Microsoft Azure AD architecture and features
Microsoft Azure AD architecture and featuresMicrosoft Azure AD architecture and features
Microsoft Azure AD architecture and features
ssuser381403
 
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
dABGO KI CITy kUSHINAGAR Ak47
 

Recently uploaded (20)

Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
Call Girls Nagpur 8824825030 Escort In Nagpur service 24X7
 
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
 
My Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdfMy Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdf
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
 
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
🔥Young College Call Girls Chandigarh 💯Call Us 🔝 7737669865 🔝💃Independent Chan...
 
Lateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptxLateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptx
 
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
Call Girls Chandigarh 🔥 7014168258 🔥 Real Fun With Sexual Girl Available 24/7...
 
Intuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sdeIntuit CRAFT demonstration presentation for sde
Intuit CRAFT demonstration presentation for sde
 
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
🔥 Hyderabad Call Girls  👉 9352988975 👫 High Profile Call Girls Whatsapp Numbe...
 
Cricket management system ptoject report.pdf
Cricket management system ptoject report.pdfCricket management system ptoject report.pdf
Cricket management system ptoject report.pdf
 
Technological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdfTechnological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdf
 
Covid Management System Project Report.pdf
Covid Management System Project Report.pdfCovid Management System Project Report.pdf
Covid Management System Project Report.pdf
 
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
❣Independent Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai E...
 
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
 
SPICE PARK JUL2024 ( 6,866 SPICE Models )
SPICE PARK JUL2024 ( 6,866 SPICE Models )SPICE PARK JUL2024 ( 6,866 SPICE Models )
SPICE PARK JUL2024 ( 6,866 SPICE Models )
 
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl LucknowCall Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
Call Girls In Lucknow 🔥 +91-7014168258🔥High Profile Call Girl Lucknow
 
Microsoft Azure AD architecture and features
Microsoft Azure AD architecture and featuresMicrosoft Azure AD architecture and features
Microsoft Azure AD architecture and features
 
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
High Profile Call Girls Ahmedabad 🔥 7737669865 🔥 Real Fun With Sexual Girl Av...
 

K mean-clustering algorithm

  • 2. INTRODUCTION- What is clustering?  Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often according to some defined distance measure.
  • 3. Types of clustering: 1. Hierarchical algorithms: these find successive clusters using previously established clusters. 1. Agglomerative ("bottom-up"): Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters. 2. Divisive ("top-down"): Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters. 2. Partitional clustering: Partitional algorithms determine all clusters at once. They include:  K-means and derivatives  Fuzzy c-means clustering  QT clustering algorithm
  • 4. Common Distance measures:  Distance measure will determine how the similarity of two elements is calculated and it will influence the shape of the clusters. They include: 1. The Euclidean distance (also called 2-norm distance) is given by: 2. The Manhattan distance (also called taxicab norm or 1- norm) is given by:
  • 5. 3.The maximum norm is given by: 4. The Mahalanobis distance corrects data for different scales and correlations in the variables. 5. Inner product space: The angle between two vectors can be used as a distance measure when clustering high dimensional data 6. Hamming distance (sometimes edit distance) measures the minimum number of substitutions required to change one member into another.
  • 6. K-MEANS CLUSTERING  The k-means algorithm is an algorithm to cluster n objects based on attributes into k partitions, where k < n.  It is similar to the expectation-maximization algorithm for mixtures of Gaussians in that they both attempt to find the centers of natural clusters in the data.  It assumes that the object attributes form a vector space.
  • 7.  An algorithm for partitioning (or clustering) N data points into K disjoint subsets Sj containing data points so as to minimize the sum-of-squares criterion where xn is a vector representing the the nth data point and uj is the geometric centroid of the data points in Sj.
  • 8.  Simply speaking k-means clustering is an algorithm to classify or to group the objects based on attributes/features into K number of group.  K is positive integer number.  The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid.
  • 9. How the K-Mean Clustering algorithm works?
  • 10.  Step 1: Begin with a decision on the value of k = number of clusters .  Step 2: Put any initial partition that classifies the data into k clusters. You may assign the training samples randomly,or systematically as the following: 1.Take the first k training sample as single- element clusters 2. Assign each of the remaining (N-k) training sample to the cluster with the nearest centroid. After each assignment, recompute the centroid of the gaining cluster.
  • 11.  Step 3: Take each sample in sequence and compute its distance from the centroid of each of the clusters. If a sample is not currently in the cluster with the closest centroid, switch this sample to that cluster and update the centroid of the cluster gaining the new sample and the cluster losing the sample.  Step 4 . Repeat step 3 until convergence is achieved, that is until a pass through the training sample causes no new assignments.
  • 12. A Simple example showing the implementation of k-means algorithm (using K=2)
  • 13. Step 1: Initialization: Randomly we choose following two centroids (k=2) for two clusters. In this case the 2 centroid are: m1=(1.0,1.0) and m2=(5.0,7.0).
  • 14. Step 2:  Thus, we obtain two clusters containing: {1,2,3} and {4,5,6,7}.  Their new centroids are:
  • 15. Step 3:  Now using these centroids we compute the Euclidean distance of each object, as shown in table.  Therefore, the new clusters are: {1,2} and {3,4,5,6,7}  Next centroids are: m1=(1.25,1.5) and m2 = (3.9,5.1)
  • 16.  Step 4 : The clusters obtained are: {1,2} and {3,4,5,6,7}  Therefore, there is no change in the cluster.  Thus, the algorithm comes to a halt here and final result consist of 2 clusters {1,2} and {3,4,5,6,7}.
  • 17. PLOT
  • 19. PLOT
  • 20. Real-Life Numerical Example of K-Means Clustering We have 4 medicines as our training data points object and each medicine has 2 attributes. Each attribute represents coordinate of the object. We have to determine which medicines belong to cluster 1 and which medicines belong to the other cluster. Object Attribute1 (X): weight index Attribute 2 (Y): pH Medicine A 1 1 Medicine B 2 1 Medicine C 4 3 Medicine D 5 4
  • 21. Step 1:  Initial value of centroids : Suppose we use medicine A and medicine B as the first centroids.  Let and c1 and c2 denote the coordinate of the centroids, then c1=(1,1) and c2=(2,1)
  • 22.  Objects-Centroids distance : we calculate the distance between cluster centroid to each object. Let us use Euclidean distance, then we have distance matrix at iteration 0 is  Each column in the distance matrix symbolizes the object.  The first row of the distance matrix corresponds to the distance of each object to the first centroid and the second row is the distance of each object to the second centroid.  For example, distance from medicine C = (4, 3) to the first centroid is , and its distance to the second centroid is , is etc.
  • 23. Step 2:  Objects clustering : We assign each object based on the minimum distance.  Medicine A is assigned to group 1, medicine B to group 2, medicine C to group 2 and medicine D to group 2.  The elements of Group matrix below is 1 if and only if the object is assigned to that group.
  • 24.  Iteration-1, Objects-Centroids distances : The next step is to compute the distance of all objects to the new centroids.  Similar to step 2, we have distance matrix at iteration 1 is
  • 25.  Iteration-1, Objects clustering:Based on the new distance matrix, we move the medicine B to Group 1 while all the other objects remain. The Group matrix is shown below  Iteration 2, determine centroids: Now we repeat step 4 to calculate the new centroids coordinate based on the clustering of previous iteration. Group1 and group 2 both has two members, thus the new centroids are and
  • 26.  Iteration-2, Objects-Centroids distances : Repeat step 2 again, we have new distance matrix at iteration 2 as
  • 27.  Iteration-2, Objects clustering: Again, we assign each object based on the minimum distance.  We obtain result that . Comparing the grouping of last iteration and this iteration reveals that the objects does not move group anymore.  Thus, the computation of the k-mean clustering has reached its stability and no more iteration is needed..
  • 28. Object Feature1(X): weight index Feature2 (Y): pH Group (result) Medicine A 1 1 1 Medicine B 2 1 1 Medicine C 4 3 2 Medicine D 5 4 2 We get the final grouping as the results as:
  • 29. K-Means Clustering Visual Basic Code Sub kMeanCluster (Data() As Variant, numCluster As Integer) ' main function to cluster data into k number of Clusters ' input: ' + Data matrix (0 to 2, 1 to TotalData); ' Row 0 = cluster, 1 =X, 2= Y; data in columns ' + numCluster: number of cluster user want the data to be clustered ' + private variables: Centroid, TotalData ' ouput: ' o) update centroid ' o) assign cluster number to the Data (= row 0 of Data) Dim i As Integer Dim j As Integer Dim X As Single Dim Y As Single Dim min As Single Dim cluster As Integer Dim d As Single Dim sumXY() Dim isStillMoving As Boolean isStillMoving = True if totalData <= numCluster Then 'only the last data is put here because it designed to be interactive Data(0, totalData) = totalData ' cluster No = total data Centroid(1, totalData) = Data(1, totalData) ' X Centroid(2, totalData) = Data(2, totalData) ' Y Else 'calculate minimum distance to assign the new data min = 10 ^ 10 'big number X = Data(1, totalData) Y = Data(2, totalData) For i = 1 To numCluster
  • 30. Do While isStillMoving ' this loop will surely convergent 'calculate new centroids ' 1 =X, 2=Y, 3=count number of data ReDim sumXY(1 To 3, 1 To numCluster) For i = 1 To totalData sumXY(1, Data(0, i)) = Data(1, i) + sumXY(1, Data(0, i)) sumXY(2, Data(0, i)) = Data(2, i) + sumXY(2, Data(0, i)) Data(0, i)) sumXY(3, Data(0, i)) = 1 + sumXY(3, Data(0, i)) Next i For i = 1 To numCluster Centroid(1, i) = sumXY(1, i) / sumXY(3, i) Centroid(2, i) = sumXY(2, i) / sumXY(3, i) Next i 'assign all data to the new centroids isStillMoving = False For i = 1 To totalData min = 10 ^ 10 'big number X = Data(1, i) Y = Data(2, i) For j = 1 To numCluster d = dist(X, Y, Centroid(1, j), Centroid(2, j)) If d < min Then min = d cluster = j End If Next j If Data(0, i) <> cluster Then Data(0, i) = cluster isStillMoving = True End If Next i Loop End If End Sub
  • 31. Weaknesses of K-Mean Clustering 1. When the numbers of data are not so many, initial grouping will determine the cluster significantly. 2. The number of cluster, K, must be determined before hand. Its disadvantage is that it does not yield the same result with each run, since the resulting clusters depend on the initial random assignments. 3. We never know the real cluster, using the same data, because if it is inputted in a different order it may produce different cluster if the number of data is few. 4. It is sensitive to initial condition. Different initial condition may produce different result of cluster. The algorithm may be trapped in the local optimum.
  • 32. Applications of K-Mean Clustering  It is relatively efficient and fast. It computes result at O(tkn), where n is number of objects or points, k is number of clusters and t is number of iterations.  k-means clustering can be applied to machine learning or data mining  Used on acoustic data in speech understanding to convert waveforms into one of k categories (known as Vector Quantization or Image Segmentation).  Also used for choosing color palettes on old fashioned graphical display devices and Image Quantization.
  • 33. CONCLUSION  K-means algorithm is useful for undirected knowledge discovery and is relatively simple. K-means has found wide spread usage in lot of fields, ranging from unsupervised learning of neural network, Pattern recognitions, Classification analysis, Artificial intelligence, image processing, machine vision, and many others.
  • 34. References  Tutorial - Tutorial with introduction of Clustering Algorithms (k-means, fuzzy-c-means, hierarchical, mixture of gaussians) + some interactive demos (java applets).  Digital Image Processing and Analysis-byB.Chanda and D.Dutta Majumdar.  H. Zha, C. Ding, M. Gu, X. He and H.D. Simon. "Spectral Relaxation for K-means Clustering", Neural Information Processing Systems vol.14 (NIPS 2001). pp. 1057- 1064, Vancouver, Canada. Dec. 2001.  J. A. Hartigan (1975) "Clustering Algorithms". Wiley.  J. A. Hartigan and M. A. Wong (1979) "A K-Means Clustering Algorithm", Applied Statistics, Vol. 28, No. 1, p100-108.  D. Arthur, S. Vassilvitskii (2006): "How Slow is the k-means Method?,"  D. Arthur, S. Vassilvitskii: "k-means++ The Advantages of Careful Seeding" 2007 Symposium on Discrete Algorithms (SODA).  www.wikipedia.com
  翻译: