尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Association Analysis
Compiled by: Kamal Acharya
Association Analysis
• Association rules analysis is a technique to uncover(mine) how
items are associated to each other.
• Such uncovered association between items are called association
rules
• When to mine association rules?
– Scenario:
• You are a sales manager
• Customer bought a pc and a digital camera recently
• What should you recommend to her next?
• Association rules are helpful In making your recommendation.
Compiled by: Kamal Acharya
Contd…
• Frequent patterns(item sets):
– Frequent patterns are patterns that appear frequently in a data
set.
– E.g.,
• In transaction data set milk and bread is a frequent pattern,
• In a shopping history database first buy pc, then a digital camera, and
then a memory card is another example of frequent pattern.
– Finding frequent patterns plays an essential role in mining
association rules.
Compiled by: Kamal Acharya
Frequent pattern mining
• Frequent pattern mining searches for recurring relationships in a given
data set.
• Frequent pattern mining for the discovery of interesting associations
between item sets
• Such associations can be applicable in many business decision making
processes such as:
– Catalog design
– Basket data analysis
– cross-marketing,
– sale campaign analysis,
– Web log (click stream) analysis, etc
Compiled by: Kamal Acharya
Market Basket analysis
A typical example of frequent pattern(item set) mining for association rules.
• Market basket analysis analyzes customer buying habits by finding
associations between the different items that customers place in their shopping
baskets.
Applications: To make marketing strategies
Example of Association Rule:
milk bread
Definition: Frequent Itemset
• Itemset
– A collection of one or more items
• Example: {Milk, Bread, Diaper}
– k-itemset
• An itemset that contains k items
• Support count ()
– Frequency of occurrence of an itemset
– E.g. ({Milk, Bread,Diaper}) = 2
• Support
– Fraction of transactions that contain an
itemset
– E.g. s({Milk, Bread, Diaper}) = 2/5
• Frequent Itemset
– An itemset whose support is greater than or
equal to a minsup threshold
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Compiled by: Kamal Acharya
Definition: Association Rule
Association Rule :
An implication expression of the form X  Y, where X and Y are itemsets
e.g.,
•It is not very difficult to develop algorithms that will find this
associations in a large database.
•The problem is that such an algorithm will also uncover many other
associations that are of very little value.
It is necessary to introduce some measures to distinguish
interesting associations from non-interesting ones.
Beer}Diaper,Milk{ 
Compiled by: Kamal Acharya
Definition: Association Rule
Example:
Beer}Diaper,Milk{ 
4.0
5
2
|T|
)BeerDiaper,,Milk(


s
67.0
3
2
)Diaper,Milk(
)BeerDiaper,Milk,(



c
 Rule Evaluation Metrics
 Support (s)
 Fraction of transactions that
contain both X and Y
 Alternatively, support, s,
probability that a
transaction contains {X  Y}
 Confidence (c)
 Measures how often items in Y
appear in transactions that
contain X
 Alternatively, confidence, c,
conditional probability that a
transaction having X also contains
Y
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Compiled by: Kamal Acharya
TID date items_bought
100 10/10/99 {F,A,D,B}
200 15/10/99 {D,A,C,E,B}
300 19/10/99 {C,A,B,E}
400 20/10/99 {B,A,D}
Example1
• What is the support and confidence of the rule: {B,D}  {A}
 Support:
 percentage of tuples that contain {A,B,D} =
 Confidence:

D}{B,containthattuplesofnumber
D}B,{A,containthattuplesofnumber
75%
100%
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
Example2
• Given data set D is:
• What is the support and confidence of the rule:
– A  C
– C  A
• For rule A  C
– Support =50% and
– Confidence = 66.6%)
• For rule C  A
– Support=50% and
– Confidence = 100%
Transaction ID Items Bought
2000 A,B,C
1000 A,C
4000 A,D
5000 B,E,F
Association Rule Mining Task
• Given a set of transactions T, the goal of association rule mining
is to find all rules having
– support ≥ minsup threshold
– confidence ≥ minconf threshold
• Brute-force approach:
– List all possible association rules
– Compute the support and confidence for each rule
– Prune rules that fail the minsup and minconf thresholds
 Computationally prohibitive!
Compiled by: Kamal Acharya
Mining Association Rules
Example of Rules:
{Milk,Diaper}  {Beer} (s=0.4, c=0.67)
{Milk,Beer}  {Diaper} (s=0.4, c=1.0)
{Diaper,Beer}  {Milk} (s=0.4, c=0.67)
{Beer}  {Milk,Diaper} (s=0.4, c=0.67)
{Diaper}  {Milk,Beer} (s=0.4, c=0.5)
{Milk}  {Diaper,Beer} (s=0.4, c=0.5)
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Observations:
• All the above rules are binary partitions of the same itemset:
{Milk, Diaper, Beer}
• Rules originating from the same itemset have identical support but
can have different confidence
• Thus, we may decouple the support and confidence requirements
Compiled by: Kamal Acharya
Mining Association Rules
• Two-step approach:
1. Frequent Itemset Generation
– Generate all itemsets whose support  minsup
2. Rule Generation
– Generate high confidence rules from each frequent itemset,
where each rule is a binary partitioning of a frequent
itemset
• Frequent itemset generation is still computationally
expensive!
Compiled by: Kamal Acharya
Frequent Itemset Generation
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
Given d items, there
are 2d possible
candidate itemsets
Compiled by: Kamal Acharya
Reducing Number of Candidates
• Apriori principle:
– If an itemset is frequent, then all of its subsets must also be
frequent
– if {beer, diaper, nuts} is frequent, so is {beer, diaper}
• Apriori principle holds due to the following property of the
support measure:
– Support of an itemset never exceeds the support of its subsets
)()()(:, YsXsYXYX 
Compiled by: Kamal Acharya
Example
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
s(Bread) > s(Bread, Beer)
s(Milk) > s(Bread, Milk)
s(Diaper, Beer) > s(Diaper, Beer, Coke)
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
Contd…
• How is the apriori property is used in the algorithm?
– Apriori pruning principle: If there is any itemset which is infrequent, its
superset should not be generated/tested!
Found to be
Infrequent
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
Pruned
supersets
Illustrating Apriori purning Principle
Item Count
Bread 4
Coke 2
Milk 4
Beer 3
Diaper 4
Eggs 1
Itemset Count
{Bread,Milk} 3
{Bread,Beer} 2
{Bread,Diaper} 3
{Milk,Beer} 2
{Milk,Diaper} 3
{Beer,Diaper} 3
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate
candidates involving Coke
or Eggs)
Minimum Support = 3
Compiled by: Kamal Acharya
The Apriori Algorithm (the general idea)
1. Initially, scan DB once to get candidate item set C1 and find frequent
1-items from C1and put them to Lk (k=1)
2. Use Lk to generate a collection of candidate itemsets Ck+1 with size
(k+1)
3. Scan the database to find which itemsets in Ck+1 are frequent and put
them into Lk+1
4. If Lk+1 is not empty(i.e., terminate when no frequent or candidate set
can be generated)
k=k+1
GOTO 2
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
Generating association rules from frequent itemsets
• generate strong association rules from frequent itemsets (where strong
association rules satisfy both minimum support and minimum confidence) as
follows:
• The rules generated from frequent itemsets, each one automatically satisfies
the minimum support.
Compiled by: Kamal Acharya
The Apriori Algorithm — Example1
• Consider the following transactions for association rules analysis:
• Use minimum support(min_sup) = 2 (2/9 = 22%) and
• Minimum confidence = 70%
Compiled by: Kamal Acharya
Contd…
• Step1: Frequent Itemset Generation:
Compiled by: Kamal Acharya
Contd…
• Step2: Generating association rules: The data contain frequent
itemset X ={I1,I2,I5} .What are the association rules that can be generated
from X?
• The nonempty subsets of X are . The
resulting association rules are as shown below, each listed with its confidence:
• Here, minimum confidence threshold is 70%, so only the second, third, and
last rules are output, because these are the only ones generated that are strong.
Compiled by: Kamal Acharya
The Apriori Algorithm—An Example1
Database TDB
1st scan
C1
L1
L2
C2 C2
2nd scan
C3 L33rd scan
Tid Items
10 A, C, D
20 B, C, E
30 A, B, C, E
40 B, E
Itemset sup
{A} 2
{B} 3
{C} 3
{D} 1
{E} 3
Itemset sup
{A} 2
{B} 3
{C} 3
{E} 3
Itemset
{A, B}
{A, C}
{A, E}
{B, C}
{B, E}
{C, E}
Itemset sup
{A, B} 1
{A, C} 2
{A, E} 1
{B, C} 2
{B, E} 3
{C, E} 2
Itemset sup
{A, C} 2
{B, C} 2
{B, E} 3
{C, E} 2
Itemset
{B, C, E}
Itemset sup
{B, C, E} 2
Supmin = 2
The Apriori Algorithm — Example3
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
Database D itemset sup.
{1} 2
{2} 3
{3} 3
{4} 1
{5} 3
itemset sup.
{1} 2
{2} 3
{3} 3
{5} 3
Scan D
C1
L1
itemset
{1 2}
{1 3}
{1 5}
{2 3}
{2 5}
{3 5}
itemset sup
{1 2} 1
{1 3} 2
{1 5} 1
{2 3} 2
{2 5} 3
{3 5} 2
itemset sup
{1 3} 2
{2 3} 2
{2 5} 3
{3 5} 2
L2
C2 C2
Scan D
C3 L3itemset
{2 3 5}
Scan D itemset sup
{2 3 5} 2
min_sup=2=50%
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
Homework
• A database has five transactions. Let min sup = 60% and min
conf = 80%.
• Find all frequent itemsets using Apriori algorithm.
• List all the strong association rules.
Problems with A-priori Algorithms
• It is costly to handle a huge number of candidate sets.
– For example if there are 104 large 1-itemsets, the Apriori algorithm will need to
generate more than 107 candidate 2-itemsets. Moreover for 100-itemsets, it must
generate more than 2100  1030 candidates in total.
• The candidate generation is the inherent cost of the Apriori
Algorithms, no matter what implementation technique is applied.
• To mine a large data sets for long patterns – this algorithm is
NOT a good idea.
• When Database is scanned to check Ck for creating Lk, a large
number of transactions will be scanned even they do not contain
any k-itemset.
Compiled by: Kamal Acharya
Frequent pattern growth (FP-growth):
• A frequent pattern growth approach mines Frequent Patterns
Without Candidate Generation.
• In FP-Growth there are mainly two step involved:
– Build a compact data structure called FP-Tree and
– Than, extract frequent itemset directly from the FP-tree.
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
FP-tree Construction from a Transactional DB
• FP-Tree is constructed using two passes over the data set:
• Pass1:
1. Scan DB once, find frequent 1-itemsets (single item patterns)
1. Scan DB and find support for each item.
2. Discard infrequent items
2. sort frequent items in descending order of their frequency(support
count).
3. Sort the items in each transaction in descending order of their
frequency.
Use this order when building the FP-tree, so common prefixes can be
shared.
• Pass2: Scan DB again, construct FP-tree
1. FP-growth reads one transaction at a time and maps it to a path.
2. Fixed order is used, so path can overlap when transaction share items.
3. Pointers are maintained between nodes containing the same item(doted
line)
Mining Frequent Patterns Using FP-tree
• The FP-tree is mined as follows:
– Start from each frequent length-1 pattern (as an initial suffix
pattern)
– construct its conditional pattern base (a “sub-database,” which
consists of the set of prefix paths in the FP-tree co-occurring with
the suffix pattern)
– then construct its (conditional) FP-tree, and perform mining
recursively on such a tree.
– The pattern growth is achieved by the concatenation of the suffix
pattern with the frequent patterns generated from a conditional FP-
tree.
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
FP-tree Construction
• Let transaction database D:
• Let min_support = 3
Compiled by: Kamal Acharya
FP-tree Construction
• Pass 1: Finding frequent 1-itemset and sorting the this set in descending order of
support count(frequency):
• Then, Making sorted frequent transactions in the transaction dataset D:
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p} {f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o} {f, b}
400 {b, c, k, s, p} {c, b, p}
500 {a, f, c, e, l, p, m, n} {f, c, a, m, p}
FP-tree Construction
root
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, p, b}
500 {f, c, a, m, p}
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
min_support = 3
f:1
c:1
a:1
m:1
p:1
Compiled by: Kamal Acharya
FP-tree Construction
root
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
min_support = 3
f:2
c:2
a:2
m:1
p:1
b:1
m:1
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, p, b}
500 {f, c, a, m, p}
Compiled by: Kamal Acharya
FP-tree Construction
root
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
min_support = 3
f:3
c:2
a:2
m:1
p:1
b:1
m:1
b:1
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, p, b}
500 {f, c, a, m, p}
Compiled by: Kamal Acharya
FP-tree Construction
root
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
min_support = 3
f:3
c:2
a:2
m:1
p:1
b:1
m:1
b:1
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, p, b}
500 {f, c, a, m, p}
c:1
b:1
p:1
Compiled by: Kamal Acharya
FP-tree Construction
root
Item frequency
f 4
c 4
a 3
b 3
m 3
p 3
min_support = 3
f:4
c:3
a:3
m:2
p:2
b:1
m:1
b:1
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, p, b}
500 {f, c, a, m, p}
c:1
b:1
p:1
Header Table
Item frequency head
f 4
c 4
a 3
b 3
m 3
p 3
Compiled by: Kamal Acharya
Benefits of the FP-tree Structure
• Completeness:
– never breaks a long pattern of any transaction
– preserves complete information for frequent pattern mining
• Compactness
– reduce irrelevant information—infrequent items are gone
– frequency descending ordering: more frequent items are more likely to be
shared
– never be larger than the original database.
Compiled by: Kamal Acharya
Mining Frequent Patterns Using FP-tree
• The FP-tree is mined as follows:
– Start from each frequent length-1 pattern (as an initial suffix
pattern)
– construct its conditional pattern base (a “sub-database,” which
consists of the set of prefix paths in the FP-tree co-occurring with
the suffix pattern)
– then construct its (conditional) FP-tree, and perform mining
recursively on such a tree.
– The pattern growth is achieved by the concatenation of the suffix
pattern with the frequent patterns generated from a conditional FP-
tree.
Compiled by: Kamal Acharya
Mining Frequent Patterns Using the FP-tree
(cont’d)
• Start with last item in order (i.e., p).
• Follow node pointers and traverse only the paths containing p.
• Accumulate all of transformed prefix paths of that item to form a
conditional pattern base
Conditional pattern base for p
fcam:2, cb:1
f:4
c:3
a:3
m:2
p:2
c:1
b:1
p:1
p
Construct a new FP-tree based on this
pattern, by merging all paths and
keeping nodes that appear sup times.
This leads to only one branch c:3
Thus we derive only one frequent
pattern cont. p. Pattern cp
Compiled by: Kamal Acharya
Mining Frequent Patterns Using the FP-tree
(cont’d)
• Move to next least frequent item in order, i.e., m
• Follow node pointers and traverse only the paths containing m.
• Accumulate all of transformed prefix paths of that item to form a conditional
pattern base
f:4
c:3
a:3
m:2
m
m:1
b:1
m-conditional pattern base: fca:2, fcab:1
{}
f:3
c:3
a:3
m-conditional FP-tree (contains only path fca:3)
All frequent patterns that include m
m,
fm, cm, am,
fcm, fam, cam,
fcam 
Compiled by: Kamal Acharya
Conditional Pattern-Bases for the example
EmptyEmptyf
{(f:3)}|c{(f:3)}c
{(f:3, c:3)}|a{(fc:3)}a
Empty{(fca:1), (f:1), (c:1)}b
{(f:3, c:3, a:3)}|m{(fca:2), (fcab:1)}m
{(c:3)}|p{(fcam:2), (cb:1)}p
Conditional FP-treeConditional pattern-baseItem
Compiled by: Kamal Acharya
Why Is Frequent Pattern Growth Fast?
• Performance studies show
– FP-growth is an faster than Apriori, and is also
• Reasoning
– No candidate generation, no candidate test
– Uses compact data structure
– Eliminates repeated database scan
– Basic operation is counting and FP-tree building
Compiled by: Kamal Acharya
Compiled by: Kamal Acharya
Handling Categorical Attributes
• So far, we have used only transaction data for mining association rules.
• The data can be in transaction form or table form
Transaction form: t1: a, b
t2: a, c, d, e
t3: a, d, f
Table form:
• Table data need to be converted to transaction form for association rule
mining
Attr1 Attr2 Attr3
a b d
b c e
Compiled by: Kamal Acharya
Contd…
• To convert a table data set to a transaction data set simply change
each value to an attribute–value pair.
• For example:
Compiled by: Kamal Acharya
Contd…
• Each attribute–value pair is considered an item.
• Using only values is not sufficient in the transaction form
because different attributes may have the same values.
• For example, without including attribute names, value a’s for
Attribute1 and Attribute2 are not distinguishable.
• After the conversion, Figure (B) can be used in mining.
Compiled by: Kamal Acharya
Home Work
• Convert the following categorical attribute data into transaction
data set
Compiled by: Kamal Acharya
Homework
• What is the aim of association rule mining? Why is this aim important in some
application?
• Define the concepts of support and confidence for an association rule.
• Show how the apriori algorithm works on an example dataset.
• What is the basis of the apriori algorithm? Describe the algorithm briefly.
Which step of the algorithm can become a bottleneck?
Compiled by: Kamal Acharya
Contd…
• Show using an example how FP-tree algorithm solves the association rule
mining (ARM) problem.
• Perform ARM using FP-growth on the following data set with minimum
support = 50% and confidence = 75%
Transaction ID Items
1 Bread, cheese, Eggs, Juice
2 Bread, Cheese, Juice
3 Bread, Milk, Yogurt
4 Bread, Juice, Milk
5 Cheese, Juice, Milk
Thank You !
Compiled by: Kamal Acharya

More Related Content

What's hot

Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
Archana Swaminathan
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
MaryamRehman6
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
Krish_ver2
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
Krish_ver2
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
Valerii Klymchuk
 
PAC Learning
PAC LearningPAC Learning
PAC Learning
Sanghyuk Chun
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
Krish_ver2
 
supervised learning
supervised learningsupervised learning
supervised learning
Amar Tripathi
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
DataminingTools Inc
 
SPADE -
SPADE - SPADE -
SPADE -
Monica Dagadita
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
Er. Nawaraj Bhandari
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
Usama Fayyaz
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
Dung Nguyen
 
I. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AII. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AI
vikas dhakane
 
web mining
web miningweb mining
web mining
Arpit Verma
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset Generation
Knoldus Inc.
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Salah Amean
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
Hamidreza Mahdavipanah
 
Clustering
ClusteringClustering
Clustering
M Rizwan Aqeel
 

What's hot (20)

Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
PAC Learning
PAC LearningPAC Learning
PAC Learning
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
supervised learning
supervised learningsupervised learning
supervised learning
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
SPADE -
SPADE - SPADE -
SPADE -
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
I. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AII. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AI
 
web mining
web miningweb mining
web mining
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset Generation
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
 
Clustering
ClusteringClustering
Clustering
 

Similar to Association Analysis in Data Mining

The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithm
deepti92pawar
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdf
WailaBaba
 
DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.ppt
raju980973
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
Sulman Ahmed
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15
Shani729
 
Data Mining Lecture_4.pptx
Data Mining Lecture_4.pptxData Mining Lecture_4.pptx
Data Mining Lecture_4.pptx
Subrata Kumer Paul
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
Wan Aezwani Wab
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
nikshaikh786
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit III
malathieswaran29
 
ASSOCIATION Rule plus MArket basket Analysis.pptx
ASSOCIATION Rule plus MArket basket Analysis.pptxASSOCIATION Rule plus MArket basket Analysis.pptx
ASSOCIATION Rule plus MArket basket Analysis.pptx
SherishJaved
 
BIM Data Mining Unit4 by Tekendra Nath Yogi
 BIM Data Mining Unit4 by Tekendra Nath Yogi BIM Data Mining Unit4 by Tekendra Nath Yogi
BIM Data Mining Unit4 by Tekendra Nath Yogi
Tekendra Nath Yogi
 
Association-Analysis.pdf
Association-Analysis.pdfAssociation-Analysis.pdf
Association-Analysis.pdf
SHAIKHNAHIAN1704022
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
apriori.pptx
apriori.pptxapriori.pptx
apriori.pptx
selvifitria1
 
My6asso
My6assoMy6asso
My6asso
ketan533
 
Data Mining Lecture_3.pptx
Data Mining Lecture_3.pptxData Mining Lecture_3.pptx
Data Mining Lecture_3.pptx
Subrata Kumer Paul
 
06FPBasic02.pdf
06FPBasic02.pdf06FPBasic02.pdf
06FPBasic02.pdf
Alireza418370
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14
rahulmath80
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
Quyn590023
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Subrata Kumer Paul
 

Similar to Association Analysis in Data Mining (20)

The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithm
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdf
 
DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.ppt
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15
 
Data Mining Lecture_4.pptx
Data Mining Lecture_4.pptxData Mining Lecture_4.pptx
Data Mining Lecture_4.pptx
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit III
 
ASSOCIATION Rule plus MArket basket Analysis.pptx
ASSOCIATION Rule plus MArket basket Analysis.pptxASSOCIATION Rule plus MArket basket Analysis.pptx
ASSOCIATION Rule plus MArket basket Analysis.pptx
 
BIM Data Mining Unit4 by Tekendra Nath Yogi
 BIM Data Mining Unit4 by Tekendra Nath Yogi BIM Data Mining Unit4 by Tekendra Nath Yogi
BIM Data Mining Unit4 by Tekendra Nath Yogi
 
Association-Analysis.pdf
Association-Analysis.pdfAssociation-Analysis.pdf
Association-Analysis.pdf
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
apriori.pptx
apriori.pptxapriori.pptx
apriori.pptx
 
My6asso
My6assoMy6asso
My6asso
 
Data Mining Lecture_3.pptx
Data Mining Lecture_3.pptxData Mining Lecture_3.pptx
Data Mining Lecture_3.pptx
 
06FPBasic02.pdf
06FPBasic02.pdf06FPBasic02.pdf
06FPBasic02.pdf
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
 

More from Kamal Acharya

Programming the basic computer
Programming the basic computerProgramming the basic computer
Programming the basic computer
Kamal Acharya
 
Computer Arithmetic
Computer ArithmeticComputer Arithmetic
Computer Arithmetic
Kamal Acharya
 
Introduction to Computer Security
Introduction to Computer SecurityIntroduction to Computer Security
Introduction to Computer Security
Kamal Acharya
 
Session and Cookies
Session and CookiesSession and Cookies
Session and Cookies
Kamal Acharya
 
Functions in php
Functions in phpFunctions in php
Functions in php
Kamal Acharya
 
Web forms in php
Web forms in phpWeb forms in php
Web forms in php
Kamal Acharya
 
Making decision and repeating in PHP
Making decision and repeating  in PHPMaking decision and repeating  in PHP
Making decision and repeating in PHP
Kamal Acharya
 
Working with arrays in php
Working with arrays in phpWorking with arrays in php
Working with arrays in php
Kamal Acharya
 
Text and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHPText and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHP
Kamal Acharya
 
Introduction to PHP
Introduction to PHPIntroduction to PHP
Introduction to PHP
Kamal Acharya
 
Capacity Planning of Data Warehousing
Capacity Planning of Data WarehousingCapacity Planning of Data Warehousing
Capacity Planning of Data Warehousing
Kamal Acharya
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
Kamal Acharya
 
Search Engines
Search EnginesSearch Engines
Search Engines
Kamal Acharya
 
Web Mining
Web MiningWeb Mining
Web Mining
Kamal Acharya
 
Information Privacy and Data Mining
Information Privacy and Data MiningInformation Privacy and Data Mining
Information Privacy and Data Mining
Kamal Acharya
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
Kamal Acharya
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
Kamal Acharya
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
Kamal Acharya
 
Functions in Python
Functions in PythonFunctions in Python
Functions in Python
Kamal Acharya
 

More from Kamal Acharya (20)

Programming the basic computer
Programming the basic computerProgramming the basic computer
Programming the basic computer
 
Computer Arithmetic
Computer ArithmeticComputer Arithmetic
Computer Arithmetic
 
Introduction to Computer Security
Introduction to Computer SecurityIntroduction to Computer Security
Introduction to Computer Security
 
Session and Cookies
Session and CookiesSession and Cookies
Session and Cookies
 
Functions in php
Functions in phpFunctions in php
Functions in php
 
Web forms in php
Web forms in phpWeb forms in php
Web forms in php
 
Making decision and repeating in PHP
Making decision and repeating  in PHPMaking decision and repeating  in PHP
Making decision and repeating in PHP
 
Working with arrays in php
Working with arrays in phpWorking with arrays in php
Working with arrays in php
 
Text and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHPText and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHP
 
Introduction to PHP
Introduction to PHPIntroduction to PHP
Introduction to PHP
 
Capacity Planning of Data Warehousing
Capacity Planning of Data WarehousingCapacity Planning of Data Warehousing
Capacity Planning of Data Warehousing
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Information Privacy and Data Mining
Information Privacy and Data MiningInformation Privacy and Data Mining
Information Privacy and Data Mining
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
 
Functions in Python
Functions in PythonFunctions in Python
Functions in Python
 

Recently uploaded

managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
nabaegha
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
Celine George
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
sanamushtaq922
 
Interprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdfInterprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdf
Ben Aldrich
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
Nguyen Thanh Tu Collection
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
Celine George
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
Frederic Fovet
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
Derek Wenmoth
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Kalna College
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
heathfieldcps1
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Catherine Dela Cruz
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
MJDuyan
 
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
yarusun
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
Kalna College
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
chaudharyreet2244
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
PJ Caposey
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 

Recently uploaded (20)

managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
 
Interprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdfInterprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdf
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
 
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
 

Association Analysis in Data Mining

  • 2. Compiled by: Kamal Acharya Association Analysis • Association rules analysis is a technique to uncover(mine) how items are associated to each other. • Such uncovered association between items are called association rules • When to mine association rules? – Scenario: • You are a sales manager • Customer bought a pc and a digital camera recently • What should you recommend to her next? • Association rules are helpful In making your recommendation.
  • 3. Compiled by: Kamal Acharya Contd… • Frequent patterns(item sets): – Frequent patterns are patterns that appear frequently in a data set. – E.g., • In transaction data set milk and bread is a frequent pattern, • In a shopping history database first buy pc, then a digital camera, and then a memory card is another example of frequent pattern. – Finding frequent patterns plays an essential role in mining association rules.
  • 4. Compiled by: Kamal Acharya Frequent pattern mining • Frequent pattern mining searches for recurring relationships in a given data set. • Frequent pattern mining for the discovery of interesting associations between item sets • Such associations can be applicable in many business decision making processes such as: – Catalog design – Basket data analysis – cross-marketing, – sale campaign analysis, – Web log (click stream) analysis, etc
  • 5. Compiled by: Kamal Acharya Market Basket analysis A typical example of frequent pattern(item set) mining for association rules. • Market basket analysis analyzes customer buying habits by finding associations between the different items that customers place in their shopping baskets. Applications: To make marketing strategies Example of Association Rule: milk bread
  • 6. Definition: Frequent Itemset • Itemset – A collection of one or more items • Example: {Milk, Bread, Diaper} – k-itemset • An itemset that contains k items • Support count () – Frequency of occurrence of an itemset – E.g. ({Milk, Bread,Diaper}) = 2 • Support – Fraction of transactions that contain an itemset – E.g. s({Milk, Bread, Diaper}) = 2/5 • Frequent Itemset – An itemset whose support is greater than or equal to a minsup threshold TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Compiled by: Kamal Acharya
  • 7. Definition: Association Rule Association Rule : An implication expression of the form X  Y, where X and Y are itemsets e.g., •It is not very difficult to develop algorithms that will find this associations in a large database. •The problem is that such an algorithm will also uncover many other associations that are of very little value. It is necessary to introduce some measures to distinguish interesting associations from non-interesting ones. Beer}Diaper,Milk{  Compiled by: Kamal Acharya
  • 8. Definition: Association Rule Example: Beer}Diaper,Milk{  4.0 5 2 |T| )BeerDiaper,,Milk(   s 67.0 3 2 )Diaper,Milk( )BeerDiaper,Milk,(    c  Rule Evaluation Metrics  Support (s)  Fraction of transactions that contain both X and Y  Alternatively, support, s, probability that a transaction contains {X  Y}  Confidence (c)  Measures how often items in Y appear in transactions that contain X  Alternatively, confidence, c, conditional probability that a transaction having X also contains Y TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Compiled by: Kamal Acharya
  • 9. TID date items_bought 100 10/10/99 {F,A,D,B} 200 15/10/99 {D,A,C,E,B} 300 19/10/99 {C,A,B,E} 400 20/10/99 {B,A,D} Example1 • What is the support and confidence of the rule: {B,D}  {A}  Support:  percentage of tuples that contain {A,B,D} =  Confidence:  D}{B,containthattuplesofnumber D}B,{A,containthattuplesofnumber 75% 100% Compiled by: Kamal Acharya
  • 10. Compiled by: Kamal Acharya Example2 • Given data set D is: • What is the support and confidence of the rule: – A  C – C  A • For rule A  C – Support =50% and – Confidence = 66.6%) • For rule C  A – Support=50% and – Confidence = 100% Transaction ID Items Bought 2000 A,B,C 1000 A,C 4000 A,D 5000 B,E,F
  • 11. Association Rule Mining Task • Given a set of transactions T, the goal of association rule mining is to find all rules having – support ≥ minsup threshold – confidence ≥ minconf threshold • Brute-force approach: – List all possible association rules – Compute the support and confidence for each rule – Prune rules that fail the minsup and minconf thresholds  Computationally prohibitive! Compiled by: Kamal Acharya
  • 12. Mining Association Rules Example of Rules: {Milk,Diaper}  {Beer} (s=0.4, c=0.67) {Milk,Beer}  {Diaper} (s=0.4, c=1.0) {Diaper,Beer}  {Milk} (s=0.4, c=0.67) {Beer}  {Milk,Diaper} (s=0.4, c=0.67) {Diaper}  {Milk,Beer} (s=0.4, c=0.5) {Milk}  {Diaper,Beer} (s=0.4, c=0.5) TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements Compiled by: Kamal Acharya
  • 13. Mining Association Rules • Two-step approach: 1. Frequent Itemset Generation – Generate all itemsets whose support  minsup 2. Rule Generation – Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset • Frequent itemset generation is still computationally expensive! Compiled by: Kamal Acharya
  • 14. Frequent Itemset Generation null AB AC AD AE BC BD BE CD CE DE A B C D E ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE Given d items, there are 2d possible candidate itemsets Compiled by: Kamal Acharya
  • 15. Reducing Number of Candidates • Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent – if {beer, diaper, nuts} is frequent, so is {beer, diaper} • Apriori principle holds due to the following property of the support measure: – Support of an itemset never exceeds the support of its subsets )()()(:, YsXsYXYX  Compiled by: Kamal Acharya
  • 16. Example TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke s(Bread) > s(Bread, Beer) s(Milk) > s(Bread, Milk) s(Diaper, Beer) > s(Diaper, Beer, Coke) Compiled by: Kamal Acharya
  • 17. Compiled by: Kamal Acharya Contd… • How is the apriori property is used in the algorithm? – Apriori pruning principle: If there is any itemset which is infrequent, its superset should not be generated/tested! Found to be Infrequent null AB AC AD AE BC BD BE CD CE DE A B C D E ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE null AB AC AD AE BC BD BE CD CE DE A B C D E ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE Pruned supersets
  • 18. Illustrating Apriori purning Principle Item Count Bread 4 Coke 2 Milk 4 Beer 3 Diaper 4 Eggs 1 Itemset Count {Bread,Milk} 3 {Bread,Beer} 2 {Bread,Diaper} 3 {Milk,Beer} 2 {Milk,Diaper} 3 {Beer,Diaper} 3 Items (1-itemsets) Pairs (2-itemsets) (No need to generate candidates involving Coke or Eggs) Minimum Support = 3 Compiled by: Kamal Acharya
  • 19. The Apriori Algorithm (the general idea) 1. Initially, scan DB once to get candidate item set C1 and find frequent 1-items from C1and put them to Lk (k=1) 2. Use Lk to generate a collection of candidate itemsets Ck+1 with size (k+1) 3. Scan the database to find which itemsets in Ck+1 are frequent and put them into Lk+1 4. If Lk+1 is not empty(i.e., terminate when no frequent or candidate set can be generated) k=k+1 GOTO 2 Compiled by: Kamal Acharya
  • 20. Compiled by: Kamal Acharya Generating association rules from frequent itemsets • generate strong association rules from frequent itemsets (where strong association rules satisfy both minimum support and minimum confidence) as follows: • The rules generated from frequent itemsets, each one automatically satisfies the minimum support.
  • 21. Compiled by: Kamal Acharya The Apriori Algorithm — Example1 • Consider the following transactions for association rules analysis: • Use minimum support(min_sup) = 2 (2/9 = 22%) and • Minimum confidence = 70%
  • 22. Compiled by: Kamal Acharya Contd… • Step1: Frequent Itemset Generation:
  • 23. Compiled by: Kamal Acharya Contd… • Step2: Generating association rules: The data contain frequent itemset X ={I1,I2,I5} .What are the association rules that can be generated from X? • The nonempty subsets of X are . The resulting association rules are as shown below, each listed with its confidence: • Here, minimum confidence threshold is 70%, so only the second, third, and last rules are output, because these are the only ones generated that are strong.
  • 24. Compiled by: Kamal Acharya The Apriori Algorithm—An Example1 Database TDB 1st scan C1 L1 L2 C2 C2 2nd scan C3 L33rd scan Tid Items 10 A, C, D 20 B, C, E 30 A, B, C, E 40 B, E Itemset sup {A} 2 {B} 3 {C} 3 {D} 1 {E} 3 Itemset sup {A} 2 {B} 3 {C} 3 {E} 3 Itemset {A, B} {A, C} {A, E} {B, C} {B, E} {C, E} Itemset sup {A, B} 1 {A, C} 2 {A, E} 1 {B, C} 2 {B, E} 3 {C, E} 2 Itemset sup {A, C} 2 {B, C} 2 {B, E} 3 {C, E} 2 Itemset {B, C, E} Itemset sup {B, C, E} 2 Supmin = 2
  • 25. The Apriori Algorithm — Example3 TID Items 100 1 3 4 200 2 3 5 300 1 2 3 5 400 2 5 Database D itemset sup. {1} 2 {2} 3 {3} 3 {4} 1 {5} 3 itemset sup. {1} 2 {2} 3 {3} 3 {5} 3 Scan D C1 L1 itemset {1 2} {1 3} {1 5} {2 3} {2 5} {3 5} itemset sup {1 2} 1 {1 3} 2 {1 5} 1 {2 3} 2 {2 5} 3 {3 5} 2 itemset sup {1 3} 2 {2 3} 2 {2 5} 3 {3 5} 2 L2 C2 C2 Scan D C3 L3itemset {2 3 5} Scan D itemset sup {2 3 5} 2 min_sup=2=50% Compiled by: Kamal Acharya
  • 26. Compiled by: Kamal Acharya Homework • A database has five transactions. Let min sup = 60% and min conf = 80%. • Find all frequent itemsets using Apriori algorithm. • List all the strong association rules.
  • 27. Problems with A-priori Algorithms • It is costly to handle a huge number of candidate sets. – For example if there are 104 large 1-itemsets, the Apriori algorithm will need to generate more than 107 candidate 2-itemsets. Moreover for 100-itemsets, it must generate more than 2100  1030 candidates in total. • The candidate generation is the inherent cost of the Apriori Algorithms, no matter what implementation technique is applied. • To mine a large data sets for long patterns – this algorithm is NOT a good idea. • When Database is scanned to check Ck for creating Lk, a large number of transactions will be scanned even they do not contain any k-itemset. Compiled by: Kamal Acharya
  • 28. Frequent pattern growth (FP-growth): • A frequent pattern growth approach mines Frequent Patterns Without Candidate Generation. • In FP-Growth there are mainly two step involved: – Build a compact data structure called FP-Tree and – Than, extract frequent itemset directly from the FP-tree. Compiled by: Kamal Acharya
  • 29. Compiled by: Kamal Acharya FP-tree Construction from a Transactional DB • FP-Tree is constructed using two passes over the data set: • Pass1: 1. Scan DB once, find frequent 1-itemsets (single item patterns) 1. Scan DB and find support for each item. 2. Discard infrequent items 2. sort frequent items in descending order of their frequency(support count). 3. Sort the items in each transaction in descending order of their frequency. Use this order when building the FP-tree, so common prefixes can be shared. • Pass2: Scan DB again, construct FP-tree 1. FP-growth reads one transaction at a time and maps it to a path. 2. Fixed order is used, so path can overlap when transaction share items. 3. Pointers are maintained between nodes containing the same item(doted line)
  • 30. Mining Frequent Patterns Using FP-tree • The FP-tree is mined as follows: – Start from each frequent length-1 pattern (as an initial suffix pattern) – construct its conditional pattern base (a “sub-database,” which consists of the set of prefix paths in the FP-tree co-occurring with the suffix pattern) – then construct its (conditional) FP-tree, and perform mining recursively on such a tree. – The pattern growth is achieved by the concatenation of the suffix pattern with the frequent patterns generated from a conditional FP- tree. Compiled by: Kamal Acharya
  • 31. Compiled by: Kamal Acharya FP-tree Construction • Let transaction database D: • Let min_support = 3
  • 32. Compiled by: Kamal Acharya FP-tree Construction • Pass 1: Finding frequent 1-itemset and sorting the this set in descending order of support count(frequency): • Then, Making sorted frequent transactions in the transaction dataset D: Item frequency f 4 c 4 a 3 b 3 m 3 p 3 TID Items bought (ordered) frequent items 100 {f, a, c, d, g, i, m, p} {f, c, a, m, p} 200 {a, b, c, f, l, m, o} {f, c, a, b, m} 300 {b, f, h, j, o} {f, b} 400 {b, c, k, s, p} {c, b, p} 500 {a, f, c, e, l, p, m, n} {f, c, a, m, p}
  • 33. FP-tree Construction root TID freq. Items bought 100 {f, c, a, m, p} 200 {f, c, a, b, m} 300 {f, b} 400 {c, p, b} 500 {f, c, a, m, p} Item frequency f 4 c 4 a 3 b 3 m 3 p 3 min_support = 3 f:1 c:1 a:1 m:1 p:1 Compiled by: Kamal Acharya
  • 34. FP-tree Construction root Item frequency f 4 c 4 a 3 b 3 m 3 p 3 min_support = 3 f:2 c:2 a:2 m:1 p:1 b:1 m:1 TID freq. Items bought 100 {f, c, a, m, p} 200 {f, c, a, b, m} 300 {f, b} 400 {c, p, b} 500 {f, c, a, m, p} Compiled by: Kamal Acharya
  • 35. FP-tree Construction root Item frequency f 4 c 4 a 3 b 3 m 3 p 3 min_support = 3 f:3 c:2 a:2 m:1 p:1 b:1 m:1 b:1 TID freq. Items bought 100 {f, c, a, m, p} 200 {f, c, a, b, m} 300 {f, b} 400 {c, p, b} 500 {f, c, a, m, p} Compiled by: Kamal Acharya
  • 36. FP-tree Construction root Item frequency f 4 c 4 a 3 b 3 m 3 p 3 min_support = 3 f:3 c:2 a:2 m:1 p:1 b:1 m:1 b:1 TID freq. Items bought 100 {f, c, a, m, p} 200 {f, c, a, b, m} 300 {f, b} 400 {c, p, b} 500 {f, c, a, m, p} c:1 b:1 p:1 Compiled by: Kamal Acharya
  • 37. FP-tree Construction root Item frequency f 4 c 4 a 3 b 3 m 3 p 3 min_support = 3 f:4 c:3 a:3 m:2 p:2 b:1 m:1 b:1 TID freq. Items bought 100 {f, c, a, m, p} 200 {f, c, a, b, m} 300 {f, b} 400 {c, p, b} 500 {f, c, a, m, p} c:1 b:1 p:1 Header Table Item frequency head f 4 c 4 a 3 b 3 m 3 p 3 Compiled by: Kamal Acharya
  • 38. Benefits of the FP-tree Structure • Completeness: – never breaks a long pattern of any transaction – preserves complete information for frequent pattern mining • Compactness – reduce irrelevant information—infrequent items are gone – frequency descending ordering: more frequent items are more likely to be shared – never be larger than the original database. Compiled by: Kamal Acharya
  • 39. Mining Frequent Patterns Using FP-tree • The FP-tree is mined as follows: – Start from each frequent length-1 pattern (as an initial suffix pattern) – construct its conditional pattern base (a “sub-database,” which consists of the set of prefix paths in the FP-tree co-occurring with the suffix pattern) – then construct its (conditional) FP-tree, and perform mining recursively on such a tree. – The pattern growth is achieved by the concatenation of the suffix pattern with the frequent patterns generated from a conditional FP- tree. Compiled by: Kamal Acharya
  • 40. Mining Frequent Patterns Using the FP-tree (cont’d) • Start with last item in order (i.e., p). • Follow node pointers and traverse only the paths containing p. • Accumulate all of transformed prefix paths of that item to form a conditional pattern base Conditional pattern base for p fcam:2, cb:1 f:4 c:3 a:3 m:2 p:2 c:1 b:1 p:1 p Construct a new FP-tree based on this pattern, by merging all paths and keeping nodes that appear sup times. This leads to only one branch c:3 Thus we derive only one frequent pattern cont. p. Pattern cp Compiled by: Kamal Acharya
  • 41. Mining Frequent Patterns Using the FP-tree (cont’d) • Move to next least frequent item in order, i.e., m • Follow node pointers and traverse only the paths containing m. • Accumulate all of transformed prefix paths of that item to form a conditional pattern base f:4 c:3 a:3 m:2 m m:1 b:1 m-conditional pattern base: fca:2, fcab:1 {} f:3 c:3 a:3 m-conditional FP-tree (contains only path fca:3) All frequent patterns that include m m, fm, cm, am, fcm, fam, cam, fcam  Compiled by: Kamal Acharya
  • 42. Conditional Pattern-Bases for the example EmptyEmptyf {(f:3)}|c{(f:3)}c {(f:3, c:3)}|a{(fc:3)}a Empty{(fca:1), (f:1), (c:1)}b {(f:3, c:3, a:3)}|m{(fca:2), (fcab:1)}m {(c:3)}|p{(fcam:2), (cb:1)}p Conditional FP-treeConditional pattern-baseItem Compiled by: Kamal Acharya
  • 43. Why Is Frequent Pattern Growth Fast? • Performance studies show – FP-growth is an faster than Apriori, and is also • Reasoning – No candidate generation, no candidate test – Uses compact data structure – Eliminates repeated database scan – Basic operation is counting and FP-tree building Compiled by: Kamal Acharya
  • 44. Compiled by: Kamal Acharya Handling Categorical Attributes • So far, we have used only transaction data for mining association rules. • The data can be in transaction form or table form Transaction form: t1: a, b t2: a, c, d, e t3: a, d, f Table form: • Table data need to be converted to transaction form for association rule mining Attr1 Attr2 Attr3 a b d b c e
  • 45. Compiled by: Kamal Acharya Contd… • To convert a table data set to a transaction data set simply change each value to an attribute–value pair. • For example:
  • 46. Compiled by: Kamal Acharya Contd… • Each attribute–value pair is considered an item. • Using only values is not sufficient in the transaction form because different attributes may have the same values. • For example, without including attribute names, value a’s for Attribute1 and Attribute2 are not distinguishable. • After the conversion, Figure (B) can be used in mining.
  • 47. Compiled by: Kamal Acharya Home Work • Convert the following categorical attribute data into transaction data set
  • 48. Compiled by: Kamal Acharya Homework • What is the aim of association rule mining? Why is this aim important in some application? • Define the concepts of support and confidence for an association rule. • Show how the apriori algorithm works on an example dataset. • What is the basis of the apriori algorithm? Describe the algorithm briefly. Which step of the algorithm can become a bottleneck?
  • 49. Compiled by: Kamal Acharya Contd… • Show using an example how FP-tree algorithm solves the association rule mining (ARM) problem. • Perform ARM using FP-growth on the following data set with minimum support = 50% and confidence = 75% Transaction ID Items 1 Bread, cheese, Eggs, Juice 2 Bread, Cheese, Juice 3 Bread, Milk, Yogurt 4 Bread, Juice, Milk 5 Cheese, Juice, Milk
  • 50. Thank You ! Compiled by: Kamal Acharya
  翻译: