Author Guidelines for Preparing a Paper for the International Journal

advertisement
A Survey on various Algorithms of Associative
Classification
1Pursuing
Ms. Swati Khare 1, Dr. Anju Singh 2
M-Tech, Computer Science and Engineering, Barkatullah Univercity
Bhopal, Madhya Pradesh, India
Swati_soni12@rediffmail.com
2Assistant
Professor, Information and Technology, Barkatullah Univercity
Bhopal, Madhya Pradesh, India
Asingh0123@rediffmail.com
Abstract
Classification and association rule mining are two basic
tasks of Data Mining. Classification rule mining is used to
mine a small set of rules in the database to form an accurate
classifier. Association rules mining has been used to find all
interesting relationships in a large database. By applying
association rule into classification one can improve the
accuracy and get some valuable rules and information that
cannot be captured by other classification approaches.
However, this rule generation procedure is very timeconsuming when encountering large data set. In this paper it
is discussed that how associative classification is is better
than association rule mining and also various associative
classification algorithms with their workings
Keywords: Association, Classification, associative
classification
.
1. Association Rule
Basic objective of finding association rules [2] is to
find all co-occurrence relationship called associations.
It was first introduced in 1993 by Agrawal et. Al. The
classic application of association rule mining is
Market-Basket data analysis. Through this we got to
know how various items purchased by customer in a
supermarket are associated and this associativity is the
base of association rule mining. Association rule are of
form 𝑋 ≥ π‘Œ where X and Y are collection of items and
𝑋 ∩ π‘Œ is null.
The Problem of mining association rules can be stated
as follows:
Let 𝐼 = {𝑖1, 𝑖2,……, π‘–π‘š } be a set of items.
Let 𝑇 = (𝑑1, 𝑑2, , … . . , 𝑑𝑛 ) be a set of
transactions (the database),
Where each transaction 𝑑𝑖 is a set if items such that
𝑑𝑖 ⊆ 𝐼. Association rule is an implication of
the form, 𝑋 → π‘Œ, where 𝑋 ⊂ 𝐼, π‘Œ ⊂ 𝐼 and𝑋 ∩ π‘Œ = ∅.
Here X (or Y) is a set of items, called an item set.
1.1
Frequent Items
Frequent items are the patterns which occur frequently
in data. Frequent patterns can be categorized in three:
ο‚· Frequent Item sets
ο‚· Frequent subsequences
ο‚· Frequent substructures
Frequent item sets is the set of items which more offer
appear together in a transactional data set. Like milk
and bread, it can be assume that if a person buys milk
then the probability of purchasing bread become
higher.
Frequent subsequences are the sequences which
happen one after another. For example if a person buys
a laptop followed by a digital camera and a memory
card.
Frequent substructure refers different structural forms
like graphs, trees, which combined with item sets or
subsequences.
1.2 Support and Confidence
The strength of an association rule is measured as
Support and Confidence.
Support value [1] is frequency of number of data that
consists of X and Y or 𝑃( 𝑋 ∪ π‘Œ) and is given by
π‘†π‘’π‘π‘π‘œπ‘Ÿπ‘‘, 𝑠(𝑋 → π‘Œ) = 𝜎(𝑋 ∪ π‘Œ)⁄𝑁
(1)
Confidence [1] is frequency of number of data that
consist of X and Y or 𝑃(π‘‹Η€π‘Œ) and given by
πΆπ‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’, 𝑐(𝑋 → π‘Œ) = 𝜎(𝑋 ∪ π‘Œ) βˆ• 𝜎(𝑋) (2)
2. Classification
Classification is a form of data analysis that extracts
models describing important data classes. Such
models, called classifiers, predict categorical (discrete,
unordered) class labels.
Many Classification methods have been proposed by
researchers in machine learning, pattern recognition,
and statistics. Most algorithms are memory resident,
typically assuming a small data size. Recent data
mining research has built on such work, develop
scalable classification and prediction techniques
capable of handling large amount of disk resident data.
Classification has numerous applications, including
fraud detection, target market, performance prediction,
manufacturing and medical diagnosis.
Step1: Discover
frequent rule
items
Here in our paper we are presenting how associative
classification works and also various algorithms on
associative classification.
Step 2: Generate
Rules
3. Associative Classification
Associative Classification [3] is a branch of larger
area of scientific study known as Data Mining.
Associative Classification (AC) integrates two known
data mining task, association rule discovery and
classification so that a model or say classifier can be
form for prediction purpose.
Classification[12] has aim to discover a set of
Association mining rules in the database that that
satisfy some minimum support and minimum
confidence constraints and forms an accurate
classifier.
Associative classification based on association rules is
a procedure that uses association rules to build
classifier. Usually it includes two steps: first it finds
all the class association rules (CARs) whose righthand side is a class label, and then selects strong rules
from the CARs to build a classifier. In this fashion,
associative classification can generate rules with
higher confidence and better support with
conventional approaches. AC [3] is a special case of
association rule discovery in which only the class
attribute is considered in the rule’s right-hand side
(consequent); for example, in a rule such as 𝑋 → π‘Œ, π‘Œ
must be a class attribute. One of the main advantages
of using a classification based on association rules
over classic classification approaches is that the output
of an AC algorithm is represented in simple if–then
rules, which makes it easy for the end-user to
understand and interpret it.
Moreover, unlike decision tree algorithms, one can
update or tune a rule in AC without affecting the
complete rules set, whereas the same task requires
reshaping the whole tree in the decision tree approach.
Let us define the AC problem, where a training data
set T has m distinct attributes A1, A2,….,Am and C is
a list of classes. The number of rows in T is denoted
|T|. Attributes can be categorical (meaning they take a
value from a finite set of possible values) or
continuous (where they are real or integer). In the case
of categorical attributes, all possible values are
mapped to a set of positive integers. For continuous
attributes, a discretization method is used.
Training Data
Test data
Frequent Rule
items
Set of class
association
rule (CAR)
Step 4:
Predict
Step 3:
Rank and
Prune
Classifiers
Figure 1.1 Associative classification steps
Reasons why associative classification is better
than association rule mining:
Association rule discovery is an unsupervised
approached means no class attribute is associated
while Associative classification involves classes which
provides supervised learning.
1. In association rule discovery aim is to discover
associations between items in a transactional
database
where
association
classification
construct a classifier that can forecast the classes
of test data objects.
2. In association rule discovery there could be more
than one attribute in the consequent of a rule
where in associative classification there is only
attribute (class attribute) in the consequent of a
rule.
3. In association rule mining over fitting is usually not
an issue where as in association classification over
fitting is an important issue.
4. Literature Survey
A lot of work has been done in the field of associative
classification. For building a classifier with the help of
an AC algorithm, the complete set of class association
rules (CARs) is first discovered from the training data
set and a subset is selected to form the classifier. This
subset selection is[3] accomplished in many ways for
example in the classification by association rule
(CBA)[3][4] and classification based on multiple
association rules (CMAR) [3][5] algorithms, the
selection of the classifier is made using the database
coverage heuristic[4] , which evaluates the complete
set of CARs on the training data set and considers
rules that cover a certain number of training data
objects. However, the live-and-let-live [3][6]
algorithm uses a lazy pruning approach to build the
classifier. Once the classifier is constructed, its
predictive power is then evaluated on test data objects
to forecast their class labels. Various algorithms use
various different approaches to discover frequent item
sets. Also Different algorithms have their different
way to do classification using association rules.
Classification by association algorithms (CBA)[3][4]
has horizontal data layout, it uses Apriori association
algorithm for rule generation ranking is done through
support, confidence, and rules generated first. Pruning
is done through pessimistic error, database coverage
and its prediction method is Maximum likelihood.
Another variant of CBA is CBA(2)[3][10] multiple
support algorithm (Liu et al., 2000) modifies the
original CBA algorithm to employ multiple class
supports by assigning a different support threshold to
each class in the training data set based on the classes
frequencies. This assignment is done by distributing
the global support threshold to each class
corresponding to its number of occurrences in the
training data set, and thus considers the generation of
rules for class labels with low frequencies in the
training data set.
Classification based on multiple association rule
(CMAR) [3][5] has horizontal data layout. It uses FPgrowth approach for rule discovery. Its ranking is done
in terms of support, confidence and cardinality. Its
pruning is done in terms of Chi-square, database
coverage, redundant rule and its prediction method is
CMAR multiple label. Classification based on
predictive association rule (CPAR)[3][7] uses greedy
strategy presented in FOIL. Its ranking is done through
support confidence and cardinality same as CMAR. It
uses Laplace expected error estimate to do pruning and
for prediction it uses CPAR multiple label.
A new algorithm [3][8] called ‘existential upwardclosure’ has been introduced in the AC approach based
on a decision tree called the association-based decision
tree algorithm (ADT). The ADT uses pessimistic error
pruning which constructs a decision-tree-like structure,
known as an ADT-tree, using the generated CARs and
places general rules at the higher levels and specific
rules at the lower levels of the tree. In the prediction
step, the ADT selects the highest ranked rule that
matches a test object; a procedure that ensures each
object has only one covering rule [3].𝐿3 (live-and-letlive)[3][6] algorithm scans horizontal data layout, use
FP growth tree for rule generation, ranking is done
through support, confidence, rules cardinality and
items lexicographical.
Mostly real world applications [13] such as marketing
surveys, medical records contains structured data
which stored in multiple relations. This results to the
evolution of multi-relational data mining (MRDM).
Multi-relational data mining learns the interesting
patterns directly from multiple interrelated tables with
the support of primary key /foreign keys. Multirelational classification (MRC) is one of the rapidly
rising subfields of multi relational data mining which
constructs a classification model that utilizes
information gathered in several relations. Multi
relational classification is the method which perform
classification on multi relational data base. Multi
relational Classification using Association Rules
(MCAR) [3][9] has vertical data layout, rule discovery
is done through Tid – list intersections. Ranking is
done through support, confidence and cardinality. For
pruning it covers whole database (database coverage),
its prediction method is exact minimum likelihood.
Classification based on atomic association rules
(CAAR) [3][11] mines only atomic CARs from image
block data sets. An atomic rule takes the form of 𝐼 →
𝐢, where the antecedent contains a single item. CAAR
has been designed for image block classification data
sets, although its authors claim that it could be adapted
to other classification data sets, which were not
supported in the experimental tests. CAAR builds the
classifier in multiple passes, where in the first pass, it
scans the data set to count the potential atomic rules
(rule items of length 1), which are then hashed into a
table. The algorithm generates all atomic rules that
pass the initial support and confidence thresholds
given by the end-user.
5. Conclusion
For classification of correlated data sets, there is a need for
additional constraints beside support, confidence and
cardinality in the rule ranking process, in order to break ties
between similar rules and to minimize random selection.
Also, pruning can be used to cut down the number of rules
produced and to avoid over fitting. Furthermore, most
existing AC techniques use the horizontal layout presented
in Apriori to represent the training data set. This approach
suffers from drawbacks, including multiple database scans
and the use of complex data structures in order to hold all
potential rules during each level, requiring large CPU times
and memory size. However, the vertical data format may
require only a single database scan, although the number of
tid-list intersections may become large, consuming
considerable CPU time. Efficient rule discovery methods
that avoid going through the database multiple times and do
not perform a large number of computations can avoid some
of these problems.
References
[1] Prachitee B. Shekhawat, Prof. Sheetal S. Dhande, “A
classification technique using Associative classification”,
International Journal of Computer Applications (0975 –
8887)Volume 20– No.5, April 2011
[2] Association Rule mining, Winter school on “Data Mining
Techniques and Tools for knowledge Discovery in
Agricultural Datasets”
[3] Thabtah, Fadi Abdeljaber “A review of associative
classification mining” Knowledge Engineering Review, 22
(1). pp. 37-65. ISSN 0269-8889, University of Huddersfield
Repository, 2007.
[4] Liu, B., Hsu, W. & Ma, Y. “ Integrating classification and
association rule mining” In Proceedings of the International
conference on Knowledge Discovery and Data Mining. New
York, NY: AAAI Press, pp. 80–86, 1998.
[5] Liu, B., Ma, Y., & Wong, C.-K., “Classification using
association rules: Weakness and enhancements” In Vipin
Kumar, et al. (eds), Data Mining for Scientific Applications,
2001.
[6] Baralis, E. & Torino, P., “A lazy approach to pruning
classification rules” Proceedings of the 2002 IEEE
International Conference on Data Mining (ICDM’02),
Maebashi City, Japan, p. 35, 2002.
[7] Yin, X. & Han, J., “CPAR: Classification based on
predictive association rule” In Proceedings of the SIAM
International Conference on Data Mining. San Francisco,
CA: SIAM Press, pp. 369–376, 2003.
[8] Weka “Data mining software
www.cs.waikato.ac.nz/ml/weka, 2000.
in
Java”
http://
[9] Thabtah, F., Cowling, P. & Peng, Y, “ MCAR: Multiclass classification based on association rule approach” In
Proceeding of the 3rd IEEE International Conference on
Computer Systems and Applications, Cairo, Egypt, pp. 1–7,
2005.
[10] Liu et al (2003)
[11] Xu, X., Han, G. & Min, H. “A novel algorithm for
associative classification of images block”. In Proceedings of
the 4th IEEE International Conference on Computer and
Information Technology, Lian, Shiguo, China, pp. 46–51,
2004.
[12] Nitin Kumar Choudhary, Gaurav Shrivastava, Mahesh
Malviya, “Multi-relational Bayesian Classification through
Genetic Approach”, International Journal of Advanced
Research in Computer and Communication EngineeringVol.
1, Issue 7, September, 2012.
[13] M. Thangaraj, C.R.Vijayalakshmi, “Performance Study
on Rule-based Classification Techniques across Multiple
atabase Relations”, International Journal of Applied
Information Systems (IJAIS) – ISSN : 2249-0868 Foundation
of Computer Science FCS, New York, USA Volume 5–
No.4, March 2013.
Download