APPLICATIONS OF DATA MINING Deepanshi Rajput E

advertisement
APPLICATIONS OF DATA MINING
Deepanshi Rajput
E-mail:1133310014@rkgitw.edu.in
Abstract:
Since industrial systems become very complex, classical
control methods become more sophisticated to lead the
process more adequate according to appropriate conditions
form economic (cost-effectivenes) to safety.Both
technology development as well as requirements factors are
crucial to modern industry. Their main aim is advising
process operators or even replace them regarding to
human fault elimination and increasing both the level of
quality and security.Such approach is not new. It seems to
be the continuation of computational intelligence
ideas implementations, stared in early 70’. Development of
scientific principles of artificial neural networks, predictive
and adaptive control, become a new challenge for scientists
and industry practitioners.It is notable, that pure optimizing
of known process lines stands only a part of interests.
Using innovative technology allows to gain a competitive
advantage on one hand, but it also opens the new
possibilities to very complex, nonlinear processes, where it
is very hard or impossible to gather precise, direct
information from measurement equipment directly, due
to high and unforeseen dynamics or extremely hard
environment conditions obstacles.
[1]Data Mining Techniques to Find Out Heart
Diseases
Heart disease is a major cause of morbidity and mortality in
modern society. Medical diagnosis is extremely important
but complicated task that should be performed accurately
and efficiently. Although significant progress has been
made in the diagnosis and treatment of heart disease,
further investigation is still needed. The availability of huge
amounts of medical data leads to the need for powerful data
analysis tools to extract useful knowledge. There is a huge
data available within the healthcare systems. However,
there is a task of effective analysis tools to discover hidden
relationships and trends in data. Knowledge discovery and
data mining have found numerous application in business
and scientific domain. Researchers have long been
concerned with applying statistical and data mining tools to
improve data analysis on large data sets. Disease diagnosis
is one of the applications where data mining tools are
proving successful results. This research paper proposed to
find out the heart diseases through data mining, Support
Vector Machine (SVM), Genetic Algorithm, rough set
theory, association rules and Neural Networks. In this
study, we briefly examined that out of the above techniques
Decision tree and SVM is most effective for the heart
disease. So it is observed that, the data mining could help in
the identification or the prediction of high or low risk heart
diseases.
[2]Data Mining QFD for The Dynamic Forecasting
of Life Cycle under Green Supply Chain
The satisfaction of customer requirements is critical issue
for the computer designers and
manufacturers, because computer design is a high risk and
value-added technology. When considering green
design, designers should incorporate the voices from the
customers and because they are the driving force. On
the other hand, data mining from large marketing database
has been successfully applied in a number of
advanced fields. However, little study has been done in the
quality function deployment of identifying future
customer requirements for computer design and
manufacture, using data mining. This study uses data
mining
cycle in QFD to forecast future customer requirements for
green design of life cycle. The use of time seriesbased
data mining cycle to predict the weights is advantageous
because it can (1) find the future trend of
customer requirements; (2) provide the computer designers
and manufacturers with reference points to satisfy
customer requirements in advance. The results of this study
can provide an effective procedure of identifying
the trends of customer requirements and enhance dynamic
forecasting of life cycle under green supply chain in
the computer marketplace.
[3]Complex event processing and data mining for
masrt cities
The avalanche of data which information systems had to
face in the last years influenced their evolution and
characteristics. Continuous, on-time processing of
incoming data streams imposed particular requirements,
which traditional Database Management Systems (DBMS)
were not able to fulfil. Consequently, due to the market
needs, new tools have been developed, able to process
multiple data sources, often streams, in a timely fashion in
order to extract relevant information. Grouped under the
domain of event processing (or, according to information
flow processing domain), two main types of such systems
have emerged: Data Stream Management Systems (DSMS)
and Complex Event Processing (CEP) systems.
[4]Data Mining For Security Purpose & Its Solitude
Suggestions
Data mining is the procedure of posing questions and
taking out patterns, often in the past mysterious from huge
capacities of data applying pattern matching or other way
of thinking techniques. Data mining has several
applications in protection together with for national
protection as well as for cyber protection. The pressure to
national protection includes aggressive buildings,
demolishing dangerous infrastructures such as power grids
and telecommunication structures. Data mining techniques
are being examined to realize who the doubtful people are
and who is competent of functioning revolutionary
activities. Cyber security is concerned with defending the
computer and network systems against fraud due to Trojan
cattle, worms and viruses. Data mining is also being useful
to give solutions for invasion finding and auditing. While
data mining has several applications in protection, there are
also serious privacy fears. Because of data mining, even
inexperienced users can connect data and make responsive
associations. Therefore we must to implement the privacy
of persons while working on practical data mining. In this
paper we will talk about the developments and instructions
on privacy and data mining. In particular, we will give a
general idea of data mining, the different types of threats
and then talk about the penalty to privacy. This paper is
organized as follows. Section 2 talks about data mining for
safety applications. Section 3 explains the overview of
privacy. Section 4 discusses different aspects of data
mining on. Directions are provided in section 5 and section
6 gives the conclusion of this paper or work done on the
paper.
[5]Anomaly Detection in Network using Data
mining Techniques
As the network dramatically extended security considered
as major issue in networks. There are many methods to
increase the network security at the moment such as
encryption, VPN, firewall etc. but all of these are too static
to give an effective protection against attack and counter
attack. We use data mining algorithm and apply it to the
anomaly detection problem. In this work our aim to use
data mining techniques including classification tree and
support vector machines for anomaly detection. The result
of experiments shows that the algorithm C4.5 has greater
capability than SVM in detecting network anomaly and
false alarm rate by using 1999 KDD cup data.
[6]Mining Big Data in Real Time
Streaming data analysis in real time is becoming the fastest
and most e_cient way toobtain useful knowledge from what
is happening now, allowing organizations to react
quickly when problems appear or to detect new trends
helping to improve their performance. Evolving data
streams are contributing to the growth of data created over
the last few years. We are creating the same quantity of
data every two days, as we created from the dawn of time
up until 2003.Evolving data streams methods are becoming
a low-cost, green methodology for real time online
prediction and analysis. We discuss the current and future
trends of mining evolving data streams, and the challenges
that the _eld will have to overcome during the next years.
[6]Data Mining Approaches For Network Intrusion
Detection System
Data mining has been gaining popularity in knowledge
discovery field, particularity with the increasing availability
of digital documents in various languages from all around
the world. Network intrusion detection is the process of
monitoring the events occurring in a computing system or
network and analyzing them for signs of intrusions. In this
paper, intrusion detection & several areas of intrusion
detection in which data mining technology applied are
discussed. Data mining techniques are used to discover
consistent and useful patterns of system features that
describe program and user behavior. Data mining can
improve variant detection rate, control false alarm rate and
reduce false dismissals. By using these set of relevant
system features to compute classifiers that recognize
anomalies & known intrusion.
[7]Data Mining Tools in Knowledge Discovery
Process
Data mining, the extraction of hidden predictive
information from large databases, is a powerful new
technology with great potential to help companies focus on
the most important information in their data warehouses. It
uses machine learning, statistical and visualization
techniques to discovery and present knowledge in a form
which is easily comprehensible to humans. Various popular
data mining tools are available today. Data mining tools
predict future trends and behaviors, allowing businesses to
make proactive, knowledge-driven decisions. Data mining
tools can answer business questions that traditionally were
too time consuming to resolve. They scour databases for
hidden patterns, finding predictive information that experts
may miss because it lies outside their expectations. This
paper presents an overview of the data mining tools like
Weka.
[8]Performance Analysis of Healthy Diet
Recommendation System using Web Data Mining
Medical study has revealed that people set a bigger
possibility of countering free radicals and warding off
illness by consumption of healthy foods and by increasing
their resistant system. Due to the poor eating habits people
suffer from many diseases. In the current scenario fast food
become important food in daily routine because it is
effortlessly available but taking fast food in routine may
cause for disease like heart attack, diabetics etc. Healthier
diets help us to maintain our health and keep us away from
many diseases. For better recovery from diseases or surgery
etc individual have special needs according to their medical
profile, cultural backgrounds and nutrient requirements.
Design
and
implementation
of
healthy
diet
recommendation system is based on web data mining
which is the application of data mining technique help us to
determine pattern from web. In terms of accuracy and time
performance analysis of recommendation system using two
decision tree learning algorithm ID3 and C4.5 and apply it
on healthy diet application
Conclusion:
References:
Data mining is blend of concepts and algorithms from
machine learning, statistics, artificial intelligence, and data
management. With the emergence of data mining,
researchers and practitioners began applying this
technology on data from different areas such as
banking,finance, retail, marketing, insurance, fraud
detection, science, engineering, etc., to discover any hidden
relationships or patterns.Data mining is therefore a rapidly
expanding field with growing interests and importanceand
manufacturing is an application area where it can provide
significant competitive advantage (Harding, J. et al.,
2006).The use of data mining techniques in manufacturing
began in the 1990s and it has gradually progressed by
receiving attention from the production community. These
techniques are now used in many different areas in
manufacturing engineering to extract knowledge foruse in
predictive maintenance, fault detection, design, production,
quality assurance,scheduling, and decision support systems.
Data can be analyzed to identify hidden patterns in the
parameters that control manufacturing processes or to
determine and improve the quality of products. A major
advantage of data mining is that the required data for
analysis can be collected during the normal operations of
the manufacturing process being studied and it is therefore
generally not necessary to introduce dedicated processes
for data collection. Since the importance of data mining in
manufacturing has clearly increased over the last 20 years,
it is now appropriate to critically review its history and
application.
[1]Bartok J., Habala O., Bednar P., Gazak M. & Hluchy L.
(2010). Data mining and integration for predicting
significant
meteorogical
phenomena.
International
Conference onComputational Science, (ICCS 2010),
Procedia Computer Science 1, Elsevier, 37-46
Data mining techniques becomes the basic element of
modern business. Although the idea is not new, new
technologies and implemented standards make a
contribution to their growing popularity. Regarding to
mining model usage SQL Server 2005 stands breakthrough
in this area. Thanks to the DMX language either
programmers or database administrators are able to create
Data Mining Systems in simple way. Although economical
and business publications are very fruitful of data mining
approaches,the described problem is presented rather weak
in the international publications. Nethertheless some
industrial appliances of data mining technology were
considered in(Duebel, C., 2003). Industrial usage of data
mining techniques opens new possibilities in decision
making not only for top level management, but also for
advisory
or
control
systems.
Several
prediction,classification or even anomaly detection
algorithms implementation may become lucrative tool for
industrial process appropriate stages optimization, that
combines diagnosis and control functions.The reviewed
literature shows that there is a rapid growth in the
application of data mining .In industry and manufacturing.
However, there is still slow adoption of this technology in
some industries for several reasons including both
difficulties in determining the type of data mining function
to be performed in any particular knowledge area and
question of choice the most appropriate data mining
technique regarding to many possibilities.
[2] A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer.
MOA:
Massive
Online
Analysis
http://moa.cms.waikato.ac.nz/. Journal of Machine
Learning Research (JMLR), 2010.
[3] Bala Sundar V, Bharathiar, ―Development of a Data
Clustering Algorithm for Predicting Heart‖ International
Journal of Computer Applications (0975 – 888) Volume
48– No.7, June 2012.
[4] C. F. Chien, A. Hsiao, I. Wang, Constructing
semiconductor manufacturing performance indexes and
applying data mining for manufacturing data analysis,
Journal of the Chinese Institute of Industrial Engineers,
Vol.21, 2004, pp.313-327.
[5] Giffinger, R., Fertner, C., Kramar, H., Kalasek, R.,
Pichler-Milanovic, N., Meijers, E.: Smart cities Ranking
of European medium-sized cities. , Vienna, Austria
(2007).
[6] Liu, L., Kantarcioglu, M., Thuraisingham, B.M.: A
Novel Privacy Preserving Decision Tree. In: Proceedings
Hawaii International Conf. on Systems Sciences (2009)
[7] E. Bloedorn et al, ”Data Mining for Network Intrusion
Detection: How to Get Started,” Technical paper, 2001
[8]An Extended ID3 Decision Tree Algorithm for Spatial
Data Sitanggang, I.S.; Yaakob, R.; Mustapha, N.;
Nuruddin, A.A.B.;[IEEE2011] .
[9] The WEKA data mining sodtware: An update, Mark
Hall, Eibe Frank, G. Holmes, B. Pfahringer, P. Reutemann,
IH Witten, ACM SIGKDD Explorations, Newsletter, Pages
10-18, volume 11 issue 1, june 2009.
Download