Data Mining and Information Security

advertisement
Data Mining and Information Security
1
Reham Jarman#1, Barea Alsa’awi*2, Maha Alazizy#3
#
Computer Science dept, Prince Sultan University
Saudi Arabia Riyadh
1reham.jarman@hotmail.com
2e4_bebo0o@hotmail.com
3maha_alazizy@yahoo.com
Abstract- According to MTI Technology review magazine,
data mining is going to be one of the most 10 sectors that is
going to change the world in the future. Many giant
companies entered this sector recently like Oracle and IBM
by supplying software or models used to serve data mining.
Also there are many companies interested with the security
of data mining like Cisco Company. But, what makes all
these companies interesting in data mining ?.What is behind
the big profit gained from data mining companies?.Many
standards and rules was added recently to help improving
the information security .These standards are figured and
controlled by strong organizations and sometimes
governments
like
International
Organization
for
Standardization(ISO) .Lets take the ISO27001 for
managing the information security as an example .
In this paper, we are trying to link two important
and new aspects for data which are the security of these
data and the extracting of it or what is known as data
mining. The technique of data mining comes with the huge
size of databases used now. This will increase the risk of
losing or damaging these data warehouses .Then it comes
the need of more security management to guarantee your
data reliability, privacy, integrity, etc... Information security
is needed in all organizations, businesses and for individuals
also. We will try to clarify as much as possible the relation
between data mining and information security.
In this research we are focusing on the security side
of the data mining.
I.
they can be implemented fast on obtainable software and
hardware platforms to increase the value of existing information
resources.
Information security was known as an old definition used
in the Second World War, but it becomes a large sector because
of the revolution of technologies. The security of information
avoids risks not only for individuals also for organizations,
business companies and the most important governments. When
we are talking about Information security, we are talking about
the most important matter of data mining. It's a very hard,
complicated and long-time aspect. Information security cannot
be done; there is always a risk but the goal is to reduce it as
much as possible.
We will explain data mining and we will mention the
most common techniques. And we will talk about data
warehouse Also, we will talk about the data security and then we
will move to the relation between data mining and information
security.
II.
a.
Data Mining
What is data mining?
Data mining is known as the science of extracting useful
information from large data sets or databases. Data mining is a
new discipline, it lies at the intersection of machine learning,
statistics, databases and data management, artificial intelligence,
pattern recognition, and more other areas. [1]
Introduction
b.
We are going to talk about a new powerful technology
that helps firms and companies focus on the important
information in their warehouses. This technology is data
mining, which is extracting information from large data sets.
The future of data mining is bright and portentous ،and growing
very fast to reach web and text mining .Many researches are
done recently to serve the future knowledge of the data mining.
Data mining allows businesses to make positive knowledge
decisions by its tools which predict future trends and behaviors.
Data mining tools help finding predictive information that
experts may miss because it lies outside their expectation.
Data mining techniques can be incorporated with
new products and systems as they are brought on line, and
Data warehouse
"A data warehouse is a subject-oriented, integrated, timevariant and non-volatile collection of data in support of
management's decision making process."[2]
"Subject-Oriented: A data warehouse can be used to
analyze a particular subject area. For example, "sales" can be a
particular subject." [2]
A data warehouse integrates data from multiple data
sources. For example, source A and source B may have different
ways of identifying a product, but in a data warehouse, there will
be only a single way of identifying a product.
2
First: Classical techniques.
The classical technique has descriptions of techniques
that have been used for decades. It should help the user to
understand the rough differences in the techniques and at least
enough information to be dangerous and well armed enough to
not be baffled by the vendors of different data mining tools.
1.
Statistics
By strict definition "statistics" or statistical techniques are
not data mining. They were being used long before the term data
mining was coined to apply to business applications. However,
statistical techniques are driven by the data and are used to
discover patterns and build predictive models. And from the
users perspective you will be faced with a conscious choice when
solving a "data mining" problem as to whether you wish to attack
it with statistical methods or other data mining techniques. For
this reason it is important to have some idea of how statistical
techniques work and how they can be applied. [3]
Fig 1: A data warehouse example
Historical data is kept in a data warehouse. For
example, one can retrieve data from 3 months, 6 months, 12
months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most
recent data is kept. For example, a transaction system may hold
the most recent address of a customer, where a data warehouse
can hold all addresses associated with a customer.
Once data is in the data warehouse, it will not change.
So, historical data in a data warehouse should never be
altered."[2]
c.
Data Mining Techniques
We will describe some of the most common data mining
algorithms in use today. We have divided the techniques into
two sections:


Classical Techniques:
o Statistics.
o Neighborhoods
o Clustering
Next Generation Techniques:
o Decision Trees
o Neural Networks
o Rules [3].
Regression is an old and most well-known statistical
technique used in data mining in functions format. Some of them
are simple like the linear regression to find appropriate values
according to predicted values. There are other advanced
regression techniques such as multiple regression for more
complex relations. Successful data mining still requires skilled
technical and analytical specialists who can structure the analysis
and
interpret
the
output.
[4]
2.
Neighborhoods
Clustering and the Nearest Neighbor prediction technique
are among the oldest techniques used in data mining. Most
people have an intuition that they understand what clustering is namely that like records are grouped or clustered together.
Nearest neighbor is a prediction technique that is quite similar to
clustering - its essence is that in order to predict what a
prediction value is in one record look for records with similar
predictor values in the historical database and use the prediction
value from the record that it “nearest” to the unclassified record.
[3]
3.
Clustering
"Clustering is a data mining (machine learning) technique
used to place data elements into related groups without advance
knowledge of the group definitions.
Popular clustering techniques include k-means clustering and
expectation maximization (EM) clustering."[5]
Another definition: A grouping of a number of similar
things; a bunch of trees; a cluster of admirers.
3
Second: Next Generation Techniques.
The next Generation techniques represent the most often
used techniques that have been developed over the last two
decades of research. These techniques can be used for either for
building predictive models or discovering new information
within large databases
1.
Decision Trees
"Decision tree structure and nodes vary depending on the
object of data mining and on the structure of information you
possess." [5] As shown in fig 2
Specific decision tree methods include Classification and
Regression Trees (CART) and Chi Square Automatic
Interaction Detection (CHAID).
Fig 3: A simplified view of a neural network for prediction of loan default.
2.
Neural Networks
"To be more precise with the term “neural network” one
might better speak of an “artificial neural network”. True neural
networks are biological systems (a k a brains) that detect
patterns, make predictions and learn. The artificial ones are
computer programs implementing sophisticated pattern detection
and machine learning algorithms on a computer to build
predictive models from large historical databases. Artificial
neural networks derive their name from their historical
development which started off with the premise that machines
could be made to “think” if scientists found ways to mimic the
structure and functioning of the human brain on the computer.
Thus historically neural networks grew out of the community of
Artificial Intelligence rather than from the discipline of statistics.
Despite the fact that scientists are still far from understanding the
human brain let alone mimicking it, neural networks that run on
computers can do some of the things that people can do." [3] As
fig 3 shows an example of simplified view of a neural network.
3.
Rules
Finding frequent patterns, associations, correlations, or
causal structures among sets of items or objects in transactional
databases, relational databases, and other information
repositories.[6]
Fig 2: An example for a Decision Tree.
http://www.cs.odu.edu/~toida/nerzic/390teched/computability/complexity.htm
4
These days' companies with a powerful retail,
communication, financial, and marketing organizations use data
mining. Data mining enables the companies to find out the
impact on sales, customer agreement, and share profit. It also
makes it easier for the companies to determine relationships
among external factors. For example product, price, staff skills,
customer demographics, economic indicators, and positioning.
Finally, data mining makes it easy to summary information to
view detail transactional data.[8]
Fig 4: Data Mining Process
http://msdn.microsoft.com/en-us/library/ms174949.aspx
d.
These are some examples to show you companies that
use data mining, firstly, American Express it can suggest product
to its cardholders based on analysis of their monthly expenditure.
Secondly, blockbuster Entertainment which mines its video rental
history database to recommend rentals to individual customers.
Thirdly, Wall Mart has over 2,900 stores in 6 different countries
and it transmits these data to its 7.5 Tara byte data warehouse. It
allows more than 3,500 suppliers, to access and perform data
analyses. The suppliers use this information to manage local store
inventory and identify new opportunity. [8]
Data mining process
III.
The data processing comes before the algorithms because it
must be processed to bring it to a form suitable for pattern
identification. The processing consists of six phases. As shown
in
figure
4:






e.
Define the problem by defining variables, objectives, and
requirements then translate them to definition.
Prepare the data by constructing the final data set, it should
be clean (error free) and formatted. The major tasks
involved in this phase are selecting tables, records, and
attributes as well as transformation of the data for the next
phase.
Explore data, collect and describe the data. Statistics are
used in this process.
Building models by selecting a model and apply functions
such as association, classification, and clustering. Different
functions can be used for the same data type; some
functions can only be used for specific data type.
Evaluate the model if it does not satisfy the expectations
the model is rebuild until it achieves the objectives.
Deploy the result and present it as simple report or as
complex
database.
[7]
Information Security
In the past, people used to carry their money,
gold and silver with a big chance of losing them. Then,
they realized that we need to make a safe place and
avoiding caring expensive things. In addition to that,
banks starts working by guarantee the secure of the
customer's savings. Actually, we are not going far of our
topic, but we are trying to show the important of it .Now,
information in warehouse can be much more important
than savings in banks. Transferring information need to
be secure as transferring savings. Companies paid lots of
money to make their data secure, Confidential and
feasible as much as possible.
What can data mining do?
A retailer can use point-of-sale records of customer
purchases to send targeted promotions based on an individual's
buy history and this can be done by data mining. By mining
demographic data from comment or warranty cards, the retailer
could develop goods and promotions to demand to specific
customer segments.
Fig 5: Governments Security Classification Cost 2009
http://www.govinfosecurity.com/articles
5
Fig 5 shows how the US governments spend for the
information security more than other security matter. No one of
us is not concerning about his or her information security
.Indeed, we need it most of the time to minimize the breach
crimes, but not ending it.
a.
History
During the world war II ,armies and governments
needed to avoid leaking of information .They focused on
developing new technologies to help hiding very high secret
information .Cryptography ,for example ,is one of the most
popular and powerful technique was used till now. This is the
study of hiding information.”The US department of Defense
and the Department of State improve this technique since the
1970s with expertise in cryptography.” [9].
Encryption was used only by governments, but
now it's used for organization and individuals also.
It's easy to encrypt your email so no one during the
transferring can read it other than the receiver. Information
security become an ongoing learning process in a big field
including techniques, algorithms ,issues etc For instance ,cloud
computing technology to manage sharing and saving
information very easily and safety on servers .Information
security is taken in a serious consideration to many sectors like
business and healthcare for example .The world concern about
the data security more, so governments and organizations add
new principles and strict laws to guarantee the information
security.ISO27K
standards found by ISO(International
Organization for Standardization) ,to protect the information
on which we all depend. Although laws are there, computer
crimes are increasing, but awareness people about how to avoid
problems in information security may increase the security of
their
information.
b.
Definition
There is no universal definition of information security,
but we can say it's the process of protecting data by giving
authorizations to see and use a certain data. To understand
information security we need to understand the three aspects of
information security which are: confidentially, integrity and
availability.
First, the data must be confidential to make sure that
every user is having his information in a system in a very high
private level, and no one can reach it without his or her
permission.
Providing passwords and IDs can serve the issue. But
this is not done only by the system or in other word the
DBMS(database management system) .
Let's take an example of a person who is saving sensitive
information related to his company with no authorization (an
one who owns the file can see it) in a USB driver, and a bad day
came when the USB has been stolen .Another example is when
someone owns a credit card and he associate his password to be
all zeros or his birth date .In the two previous cases, the system
has provide a privacy choice to the two persons, but they didn't
use it property. Let's move to more complex situation. A
company with very huge database of customer's information.
Hiding all the data is not a good idea, because users want to
access data as much as possible with no many constraints. It's
difficult to the security system know which data is sensitive and
which is not. Precision is an approach which goal is to maximize
as much no sensitive data as possible and protect the rest data
(the sensitive one).
We move to the integrity aspect where the data must be
consistent and reliable with the intended data to minimize the
loss of data or the inconsistencies of the data; information should
not be changed or removed randomly.
”A successful attack can happen when integrity is violated first
then the system availability or confidentiality"[10]. The DBMS
work in this aspect by reducing and analyzing failures that could
happen. Because these failures are commonly happened and the
reconstruction is costly, integrity is very important for
organizations.
Last but not least, is to serve the sharing of information
which done within the availability aspect. A system with correct
controlling, storing and communicating processes is serving the
availability aspect.
c.
Risk Management
The meaning of risk management in data reefers to the
guidelines used to reduce security risks in data to an acceptable
level. This is done by knowing the weaknesses in the security
system that brings threats .In a security system, risk management
are needed to serve the value of security very well. In other word,
it gives a backup plan to what if a bad situation happened .This
not only includes the security issue. It expands to include
managing and fixing the operational and economic costs to
establish a high level of protectively and protecting the IT
systems and data that support a certain organization. . Other
impacts cannot be measured in specific units but it can be
described in terms of high, medium, and low impacts .For
instance or loss of public confidence, loss of credibility. In this
research, we are only concerning about the information security
management instead of business risk management.
To manage the risk management in information security,
we must first collect factors that could affect it, which are:






Hardware
Software
People who are using the system
Sensitive data
System interfaces
Critical
"A threat is a circumstance or event with a harm effect to
an information system ".Threat-Sources are commonly appeared.
They can be human threats which caused by human like hackers
6
or environmental threats (physical) like the failure of a power.
Also, some threat can cause a direct damage (primary threat), or
a long term damage (secondary threat).
d.
RFID security
RFID refers to Radio Frequency Identification systems
which are the greatest technology to identifying identities and
giving more security benefit .It work using automatically
private networking using high technologies to minimize failures
and attacking.
RFID is a widely use now ,because in almost all
industries, there are things must be easily tracked, recorded and
identified many things in a very short time .But can this
technology be the saver of hacking and leaking?.Can People
stop frighten of their credit card security when they are using
this technology? As we mention before, information technology
is an ongoing process, because there is always two group of
people who are against each other; devil people and good
people .A thief could steal your credit card from your wallet
,but electronic pickpocket who are using RFID can steal your
credit card information while it's on your wallet and without
even you know. Unfortunately, This can put millions of people
at risk. Electronic pickpocket will use RFID to scan wallet or
bag , then immediately , the credit card information is known
now like the expiration date, number , name ,etc. It's not the
risk of a credit card .Indeed, it could happened with anything
uses the technology like passport contain RFID.
IV.
Secure multiparty computation techniques that allow
servers to compute functions over local data while ensuring that
no server learns anything about the data of the other servers,
except the output of the function, the computation is secure if
given just one party’s input and output from those runs this will
guarantee a strong privacy.
PPDM is not the only field regarding to the data mining
for enhancing information security. Many articles, workshops
and researches has been done and used by many sectors like
business ,governments and healthcare sectors. In short word,
PPDM is one field between many other fields having the same
matters; security matter in data mining
Security Matter in Data Mining
Both data mining and information security have many
researches during the last few years, the researchers suggest that
raising security must be on the top of the data mining issues.
Data mining techniques can be applied to handle security
problems as they can cause other security problems. It becomes
common in both the private and public sectors. In the matter of
fact, data mining is smart techniques to analyze gather statistical
information and help in decision making. Many of these sectors
sell the data to other sectors , which use these data for their own
purposes. As a result, privacy of individual is being affected
without their execution.
a.
The privacy preserving ensures unconditionally safe access to the
data and does not require from the data miner any expertise in
privacy. Most of the research on privacy focused on theoretical
properties of data mining. Recent studies focused on the use of
privacy in practical applications such as banking, healthcare, and
airlines.
PPDM deals with the problem of learning accurate models
over aggregate data, while protecting privacy at the level of
individual records[9].What PPDM analyze is that individuals
wants more information security ,and this is not applicable for
knowledge discovery that is used for decision making. In short
word, there is a conflict between the privacy purpose individuals
need and the analyzing purpose organizations need. The question
is: can us accurate good annalist without access the individual's
information.
Privacy Preserving Data Mining (PPDM)
Lots of institutions are spending more resources on
developing their data mining skills and by doing and looking for
new
research
on
data
mining.
Privacy Preserving Data Mining (PPDM) is a new
research area that helps researchers and practitioners to identify
problems and solutions for data mining according to the security
concern. Its aim is to secure the information using different kind
of algorithms and techniques. What happened if we ignore or
limit the need of information security can threaten to derail
data mining projects. The concerns of privacy has been
increased because of the misusing of information, data mining
will prevents this misusing and guarantees no data is revealed.
These are some of the new and simplest researches
according to all sectors:
 Privacy and security when mining outsourced private
data
 Privacy threats induced by data mining
 Data mining for anomaly detection
 Using data mining for intrusion detection and
prevention
 Privacy-preserving link and social network analysis
 Security and privacy in spatpio-temporal data mining.
b.
Security Classification for Information
What is important to know for a set of information is that
not all the information are having the same level of protection.
For instance, old information; that wasn't updated for long time,
are usually not needed any more or not private as it was. Data can
classified to classes depending on the security levels assigned to
each class as shown in fig 7
7
Unfortunately, individuals are the victim because they
don't know what is happening behind them. Let's take the social
network databases as an example. Individuals are sharing a
valuable information among each other or sometimes they only
won't .What is happening is that some analysts start mining and
analyzing that information and sell it to other companies. The
future concern is that if these companies still keep tracing these
data, the privacy matter will be unreachable. Because someone's
data could be found in some other documents in other website
without his/her permission and knowing. Spokeo is a website that
is aggregating and organizing people related information from
the internet source. It give you the most comprehensive snapshot
of people-related, public data from the internet. A person could
be found by his /her name, phone, username emails and even
friends. There is two points must be realized about this website.
First, this website is mining information .Even it was from a
public resources, they gather these sensitive data which make it
less secure and annoying. The second point is that this
information may not be efficient.
Figure 6: shows the hierarchy of the security classification among information
http://www.centos.org/docs/5/html/Deployment_Guide-en-US/sec-mls-ov.html
Classifying data according to the security level can help
shaping the data mining process. Because it can show what data
could be gathered, what data couldn't and avoid using the
unneeded data; like the old data. And the company will be
aware of what are the data that could be sell and not.
Handling noisy or incompatible data is an issue in data mining
.Classify information according to the security level can help
reducing the problem. The information requiring protection
should be described in clear according the classification.
One of the aims of classifying data according to the
security matter is that assigning all the data to a very high secret
level will waste so many resources.
c.
Information Security in Data Mining
It's obvious that there is a huge need for learning and
mining methods with enough privacy and security guarantees
for fields that need decision making process [11].Also, it's
important to develop mechanisms for processing the data
without affecting the data privacy matter.
Differential privacy is a theory that serves the both
aspects in the same time; information privacy and data mining.
The aim of it is to give an accurate query from statistical
databases and minimizing the chances of identifying its records.
Also ,data cleansing is a technique in which it identify and
remove suspicious data to reach the most effective and reliable
data during the data mining .As a result, more security
information and more accurate analysis. Existing research
efforts (Maletic and Marcus 2000; Orr 1998) suggested that the
average error rate of a dataset in a data mining application have
to be around 5%-10% [12].
Clickstream is a technique used to record what computer
users clicking on while they are browsing the web. When
someone brows a page, the URL of the page and also the IP
address of the user will be saved in the web server. Clickstream
can analyze the behavior of users or customers and how they
interact with a certain website. Using clickstream in marketing
can help companies to choose the best website to publish their
commercials on it. Also, they can publish it by sending emails to
who are using this website more often. This would be perfect for
knowledge discovery but not that so for privacy. By clickstream
,they can know the all pages user brows it and the exact time of
browsing each .Also, it can easily know the user if the user
publish some of his/her information .Some of web providers start
to use these analysis and statistics to market it. This process is
considered to be legal because they only distribute user's
behavior in a way that help many business companies to make
their decisions ,and they disallow to gave them private
information about users like their names or IP address. But
sometimes its easy to get it because some people don't have
knowledge about what could happen if his/her information was
published. Not all internet providers give their customer a
description or even a hint about their exact work and especially
when it comes to their privacy. Google engine have another
point of view about customer's privacy related with
clickstream.By clearing cookies and turning the cable modem
off for few minutes the customer's IP address will be realized as a
new
IP
address
Information security in health care is a good example of
managing information security, patient's information must remain
private and secure because misusing of information, exposing, or
loss of data may harm both the individuals and the organizations.
To understand the security system data miners should first
understand the Generally Accepted System Security Principles
(GASSP) published by the International Information Security
Foundation that was updated in 1997[13].Owners should provide
responsible and accountable system, and the
security of
8
information systems should be explicit. The security of
information in a system should be provided as a high manner to
all users with no differentiation among them and respects the
right and interests of others. Systems should respond to
breaches of and threats to the security of information and
information systems .“Measures for the security of information
systems should be coordinated and integrated with each other
and with other measures, practices and procedures of the
organization so as to create a coherent system of security”[14].
Dynamic Data Web technology was developed by
Quiterian company to enables multiple solutions to be
developed at the business sector .By using Dynamic Data Web
,companies can study their customer's behavior ,give the key
factors of business success and identify risks to find the best
decision making and this is a continues process. Dynamic Data
Web is the fastest and most powerful analytical business
intelligent platform in the market. What make it different is that
it includes easy and powerful analytical techniques for a big
data. "It has very good security rules and personal data
protection
control
(used
in
Police,
Health
or
Banking)"[15].Knowing that a company is using this king of
technology would make it more trustworthy. As a result, big
companies start to use this technique like Vodafone and TMB.
V.
Conclusion
In conclusion, Data mining the knowledge of extracting
helpful information from large data sets or databases.
Technologies are in evolution every day ,and more individuals
companies and organizations start using these technologies in
the matter of easiness and to be on the first line with
competitions .On the other hand, these technologies must be in a
good security level to guarantees the safety of information and
the reliability of it to serve their goals .Information security is
an old definition used first in military needs and then the use of
it was needed to individuals and groups .Information security
professionals are always facing new challenges which make
them aware to find the best secure (but not the final) to a
particular information and making backup plans .Information
security have three aspects which are :confidentially, integrity
and availability .
Many researchers have been used and adapted by big companies
and universities according to the security of information in data
mining technique. Protecting privacy of sensitive information
used for data mining purposes is a big issue discussed by
researches these days. Classifying the security level can
guarantee more security for the information. Some
organizations are mining individual's information and selling it
to other companies. This becomes an ethical issue. Companies
will gain more profit and individuals will be the victim. This
might end the generation of the private information. Data
mining could bring risks to security of information and privacy,
but researchers are developing new technologies and algorithms
to make some balance between privacy on individual's side and
data analyzing on organizations side.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
Hand, David, Heikki Mannila, and Padhraic Smyth. Pricnciple of
Data Mining. Libraryof Congress Catloging-in-Publication Data,
2001. Print. Qa76.9.D343 H38 2001.
"Data Warehouse Definition - What Is a Data Warehouse." 1Keydata Home of Free Online Tutorials. Web. 04 Jan. 2011.
<http://www.1keydata.com/datawarehousing/data-warehousedefinition.html>.
Berson, Alex, Stephen Smith, and Kurt Thearling. Building Data
Mining for Applications for CRM. McGraw-Hill Companies,
December 22, 1999. Print
Chapple, Mike. "Regression." About.com. About.com, 2007. Web.
Accessed,3
Dec.
2010.
<http://databases.about.com/od/datamining/g/regression.htm>
Chapple, Mike. "Clustering (data Mining) Definition." About
Databases: Microsoft Access, SQL Server, Oracle and More! Web. 01
Jan.
2011.
<http://databases.about.com/od/datamining/g/clustering.htm>
Kulkarni, Sushil. "Association Rules in Data Mining Ppt
Presentation." AuthorSTREAM Online PowerPoint Presentations and
Slideshow
Sharing.
Web.
04
Jan.
2011.
<http://www.authorstream.com/Presentation/sushiltry-108428association-rules-data-mining-science-technology-ppt-powerpoint/>.
Andrea Andreescu, “Forecasting Corporate Earnings a Data Mining
Approach”. The Swedish School of Economics and Business
Administration,
2004.
<http://www.pafis.shh.fi/graduates/andand02.pdf>
Palace, Bill. "Data Mining." Anderson. June 1996. Web. 14 Feb. 2011.
<http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologi
es/palace/index.htm>.
Pfleeger, Charles P., and Shari Lawrence Pfleeger. "Elementary
Cryptography." Security in Computing. Third ed. New Jersey:
PRENTICE HALL, 2003. 35-91. Print.
Fraquad. "Privacy Preseving Data Mining." All About Education.
Inspire and Ignite, 20 Dec. 2009. Web. 6 Dec. 2010.
<http://www.inspirenignite.com/privcy-preserving-data-mining/>.
"Workshop on Privacy and Security Issues in Data Mining and
Machine Learning." ECML PKDD2010. ECML PKDD 2010. Web.
<http://fias.uni-frankfurt.de/~dimitrakakis/workshops/psdml-2010/>.
Marcus, Andarian, and Jonathan Maletic. "Data Cleansing." Data
Mining and Knowlede Discovery Handbook. New York: Springer,
2005. 50-55. Print.
Ralph Spencer Poore, International Information Security Foundation,
“Generally Accepted System Security Principle” 1999.Web
<http://www.infosectoday.com/Articles/gassp.pdf>"Quiterian
Data Mining Y Análisis Predictivo Para Usuarios De Negocio."
Quiterian - Dynamic Data Web - Análisis Dinámico De Datos HOME.
10
Jan.
2011.Accessed,
14
Jan.Web
2011.
<http://www.quiterian.com/site/index.php>.
Ted Cooper and Jeff Collman. Managing information Security and
Privacy in Healthcare. Department of Ophthalmology, Stanford
University Medical School, Palo Alto, California, ISIS Center
Georgetown University School of Medicine; Department of
Radiology;Georgetown University Medical Center, Washington D.C.,
2005.Web
<http://ai.arizona.edu/mis596a/book_chapters/medinfo/Chapter_04.pd
f>
"Confidentiality, Integrity, Availability (CIA) - Privacy / Data
Protection Project (c)2002-2005." Privacy / Data Protection Project.
University of Miami., 24 Apr. 2006. Web. Accessed 10 Dec. 2010.
<http://privacy.med.miami.edu/glossary/xd_confidentiality_integrity_
availability.htm>
SIeglein, William. "Assisments/Risk Assesments." Security Planning &
Disaster Recovery. By Eric Maiwald. Californial: Bradon A.Nordin,
2002. Print.
Montgomery, David. "Electronic Pickpocket Stoppers." The
Washington Post 2 Apr. 2008. Print, accessed 14 Dec.2010.
9
[19] Thearling, Kurt. "Data Mining and Privacy: A Conflict in the
Making?" Data Mining and Analytic Technologies (Kurt Thearling).
Web.
Accesed14
Dec.
2010.
<http://www.thearling.com/text/dsstar/privacy.htm>.
[20] Under,
Filed.
"Principls
of
Information
Security."
Www.informationintegrity.org. Www.informationintegrity.org, 20
Oct. 2010. Web. Accessed 11 Dec. 2010.
<http://www.informationintegrity.org/principles-of-informationsecurity/>.
[21] Kimball, Ralph, and Marqy Ross. The Data Warehouse Toolkit. 2
Edition ed. Willy, April 26, 2002. Print
[22] "ESTARD Software :: Data Mining Software :: ESTARD Data
Miner." ESTARD Software. Data Mining Software for Business &
Science.
Accessed,Web.
01
Jan.
2011.
<http://www.estard.com/products/>.
Download