data mining

advertisement
B12
6005
Disclaimer — This paper partially fulfills a writing requirement for first year (freshman) engineering students at the University
of Pittsburgh Swanson School of Engineering. This paper is a student, not a professional, paper. This paper is based on
publicly available information and may not be provide complete analyses of all relevant data. If this paper is used for any
purpose other than these authors’ partial fulfillment of a writing requirement for first year (freshman) engineering students at
the University of Pittsburgh Swanson School of Engineering, the user does so at his or her own risk.
DATA MINING: A SECURITY RISK OR A SECURITY ADVANTAGE?
Ben Birkett, beb96@pitt.edu, Sanchez 10:00, Laura Friedland, lnf14@pitt.edu, 6:00
Revised Proposal — Within the past 5 years, data mining has
become more and more prevalent the United States due to
recent scandals and exposes on the topic. Simply put by
Professor Jason Frand of UCLA, data mining “is the process
of analyzing data from different perspectives and summarizing
it into useful information” [1]. Due to major advances in
technology and the average person’s growing dependence on
technology, data mining now affects every citizen in the United
States, whether he or she is aware of it or not. Within the past
ten years, data mining has become an essential tool that
government organizations, such as the National Security
Agency and the Central Intelligence Agency, use to protect the
country and gain intelligence on potential threats.
Within data mining, there are several techniques, both old
and new, referred to as neighborhoods, clustering, trees,
networks and rules. Each of these techniques utilizes
algorithms to find connections and trends in people’s everyday
computer activity [2]. This allows the government to easily
identify potential threats within the country and outside of the
country. However, this also leads to the fear that if the U.S.
government has access to every citizen’s computer activity, the
government could have access to personal files and
documents; which many could argue violates the rights of
citizens [3]. Data mining is a continuous innovation [1]. It is
constantly growing and changing as the technology the world
uses grows and changes. New techniques and uses are
frequently discovered, however; its most useful application is
security, despite the controversy it generates.
This paper will discuss the technology and methods behind
data mining, how data mining works, and how it helps to
improve national security. The ethics and the fallbacks
regarding privacy will also be discussed in depth in reference
to the NSA scandal that recently came to light when Edward
Snowden revealed to the public that the NSA has access to
every citizen’s private computer. Both technical and ethical
articles will be used to highlight and discuss the potential,
good and bad, and the controversy of data mining.
Data mining methods are expanding rapidly allowing for the
mass collection of information. This mass amount of
information is then used by many government agencies to
identify threats, gain intelligence, and obtain a better
understanding of enemy networks. However, the ability to
collect this information from any computer draws into question
whether or not data mining leads to a violation of the average
University of Pittsburgh Swanson School of Engineering 1
01/28/16
citizen’s privacy and has created a debate as to if data mining
is ethically plausible.
REFERENCES
[1] J. Frand. (2010). Data Mining: What is Data Mining?.
(online
article).
http://www.anderson.ucla.edu/faculty/jason.frand/teacher/tec
hnologies/palace/datamining.htm
[2] A. Berson. (2000). Building Data Mining Applications for
CRM (Enterprise). McGraw-Hill Education. (Print book).
[3] J. Pappalardo. (2013, Oct.). “NSA Data Mining: How It
Works.” Popular Mechanics. (online article). DOI: 00324558
ANNOTATED BIBLIOGRAPHY
A. Berson. (2000). Building Data Mining Applications for
CRM (Enterprise). McGraw-Hill Education. (print book).
From an educational book about data mining, we are using
an excerpt which gives an overview of data mining techniques.
Within this overview, the book explains various specific
methods for data mining and discusses how to use and apply
them. This technical source will help us explain how data
mining works and will let us delve into specific techniques for
data mining.
E. Svoboda. (2009). “Digital Exposure.” Discover. (print
article). Vol. 30, Issue 10
This article touches on concerns about data mining, and
how it can be used against us. Also discussed, is the way
businesses use data mining. The article relates the normal,
unsuspecting person to the world of data mining and shows
how his or her data can, and probably is, being obtained and
used. In our paper, this article will help relate data mining to
people who do not yet know much about it, and provide
reasons as to why they should.
G. Tsiafoulis, C. Zorkadis. (2010). “A neural-network
clustering-based algorithm for privacy preserving data
mining.” Computational Intelligence and Security (CIS).
(online article). ISBN: 978-1-4244-9114-8. pp. 401-405
This article from a conference relating to computer
intelligence proposes methods for preserving privacy in the use
Ben Birkett
Laura Friedland
of data mining. Specifically, providing various levels of
anonymity for certain data, based off of what it is. This article
applies technical aspects of data mining to the ethics
surrounding it. It will help us provide a connection between the
technical methods of data mining and the issue of privacy
related to it.
X=3&database=3&format=expertSearchAbstractFormat&ded
upResultCount=&SEARCHID=bd4c0c72Md34bM4967Maf0
4M824cacb492d0
This article, published in professional and respected journal
specializing in computer technology, discusses how data
mining can use decision tree algorithms to predict and identify
possible attacks. It discusses in depth how decision trees can
search private networks for keywords and identify possible
threats. It also discusses the effectiveness of the program and
how changes could be made. This article will assist us in
showing real life applications of data mining and how it works
to identify security issues.
J. Bamford. (2015). “The Black-and-White Security
Question.” Foreign Policy. (print article). pp.70-75
This article, from a magazine which focuses on American
foreign policy, puts forth the idea of using government
intelligence, such as that obtained through data mining, to help
people by making it public. In presenting this idea, the article
discusses the ethical issues relating to the intel that the U.S.
government collects through data mining. In our paper, this
article can provide discussion relating to the ethics of data
mining.
S. Nath. (2006, Dec.). “Crime Pattern Detection Using Data
Mining.” Web Intelligence and Intelligent Agent Technology.
(online article). DOI: 10.1109/WI-IATW.2006.55
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4053200
&tag=1
This article, written for a computer security and intelligence
conference, discusses a data mining clustering model that can
be used to assist in finding evidence, solving crimes, and
flagging potential future crimes. It provides statistics and
efficiency reports as well as case studies it was used on. This
article will be useful for when we wish to discuss the vast
applications and effects data mining has in all security issues,
domestic and foreign.
J. Frand. (2010). “Data Mining: What is Data Mining?.”
(online
article).
http://www.anderson.ucla.edu/faculty/jason.frand/teacher/tec
hnologies/palace/datamining.htm
This article, published by the Univerisity of California Los
Angeles, written by professor of mathematics, Jason Frand,
details what data mining is, how it works, and the possibilities
it presents. This article defines the basic structures of data
mining, such as classes and clusters and disusses in depth how
decision trees work and how they can be applied to security.
Information from this article will help us clarify what the
technology is and define in simple terms how it works.
J. Pappalardo. (2013, Oct.). “NSA Data Mining: How It
Works.” Popular Mechanics. (online article). DOI: 00324558
http://web.a.ebscohost.com/ehost/detail/detail?sid=7dcfa7aeb20b-4aad-98eebb7853923ef6%40sessionmgr4001&vid=5&hid=4104&bdata
=JnNpdGU9ZWhvc3QtbGl2ZQ%3d%3d#AN=90650431&d
b=aph
This article, published in the respectable “Populat
Mechanics” magazine, discusses the ethics of data mining in
reference to personal internet security and the NSA Snowden
scandal. It describes information landscapes, Exabyte’s,
metadata tracking and worldwide data and the possible
security threats these concepts impose. This article also goes
into depth about data leaks and security issues. This article will
be useful when we wish to discuss the ethics of data mining
and the threat to privacy it could possess.
M. Shree, J. Visumathi, P. Jayarin. (2016). “Identification of
attacks using proficient data interested decision tree algorithm
in data mining.” Advances in Intelligent Systems and
Computing. (online article). DOI: 10.1007/978-81-322-26741_60
https://www.engineeringvillage.com/search/doc/abstract.url?
pageType=expertSearch&searchtype=Expert&SEARCHID=
bd4c0c72Md34bM4967Maf04M824cacb492d0&DOCINDE
2
Download