Data Mining as Pre-EDD Investigatory Tool

advertisement
Data Mining as Pre-EDD
Investigatory Tool
Team 9
Data Mining Overview
• Use of sophisticated data analysis tools to
discover previously unknown, valid
patterns and relationships in large data
sets
– E.g., Statistical models, mathematical
algorithms, machine learning methods
• Can be performed on many types of
data including those in structured,
textual, spatial, Web, or multimedia forms
Data Mining Overview cont.
• Government and Industry
• Most Common Purposes
–
–
–
–
–
–
Improving service or performance
Detecting fraud, waste, abuse
Analyzing scientific and research information
Managing human resources
Detecting criminal activities or patterns
Analyzing intelligence and detecting terrorist
activities
Advantages as Pre-EDD Tool
• Assist researchers by speeding up their
data analyzing process, allowing them
more time to work on other projects.
• Improve effectiveness by Identifying
patterns and relationships that may
otherwise go unnoticed.
• Advances in technology are allowing for
more efficient techniques
Data Mining Initiatives
• Able Danger
– The Department of Defense characterized
Able Danger as a demonstration project to
test analytical methods and technology on
very large amounts of data.
• National Security Agency (NSA)
– Speculation on NSA terrorist surveillance
dating back to at least 2002, involving the
domestic collection, analysis, and sharing of
information.
Data Mining Initiatives cont.
• The Novel Intelligence from Massive
Data (NIMD) Program
– NIMD program focuses on the
development of data mining and
analysis tools to be used in working with
massive data.
Factors Affecting Use of Data
Mining as Pre-EDD Tool….
• Data Quality
– In the wake of Choicepoint, Lexis-Nexis, etc. full aware of
commercial risks of privacy breaches, bad data quality,
accuracy, etc. Access vs. Accuracy
• Interoperability
– What use is data without proper context and resources?
(Collaboration with sharing data, just data mining might
not be enough to stop criminals and terrorists or be
meaningful). Quantity vs. Meaningfulness
• Mission/Purpose
– Limiting privacy laws may be useful, but abuse, other uses
of data may occur outside of original intention.
Authenticity vs. Illegitimacy
Limiting Privacy Laws Equals
Increasing Oversight?
• H.R. 1502 the Civil Liberties Restoration Act of 2005
– Department or agency engaged in any activity or use or
develop data-mining technology submit a public report to
Congress
– A list and analysis of the laws and regulations that would
apply to the data mining activity
– Laws and regulations that would need to be modified to
allow the data mining activity to be implemented
– Information on how individuals whose information is being
used in the data mining activity will be notified of the use of
their information
– These reports would be due to Congress no later than 90 days
after the enactment of H.R. 1502, and would be required to
be updated annually to include “any new data-mining
technologies.”
Sources
•
•
•
•
•
http://www.gao.gov/new.items/d04548.pdf
http://www.gao.gov/new.items/d05866.pdf
http://www.fas.org/sgp/crs/intel/RL31789.pdf
http://www.contentanalyst.com
http://en.wikipedia.org/wiki/Data_mining
Download