Data, Privacy, Security, and the Courts - Chicago

advertisement
Data, Privacy, Security, and The Courts:
Where Are We?
And, How Do We Get Out Of Here?
Richard Warner
Chicago-Kent College of Law
rwarner@kenlaw.iit.edu
We Live in the Age of Big Data
• “Big Data” refers to the acquisition and analysis of
massive collections of information, collections so large
that until recently the technology needed to analyze them
did not exist.
• And more:
• Datafication
• Massive amounts of unstructured “messy” data
• Otherwise unnoticed patterns
• Indiscriminate collection
• Indefinite retention for unpredictable future uses
Datafication
Massive Messy Data
• Big Data analysis requires collecting
• massive amounts of
• messy data .
• Messy data: The data is not in a uniform format
as one would see in traditional database, it is not
annotated (semantically tagged).
• A technological breakthroughs was to find ways to manipulate and
analyze such data.
• Massive amounts: think of every tweet ever
tweeted. They are all in the Library of Congress.
• 400 million tweets a day in 2013.
Patterns We Would Not Notice
• Big Data analytics can reveal important patterns that
would otherwise go unnoticed.
• Taking the antidepressant Paxil together with the anticholesterol drug Pravachol could result in diabetic blood
sugar levels. Discovered by
• (1) using a symptomatic footprint characteristic of very high blood
sugar levels obtained by analyzing thirty years of reports in an FDA
database, and
• (2) then finding that footprint in the Bing searches using an
algorithm that detected statistically significant correlations. People
taking both drugs also tended to enter search terms (“fatigue” and
“headache,” for example) that constitute the symptomatic footprint.
Benefits of Big Data
• Big Data analytics can reveal important patterns that
would otherwise go unnoticed.
• Taking the antidepressant Paxil together with the anticholesterol drug Pravachol could result in diabetic blood
sugar levels. Discovered by
• (1) using a symptomatic footprint characteristic of very high blood
sugar levels obtained by analyzing thirty years of reports in an FDA
database, and
• (2) then finding that footprint in the Bing searches using an
algorithm that detected statistically significant correlations. People
taking both drugs also tended to enter search terms (“fatigue” and
“headache,” for example) that constitute the symptomatic footprint.
Obama and Big Data
• “Aiming to make the most of the fast-growing volume of
digital data, the Obama Administration today announced a
Big Data Research and Development Initiative. By
improving our ability to extract knowledge and insights
from large and complex collections of digital data, the
initiative promises to help solve some the Nation’s most
pressing challenges.”
• Office of Science and Technology Policy, Obama Administration
Unveils “Big Data” Initiative: Announces $200 Million In New R&D
Investments (Executive Office of the President, March 29, 2012),
http://www.whitehouse.gov/sites/default/files/microsites/ostp/big_da
ta_press_release.pdf.
Foreign governments
Voluntary information sharing,
Reporting requirements
Government
Internal sharing
Reporting
External sharing
Investigation
Surveillance
Institution
Transparency
requirements
Privacy and security
requirements
Investigation
Consumers and others seeking services
Indiscriminate Collection
• Big Data typically involves collecting diverse
types of data.
• “In an intelligence driven security model, the definition
of ‘security data’ expands considerably. In this new
model, security data encompasses any type of
information that could contribute to a 360-degree view
of the organization and its possible business risks.”
• Sam Curry et al., “Big Data Fuels Intelligence-Driven Security” (RSA,
January 2013), 4, http://www.emc.com/collateral/industry-overview/bigdata-fuels-intelligence-driven-security-io.pdf.
Indefinite Retention, Unpredictable Uses
• The information is typically retained for a long
time
• to use in unpredictable ways.
• as the Pravochol/Paxil example illustrates.
• The example also illustrates the rationale: the
discovery of patterns we might not otherwise
notice.
•
Loss of Informational Privacy
• Informational privacy is the ability to determine for
ourselves what information about us others
collect and what they do with it.
• None of the developments just outlined can
happen without a loss of control over our data.
We Lose Control, They Gain It
Information
aggregators
Our data
Businesses
Government
“We can
determine
where you
work, how you
spend your
time, and with
whom, and
with 87%
certainty
where you'll be
next Thursday
at 5:35 p.m.”
Increased
power to
control from
knowing our
location data.
But James Rule, pre-Big Data
• Information processing practices now “share a
distinctive and sociologically crucial quality: they
not only collect and record details of personal
information; they are also organized to provide
bases for action toward the people concerned.
Systematically harvested personal information, in
other words, furnishes bases for institutions to
determine what treatment to mete out to each
individual . . . Mass surveillance is the distinctive
and consequential feature of our times.”
• James Rule, Privacy in Peril, © 2007, completed in 2006
•
New Privacy Problems?
• Changed privacy problems.
• A particularly complex and difficult tradeoff
problem takes center stage.
• Big Data presents a much wider range of both
risks and benefits—from detecting drug
interactions to reducing emergency room costs to
improving police response times.
Privacy Advocates and Courts
• Privacy advocates insist that
• We adopt severe restrictions on data collection, use,
and retention, and .
• that courts should see the invasion of privacy as a
compensable harm.
• Courts
• Refuse to see a mere invasion of privacy as a
compensable harm
• Do not curtailed massive data collection, and
• Rarely hold businesses liable for data breaches.
And the Rest of Us: What We Want
• More control over our information, but without giving up
the advantages information processing secures:
• We are willing trade.
• Humphrey Taylor, Most People Are “Privacy Pragmatists” Who,
While Concerned about Privacy, Will Sometimes Trade It Off for
Other Benefit, THE HARRIS POLL (2003).
• What is the current mechanism for making privacy
tradeoffs?
• Government: Constitutional and statutory constraints
on government surveillance.
• Dana Priest and William M. Arkin, Top Secret America: The Rise of
the New American Security State .
• Private business: Notice and Choice.
Notice and Choice
• The “notice” is the presentation of information
• Typically in a privacy policy.
• The “choice” is some action by the consumer
• Typically using the site, or clicking on an “I agree”
button.
• Claims:
1. Notice and Choice ensure free and informed
consent.
2. The pattern of free and informed consent defines
an acceptable tradeoff between privacy and the
benefits of information processing.
What We Have—Contractually Realized
Notice and Choice
Advertising
ecosystem
Business
Government
Consumer
Payment
system
Aggregators
The Dominant Paradigm
• It is well known that these claims are false.
• Even so, Notice and Choice dominates public
policy in both the US and the EU.
• An unsympathetic but not entirely inapt analogy:
The old joke about the drunk and the streetlight.
• Why do policy makers and privacy advocates
continue to look under the streetlight of Notice
and Choice when it is clear that consent is not
there?
The Failure of Notice and Choice

Notice and Choice fails to



To ensure free and informed consent.
To define an acceptable tradeoff between
privacy and the benefits of information
processing.
I focus on the problems about informed
consent.
Informed Consent Impossible
• Two features of the advertising system
make it impossible for a Notice to contain
enough information:
• Complexity, and
• Long-term data retention.
Complexity
• The specificity assumption: informed
consent requires knowing specific detail
about what happens with the one’s
information.
• The advertising system is too complex for a
Notice to provide the required detail.
Long-Term Data Retention
• Current practice is to store information for a long
time, to be used in ways we cannot now predict.
• What we cannot predict now we cannot now write
down in a privacy policy, so
• it is not possible for the policy to be informative
enough.
The Wrong Tradeoff
• Why would individual decisions based on
information available at the time somehow add up
an acceptable tradeoff?
• Even if Notices could, per impossible, contain all
relevant information, and even if all visitors read
and understood Notices, they would not have the
information they need.
• The information required to adequately balance
the benefits and risks concerns complex societywide consequences that unfold over a long period
of time.
Data Restrictions
• Proponents of Notice and Choice insist on
restrictions on data collection and use:
• The Federal Trade Commission: Companies
should
• limit data collection to that which is consistent with the
context of the transaction or the consumer’s relationship
with the business
• implement reasonable restrictions on the retention of
data and should dispose of it once the data has outlived
the legitimate purpose for which it was collected.
How Do We Get Out of Here?
• Notice and Choice first.
• We—my co-author Robert Sloan and I—think
what policy makers have missed is the role of
informational norms,
• Norms that govern the collection, use, and
distribution of information.
• What follows is a bare bones outline of the idea,
for more
• Robert Sloan and Richard Warner, Unauthorized
Access: The Crisis in Online Privacy and Security, July
2013,
http://www.crcpress.com/product/isbn/9781439830130.
What Norms Can Do
• When informational norms govern online
businesses data collection and use practices,
website visitors
• give free and informed consent
• to acceptable tradeoffs.
• As long as the norms are consistent with our
values. Call such norms “value-optimal.”
Informed Consent
• A visitor’s consent is informed if the visitor can
make a reasonable evaluation of the risks and
benefits of disclosing information.
• Suppose visitors know transactions are governed
by value-optimal norms, then:
• they know that uses of the visitor’s information—
both uses now and uses in the unpredictable
future—will implement tradeoffs between privacy
and competing goals that entirely consistent with
their values.
Tradeoffs
• All informational norms—value-optimal and non-
value-optimal alike—implement a tradeoff
between privacy and competing concerns.
• They permit some information processing, and thus
secure some of its benefits, but they protect privacy by
allowing only certain processing.
• When the norm is value-optimal, the tradeoff it
implements it is justified by visitors’ values. The
tradeoff is acceptable in this sense.
The Lack of Norms Problem
• Rapid advances in technology have created
many situations for which we lack relevant valueoptimal informational norms. Two cases:
• (1) relevant norms exist, but they are not valueoptimal;
• (2) relevant norms do not exist at all.
Now What About Privacy Harms?
• The norms approach works—if indirectly.
• We can reduce the risk of harm problem by
reducing unauthorized access.
• Can we reduce it to the point that we can
adequately address the remaining increased risk
of harm through existing means—insurance and
recovery from identity theft?
• Whether we can is a matter of norms
• appropriate product-risk norms for software.
• and appropriate service-risk norms for malware.
Download