Uploaded by cvsandcovers

You are being Tracked

advertisement
YOU ARE BEING
TRACKED
BY MW RENALD
You are being tracked
By MW. Renald
Table of contents
TABLE OF CONTENTS
2
CHAPTER 1. TRACKING YOU AND YOUR DATA, GEOLOCATION
4
1.1 Introduction
4
1.2 User-Volunteered Geolocation Data Collection
6
1.3 Collateral information data collection
8
1.4 Secretly Geolocation Data Collection
9
CHAPTER 2. BIOMETRICS TRACKING
13
2.1 Introduction
13
2.2 Biometrics identification and verification
13
2.3 Forms of Biometrics
15
CHAPTER 3. INTERNET TRACKING, YOU ARE BEING FOLLOWED
22
3.1 Introduction
22
3.2 How the Web and the Internet Work
22
3.3 Tracking with Cookies and JavaScript
23
3.4 Tracking with Browser Information
24
3.5 Tracking by Your ISP
26
3.6 What Can You Do?
27
CHAPTER 4. DATA COLLECTION METHODS
30
4.1 Introduction
30
4.2 How Much Information Is Collected and How?
30
4.3 Predictive algorithms
33
4.4 Shadow Profiling
35
4.5 What can I do?
37
CHAPTER 5. NOWHERE TO HIDE?
39
5.1 Health-Related Data collection
39
5.2 Facial Recognition Technology
41
5.3 Tattoo Recognition
43
5.4 Advertising Kiosks
43
5.5 What Can You Do?
45
CHAPTER 6. THE DARK WEB
47
6.1 Introduction
47
6.2 Tor Browser
47
6.3 The Dark web anatomy
49
6.4 Dark Web Activities
50
6.5 What Can You Do?
53
CHAPTER 7. THE FUTURE OF PERSONAL DATA
55
7.1 Introduction
55
7.2 DNA as Personal Data
55
7.3 DNA profiles
58
7.4 The Future of Privacy Regulations
59
7.5 The Legislative Future of Personal Data
60
Chapter 1. Tracking you And Your Data, Geolocation
1.1 Introduction
What if your local police department had the technology to individually track and
follow you and any of your neighbours as you walked around town? What if
advertisers could do the same? What if your spouse's divorce lawyer could? You
are where you go and for that purpose, there is a tracking phenomenon known as
geolocation, which identifies where a person is physically and uses that to derive
certain influences from that information. The Supreme Court in a case captioned
United States versus Jones address some of the law and privacy issues of
geolocation in a matter involving the decision of FBI agents to put a global
positioning system or GPS tracking device on a vehicle belonging to a man named
Jones.
A GPS system works by interacting with satellites that orbit the Earth. It was
developed in the 1970s for military use, and was open for commercial
development only in 1995.
It's the basis for the electronic navigator in your car, and on your smartphone.
Without it, Uber and Google Maps just wouldn't work. Interestingly, the key to
geolocation isn't geography. It's time. That's why GPS satellites carry atomic
clocks that tell time by measuring the decay of radioactive isotopes. The clocks
are synchronised to each other and to clocks on the ground. To the extent the
small errors creep into the clocks, they are corrected every day and that's why
GPS is based on accurately knowing what time it is.
GPS satellites broadcast their time and their location continuously, a GPS receiver
listens for these signals. With the signals from four different satellites, it can
calculate exactly where it is, and exactly what time it is based on small
differences in how long the signal takes to reach the receiver. The general
accuracy of a commercial GPS system falls within about 3 meters or 10 feet. But
in practice, it can be even more precise since the receiver is continuously
calculating its location and averaging the measurements it's making. GPS tracking
is used for a wide variety of commercial and military applications. It's the main
way that we navigate. Almost nobody uses paper maps anymore. It's at the heart
of futuristic concepts like driverless cars and, of course, GPS is the core idea
behind precision guided munitions that hit military targets with great accuracy
and specificity. GPS also has surveillance applications, as must no doubt come as
a surprise to you since this is a book on tracking, and the example of the criminal
suspect Mr. Jones. His name is Antoine Jones and he was the operator and owner
of a Washington DC nightclub. FBI agents attached a GPS tracker to a Jeep
registered in the name of his wife. The vehicle was parked in a public lot at the
time.
Authorities suspected Jones was a drug dealer, thus they tracked him to a
location that they came to believe was a Drug Stash house, a place outside the
home where dealers often hide their narcotics. The house was raided, drugs were
found, Jones was arrested, and he was convicted.
But the FBI had made a mistake, they hadn't gotten a warrant to put the tracker
on the car. The government's argument was simple, cars travel on public roads,
public roads are in public view, So Jones has no expectation of privacy in his
travels and therefore, no warrant is needed. What the police lost and Jones won,
the Supreme Court decided the case nine to nothing.
Five justices thought that Jones should prevail for a very narrow and limited
reason that the federal agents had trespassing on his property in placing the GPS
tracker on his car, and that they were required a warrant to legally do so. Fourth
other justices would have decided the case on a broader ground. They said that
the collection of a large volume of data, what we've called the big data problem
raises constitutional issues, because it allows for the creation of a so called
mosaic picture of Jones. In other words, by combining many snippets of
information, authorities could piece together a much more comprehensive
picture of Jones's life then was revealed by any individual piece. In reality, this
picture suggested Jones was a drug dealer, but in other settings, it might have
been used to determine whether it was a Democrat or Republican. Thus, Jones
illustrates two points.
First, the case demonstrates something about the revealing power of geography.
What enforcement authorities concluded that they knew what Jones was because
of where he went. That's very useful analytics and it's also very spooky.
Second, we can infer that the mosaic distinction requires some line drawing, but
that nobody knows exactly where the line is. How many snippets of information
are enough to create a mosaic? Nobody knows. And the majority of the court
didn't answer the question at all. But we can recognise that the Supreme Court is
thinking hard about the Fourth Amendment protections against unreasonable
searches and seizures and applying them in an era of big data where the search
might consist of digital scraps of information.
After the Supreme Court vacated, Mr. Jones has committed The government
offered him a deal. He refused and went to trial. The jury did not reach a verdict,
maybe because there was no GPS evidence. But rather than face yet another trial,
Jones reached a plea agreement and was sentenced to 15 years in prison.
As we survey the field of geolocation, I want to identify three separate concepts
that form a useful framework for our discussion. They are related broadly
speaking to the manner in which geolocation information is collected. First, in
some instances, we volunteer that information to the world around us. In others,
geolocation data, art collateral information, that is necessarily collected as part of
some other process like making a phone call. And third, we can talk about a way
of collecting geolocation data through surreptitious means.
1.2 User-Volunteered Geolocation Data Collection
Aaron Schock is a former republican congressman from Illinois, who seemed to
have it all, a safe district, good ideas, solid political prospects and a great social
media footprint. He was much more well-known than many of his peers on
Capitol Hill and they have been marked with the label rising star. Once dubbed as
the ripped representative and the fittest man in Congress by Men's Health
magazine. Schock also learned about the dangers of geolocation in a era of selfies
and Instagram. Schock spent taxpayer campaign funds on flights aboard private
planes owned by some of his key donors.
The Associated Press identified at least a dozen flights valued at more than
$40,000 on donors planes e also reportedly enjoyed other expensive travel and
writing up significant personal entertainment charges, including massages and
tickets to music concert. How did The AP know all this? They tracked Schock
reliance on the aircraft, partially through the congressman's penchant for
uploading pictures and videos of himself to his Instagram account.
They extracted location data associated with each image and correlated it with
flight records showing airport stopovers and expenses built for air travel against
Schock office and campaign records. If you take pictures with your smartphone,
you are creating location data about yourself. Your camera stores a bunch of data
about every picture you take. It records the aperture, shutter speed, ISO speed,
camera mode, focal distance, and sometimes even more than that.
All of this is stored in the EXIF data, an extra piece of information attached to
every picture file your camera creates. It's called the exchangeable image file.
EXIF data has been around since the early days of digital photography. Today,
one thing that the photo puts in the Exif is your geolocation.
Almost all smartphone cameras geotag the photos they take and once you put the
picture up on Instagram, or most any other photo collection programme, it's a
simple process to download the photo, select its properties and find the EXIF
data for the picture. That's how The Associated Press knew where comes from
shock was when he took all the pictures. As shocked as the former Congress
might have been after the AP revealed that it had been tracking his whereabouts
some of us advertise the same information. We run around round formally
tagging our location and checking in at various places. If you use an app like
foursquare swarm, you are purposefully broadcasting where you are and it's
pretty easy to accumulate that data and use it to draw a picture of an individual's
activities.
For example, Raytheon has developed something it calls the Rapid Information
Overlay Technology (RIOT). RIOT uses only publicly available data from social
media programmes like Instagram, Facebook, foursquare and Tiktok. With that
information, you can draw a detailed picture of a person based on where he goes.
Raytheon understands the power of this sort of analytic tool and the peril. That's
why it describes the RIOT tool as privacy protective.
Jared Adams, a spokesman for Raytheon's intelligence information systems
department, says this ‘’RIOT is a big data analytics system design. We are
working along with industry, national labs, and commercial partners to help turn
massive amounts of data into usable information to help meet our nation's
rapidly changing security needs. Its innovative privacy features are the most
robust that we're aware of enabling the sharing and analysis of data without
personally identifiable information, such as social security numbers bank or
other financial account information being disclosed.’’
We see one of the common tools that systems integrators often use as a means of
ameliorating privacy and civil liberties concerns, the tool of partial masking or
pseudonymity. By scrubbing with data of personally identifiable information, but
still making it capable of being correlated and analysed, you can create a two step
process that is fought to be more robust in protecting privacy.
At the first step, data that is scrubbed of identity markers is linked together in
patterns. Only when those patterns meet some threshold of concern and
typically, when some third party or supervisor verifies that the threshold has
been exceeded, only then is the anonymity of the data removed and identifying
information added back in.
In this way large volumes of innocent collateral data can be collected and sifted
in an automated fashion. Without, it is said threats to privacy. Of course, to rely
on that system, you have to trust the process.
Former Congressman Schock Instagram account revealed what we might think of
as deliberate, but inadvertent geolocation sharing. He had probably never heard
of the EXIF file. Now that he has, he can turn off the identifying information and
still use his camera and his Instagram account, just like he used to. The geotag is
not an essential function but what about when it is essential?
1.3 Collateral information data collection
Geolocation is essential to navigation functions, such as Google Maps, for
example. That's the type of functionality you can't really turn off Still navigate.
And so the only way to avoid exposing your location data is not to use that
function at all.
For years, some people refused to get an E-ZPass to travel on toll roads. Their
problem with this electronic system of capturing your payment and path as you
pass by was that as you track up tolls, the E-ZPass created a record of where you
are and where you've been. That's a record a geolocation record that law
enforcement and others can collect and analyse just like any other geo record.
We have seen E-ZPass type tool records used in everything from criminal
investigations to divorce proceedings. They accepted the inconvenience of longer
wait time. In exchange for a little bit later, personal obscurity. But some other
geolocation functions are, for all intents and purposes, an essential component of
modern day life. When that happens, then the sort of surveillance that in other
contexts might seem only a bit creepy can begin to become pretty scary and even
downright authoritarian.
Think for example, about your cell phone. Not all of the super sophisticated
location apps that you could do without rather think of the phone itself and the
voice and text communications that are probably at the core of your personal
mobility and your personal connectivity. These features also allow us to know
exactly where you are all the time. Your cell phone is constantly reporting your
location to the nearest cell towers. That's how the telephone system knows
where you are so it can connect a call to you. Otherwise, cell phone service really
just wouldn't work. The phone company keeps those records of where your cell
phone is, or was. That means that they know where you are right now and also
where you've been.
The German politician Malte Spitz used his cell phone records and a Google Maps
to create a video log of all his movements for six month time period. He did this
to make a point because most of where we go is innocuous.
But if I have six months of your travel logs, I also know a great deal about you.
Maybe you are not worried about what your phone company knows. But what if
they sell it to some commercial advertiser? Or what if the government issues a
subpoena, and collect all those records about you? The issue is highly contentious
but the law does not protect information you share with a third party. When you
voluntarily broadcast your location to the Cell Phone Company or Facebook that
means that there's no constitutional rule that prevents them in turn, giving the
information to the USA government.
That is a pretty hot definition of voluntary, that kind of implied consent has a
very forced feel to it. We can't turn the geolocation part of the cell phone off, at
least not if we want our cell phones to work, and we can't really quit society. Our
consent is in effect, coerce. That's why a few courts in USA are starting to take a
different view and extending the law governing warrants to cover cell tower
records. They're saying that in the absence of a warrant based on probable cause,
the government can't secure these historical records.
And that extension, which limits police methods, naturally brings with it
problems of a different sort. Sometimes geolocation cell tower data can be
powerful evidence of criminality. If you are a fan of forensic files, in one case for
example, cell tower data located the defendants’ Mobile phone in close proximity
to six different armed robberies. Federal investigators also use cell tower
location records to establish that a State Department whistleblower was in the
same place as a TV reporter who later published leaked classified information.
We need as a society to choose how much or little access we want to give the
government to geolocation data. The basis for this choice comes down to a rough
form of cost benefit analysis. If we think the value of the positive uses of a
technology are great enough, then we will deploy it while trying to manage its
use through warrant requirements or data retention rules in the light. But if we
are concern that technology provides too much surveillance power to the
government, we often consider banning the technology altogether. That step isn't
always possible as we are too dependent on the technology. And that I think
explains why some groups always fight so hard to prevent a new technology from
coming online in the first instance, not because initial users are so abusive but
because a step down that technological path can't easily be untaken. Think of
Social Security once your identifying number had a sole purpose and no other
now it's a personal identifier that's ubiquitous.
Knowing what we know now, what might be the right way to structure
government access to cell tower geolocation data, would you require it to be
deleted upon collection or stored only for a few hours, so that could be used in
emergency cases like an ongoing kidnapping, or held for longer periods, but
available to the government only with a warrant or never available to the
government but you usable by the commercial sector. You don't need to answer
that right now. But you do need to be thinking about it.
1.4 Secretly Geolocation Data Collection
Let's turn to the collection of geolocation data by the government without your
knowledge or consent. At least in the cell phone example, we could point to your
implicit agreement through your use of the company's geolocation technology.
But what happens when the government starts tracking you without your
knowledge, as in the case of Mr. Jones.
One important example this is exemplified by a tracking system known as
Stingray. When a stingray tracking device is turned on, it pretends it is a
cellphone tower. It simulates the call out from the tower to nearby phones, even
when they're not turned on. Those phones in turn, respond To the Stingray by
reporting in their phone number, and a unique electronic serial ID number.
According to the Non-profit civil liberties organisation in Washington, known as
the Electronic Privacy Information Centre (EPIC), government investigators and
private individuals alike, can use Stingray and others cell site simulators to
locate, interfere with, and even intercept communications from cell towers and
other wireless devices. EPIC tells us that the FBI has used these simulators to
track and locate phones and users since at least 1995.
How powerful is this technology? According to The Wall Street Journal, the US
Marshal Service flies planes carrying devices that mimic cell phone towers in
order to scan the identifying information of Americans phones. As it searches for
criminal suspects and fugitives. Under this programme, the government collects
data from thousands of mobile phones along the way, it also collects and then it
says discards data on a large number of innocent Americans.
The Justice Department justifies the phone records collection programme by
arguing that it is minimally invasive and an essential way to attempt to track
terrorists and criminals. Its main virtue from the government's perspective, is
that the programme eliminates the need to go to phone companies as an
intermediary in searching for suspects. Rather than asking a company for cell
tower information to help locate a suspect, a process that law enforcement has
criticised as slow and inaccurate. The government can get that information itself.
Naturally, others see that is problematic. Christopher Soghoian Chief
Technologist of the American Civil Liberties Union who characterises it as
dragnet surveillance. He says Stingray in the air is ‘’… inexcusable and it's likely
to the extent that judges are authorising it, that they have no idea of the scale of
it.’’
Now, one recurring theme in contemporary issues of law and technology is the
balance to be sought between secrecy and transparency. It's important for
citizens to know what their government is doing. On the other hand, it's clear
that the disclosure of certain surveillance techniques can significantly diminish
their utility. Most of what you now know about Stingray is a matter of public
record, derived from enterprising journalists and litigants who have run into the
programme as part of ongoing criminal proceedings for example.
But the US government is anxious, some might say desperate to keep the
technology out of the hands of non-government actors. To that end, whenever
they're asked to produce the equipment in court, they dismiss the case, rather
than disclose how the Stingray actually works.
The Washington Post brought us the story of Tadrae Mackenzie, Mackenzie was a
small time criminal living in Florida. He and two of his friends robbed another
small time crook, a marijuana dealer, armed only with BB guns.
At first, the police had no leads to this crime. But the robbers stole the victim's
cell phone as well. So the Tampa police got a court order directing the local
telephone company to give them the cell tower location data which would allow
them to track the device. The cell tower records helped, but they were not
specific enough. The records gave the police a general location for where the
stolen cell phone was, which helped them narrow their search to a general
neighbourhood, but no more. Then the police use the Stingray device to more
narrowly focus their investigation. Eventually, using the Stingray, the police were
able to narrow down the location of the stolen cell phone to a single specific
house, which they put under surveillance and when and when Mackenzie left one
morning, he was arrested.
Fast forward to the preliminary hearing in Mackenzie's case, the lead police
officer, upon being asked how he had been able to identify the specific house
where the stolen phone was, declined to answer. He said that he had used a
device that was subject to a nondisclosure agreement with the FBI, and therefore,
that he wasn't allowed to tell the defence counsel how the machine operated. The
Florida judge handling the matter was not amused. He ordered the government
to produce details of the stingray and how it operated. Rather than do that, the
state prosecutor offered McKenzie a deal. And so McKenzie agreed to plead guilty
to a second degree misdemeanour and he received six months probation.
Because of the secrecy surrounding how the Stingray works, Mackenzie got off
easy. It's almost as if he won the lottery.
And that should give all of us pause. If you are a one order advocate, you should
be concerned that a guilty man went essentially free. If you're a champion of civil
liberties, you should be concerned that the Stingray and its technical details
remain secret. We are I'm afraid in a very unsteady and unstable place where
neither answer satisfies. More to the point such an unreconciled conflict is simply
untenable in the long run. Either the government's use of new technologies in the
public sphere will have to be fully disclosed and made subject to adequate
oversight, or the police are going to have to give up such surveillance and
tracking tools if they can't withstand the scrutiny that comes with its use in a free
society.
One small step in the direction of reconciling the conflict. The Department of
Justice has now said that as a matter of policy, but not legal obligation, federal
law enforcement officers will seek warrants before using a stingray. That's
probably a sound result but note that the Department of Justice doesn't bind the
state and local law enforcement who also use stingray.
Here's a final thought from Christopher Soghoian Chief Technologist of the
American Civil Liberties Union. He points out that there is yet one more reason
why the secrecy surrounding sting raises problematic. The idea is that if the FBI
can use sting rays, then so can our enemies.
Christopher Soghoian says: ‘’our government is sitting on a security flaw that
impacts every phone in the country. If we don't talk about Stingray style tools
and the flaws that they exploit, we can't defend ourselves against foreign
governments and criminals using this equipment too.’’
Chapter 2. Biometrics tracking
2.1 Introduction
At one point in the movie Minority Report, the lead character chief john Anderton
is on the run. He needs to change his identity. But he can't, at least not easily
because the central government has everybody registered by a unique
identifying characteristic, pattern of blood vessels in their eye. This technique,
known as iris recognition, is actually in its growth stage today. The future
imagined in Minority Report marries the capability of uniquely identifying people
through their eye patterns to a small universal scanner that scares like a spider
tracking taking the picture of every individual eye in order to identify them.
Because your eye pattern is unique and immutable, the government sees this as a
way of conclusively identifying malefactors. Citizens see it as a way of exercising
control. This is not the stuff of science fiction. During the wars in Iraq and
Afghanistan, US forces used iris scanning technology as a sort of digital filing
system on civilians and others they encounter. Anderton's fiction is today's
reality. The only way for john Anderton played by the actor Tom Cruise to avoid
this surveillance is to change his eyes. In a rather gruesome scene, he goes
through an operation in which eyeballs harvested from a cadaver are
transplanted into his eye sockets. That seems pretty extreme and it's beyond the
realm of possible today. It gives you a sense of both the power and the peril of
biometric identification.
In this chapter, I want to ask, why do we care? What is it about biometrics that
make them useful? Then I want to tell you about the technology itself, what it is
and how it works and finally, we'll close with some thoughts about the dark side
of the technology, how it might threaten civil liberties.
2.2 Biometrics identification and verification
Why biometrics are interesting, comes down to the problem of establishing one's
identity. If I say to you, my name is Eric Trump. How do you know that I am not
Donald Trump pretending to be an eagle? How do we verify my identity?
The problem came into Stark highlight after the September 11 attacks. The
government's comprehensive review identified a number of gaps in America's
security infrastructure, including the USA inability to know who was who.
The Florida driver's licence picture of Mohamed Atta has become an iconic
symbol of the insecurity of the identification apparatus. Miss identification is a
critical and endemic problem and that's why biometrics are increasingly
important. In a post 911 world, the USA want to link the biographic information
they have available about risks associated with an individual be at risk of
financial fraud, abuse of eligibility for benefits or being a potential terrorist to a
verifiable biometric characteristic that is a physical characteristic that is in
possible to change unless you are Tom Cruise in Minority Report.
In every walk of life as a basic building block of risk assessment, we think it's
imperative that we have confidence that people are who they say they are.
Consider some of the uses to which a verify biometric identity can be put. Getting
through the airport is easier for trusted travellers. Establishing access control
checkpoints to let people into buildings and computer systems or to keep them
out, verifying credit and other consumer behaviour thereby pinpointing or
streamlining retail transactions, reducing fraud and resulting in lower fees.
Eliminating voter fraud and ending the voter id debate verifying age and legal
authorization to drive or vote or drink alcohol and so on.
Biometrics are actually among the oldest of new technologies. They began with
fingerprints early in the 20th century and today include more novel ideas like
gait recognition, which is the ability to identify an individual by his physical
movement, how he or she walks. Now biometrics can be used in two distinct
ways for verification or for identification.
When a biometric system is used to verify whether a person is who he or she
claims to be, that verification is frequently referred to as one to one matching.
Almost all systems can determine whether there is a match between a person's
presented biometric and a biometric template in a database in less than a second.
Identification by contrast, is known as one to many matching. In one to many
matching framework a person's biometric signal Whether it's an iris or
fingerprint is compared with all the biometric templates within a database. Now
there are also two different types of identification systems for this framework.
Positive and negative.
Positive systems expect there to be a match between the biometric presented
and the template. The systems are designed to make sure that a person is in the
database.
Negative systems are set up the opposite way to make sure that a person is not in
the system. Negative identification can also take the form of a watch list, where a
match triggers a notice to the appropriate authority for exclusionary action.
Neither system generates perfect matches or exclusionary filters. Instead, each
comparison generates a score of how close the presented biometric is to the
stored template. The system is compared compare that score with a predefined
number or with algorithms to determine whether the presented biometric and
the template are sufficiently close to be considered a match.
Most biometric systems therefore require an enrollment process in which a
sample biometric is captured, extracted and encoded as a biometric template.
This template is typically then stored in a database against which future
comparisons will be made. When the biometric is used for verification, for
example, access control, the biometric system confirms the validity of the
claimed identity. When used for identification, a biometric technology compares
a specific person's biometric with all the stored biometric records to see if there's
a match. For biometric technology to be effective, the database has to be accurate
and reasonably comprehensive. The process of enrollment, creation of a database
and comparison between the template and the sample is common to all
biometrics.
2.3 Forms of Biometrics
There are many different forms of biometrics. We are going to talk about four of
the most common fingerprints, iris recognition, facial recognition and voice
recognition. Then we will mention two other forms of biometrics, hand
geometry and gait recognition and we will end our description of biometrics with
DNA analysis.
Fingerprint recognition is probably the most widely used and well known
biometric. Fingerprint recognition relies on features found in the impressions
made by the distinct ridges on the fingertips. There are two types of fingerprints,
flat and Roll.
It used to be that fingerprint comparisons were made by hand with experienced
examiners making judgments about matches. Today, fingerprint images are
scanned and enhanced and then converted into templates. These templates are
saved in the database for future comparisons using optical, silicon or ultrasound
scanners. In Pakistan, the government requires everyone with a cell phone and
SIM card to register with their fingerprints, saying it's an anti-terror initiative,
since untraceable unregistered SIM cards were proliferating as a means of
terrorist communication. Another area where fingerprint biometrics have been
used is for identity and access management in healthcare, for example VA or
teaching hospitals.
The biometric technology is used to solve the challenge of how hospitals can give
access to users and yet maintain security levels that provide competence and
comfort to their patients. This is a critical challenge since greater security usually
decreases access. Using fingerprints has seemed to work as a way of protecting
patient privacy without too much inconvenience for the doctors.
Iris recognition technology, relies on the distinctly coloured ring that surrounds
the people of the eye. Irises have approximately 266 distinctive characteristics,
including things like a troubled ocular meshwork striations rings, furrows, a
Corona and freckles. Retinal scanning, by contrast, looks at the blood vessel
patterns in the iris, it's the same idea implemented in a slightly different form.
For iris recognition, typically, more than 170 of the distinctive characteristics are
used in creating a template.
Irises form during the eighth month of pregnancy and the thought to remain
stable throughout an individual's life barring injury iris recognition systems
usually start with a small camera that takes a picture of the iris. Pictures then
analysed to identify the boundaries of the iris and create a coordinate grid over
the image. Then the hundred and 70 characteristics found in each different zone
are identified and stored in a database as the individual's biometric template. Iris
recognition technology is relatively easy to use and can process a large number
of people quickly.
It's also only minimally intrusive in a physical sense. However, coloured or
bifocal contact lenses might hinder the effectiveness of the Irish recognition
system as can strong eyeglasses, glare or reflection can also be problematic for
the cameras. In addition and people with poor eyesight occasionally have
difficulty aligning their eyes correctly with the camera and people who have
glaucoma or cataracts might not be suitable for screening using Irish recognised
technology, but it is useful.
The United Arab Emirates has found iris recognition to be an effective over
security means for preventing expelled foreigners from re-entering the country.
The UAE faced a situation in which an expelled foreigner would return to his or
her home country and then legally change his or her name, date of birth and
address all descriptors traditionally used to screen individuals entering the
country since the new identity would not be in any of the traditional maintained
name dependent lists, government agents would then admit the banned
individual to the UAE when he returned. To counter this problem, the small Arab
country began developing a biometric system that could be used to scan all
individuals arriving in the country and determine whether the person was
banned from entering. The UAE specifications for the system included using a
biometric that didn't change over time, could be quickly acquired was easy to use
could be used in real time was safe and non invasive and which could be scaled
into the millions. The Emirates determined that iris recognition technology was
the only technology that produced a single person match in a sufficiently short
period of time to meet its needs. According to the country self report, the system
is remarkably effective. After the first 10 years, the use of Iris scans has they say
prevented the re-entry of 347,019 deportees. A statistical analysis of the
programme suggests that the likelihood of a false positive match, that is the
system would miss identify someone as registered when they are not, is less than
one in 80 billion. Face recognition technology identifies individuals by analysing
certain features on their face, may look at the nose with or the eye sockets or the
mouth. Typically facial recognition compares a live person with a stored
template. But it's also been used for comparison between photographs and
templates.
This technology works for verification, and also for identification. MasterCard is
now in the process of trialling a new facial recognition app for your smartphone
that will let you use your face as a way of verifying your identity and approving a
credit card transaction. Amusingly, in order to prove that it's a real face taken as
a selfie not a picture, you actually have to blink while the picture is being
processed to prove you are alive.
In addition, facial recognition is the biometric system that can best be routinely
used covertly, since a person's face can often be captured by video technology. In
other words, you may never know if a photo is being taken of you and compared
to some day database. Deep face, the facial recognition technology developed by
Facebook is said to be 97% accurate, making it competitive with human
distinguishing capabilities.
Voice recognition technology identifies people based on vocal differences that
are caused either by differences in their physical characteristics like the shape of
their mouth or from speaking habits, like an accent. Such systems capture
samples of a person's speech as scripted information is recorded multiple times
into a host record keeping system. That speech is known as the passphrase. This
passphrase is then converted to a digital form and distinctive characteristics like
the pitch, cadence and tone are extracted to create a template for the speaker.
Voice recognition technology can be used for both identification and verification.
The use of the technology requires minimal training for those involved. It's also
fairly inexpensive and very non intrusive. The biggest disadvantage with the
technology is that it can be unreliable. For instance, it doesn't work well in noisy
environments like airports or border entry points.
Another form of physical recognition is a measurement based on the human hand
and the width, height and length of the fingers, distances between the joints and
the shape of the knuckles. It's called hand geometry, using optical cameras and
light emitting diodes that have mirrors and reflectors to orthogonal two
dimensional images of the back and the sides of the hand are taken. Based on
these images, 96 measurements are calculated and a template is created. Most
hand readers have pins to help position the hand properly. These pins help with
consistent hand placement and template repeatability, so there is a low false
positive rate and a low failure to accurately match as well. Hand geometry is
actually a mature technology, primarily used for high volume, time and
attendance and access controls.
Hand geometry works particularly well when many people need to be processed
in a short period of time, so long as it's one to one matching. Although people's
hands differ, they're not really individually distinct. As a result, hand geometry
technology can be used for the one to many matching procedure we discussed a
while ago.
Hand geometry is perceived as very accurate and has been used in a variety of
industries to regulate access controls for more than 30 years. It is useful in
identifying who's permitted somewhere or to do something and who is not.
It's really very difficult to spoof someone's hand shadow without the person's
cooperation. The main advances in the technology over the years has been cost
reduction. Today, a wide variety of places rely on hand geometry for access. The
San Francisco Airport uses it for access to the tarmac, the port of Rotterdam,
Scott Air Force Base and a sorority at the University of Oklahoma all rely on it.
By contrast, Gait recognition, which I mentioned earlier, is an emerging biometric
technology. It's one that involves people being identified purely through the
analysis of the way they walk. According to the Homeland Security news,
scientists in Japan have developed a system measuring how the foot hits and
leaves the ground during walking. They then use 3D image processing and a
technique called image extraction to analyse the heel strike, roll to four foot Push
off by the toes. Some say that accuracy and recognition is up to 90%. With the
caveat, of course, that if you know you're being watched, you can change your
gait. The idea, however, has attracted interest because it's non invasive and
doesn't require the subjects cooperation. Gait recognition can be used from a
distance, making it well suited to identifying perpetrators at the crime scene. Or
imagine if the USA army have been able to see inside bin Laden's hidden in his
house. Perhaps they could have identified him pacing on the rooftop, just by his
gait. Researchers also envision medical applications for the technology. For
example, recognising changes in walking patterns early on, can help identify
conditions such as Parkinson's disease and multiple sclerosis in their earliest
stages.
DNA analysis is perhaps the most accurate biometric method of one to one
identity verification. You will likely recall what happened to Bill Clinton after
Monica Lewinsky turned over a navy blue dress that she said she had worn
during a romantic encounter with the President. Investigators compared the DNA
in a stain on that dress to a blood sample from the president. By conducting the
two standard DNA comparisons, the FBI laboratory concluded that Bill Clinton
was the source of the DNA obtained from Monica lewinsky's dress.
According to the more sensitive RFLP test, referring to a restriction fragment
length polymorphism used by molecular biologists to follow a particular
sequence of DNA as it's passed on to other cells. The genetic markers contained
in Mr. Clinton's DNA are characteristic of one out of 7.87 trillion Caucasians. On
the flip side, DNA evidence has increasingly come to be used to exonerate the
wrongfully accused and convicted. Hundreds of such cases have been overturned
at least 20, which involve people who had served time on death row.
Biometrics are great. What could possibly go wrong? The answer rest was
whether or not we're comfortable with the government having an immutable
record of who we are and what we do. One development in recent years that
troubles some civil liberties advocates was the case Maryland versus King, which
was decided by the Supreme Court in 2013. The case asked the question of
whether and when the government could forcibly collect your DNA from you. In
general, authorities can collect DNA from people convicted of crimes. But what if
you are merely arrested and not yet convicted? The Supreme Court by a narrow
five to four majority said that the administrative collection of DNA from all
arrestees was permissible even in the absence of a warrant or probable cause.
What happened to the rule of innocent until proven guilty?
Of course, your DNA is everywhere you are and remains through shedding after
you go. With the result in the King case, the government is now free to assemble a
template DNA national database of anyone who's ever been arrested for a crime.
The best estimate I've seen suggests that the database may in the end, connect
contain the DNA, a one in four Americans with a significantly higher rate for
African Americans. Not all the samples were collected for criminal reasons, of
course, but many were. And all of this suggests that the use of biometric
technologies poses a host of interrelated policy questions. Some of the questions
one might ask are:
Can the biometric system be narrowly tailored to its task? Who oversee the
programme? What alternatives are there to biometric technologies? What
information will be stored and in what form? To what facility or location will the
biometric give access? Will the original biometric material be retained or
biometric data be kept separately from other identifying personal information?
Who will have access to the information? How will access to the information be
controlled? How will the system ensure accuracy? Will data be aggregated across
databases? If data is stored in a database, how will it be protected? Who makes
sure that the programme administrators are responsive to privacy concerns? Can
people remove themselves from a database voluntarily?
In effect, can they unenrolled? How will consistency between data collected at
multiple sites be maintained? If there's a choice, will people be informed of
optional versus mandatory enrollment alternatives?
Some of the fears surrounding biometric information include the date the
gathered without permission, knowledge or clearly defined reasons, used for a
multitude of purposes other than the one for which it was initially gathered, you
know, function Creep, disseminated to others without explicit permission. Use to
help create a complete picture about people for surveillance, or social control
purposes. There are also concerns about tracking, which is real time or near real
time surveillance of an individual and profiling where a person's past activities
are reconstructed. Both of these would effectively destroy a person's anonymity.
Here are some ideas about biometrics to consider:
Enrollment in biometric systems should generally be overt instead of covert
before one is enrolled in a biometric programme, once you probably be made
aware of that enrollment, thus we should be more sceptical of government run
biometric programmes such as public facial recognition that permit the
surreptitious capture of biometric data.
Biometric systems are better used for verification than identification in general
they are better suited for a one to one match, assuring that the individual in
question is who he says he is and has the requisite authorization to engage in the
activity in question. Biometrics are both less practically useful and more
problematic as a matter of policy, when they're used in a one to many fashion to
pierce an individual's anonymity without the justification inherent in for
example, seeking access to a particular location. We should prefer biometric
systems that are opt in and require a person's consent rather than those that are
mandatory. By this, we should not mean that requiring one to opt in cannot be
made a condition of participation. For example, if you want to enter the United
States, you must provide a biometric since participation is ultimately voluntary in
some way.
We also recognise that certain biometric applications like DNA for convicted
criminals may need to be mandatory. However, this should be an exception to the
general rule of voluntariness. Any biometric system we built should have a
strong audit and oversight programme to prevent misuse. Someone must, as
we've said before, watch the watchers and finally, we need to be concerned about
the security of a biometric database.
After all, if your password or credit card number gets hacked, you can change it.
It's inconvenient and costly, to be sure, but it can be done. If your biometric data
gets hacked as happened to many government employees, and the breach of the
Office of Personnel Management security database, there is much more trouble
afoot. You can't after all, change your fingerprint. Centralised storage of
biometric data also raises privacy concerns by tending to enable easier mission
creep. Clearly, for some technologies and applications, local storage won't be
feasible, but to the extent practicable, local storage should be preferred.
But all this pales next to the larger question of who gets to decide? Should
citizens have a right to control their extremely sensitive biometric data? Should
for example, the collection of facial biometrics on a public way be impermissible?
In one sense, the answer seems like it should be obvious. If I can take a picture of
you on the street without your permission, which I can just go on the street and
take a shot, why can't the government? On the other hand, it's the government.
Today however, the decision to move forward with biometrics is not really the
subject of wide public debate. In 2014, the FBI started to use a next generation
identification biometric database, with 14 million face images. Current plans are
to increase that number to 52 million images by 2015 with more images to be
collected in the future. Some communities are even issuing mobile biometric
readers to their governmental staff. The staff, usually police officers, but
sometimes other regulatory agents can take pictures of people on the street or in
their homes and immediate identify them and enrol them in face recognition
databases.
Biometric technologies are likely to be of great value in creating secure
identification, but to be useful and acceptable, they need to be privacy and civil
liberties neutral. They can and should be designed with appropriate protocols to
ensure privacy before they are implemented, on that perhaps we all can agree.
Chapter 3. Internet tracking, you are being followed
3.1 Introduction
In our previous chapter, we've discovered some of the many ways your personal
data is harvested through different websites and across digital platforms. We've
seen how a lot of this happens unexpected ways, but sometimes with unexpected
consequences. By now you're getting an idea of the scope of data collection and
the wide variety of ways it can be used, both good and bad. And you're
developing a sense of where you fall on the privacy spectrum. Now it's time to go
deeper into how this works from a technology standpoint, because the more
fluent you are in how this happens, the more nuanced and effect if you can be in
your decision making.
In this chapter, we'll take a look under the hood of web tracking how websites
keep track of you and your actions and the data you produce. To do this, we need
to begin with a primer on how the web and the internet work, because this will
help us better understand the tracking technologies built on top of that.
3.2 How the Web and the Internet Work
The web, the internet, these terms are often used interchangeably, but they're
actually different things.
The internet is the global network of computers. The web is a network of
websites that operate on the internet. To use an old metaphor, you can think of
the internet like highway system, the roads and interchanges and intersections
that are all part of the Internet. The web would be something like a national and
local public bus network. It uses the highway system, but it's not the same as the
highway system. It's a way to transport things over that network.
Information is transmitted over the internet on the web. Web using protocols,
which are basically rules about how to format send and receive information. The
Internet uses the Internet Protocol IP. And the web uses the hypertext transfer
protocol HTTP, you're probably familiar with the HTTP because that's what goes
at the beginning of a web address. They start with http://, which tells your
browser that you're using the HTTP protocol. Even if you don't type that part,
your browser fills it in. Because it's a necessary part of requesting webpages. You
may also be familiar with the IP acronym from the term IP address.
An IP address is a unique address assigned to every computer and device that's
connected to the internet. Anytime you send a request for something online, like
a web page, your IP address is included so the recipient knows where to send the
information back to When you make a request for information online, there's a
series of steps that get followed. Say you're requesting this video, so you can
watch it on your device. (Image)
Your computer puts together a request for this information. And it goes from
your computer to your internet service provider, your ISP, like the cable
company that brings internet to your house. From there, it may go to a couple
other servers owned by your ISP servers or just computers connected to the
internet, then it gets passed out to the internet backbone. This is like the
interstate system, a series of very fast core connections and interchanges to get
traffic close to its destination. From there it connects to the server you're trying
to reach. That server processes the request and sends a request back to your IP
address in a reverse route. The exact path your request takes may vary each time.
The internet was intentionally set up to have many possible paths between
computers. This is because it was originally a project of the US Department of
Defence, and they wanted to ensure the network would continue to work, even if
some sites were taken out by bombs, attacks or other failures.
Servers keep a log of the requests they get and the responses they send. This
allows them to analyse their own performance, your IP address is included in the
log. This is the first way you can be tracked online. All requests are routed with IP
addresses, but you probably don't interact with IP addresses on a daily basis.
When you try to go somewhere online, you usually use a domain name like
google.com. When you enter that the first step is that it has to get turned into an
IP address. There's a large distributed public database that map's domain names
to IP addresses called DNS (domain name servers).
The first step that your computer takes is to take the domain you entered, look it
up in a DNS and get the IP address to route it. These lookups are another way you
can be tracked, which we'll discuss more in a bit.
The web operates on top of the Internet Protocol. It uses the same processes that
run everything on the internet, and then adds a layer to make webpages, images
and other data appear in your web browser. Think of it like the Internet Protocol
is a truck moving data around and webpages are the cargo. There's a lot more
tracking that can happen on webpages, but we'll focus on that later since it's not
part of the internet and webs core functionality. But just from the core way the
web works, there are logs of your IP address and every web page you visit. This is
nothing new and it isn't insidious. Since the earliest days of the web. This
information has been recorded. But it wasn't used to track people across the web
or to do much to personalise their experience.
3.3 Tracking with Cookies and JavaScript
On the web, there are a few technologies that have made their way into tracking
infrastructure. Though they were not originally intended to be used that way,
Cookies are one of these technologies. You've probably seen this term because a
lot of websites now show you an alert that they use cookies and ask you to agree.
Cookies are little pieces of code or identifiers that a website places on your
computer. There are lots of benign uses of cookies. For example, a website might
use cookies to keep track of the fact that you're logged in, or to remember your
username for a login screen. It might keep track of what products you viewed
with a cookie. In fact, you can disable cookies in your browser, but most modern
websites will not work without them.
Based on their original use cookie could tell a website that you had visited it
before. However, there are now more modern cookies that can track you across
many websites and aggregate that information to follow your visits more
broadly.
JavaScript is also another technology used in tracking. This is a programming
language that's used for all modern interactive functionality on the web. But it's
so powerful that it can be used to monitor everything you do down to individual
keystrokes, you type in forms, even if you don't submit the data. With those
terms in mind, let's look at some specific tracking examples to see how these
technologies work.
You have probably had the experience where you've been looking at a product
and then ads for that product show up on other websites, even when you know
there's no partnership between the two sites. So how does that happen? This is
something called ad retargeting. I like to We call this the Phantom toilet
phenomenon. I first encountered this personally when I was remodelling my
bathroom. I was on a home improvement website looking at a new tub sink and
toilet. Then that exact same toilet showed up on my Facebook page and on a
cooking blog and on another website. When I've talked to people about this
phenomenon, some of them are very upset and have said they stopped shopping
with companies whose products follow them around because they're upset that
those companies are tracking their browsing behaviour.
However, that's not exactly what's happening. If Emily's Home Improvement
uses ad retargeting, they partner with a web advertising company. The Home
Improvement site puts a little code in their website. And the ad company uses
that to track what products you've looked at. Then, other websites that partner
with that ad company have a little code that tells the advertiser find products
that this person has looked and show them. This uses a combination of JavaScript
and cookies.
Only the ad company tracks you across the web, not the individual sites. That
said, it can be really upsetting that any company's tracking you. In this case, the
cookies that you agree to, when you engage with a website, keep a note of
products you viewed. The ad company then retrieves those cookies to decide
what ads to show you on other websites. This is a simplified version of what's
happening, but it gives you the general gist of how it works.
3.4 Tracking with Browser Information
One thing you may not have noticed, though, is that these ads appear across
devices. You may shop for that toilet on your home computer, but the retargeted
ads may appear on your phone or tablet. How do companies know it's you when
you're using different devices? They do it through browser fingerprinting
A browser is the application you use to access the web. Internet Explorer, Firefox,
Google, Chrome and Safari are all browsers. They know how to send requests for
webpages, like when you type in a search or a web address, and how to display
the web pages code in a nice way.
Browser fingerprinting is a technology that can uniquely identify your browser
by its characteristics, just like a fingerprint can uniquely identify you by its
characteristics. That's your browser, the app on your device that lets you access
the internet, not your device itself. So how does this work?
Imagine that there were only two people in the world and one of us had an
iPhone and one of us had an Android phone. If an advertiser wanted to tell us
apart, they could just look at what type of phone we have. This is information
that gets transmitted anytime you come to a web page.
You've actually seen that in practice before because some web pages format
themselves different Depending on if you're looking at them on your phone, or on
a computer, this is because information about the system you're using is sent
along with your request for a web page.
Now, if there were four people, and some had iPhones and androids, the phone
type would not be enough to uniquely identify someone. So the advertiser could
look at other system information. For example, if two people have iPhones and
two people have androids, and one of each uses Google Chrome is their browser
and the other uses a different browser than the combination of what type of
phone you have and what type of browser you use, also identifies you uniquely.
The Android chrome user would look different than the Android opera user. And
the iPhone Safari user will look different than the iPhone chrome user. But if we
have eight people in some of them have the same combinations again, this won't
work. What else could be used to uniquely identify people?
In fact, hundreds of pieces of information about your system setup are available
anytime you visit a webpage. That includes things like what version of the
operating system you're running, what fonts you have installed on your system,
the dimension of the window, you have open, what extensions are installed in
your browser, and the list goes on. If an advertiser collects all of that information
for every person, that information essentially becomes a fingerprint. It's not
necessarily unique for every person, because two people could have the exact
same system configuration. But it's unique in the vast majority of cases. About
80% of people can be uniquely identified by the configuration of their system
with no personal or otherwise identifying information.
This allows an advertiser to know that you're the same person visiting different
websites. Even if you don't have any cookies or other stored information. They
can see that you're the person with that browser fingerprint. And so they know
what other webpages you visited.
Essentially, they have a fingerprint for one of your fingers, and they can detect it
in a bunch of places. But how is this used to track you across devices? If you have
an identifiable configuration on your phone, and your computer, how does an
advertiser know to link those two profiles together?
They can do this with account information. If you're logged into an account on a
website and visited on your desktop computer, the advertiser doesn't need to
know any personal information about your account. They just need to know
something like a username or user ID number that was logged in on that
computer. That's often easy to identify. Then if you log into the same account on
your phone, the advertiser can say well This account was logged in from a
desktop computer with this fingerprint and it was also logged in on a mobile
device with this other fingerprint. So the identifier tells us it's the same person
who owns these two fingerprints. Essentially, they now have prints for two
fingers instead of just one. They store all this information in a database.
So when you come to a website, they grab the information provided about your
system configuration, and reference that against their database.
This lets them know exactly who's visiting. You can't block this system
configuration information from being transmitted. So this is a technique that can
be used on you regardless of what privacy settings or other privacy techniques
you might implement. It's a very powerful way that your behaviour can be
tracked across the web, especially when these advertisers are large. Since lots of
commercial websites, even personal blogs have advertising, large advertisers
may have information embedded on the majority of web pages that you visit.
That means that not only do they know that you've come to a particular webpage
at a particular time, they may actually be able to track the series of websites that
you visit, because every one of those pages has some of their code embedded in
it. Even if they miss a few of those websites, because their advertising is not used
there, they get a pretty thorough picture of your behaviour on the web.
If you want to know if your browser fingerprint is unique, there's a tool that can
help you with that.
The Electronic Frontier Foundation runs a tool that will analyse your
configuration information, and let you know if it's likely to identify you.
And how prevalent are these kinds of trackers. Let's take a look through an
extension that blocks them. This is an extension called Ghostery that can be used
to block trackers. We'll talk a little bit later in this chapter about how you might
use this yourself.
One of the interesting things that it does show you is exactly which trackers are
installed on the web page that you visit. If we visit a major news website,
Ghostery blocks 26 different trackers, it shows how many trackers were blocked.
And if we look a little further, it shows a list of where each is from and what
category it belongs in. So that's tracking with browser information and cookies.
3.5 Tracking by Your ISP
Beyond all the tracking we've just discussed. You can also be tracked by your
internet service provider (ISP). They know all the websites you're going to
because they provide the internet connection to your house.
Thus, any request that you make has to go through them. They see where that
request is going to and they can keep a log of it.
Until fairly recently, there was not much they could do about that. They were not
supposed to use it to target ads to you. And in the United States. The Obama
administration introduced rules In 2015, to make this illegal, however, one of the
first pieces of legislation passed under the Trump administration allowed this
kind of tracking, it allowed ISP to use this data to target you with ads.
These regulations direct what internet service providers can collect, and what
internet service providers can share. If your ISP is able to see what you do online,
and with their service, they may decide to use this only internally to target you
with advertising. Or they could sell that information to other people. The law I
just mentioned allows it to use with outside advertisers. But internally, Internet
service providers have been using that information for even longer. Lots of
people get their internet service through their cable providers. And for a long
time, cable providers have been telling advertisers that they can quite specifically
target people based on their interests. This is not just with web advertising. This
is with the actual commercials you see on your television. By combining some of
your web activity with your viewing activity, your internet service provider can
create a profile of your interests. Then they can use this to show you different
television ads then they show your neighbour who may be watching the same
programme.
It's like the kind of personalised advertising we've grown used to on the web. But
many people don't know that it's also happening with the content they'll see on
TV. If you're watching TV at night, your cable company may show you different
commercials than your neighbours based on the demographic profile they have
of you. If you're a retiree, you might see a different commercial than the family
with Kids Next Door, even if you're watching the same show.
They can combine data from your account with third party data about you and
your interests obtained from companies who collect this kind of data about
people.
Now they can also legally analyse your web traffic to enhance those profiles.
Beyond your cable company using your internet data to show you ads, they now
have permission to use this information with advertisers online.
It's still playing out exactly how this will be used. But certainly, this opens up a
whole new source of personal data for advertisers to access and use. So that's a
rough guide to how tracking works across websites, across devices, and even
across technologies.
Tracking is a complicated and evolving phenomenon that can surprise us in its
reach and its power. But it's not inescapable. And that's good news for those who
are uncomfortable with it.
3.6 What Can You Do?
What if you want to block this kind of monitoring? There are several options that
you have:
The first is to install some browser extensions that block many of this kind of
cookie and tracking activity. There are many options available, one of which is
Ghostery, or other tracker blockers. You can install these in your browser with
one click. Then, when you go to a page, they blocked all the trackers that they
know about. Many will also create a report for you so you know what was
blocked. If there's a website that needs these kinds of trackers, and that you trust,
you can create an exception so that site can use them. Not only does this protect
you from a lot of tracking and improve your privacy, but it can also increase the
speed at which you browse pages online. That's because it prevents all kinds of
code from running in your browser. And it stops a lot of places from putting code
onto the webpage, so there's less data to load and less processing that's
happening in the background that's not necessary for your web experience.
If you do a quick search in your web browsers extensions library, just search for
Firefox or Chrome or whatever your preferred browser is, and extensions, you'll
find lots of extensions to block tracking. One thing to keep in mind, though, is that
some pages really rely heavily on these kinds of trackers and they simply will not
function if they're blocked. To deal with that you can either add exceptions in the
blockers to allow the trackers to be used on those pages, or install a second
browser and use that one on the rare occasion that you need to go to a webpage
that requires the trackers.
This is my personal strategy. I have one browser that I basically only use to order
things because the ecommerce website will not work with all the tracker
blocking that I have installed.
What about your internet service provider tracking you? There are two main
ways that your ISP can see what you're doing online:
The first is that they can see the actual webpages you go to because they bring
those web pages into your home.
The second is that they know what pages you want to visit because they turn the
domain name, the thing that ends in .com or .net into the IP address the number
that the internet actually uses to send you to a web page.
As we discussed earlier, looking up a domain on a domain name server to get an
IP address is a necessary step to get information online. Usually, your ISP has
their own domain name servers. This means they can log all the lookups done for
you, and then they know what pages you're visiting.
Those are two separate steps and both are ways that you can be tracked. If you
want to stop your internet service provider from tracking you. You need to stop
them from seeing information in both steps. To stop them from seeing your
domain name lookups, you can have someone else do that for you.
In the network settings on your computer, you have the option to specify the IP
address for the domain name server you'll use. There are lots of free open
domain name servers out there provided by independent entities.
You can easily find these with a web search. When you put those in as the domain
name servers to use, your computer sends the domain names to those servers
instead of your internet service provider to get the IP address, that then blocks
your internet service provider from seeing where you're going based on that
lookup.
The next step is to stop your internet service provider from bringing those actual
web pages to your device. That may seem impossible since they're providing
your internet service, and you need to get those pages to your device. But there's
a clever way around this. You may remember me mentioning VPNs in the first
chapter.
VPN stands for virtual private network. The way that this works is that it
essentially hides your web traffic from anyone who might be looking. Most often,
it's used to prevent hackers from seeing where you're going on the web, and to
hide your content from them. But it also can be used to stop your internet service
provider from seeing the pages that you're browsing. So say you want to visit
Google, you type in google.com. Using the domain name server you specified this
gets turned into an IP address. Next, your computer sends a request to go to that
IP address. Without a VPN, your internet service provider will send that request
and return the page to your computer.
That allows them to see the content and know the page that you visited. With a
VPN, Instead of your internet service provider fetching the page, your VPN host
does it. Your computer establishes a secure connection with your VPN server.
Your requests always Go from your computer to the VPN service provider.
They're encrypted, as is the information that comes back. So your home internet
service provider is unable to see any information about what page you visited, or
the content that was on that page.
It's essentially like a tunnel that goes from your house to someone else's house
and anything that goes out to the web goes through that tunnel, and then out
through your friend's house.
The encryption that happens here prevents anyone except your VPN service
provider from knowing what pages you visited. Your home ISP just connects you
to the tunnel and they can't see anything else.
A VPN is a great way to increase your overall security online and to protect the
privacy of your data. And the resources will provide you with some reviews of
VPN companies. There are free VPN service providers, but those are not a good
choice for privacy because they can see all the pages you're doing too, and they
make money by selling information about you. Paid VPNs are very affordable,
just a few dollars a month and give you a lot of security and additional privacy.
By using an alternative domain name server and a VPN, your browsing activity
will be totally hidden from your internet service provider, and you get the added
benefit of keeping things much more secure.
Right now we're essentially in a high point for surveillance capitalism. Lots of
companies base their business on monitoring your every move and monetizing it.
The Privacy universe this is creating is troubling. But fortunately, there's
something you can do about it.
Blocking trackers, using VPNs and hiding your web activity are all pretty easy
after initial setup, and let you maintain some privacy in a world that's working
hard to know everything you're doing.
Chapter 4. Data collection methods
4.1 Introduction
When we talk about being careful with our personal data, especially online, we
tend to think about what we explicitly choose to share. For example, don't post
your address, phone number or other sensitive personal information.
Think twice about what photos you share. Don't tell all your intimate personal
details to people on Facebook. And while this is all good advice, data is collected
about us in a lot of other ways.
Some of these we know about our phone companies keep track of who we call,
and when and for how long. They know who we text on our mobile devices. Our
credit card companies keep track of where we shop, and what we spend. If we
use loyalty cards or numbers stores know what we've bought, there's a
reasonable discussion to be had about how much of that information should be
collected and stored. But in this chapter, we're going to focus on the vast
amounts of data that are being collected that you might not know about.
4.2 How Much Information Is Collected and How?
Websites and your devices with the apps you have installed are regularly
collecting huge amounts of information about you, including sensitive personal
information that they probably don't need to operate.
Let's start by taking a look at how and how much of this information is collected.
And then talk about some of the ways you can prevent its collection.
Let's start with an older but quite simple example. Facebook wants to know
details of how people use their platform. Their goal is to keep people engaged
with Facebook, they want us to use it as much as possible, and to know what
kinds of activities keep us engaged. For example, if you're commenting on your
friends’ posts, that's good from Facebook's perspective, because it keeps you on
Facebook and encourages your friends to engage. And of course, when we
comment, Facebook knows that we've done it. So it encourages our friends to
respond. But what if you start to post something, and then reconsider before
sending it. We're not talking about posting and deleting. We're talking about just
typing text in a box on Facebook and never posting it at all. Facebook wants to
know when this happens. So they have code that collects information about when
you do this. We don't know if they're collecting the text you type, or just data like
how much you typed, but the technology exists that would let them grab your
comment, store it and analyse it, even if you don't actively post it.
When this was first reported, Facebook claimed they were not collecting what
people typed in the box, just how much they typed there. But if their policies
changed, or if other services use this technology, they could store that
information and use it in a variety of ways. The technology to collect text like this
is very straightforward to use. Thus, it's safe to assume that a lot of websites,
whether they're social media, or ecommerce or communications based, are
harvesting information that you type on their platforms, even if you choose not to
send it.
Your phone is also a goldmine for companies who want to collect information
about you. Jeffrey Fowler at the Washington Post wondered about his iPhone.
Just how much data was it sending out that he didn't know about?
He worked with a company called Disconnect to monitor the data that was sent
from his phone and was shocked by the results. In a week 5400 hidden apps and
trackers received data from his phone.
Here's a quick quote from his story: ‘’ On a recent Monday night, a dozen
marketing companies research firms and other personal data guzzlers got
reports from my iPhone. At 11:43pm. A company called amplitude learned my
phone number, email and exact location at 3:58am. Another called Appboy got a
digital fingerprint of my phone. At 6:25am, a tracker called demdex received a
way to identify my phone and sent back a list of other trackers to pair up with an
all night long, there was some startling behaviour by a household name Yelp, it
was receiving a message that included my IP address once every five minutes.
Apps, trackers in your phone itself can track your location, your pattern of
movements, who you call and how long you talk, who you text with and how
often what other apps you have installed your phone number, your contacts,
sometimes your photos. And in most cases, we did not know who those
companies are or what they're doing with our data.
Some of them are tracking us in especially surprising ways. Have you had this
experience, you're out to dinner, your phone is on the table untouched. And in
the course of the conversation, you may say something like, you know, next year
for spring break, maybe we'll go to Costa Rica, you don't search for it. Don't note
it down. Don't touch your phone at all. But the next day, you start seeing ads for
Costa Rican tourism all across the web. This happens because some apps turn on
the microphone and passively listen in on what's happening in the background.
They can use this to pull out keywords you say or to identify TV shows or songs
or commercials you're hearing. All of this feeds into a profile about you, your
interests and your activities.
One widely publicised scandal involving this behaviour happened during the
Women's World Cup. La Liga, a national Professional Football League in Spain
used their app to spy on users, they turned on the microphone to listen in and
hear if the user was in a bar or other establishment that had the game on TV.
Then, using location services, they could identify exactly which establishment
that was. If the bar didn't have a licence to show the game, La Liga could come
after them.
Essentially, users phones were hijacked to catch bars showing the game who
hadn't paid for the rights. They were able to do this because users allowed
microphone and location access to the app when they installed it. But the app did
not say what it planned to do with that access. Even without access to the
microphone, though, it's possible to listen in on you. Phones have accelerometers
which can tell how fast you're moving in three dimensions. This is used for things
like the compass, fitness apps that count steps, and for games you control by
tilting your phone, there's no special privacy permission that controls access to
the accelerometer. Researchers have shown that the sound from talking causes
vibrations that the accelerometer picks up and an app could analyse those and
convert it into speech. Even when we do know our devices are listening, they
may capture more than we expect.
Consider the case of Timothy Verrill, who's accused of murdering Christine
Sullivan and Jenna Marie Pellegrini. In January 2017, Verrill allegedly believed
that Pellegrini was a drug informant on the night of the murder he broke into
Sullivan's home and brutally beaten stabbed the two women to death. There was
an Amazon Echo personal assistant device in the home and prosecutors took the
device. They believe there may be some recordings of the actual murder on it. A
judge in the case also ruled that Amazon had to turn over any recordings they
had. That's not the only instance where an echo has been involved in a murder
trial.
In 2015, Victor Collins was found floating facedown in the pool of his friend
James Bates. Bates was initially arrested for murder, but the charges were later
dropped because the evidence did not support the charges. However, in the
midst of the investigation, Amazon turned over recordings after Bates said he
would voluntarily supply them. These types of personal assistants work by
listening for a weak word like hey Siri, or Hey, Google, or Alexa. They record what
follows upload that audio to the host like apple or Amazon or Google, where it's
processed, analysed, and a response is generated and sent back. If someone were
to activate an echo while something criminal was happening, Amazon may well
have a recording of it. Now, perhaps you're not criminally inclined and therefore
are not particularly worried about your personal assistant device incriminating
you at trial. But the fact that these devices can and do make recordings in your
home without your knowledge should set off alarm bells. Just how much does
Amazon collect from devices like this? A lot.
4.3 Predictive algorithms
In late 2018, a German Amazon user requested an archive of all his data that the
company held a right he has under the European privacy general data protection
regulation (GDPR). Amazon sent 1700 recordings from someone else, including
some recordings the person had made while he was in the shower. The recipient
contacted Amazon but never heard back from them. So eventually he went to the
magazine publisher Heise with what he had received. The journalist found that
by listening to recordings that asked for weather that mentioned people's first
and last names, and friends information, they were easily able to identify the
voice on the recordings, and the man's girlfriend.
Amazon says releasing this data to the wrong person was a mistake. But the fact
that they save thousands of recordings from people is an interesting fact on its
own. These contain deeply personal data that when aggregated, can reveal a
tremendous amount about a user.
Why collect all that data? because it can be used to profile you. The realm of
things that can be understood from simple data is vast and growing. In the first
chapter, we talked about how liking the page for curly fries was a strong
indicator of high intelligence, and how analysing likes could reveal your race,
religion, gender, sexual orientation, drinking and drug habits, intelligence, and
much more. But that technology is relatively dated, the range of things we can
discover about people from their personal data has expanded. Researchers were
able to find out what your political leanings were based on who you follow on
Twitter. That might not be surprising. Some conservatives are more likely to
follow other conservatives and liberals are likely to follow other liberals.
However, they were able to detect this even looking at neutral data, like what
national parks you follow. That is surprising, but not as surprising is this. More
recent work has been able to predict things that will be true about you in the
future, before you even know it yet. In the first chapter, I described a study from
Cornell that set out to identify someone's spouse or significant other on Facebook
by looking only at which of their friends knew one another. As they analyse their
data, they accidentally discovered that they could predict whether someone's
current relationship was likely to last or fall apart in the near term. And a lot of
us have been looking at building algorithms to predict future behaviour and
attributes. One interesting piece of work investigated whether an algorithm
could identify women who were at risk for developing postpartum depression by
analysing their social media feeds. The research followed women on Twitter over
the course of their pregnancies. The researchers collected data on their
interactions, frequency of posts and language they use. They then followed up to
see which women developed postpartum depression and which didn't.
They use this to train an algorithm to identify women at risk. All the women
changed the way they use Twitter during their pregnancies. But the women who
developed postpartum depression changed in opposite ways from those who
didn't. For example, those who did become depressed increased the number of
questions they asked over the course of their pregnancies, while it decreased for
women who did not become depressed. The research didn't look at what those
questions were. Maybe they were pregnancy related, but they could have been
questions about anything, TV shows, sports or just life. The use of verbs, adverbs
and pronouns also increased in one group, but decreased in the other. Overall,
the algorithm used these clues very effectively. On the day a woman gives birth, it
can predict with high accuracy if she'll develop postpartum depression. And if the
data extends a few days after giving birth, it's almost 85% accurate.
On one hand, this is incredibly promising work. Postpartum depression is an
insidious condition, women often don't report it because they believe they're
expected to be happy and joyful just having given birth. A tool that can accurately
predict it would allow a woman's doctor to push a button when she comes in to
deliver her baby, and know if she should be monitored more closely. However,
this could also be misused, someone's boss or insurance could run it on them and
deny them coverage or opportunities as a result. It highlights a lot of the
concerns that comes with this technology, especially when it operates on public
data. It can really help people in the organisations they work with. At the same
time, it can be incredibly intrusive and used in unfair ways. We don't yet know
how to strike the balance.
I mentioned earlier a study from the University of Maryland that looked at
alcoholism recovery. Let's talk about that one a little bit more. Researchers from
the University of Maryland went onto Twitter and found everyone who had
announced that they were going to their first Alcoholics Anonymous meeting.
Now, of course, this takes the anonymous part out of it. they're not sure how that
impacted their results. But they made sure to filter out jokes or people who are
going to support someone else. And they were left with a dataset of hundreds of
people who clearly were drinking too much, and felt like they needed to get it
under control.
They then followed what they tweeted after that to determine if they were sober
90 days later. This is a good indicator of early addiction recovery. They made
sure people said explicitly what their status was, it could be that six months later,
they were celebrating their six months of sobriety. And then they know they
were also so Britain 90 days, it could be that a week later, they were complaining
about being hungover at work. And so they knew that they hadn't made it in 90
days without drinking again. So They have this explicit data for hundreds of
people. And then They gathered everything they had done or posted on Twitter
up until announcing they went to that first Alcoholics Anonymous meeting. And
with that data, They built a model that would predict based on everything they
did up until announcing that they were going to a whether or not they would be
sober 90 days later.
So essentially, once you decide to go, you can push a button, analyse your tweets,
and know if the program will work. They model works exceptionally well. It's
right about 85% of the time. So is this good or bad? A lot of artificial intelligence
technology works in a blackbox manner: you pour the data in, you get a good
answer out. But you don't get any insight into why it's true. They wanted to build
this algorithm to give insight, so they looked at things that addiction researchers
might consider like whether you have a social circle full of people who drink a lot,
or if you have poor mechanisms for coping with stress, which is common among
people with addictions. And they modelled those characteristics on Twitter for
their algorithm. So if the algorithm says it looks like you won't make it 90 days,
They could tell you that you might need to change up your social circle, because
your friends talk about drinking a lot. Or you might want to get some cognitive
behavioural therapy to help you deal with stress in a more productive way. So in
that sense, the research is good and could help people.
On the other hand, there are lots of ways that these algorithms could be misused,
whether it's employers deciding to fire you from your job, or the justice system
using the results to determine whether you go to jail for a DUI. But for all of these
kinds of algorithms, it's worth considering what's the worst thing a person could
do with them. Because we've seen lots of data scandals such as Cambridge
Analytica arise out of the misuse of these algorithms. We'll also talk about those
specifically in a the fourth chapter. The point of these stories, though, is that
there's a lot of information that can be uncovered about you from background
data that you may be unaware is being collected. On top of that, you're even less
likely to be aware of the algorithms being applied to that data, and the power,
they may have to understand things about you. And basically, you can't hide from
them.
4.4 Shadow Profiling
Let's use Facebook as an example here. Whenever I talk to groups about this, I
asked people to raise their hand if they don't have a Facebook account. And in a
large audience, there may be five or 10 people who don't have Facebook. But the
correct answer is really that everyone has a Facebook profile. Some people have
made it themselves. And for other people, Facebook has made it for them. If you
have not created a Facebook profile, Facebook still knows a lot about you. For
many people who are not on the site, Facebook creates something they call a
shadow profile. Basically, it's very easy to know when a person is missing from a
social network. There's a hole where a person should be and Facebook can easily
figure out who that person is that the hole represents. So if you don’t have a
Facebook account, and you've never shared any information with Facebook, how
did they build a profile about you? And what could they come up with?
I want to mention here that Facebook admits that they have these shadow
profiles, but we don't know a lot about how much data they have in each one, or
how they calculate that data. So what I'm going to tell you about is
straightforward technology that could be used to build one of these shadow
profiles. We don't know if it's exactly how Facebook's doing it. So this is an
educated guess, if I were running a social network like Facebook, and I wanted to
build a profile of people who hadn't signed up, this is how I would do it. Also,
we're focusing on Facebook, but most other big social media platforms can and
may do this too.
The most obvious information to use is other people's contact lists. If you have a
friend who has a social media account, and they use the app, they likely have
given access to their contact list. Many platforms ask for this because it allows
them to pair you up with other people who are using the app. It does that by
downloading a list of your contacts along with their data, their phone number,
their email address, their street address, maybe a photo. And if they have another
user in the system with that same email address or phone number, they can
suggest that you become friends with that person. Essentially, a phone number or
an email address are unique identifiers. If you're not in Facebook, but you are in
someone's contact list, when Facebook, download your friends contact list, they
now know that you exist, what your name is, what your phone number is, what
your email addresses, and maybe things like your street address or your website
or a picture of you. Getting that from one person is useful. But the vast majority
of Americans who have internet access, also have a Facebook account. So if
you're not on Facebook, most of your friends still are, and you're likely in many
of their contact lists. So when they give permission for Facebook to access their
contacts, Facebook retrieves your information from a lot of different people. So
now Facebook doesn't just know that you exist, they also know who a bunch of
your friends are. And since these friends have profiles, some of them are likely
friends with each other, while others are from different social circles.
This can reveal your interests. For example, several people may be in your
neighbourhoods Facebook group. And so if you have three or four people who
you know in real life, and they have you in their contact list, and they're on
Facebook, and also part of the neighbourhood group, Facebook may infer that
you're likely to live in that same neighbourhood, especially if they have your
street address, it can show that you live nearby. Similarly, if you go to a church or
temple or mosque, and you have friends on Facebook, who have you in their
contact list, and who go to that same religious institution, and who are friends
with each other, Facebook may be able to infer your religion from that. The same
thing applies if a lot of your friends share an interest in a sports team, or list the
same employer from this information that your friends have provided. Facebook
knows your name, your location, your contact information, a bunch of your
friends in many of your interests. This information can also reveal other traits.
For example, there's research that shows especially for men, it's quite easy to
determine their sexual orientation based solely on information about their
friends. So even if you keep your sexual orientation private online, if you have
friends who do not Facebook may be able to tell your sexual orientation just from
information that other people provide. As you can imagine, these details about
people who have opted out of the system can be used in a variety of ways with
both good and potentially dangerous consequences.
4.5 What can I do?
The first thing to do is to check out how much of your data is being shared. There
are apps that will allow you to do this.
The Privacy Pro app created by the disconnect team we mentioned earlier, that
helped the Washington Post investigation has a free option that will let you
monitor trackers and information being shared in the background on your
phone. It will also block them which speeds up the performance of your phone
and protects data that would be shared.
On social media, Delete old posts. When less information is available about you
algorithms can discover less about you as well.
Check your privacy preferences, on your phone Turn off apps’ permission to
access your location, contacts and other information unless it's really critical.
To stop background data collection, select the options that prevent apps from
running in the background and if they do not need to contact the internet As is
the case with a lot of games turn off their rights to use cellular data.
For example, I have a couple games that I play on my phone. I'm not playing them
with anyone else. So I don't need internet connectivity to play the game. The
game works fine. For example, when I have the phone in aeroplane mode, and I'm
not connected to the internet. Thus, there's no real reason that phone app should
be using any data at all. Occasionally, it may need to be updated. But that's a
standard app update that you do from your phone's App Store, not something
that the app would do on its own. If you block apps from using data, it prevents
them from being able to send any information about you out to the world.
It may also stop them from downloading and showing ads to you. Now, this can
disable some features. For example, the game I like to play most has an option
where you can get a bonus feature in the app by watching a 32nd video,
downloading that video requires internet access. And so if I block the app from
using data, then it's unable to get those video ads and thus, I can't use the feature
to get the bonuses for watching the video. I'm fine with this trade off. I would
rather not have bonus points in a game and protect my data than to give away a
lot of information for a few in game trinkets. But ultimately, you need to decide
which option feels best for you. For each individual app, you can go into the
controls about using data and turn it on or off. You'll need to leave this on for
apps that require the internet like social media maps, web browsers and other
apps that use online information. Technical steps alone cannot stop the mass
collection of data about us, but we can make it a lot harder. To get true protection
though. We will need better legal protections.
Chapter 5. Nowhere to Hide?
5.1 Health-Related Data collection
We have talked a lot about surveillance through digital channels that we
probably expect. We know that social media companies are recording everything
we post and sharing it. We also know that companies that we buy things from
probably keep track of the patterns in what we buy to try to offer us other
products. But surveillance is out in the world in a lot of ways that we may not
suspect. We will talk in other chapters about how our devices collect data about
us. But what about our interactions when we are off of our devices and out in the
world? Just how pervasive is that sort of surveillance. One story that highlights
how difficult it is to hide from this kind of mass surveillance was shared in 2014
on salon.com. Sarah Gray wrote an article about one woman Janet Vertesi, who
was an associate professor of sociology at Princeton at the time. She was
pregnant and wanted to keep that private. She decided that for the full nine
months, she would not share any information about her pregnancy in any digital
form. Obviously, that means that she was not posting about it on social media,
but also was not doing any other real world digital interactions that would reveal
she was pregnant. That meant that she was only calling people to tell them about
her pregnancy. She asked her family members and friends not to post about it on
Facebook. One article about her efforts reported that she had an uncle who sent
her a congratulatory message on Facebook and she unfriended him in response.
She wanted to do some research about her baby and baby products on the
internet but she didn't want that to be tied back to her. As we will see, browser
fingerprinting and other technologies can be used to uniquely identify you when
you're on the web. Your internet service provider can also track what websites
you're visiting and use that to advertise to you. To prevent that kind of tracking,
she used the Tor Browser. We'll discuss that more in our chapter on the dark
web and traditionally, it's associated with people who are investigating or
carrying out nefarious deeds online. In this case, it was just used to keep news
about an impending baby from being digitally tracked. Her efforts also meant
that she would not register for baby gifts at stores that had online registries.
When she was buying things for her baby she wouldn't use a credit card. We
know retailers keep track of what we buy and analyse that to offer us new
products. Even brick and mortar stores track our purchases through our credit
card use. Any of these companies may be selling information about our purchase
histories in ways that we don't know about if you buy items with a credit card,
those purchases are easily linked to your identity and can be tied into large
marketing databases. This means one single purchase of a baby item could affect
your profile, so you're marked as pregnant and start receiving marketing
material about parenting. Avoiding that means paying in cash in offline stores
and getting creative to shop online. She created a new Amazon account with a
new email address that would have packages delivered to a locker not to her
home address. This made it very difficult to associate that Amazon account with
her specifically. However, as we mentioned, she was not using credit cards to buy
anything so how do you shop online like that?
Her solution was to use cash to buy prepaid amazon gift cards, which she would
then load into her profile. One really interesting story from her efforts highlights
the problems that can arise here. She and her husband wanted to buy a stroller
on Amazon. The stroller was expensive, and so they needed a lot of gift cards. Her
husband took $500 in cash to a local pharmacy where the gift cards were sold
and tried to buy enough to cover the price of the stroller. When he went to check
out the pharmacy told him that they had to report the transaction because
excessive cash spending on gift cards is suspicious.
That's because this is how terrorists do a lot of their business, for exactly the
same reason they don't want to be tracked and analysed in their digital activity,
they don't use credit cards. Still, they may want to do a lot of things online and so
they use cash to buy prepaid gift cards. Essentially, if you don't want to be
tracked, you look like a terrorist.
With normal behaviour were tracked, constantly monitored and marketed to and
if one were to opt out of that type of pervasive tracking it looks suspicious and
possibly even illegal. Janet's story serves as a cautionary tale of just how difficult
it can be to keep very personal information about yourself to yourself.
Companies are also trying to collect more information about us with a veneer of
consent, even though we may not know exactly what's going on behind the
scenes. Health insurance companies, for example, offer some people discounts or
gift cards if they link their fitness tracker with their insurance account. Of course,
if you take a lot of steps, it makes sense that you might get rewarded for that.
Auto insurance companies are taking similar steps by giving people tracking
devices that can monitor their speed and driving habits.
In exchange for the discount, people may give away their privacy in that domain.
But what about when we don't know?
The Houston Chronicle shared a story in 2018 about a man who had sleep apnea.
And who used a CPAP machine to help him breathe at night. These machines
need replacement parts like filters and hoses that insurance will pay for. When
the man got a new machine, he registered it and opted out of receiving
communication. However, after the first night, he woke up to an email
congratulating him on his use the night before. Later, he talked to someone at the
company, who mentioned that the device was working well at keeping his airway
open. She knew that because she had a report of his usage. This was something
his old machine did, but that was recorded on a removable card that he would
bring to his doctor's office. This machine, without his knowledge was
transmitting data about his usage. Not only that it was sending it much more
widely. It wasn't just going to his doctor, it was going to the company who made
the machine and to his shock to his insurance company. And insurers use this
data to deny coverage to patients who aren't using the machine enough. Even
with strong federal protections for health related data, this type of monitoring
seems to be legal when patients agree to the terms that come with their devices.
5.2 Facial Recognition Technology
Outside the home, facial recognition technology is another space where real
world surveillance is becoming more sophisticated. We all know that surveillance
cameras are everywhere when we are moving around in public. Private
businesses have them some municipalities have them and devices like ATMs also
have built in cameras. As a result, our movements can often be tracked. But
privacy is preserved in a way because so many people walk past these cameras.
The images from the cameras are owned and controlled by lots of different
people. As a result, it's difficult to aggregate all this together to follow a single
person's movement. But that may change in the future as technology and
integration develops. You only need to look to examples of police trying to track
the movements of a victim or a suspect on cameras, see just how difficult this can
be. It requires going into businesses asking for copies of their video footage
which sometimes isn't working or sometimes is misstated or blurry. It requires
watching hours of footage to try to identify exactly the right person and the time
that they walked past and to reconcile that with what other cameras show. This,
of course, makes things difficult for police but for the average person who's just
moving about, it also means it's very difficult for any large organisation to keep
track of our movements.
China is a counter example to this where there's massive state surveillance that
can indeed be used to track the movements of people on a large scale. Part of the
way China is able to do that is with facial recognition technology. Facial
recognition algorithms can identify an individual person by analysing the patient
pattern of their facial features. It's a technology that many large corporations are
working on.
Facebook has a good facial recognition algorithm. You may have noticed this
working if you upload a picture and it automatically identifies the people who are
in that photograph. However, not everyone has access to such a huge database of
people's photos and so there are only a handful of companies with large and
accurate facial recognition systems. Some of these companies like Amazon, are
selling that technology to third parties. There's been a lot of controversy around
this.
First, the technology is not extremely accurate as we'll see, it works much better
for white men than it does for women and people of colour. That means that
when errors are made, they're more likely to be made for those groups. This was
highlighted in August of 2018 when the American Civil Liberties Union did an
experiment using Amazon's facial recognition software. They compared 120
California lawmakers’ images to a database of 25,000 mug shots. The algorithm
incorrectly identified 28 state legislators as criminals, even though none of them
had ever been in jail and they were not the people matched in the mug shots.
That's a pretty high error rate for an algorithm that's being deployed and used by
police forces or other organisations.
The way that this technology might be used makes that even more troubling. For
example, there was a plan that has since been rolled back to link facial
recognition technology and criminal databases with video doorbells. So when
someone comes to your door, the video doorbell picks them up, runs their face
against the set of databases, and can identify if someone with a criminal record is
at your door. However, we know that there's a lot of inaccuracy in these
algorithms, and they tend to be more inaccurate and make more mistakes on
people with darker skin. This means it's likely to reinforce existing social biases.
Furthermore, in neighbourhoods where there are higher densities of people who
have been in jail, it means that people's criminal records will be constantly at the
forefront of everyone's mind. Friends and family members will be reminded that
the people they spend time with have been in jail. There are real social
implications to doing this kind of thing, even if the algorithms are right all the
time.
There's a lot of debate over the right way to use these algorithms. The inaccuracy
and the potential for them to create a variety of social problems have led to bans
on the use of facial recognition technology by government departments,
including police agencies in some cities. However, we're in the early days of this
technology, it's possible that going forward that facial recognition may become
more integrated into applications. It will require close monitoring if it's to be
used in a fair way.
In fact, even if the accuracy problems are solved, it's hard to say if there even is a
fair way for this to be used, it would constitute a dramatic escalation in the way
people are monitored through their everyday movements. This is an area where I
personally have a lot of concern. I think this technology should drive the
development of privacy legislation globally, as it's one of the greatest threats to
personal and civil liberties that we face.
5.3 Tattoo Recognition
Beyond facial recognition technology exists to individually monitor people and
their associations in other ways. Considered tattoo recognition, facial recognition
looks at the biometrics of your face to uniquely identify you. Tattoo recognition
does a similar thing, scanning an image of the tattoo to distinguish it from any
other. However, tattoos may be nearly identical between two people. If we both
have a star or a flag or logo tattoo on our forearm, an algorithm may have a hard
time telling them apart. But the fact that we have the same tattoo is still
interesting. It may reveal that we're part of the same group. Maybe that's the
same branch of the military. Maybe it's that we're in the same gang. Data
Collection about tattoos is already quite advanced. NIST, the US National
Institutes of Standards and Technology provides government and law
enforcement with a list of characteristics to note about tattoos, including type,
location, colour and imagery. Law enforcement has long used tattoo imagery to
identify gangs in members of hate groups but tattoo recognition technology
allows this to be carried to a new level. People on streets that are monitored with
cameras, even existing surveillance systems could have their tattoos
automatically scanned, cross referenced and flagged as potentially gang related.
Essentially in otherwise an anonymous person can be labelled as a gang member
without any other action. We have seen this kind of analysis go wrong before.
Daniel Ramirez Medina, a 25 year old immigrant who had been granted dreamer
status, was arrested in 2017. The government tried to strip him of his protected
status alleging he was a gang member because of a tattoo he had. The tattoo is
actually the name of the place his family was from in Mexico. Eventually, he was
released and a federal judge restored his status and barred the government from
asserting he was a gang member. However, the process took a year to sort out
and could still drag on. Now, this wasn't a case of using automated tattoo
identification, but it shows the consequences of mistaken tattoo Association.
Imagine this scaled up and automated and the potential for tremendously
impactful mistakes is clear.
5.4 Advertising Kiosks
Taking pictures of us and monitoring us is not just limited to identification for
law enforcement purposes. There are now advertising kiosks that will analyse
your face as well. The Wall Street Journal reported that some shopping malls in
South Korea had installed kiosks that have maps of the mall with lists of the
stores. I'm sure you've seen these before. But in this case, each kiosk had a set of
cameras and a motion detector. When someone came up to look at the map or
browse the stores on the screen, those cameras in detectors used facial
recognition type systems to analyse the face of the person using the map.
Why would they do this? They weren't trying to uniquely identify that person,
but rather to estimate their gender and age. From there, the kiosk could drive
them to different stores or show them ads for other products. A young woman
may see ads for something different than an older man. This is not just a science
fiction technology. This is something that's been actually used in shopping malls
already?
Do we want this kind of processing to happen? It respects privacy more than
facial recognition but it's still invasive. It blurs the line between surveillance
cameras that we've become somewhat used to that monitor us in stores,
presumably for public safety and to deter theft and facial recognition technology
that's monitoring and recording our movements is unique, identifiable people.
When we are in public spaces, we know that we can be seen by other people who
are there and we know we can be monitored in different ways. But we may not
expect that the way we look, act or move through those spaces will result in
personalised advertising directed specifically towards us. Our reactions to this
kind of technology should also consider how our data is handled.
For example, in the kiosk situation, what's being stored? Is it a person's age and
gender being recorded or Just using the moment? Could the kiosk owner analyse
the demographics of people who used it? Are copies of people's photos being
stored, or being shared with third parties? The fact is, when we walk up to a
kiosk like this, we generally have no idea that a camera is present or that it's
finding out information about us. Because of their privacy laws, a system like this
is unlikely to be able to operate in Europe.
Collecting this kind of personal data about a person would require explicit
consent and obvious transparency. And because European laws require an
explicit opt in, people would have to essentially push a button that says Yes,
they're willing to have their picture taken and personal information analysed in
order to show them ads. That really defeats the purpose of passively analysing
people with a system like this. Strong privacy rights mean, you're unlikely to see
these kinds of kiosks in Europe. This kind of surveillance in the world highlights
the need for legislation that will clarify what kind of privacy we should be able to
expect and what kind of monitoring we can avoid.
Digital surveillance traces can also make their way into the offline world. In the
summer of 2017, the news outlet The intercept published an article about
Russian attacks on US voting systems. Russian military intelligence launched a
cyber-attack against one manufacturer of us voting hardware. They also executed
a spear phishing campaign, which sent targeted emails to over 100 election
officials trying to get them to download a Microsoft Word document infected
with malware that would give Russians full control over the officials’ computers.
This scoop was the result of a top secret NSA report that had been shared
anonymously with The intercept. The person who shared it knew better than to
email it NSA systems are closely monitored and personal electronic are not even
allowed in the building. Instead, the source printed the report carried it out and
mailed it to the intercept.
As the NSA tracks who prints every document, and when the FBI investigated the
leak, they claimed only one person who printed the document had email contact
with The intercept. However, even if there were no email contact, the person who
shared the document could have been easily caught. Many colour printers
include an almost invisible series of dots on each page they print. These include
the date and the serial number of the printer. The images of the NSA report that
the intercept included in their article included these dots though the news outlets
certainly didn't know it at the time.
And the source was indeed caught. 25 year old NSA contractor Reality Winner
was arrested and eventually sentenced to five years and three months in federal
prison for violating the Espionage Act. While there are many interesting aspects
to this story, the fact that printers are including surveillance material without
permission or disclosure is surprising.
5.5 What Can You Do?
We have already talked about ways to avoid being tracked digitally. But as the
story of Janet Vertesi shows actually doing so can be almost impossibly difficult.
In terms of digital surveillance, it's really important to think about your comfort
zone and what sort of effort you want to expend. Offline surveillance is even
more difficult to control since we often don't know when we are being watched,
and to what end. Some surveillance in public is inevitable, and has benefits for
public safety and security. But too much can threaten individual liberties and
freedoms. The difficulty of analysing that data has protected most of us from the
most troubling consequences so far, but the technology and the algorithms are
improving every day.
What can we do? This is a case where individual efforts of control might not be
very effective. Surveillance and its consequences can only really be controlled
through regulation and policy. If you feel strongly about surveillance and its
impact on you, I would encourage you to get to know the privacy laws in place in
your country or community and become active in trying to improve those laws.
Guidelines that bestow rights on each of us to decide how much we share about
ourselves, especially with profit driven surveillance systems are likely our best
hope for a future with less monitoring. But we have a ways to go before these
structures are in place.
Chapter 6. The Dark Web
6.1 Introduction
You've probably heard of the dark web and likely in a rather nefarious context.
What is it? The dark web is a place where illegal activity happens on the internet
from drugs and weapons dealing to trading software viruses. But it's also a place
where plenty of people go for legitimate reasons, including because they want
privacy. This is especially true if they're living in an environment where free
speech is suppressed and speaking against a government or religion is punished.
In this chapter, we are going to talk about how the dark web works, what you can
find there and the ways it connects to issues around in your personal data online.
The dark web is called dark because it's not accessible from regular browsers
and it's not indexed by search engines. It's the same technology as the web and
operates with browsers and all the things that you're used to online. But to get
there, you need to be able to access that part of the web network. This is done
using the Tor Browser.
6.2 Tor Browser
TOR stands for The Onion Router and was originally developed by the US Navy.
The Tor browser is built on top of Firefox, so it will look very familiar. You can
download it for free and you can use it just like you would use any web browser.
The key differences are:
1. It can access sites on the dark web and
2. It protects your web traffic from snooping.
In a previous chapter, we talked about how your web usage can be monitored as
a way of tracking on you online, your internet service provider, advertisers or
entities who are looking in from the outside, can see all the sites that you go to
and build an understanding of what you're interested in and what you're doing. If
that's something that you want to keep private Tor protects that as well.
Let's start with the technical fundamentals of how the Tor Browser works. As I
mentioned, Tor is built on top of Firefox, and so it basically works mostly like the
Firefox browser. However, it's designed to protect your web browsing by routing
it differently. Instead of connecting you directly to the webpage you want to
access, Tor routes your traffic through a series of intermediate servers. For
example, if you were sitting at home, and you want to access Google to do a web
search, any standard browser will just connect you directly to Google. Your
request will go from your computer to your internet service provider. Then hop
across the internet backbone until it finds Google. Google then passes the
webpage back to you along that path. Every website works this way, and it'll have
a log that a request was received from the IP address of your home computer.
Instead of finding a quick path from you to the page you want to visit, Tor passes
your request through several intermediate servers, it may take your initial
request and route it to an intermediate server that say in Belarus. That server
knows how to get the request back to you at home, and it routes your traffic to
another location, say in South Korea. The Korean server does not know your
home location, but it knows that it needs to send information back to Belarus.
Then it will forward the request on to yet another location say Brazil. Again
Brazil only knows that it has to pass information back to the last time Stop South
Korea, it doesn't know where you're actually connecting from. There are a series
of these steps and any server in that series only knows the server that came
directly before it in the chain. Thus, even if your traffic were intercepted at one of
those servers, no one would be able to track it back to your computer. Eventually,
your request will reach Google. They'll then return the page to the last server
who requested it. That server will pass it back in the chain, and this repeats until
the page finally reaches you. This gives you a great deal of privacy with respect to
your web searching habits. Since no one can trace a request back to you. Your
home IP address is only known by the first server in the chain.
Servers on the Tor network do not log this information so no one can piece
together that chain to track a request back to you. If you are concerned about
privacy. Tor is an additional technique that you could use If you wanted to hide
your web browsing from your home internet service provider, or from anyone
else who might be snooping. It works extremely well. If that's the case, why
didn't I mention it when we talked about ways to protect the privacy of your web
browsing? Well, routing your traffic around like this dramatically slows down
your web experience. If you try to go to Google from your home computer in a
regular browser, you hardly notice a delay before the page appears on your
screen. On a slow day, it may take one second before it appears. With the Tor
Browser. Accessing that same website in the same way, may take five or 10
seconds. Sometimes it even fails and you have to try to connect to the website
again. That's because if you route your request around the world a few times,
every server has to wait and respond. And that really slows down what's
happening. Ultimately, whether you think that kind of delay is acceptable or not
comes down to personal preference. There are other inconveniences that come
with this traffic routing, the website at the end, maybe using basic information
about where you were coming from to determine what to show you. If it looks
like you're coming from another country, some websites may not work.
For example, when I was trying out the Tor Browser, I tried to order a pizza in it.
But when I went to the pizza website, it told me I couldn't order because they
didn't currently delivered to Belarus. Even though I was not there, it didn't
matter that I told them I was not in Belarus, they looked at where my traffic was
being sent to and use that to determine my location. Now I may use the Tor
Browser, but I can't use it for pizza.
Like many things, there are trade-offs between privacy and convenience. How
much you want to protect your privacy, and how much inconvenience you're
willing to put up with for that protection is a personal decision. People may want
this kind of protection simply if they're privacy conscious. But it becomes more
important if you live in places where you know your web traffic is being
monitored. In many countries who do this kind of monitoring VPNs virtual
private networks that encrypt data coming from your computer are banned. So
traffic can't be hidden that way. Tor provides a way around this. Of course,
people who are engaging in illegal activities also want this kind of privacy. But
don't worry, it's perfectly legal to use the Tor Browser. And lots of people use it
for legitimate purposes. So there's nothing wrong with downloading it and giving
it a try.
Protecting the privacy of your web activity is one of the main features of the Tor
Browser. The other is that it's able to access the dark web, which is what we're
here for. Let's talk a little bit about the dark web and its anatomy.
6.3 The Dark web anatomy
The dark web is not a different technology from the regular web. The main way
that you can tell the difference between a dark website and a regular website is
that dark websites all end with .onion top level domain instead of things you're
familiar with like.com or .net. If you try to access a .onion website with your
regular browser, your browser will just think that you've put in an incorrect
address and the browser will not be able to get to it.
The Tor Browser, on the other hand can access these sites. The domain names of
.onion sites look different than what you would expect on the regular web.
Instead of being able to choose your own domain name like cnn.com or
google.com, or Ilovepizza.net, every dark web domain name is 16 characters
followed by .onion. Just like with the regular web, anyone who's connected to the
dark web can set up a server, and host a website if they have the technical skills
to do so. On the regular web, if you want a domain name, like Obama.org, you
need to register it with a domain name registration service. That map's your
domain name to the IP address of the computer that hosts your website. A
distributed database of these mappings is kept on lots of computers, the domain
name servers that we talked about in an earlier chapter. On the dark web, there's
a similar domain name service but instead of just choosing the word that you
want for your domain name, you basically pick from a list of all available 16
character strings. This means that almost every website on the dark web has a
meaningless domain name (e.g 5fghkl54752dmnd45.onion) that's just a bunch of
letters and numbers followed by dot onion. That means that it's pretty much
impractical to memorise domain names on the dark web, like you do on the
regular web. Some websites you're familiar with exist on the dark web.
For example, Facebook is on the dark web. And while that is facebook.com on the
regular web, they have to follow the same rules and have a 16 character domain
name with dot onion to be on the dark web. That means they can't be
Facebook.onion. Instead, they are Facebookcorewwwi.onion on the dark web.
Since domain names essentially can't be memorised, it would be useful to have
good search engines for the dark web, however, that's not the case. There are
dark web search engines, but they're more like using a search engine from back
in the 1990s on the regular web. The results are often irrelevant lead to broken
webpages and are missing a lot of relevant information. This is not because there
are no professionals building the search engines, but rather that things change at
a much faster rate on the dark web. There's a lot of nefarious activity going on
there. Popular sites will get attacked by hackers that bring the service down. Or
they're targeted by law enforcement because illegal activities are taking place
there and then they'll shut down. Once they're gone, it's very easy for them to
simply reopen with a new dark web domain. And since those domain names are
not meaningful, or memorised, it's not like they're losing important branding. But
frequently, changing domain names means that search engines can't rely on sites
being in the same place for very long. Thus, a lot of the way that you find things
out on the dark web is by word of mouth.
This is not the only way that the dark web feels like a throwback to the regular
web of the 1990s. Looking at websites on the dark web also feels very old school.
They tend to have very simple interfaces, no fancy scripts or graphics, and little
tracking information because no one on the dark web wants to be tracked and
the Tor Browser prevents meaningful tracking anyway. Thus there are simpler
and less professional looking, but they load very quickly.
6.4 Dark Web Activities
If you want to get on the dark web, you need to get the Tor Browser and then find
the domain of a site you want to go to. As an example of just how to get on, we
can look at Duckduckgo.com. This is a reputable search engine that's available on
the regular web and also operates on the dark web. If we go there, we see
something that looks pretty much like a regular website and works like a regular
website because it is a regular website. We're just accessing it in a slightly
unusual way.
First, launch the Tor Browser (torproject.org). You can access any web page from
Tor because it's a regular browser. But we can also access pages that end with
.onion. Having a dark web presence allows people in countries with oppressive
governments and restrictive internet to access the site while covering their
tracks. There are also online marketplaces on the dark web that sell legal and
illegal things.
So far, I haven't made a very compelling case for why you would want to use the
dark web, you have to use a browser that's much slower than a regular browser,
search engines don't exist in the same way so it's very hard to find what you want
and there's a lot of sketchy things going on there.
Why do people use it? In the context of your personal data, there are two things
that are relevant. First, the dark web is where personal data that's been stolen
can be found. The other is that it's a way to keep your activities more private.
Let's start with that good one. How do you keep things more private by using the
dark web? Remember that you can do lots of normal things on the dark web. So
we're not necessarily talking about keeping criminal activities private. There are
plenty of perfectly legitimate activities going on in the dark web, including people
playing games, or having political debates or sharing news. You can find the full
text of popular books along with pirated content that while illegal, is of interest
to a lot of people. If you're discussing sensitive topics, being able to do that
anonymously, in a way that can't be tracked is attractive. Certainly, if you live in a
country where you know web use is closely monitored, the dark web is very
attractive. It's a place where you cannot be tracked, your information is
encrypted, you can speak anonymously and discuss important civic issues
without fear of governmental retribution.
In this way, the dark web embodies a lot of the ethics of the early web that
focused on freedom of expression and that freedom is a tool for improving
people's lives. Even if you aren't engaging in something sensitive, the ability to
discuss and interact without being monitored or monetized is very attractive to a
lot of people. That said, the dark web is also a place where a lot of illicit things
happen. A study of over 2500 dark websites found that well over half of them
included some kind of illicit or illegal content. One of the major activities that you
can do on the dark web is to buy things and you can buy pretty much anything.
You can buy stolen credit card numbers, stolen login information, drugs, guns,
pornography, computer viruses, and the services of people who will help you do
more of those illegal things. You can hire hackers, currency traders and hitmen,
the marketplaces where this happens look a lot like a low version of eBay
(grymktgwyxu3sikl.onion/market). People can create listings with photos, other
people can bid or directly buy the products and money is held in escrow like
often happened with online auction websites before PayPal was popular. Once
the products are delivered, the money goes to the seller. How is it that people can
buy a kilo of cocaine online and not get caught? In addition to the anonymity
offered by the dark web, the rise of Bitcoin and other cryptocurrencies has
enabled these sorts of transactions to take place anonymously and securely.
Bitcoin and cryptocurrency are terms you've probably heard, but you may not
know what they are. Essentially, they're currencies that were invented. They're
not tied to any government or company. Transactions with cryptocurrencies are
recorded in public Ledger's maintained by volunteers. The transactions are
anonymous, with each person identified only by a string of random looking
letters and numbers, and the data within them is encrypted. So it remains secret
to everyone except the two people in the transaction. Ideally, you can convert
cryptocurrency into any other currency, but it's highly volatile. And whether or
not conversion works reliably and safely is still up for debate. It often involves
meeting strangers in fast food, parking lots to do the exchange. You can easily
buy bitcoin or other cryptocurrencies with regular money. You can trade it on
exchanges, and you can use it to buy things on the dark web. The deeper details
of how cryptocurrencies work is relatively complicated and we won't get into it
here. But the important feature is that two people can exchange money securely
without knowing any personal information about the other person.
Crypto currency in the dark web marketplaces have really evolved together
because the marketplaces make cryptocurrencies like Bitcoin more useful and
Bitcoin enables those kinds of transactions to take place securely and privately. I
mentioned that personal information can be bought on the dark web. But what
does that involve? It's basically limitless. You can buy the bank account numbers,
login and password information for a bank account in the United States that has a
$50,000 balance for like $500 on the dark web. That's very sensitive personal
information. And using it comes with a huge risk, but it can be had for a price.
When you hear about large data breaches of username and password
information from big websites that data tends to end up on the darkweb as well.
It's not as valuable as bank account information, but it can be used in a lot of
ways.
For example, there was a large breach of Yahoo login and password information,
even though everyone may have changed their password on the Yahoo website
after that, anyone who obtained that hacked information would know your
username, yahoo email address and a password you were known to have used. If
you use that same username and password combination anywhere else on the
web, they could try it out and possibly get access to different websites. This is
why people are often encouraged to use different passwords on different
websites. Though we know it can be impractical given the number of places that
we have passwords. The suggestion is designed to protect us against attacks like
this. Hacked personal information can also be aggregated. So there are places on
the dark web where you can find a person's collected email addresses and login
names, along with other information that may have been obtained illegally like
hacked passwords, credit card numbers, social security number and other
information really sensitive information.
This is all available for relatively low priced, anyone who wants to buy it. In 2017,
Experian did a study of personal information that could be bought on the dark
web and how much it cost. A social security number went for just $1. A credit
card number with the code on the back was $5. And the debit card number and
associated bank information was $15. And driver's licence was $20. If you have
any kind of normal online presence, there's probably information about you for
sale on the dark web. Unfortunately, there's not a whole lot that you can do about
that. Because of all of the privacy and security elements of the dark web that
we've already discussed. It's quite difficult to shut these repositories down. They
just pop up someplace else. And they're not traceable to the individuals who are
running them. That said, you still may want to know what's there.
6.5 What Can You Do?
You could get on the dark web yourself and start searching. But plenty of places
like credit bureaus and credit card companies now offer dark web monitoring
that looks for your personal information on the dark web and alerts you to it if
they find your credit card number, or a password that you were still using. They
can alert you that this is something to change in order to keep other accounts
secure. However, these services come at a cost and sometimes it's in random fees
that you're charged and often it means you give up your right to sue the people
monitoring you, even if they are the reason your information ended up on the
dark web in the first place. For example, if a credit bureau is hacked, and you take
them up on an offer to monitor the dark web for your information, it may be
letting them off the hook for being hacked, and letting your information out
there.
The only real steps you can take to protect yourself are using good security
practices. Using things like two factor authentication will alert you if someone
tries to get into your accounts and it makes it harder for them to access them.
You can set up a credit freeze or monitoring with a credit bureau, but be sure to
read the fine print. But beyond that, the dark web and the dark marketplaces of
information that exists there are just an unfortunate reality of our modern digital
life right now. Hackers are going to hack and until there are stiffer penalties for
companies to encourage them to do much more to protect our information, that
information will fall into the hands of criminals and end up on a low rent version
of eBay. The freedom of the dark web is useful as it is, comes at a cost to us all.
Chapter 7. The Future of Personal Data
7.1 Introduction
What's the future of personal data and privacy? The answer is hard to come by
not only because there are so many paths, but because technology drives this,
and it's evolving quickly. Nonetheless, in this last chapter, we'll ask this question
and look at a variety of ways that it might be answered in the coming decades.
We'll start with DNA as an example of how technologies change and privacy
assurances and rights shift with them. Then we'll look at the legal landscape and
what kinds of privacy regulations may be coming in the future.
7.2 DNA as Personal Data
DNA represents an interesting crossroads of technological, legal and ethical
issues with personal data. We know that DNA is now in the toolbox of every law
enforcement organisation to catch criminals. If you bleed or sweat or cry at a
crime scene, you leave a piece of yourself behind that can uniquely be matched to
you. But in the 30 years since we started hearing about DNA in the courtroom,
we've come a long way and being able to identify and link people with their
genetic profiles and this serves as an example of how technological advances can
undo privacy guarantees that were made in the past and create new challenges
going forward. If you're interested in true crime, you might know the stories that
I'm about to tell you.
The first begins with a woman Lisa, who did not know where she came from, as a
child she had been taken away from her family, kidnapped by a man who would
eventually turn out to be a serial killer. He had taken her moved around the
country, but eventually tired of her when she was only five years old. At the RV
Park where they were staying, he gave her away to a couple who were his
neighbours. They took her in but eventually went to the police knowing
something wasn't right. From there, she went into protective custody was
adopted and grew up not knowing who her birth family was, or even what her
birth name was. The man who claimed to be her father was eventually caught,
tried and convicted of abandonment. But for some reason, they never finished
the paternity test. If they had, they would have discovered he was not her
biological father.
14 years later, when Lisa was an adult, a detective finished that test. On a hunch,
when she discovered that the man was not her father, the case became more
complicated. Who was Lisa and how did he come to have her with him? And
where did she come from? For the next 10 years, there were no answers. They
tried searching for a parent or sibling using Lisa's DNA, but had no luck. Finally,
Lisa herself suggested that they start looking into genealogical databases on sites
like ancestry.com, 23andme and other open source DNA databases. People can
upload their profiles to try to find distant relatives.
She thought her identity might lie somewhere in those databases. Detectives
uploaded the profile and started finding distant cousins. We have a lot of distant
cousins. For Lisa, there were 25,000 relatives to sift through, tracking down and
immediate family member would require more work.
Barbara Rae-Venter, a genetic genealogist joined the search. After a year of work,
Barbara and her team narrowed down Lisa's mother to one person, the detective
handling her case, reached out to the family and Lisa learned her real name was
Dawn. This was the first use of familial DNA in this way in a criminal case, it got
detectives thinking about other ways they could use this strategy. Essentially,
they take the profile of a person and look for whatever distant relatives they can
find. Working with other public records and genealogical data brings them
increasingly closer to their near relatives. In Lisa's case, we had a known person
with an unknown identity.
But what if you had an unknown person like a murderer or a rapist, and you
wanted to identify them using the DNA of their family members. It turns out that
this works, and it helped catch an infamous serial rapist and serial killer who had
eluded police for decades. The Golden State killer profiled most prominently in
Michelle McNamara's book ‘’I'll be gone in the dark’’ committed dozens of rapes
in the Bay Area before escalating to home invasions and murder in the Los
Angeles area in the 1970s and 80s.
The police had DNA but had been unable to match it to a known person in all that
time. Eventually, they turned to familial DNA using public open databases where
people voluntarily upload their DNA profiles to try to find relatives. They were
searching for relatives of a murderer. Using a similar process to leases, they were
able to eventually narrow it down to one man who they thought was the killer
they were looking for. Of course, they couldn't rely just on these familial records
for an arrest. But once those records gave them a suspect, they were able to
surreptitiously collected DNA sample from the door handle of the suspects’ car,
and it matched. The man known as the Golden State killer, the East area rapist,
and the original Night Stalker had been caught. It was 72 year old Joseph
D’Angelo, and he was arrested in 2018. These stories show the power that lies in
DNA. The Golden State killer was just the first prominent example in what
become a string of cold cases that have been solved with this technology.
Catching killers and rapists is good. But there are a number of questions that
arise about this use of personal data that we have to consider. In these cases,
people had voluntarily uploaded their profiles into public databases where one
could reasonably presumed law enforcement would also have access to it. They
may be surprised to know their profiles could be used to catch their distant
relatives for committing crimes.
When we're talking about catching murderers and rapists, it feels like we're
firmly on the good side using this information. But what if it starts being used for
more petty crimes, simple assault, breaking and entering, a drug crime where
DNA is left behind? What if it eventually becomes so cheap that it's used to catch
people for committing relatively minor crimes?
Do people want their DNA profiles to catch distant cousins for littering? And
there are other applications where the territory becomes much more troubling.
For example, consider the case of anonymous sperm donation or egg donation.
Most of the time when men and women choose to anonymously donate in this
way, they sign legal contracts that protect their identity. They surrender parental
rights and the contracts keep their identities private and unknown to the families
that receive donated sperm and eggs. If donors knew they could be identified, it's
likely that many would refuse to participate. Some simply want to keep their
donations to themselves. A woman who donates her eggs in her 20s because she
wants to help an infertile couple and make some extra money maybe very
reticent to donate if the resulting children may track her down 20 years later.
Maybe she just wants to keep her reproductive choices private and be left alone.
A reasonable precaution donors might take is to keep their DNA records out of
public databases. They may even choose to never get a DNA test at all. However,
even with basic ancestral DNA searches, if that donor has a brother or sister or
parents who upload their DNA, the child that resulted from the donation would
find that immediate relative quite quickly in their search. Then, if they're working
in a system that allows contact with genetic matches, the child could reach out to
the donors family. If the donor never told anyone about their donation, the child
has revealed a deeply personal and private piece of reproductive health
information to family members.
This is especially troublesome if the donation violates the family's ethical or
religious norms. This kind of Revelation is severe enough that for some people, it
could destroy family relationships and the donor has a right to keep that
information private. Furthermore, a donor likely does not want a relationship
with the child that they anonymously donated to produce. They chose to donate
anonymously in the first place for a reason. Yet a donor’s family could decide to
pursue a relationship after finding a DNA match. The donor’s right to control the
outcome of their donation is taken away from them.
Sperm banks and reproductive health facilities are now considering how to
discuss with potential donors the way that their anonymity will be preserved.
They may simply be unable to guarantee anonymity in the current world of DNA
testing. But for people who donated in the 90s and 2000s, who were assured that
their identity would be kept private, familial DNA search is now potentially
taking that away and not for any real social good, but potentially for the whims
and curiosities of other people. And of course, the problems go deeper than this.
7.3 DNA profiles
There's only a small amount of protection with respect to DNA profiles at this
point, when George W. Bush was president, he signed legislation that prohibited
health insurance companies from discriminating based on DNA. That's an
important law. However, it's very narrow.
There is a case of a child being barred from enrolling in his local school because
he was a genetic carrier for cystic fibrosis. The school had a prohibition against
two students who had cystic fibrosis from both attending to reduce the risk of
infection. There was already a student with cystic fibrosis at the school, so the
boy was barred from enrolling even though he did not actually have cystic
fibrosis, he was merely a genetic carrier. The school's ignorance of what these
genetic tests meant, led them to take away the rights of this student. This speaks
to a long line of discrimination based on medical misunderstandings by lay
people.
You may remember the case of Ryan White, a student barred from attending
public school in the 1980s because he was HIV positive. If genetic testing
becomes cheap and easy, there are no current laws that prevent employers,
schools and other organisations from discriminating against people based solely
on their genetic profiles. If you're genetically predisposed towards heart disease,
or Alzheimer's or schizophrenia, even if you're taking all the behavioural steps
that help prevent that, you could still be barred from getting a job based on
discrimination against that factor.
The lack of scientific sophistication among the general public who's not trained
in interpreting and understanding genetic testing means that the opportunities
for unfairness are rampant, and we're likely to continue to see this sort of
discrimination based on DNA. Genetic privacy is a complex topic, and it's unlikely
that a single law could be put in place that protects people from unfairness,
discrimination, and having their privacy compromised. We want to allow for
reasonable law enforcement and healthcare use of DNA, but protect people as we
start moving towards more regulation in this space. Just how to do that is
uncertain. Thinking about DNA specifically, it's worth thinking about whether
you want to even have your DNA tested by one of these companies. If you do
want it for your own genetic insights, or if you already have had it tested, you can
think about strictly controlling the privacy settings on your DNA in the system.
Having your DNA profile deleted after you've obtained the information you want,
may also help.
7.4 The Future of Privacy Regulations
DNA is only one example of many technologies that are evolving to provide more
insight and invasion into our lives. Artificial Intelligence, data integration and
massive data collection all promise to lead towards new tech that can uncover
identities, attributes and connections that we never expected.
As a result, we likely need to think about fundamental privacy rights that we
want to establish rather than piecing together domain specific regulations. That
may sound familiar as it's the basis of European privacy protections. But are we
anywhere close to getting that in the US where big tech are based, federal
regulations are still largely up in the air. But there are interesting developments
happening at the state level, especially in California, that might offer some
insight. The California consumer Privacy Act is a state law that some people refer
to as GDPR light went into effect in January 2020. It gives many similar rights to
citizens of California that Europeans have. It governs large businesses and
businesses that make most of their money by sharing or processing personal
data. The law offers a number of protections. It requires that companies be
transparent about what data is collected and how they use it. Citizens have a
right to control the data about themselves and they have a right to see the data
that companies hold, they have the right to request that it be deleted. While this
will be very beneficial for citizens of California, it raises the prospect of a GDPR
like federal law in the United States. That's because California is likely not going
to be alone in passing a consumer Privacy Act.
Many other states have their own privacy laws and several are considering bills
that will grant similar protections within their borders, having a bunch of
different state laws that regulate consumer privacy, especially when those
regulations are not the same across states can make it very difficult for a
company working with personal data to operate in the United States.
You potentially have to handle it differently and offer different features across 50
states. Depending on how these laws are written, you may even have to offer
different protections if people are simply visiting a state versus living there. This
scenario makes it more likely that we'll see a federal law come into place in the
next few years that offers similar protections. This would allow companies to
operate under a single United States privacy law as opposed to operating
differently in each state. If this happens, the US will be following Europe's lead.
The current situation is very similar to what the EU faced, Europe had a privacy
directive that was implemented differently in each country. That led to some of
the same difficulties we see in the United States. In May 2018, when GDPR came
into effect, it harmonised those laws, making it much easier for companies to
comply and there are already precedents for federal laws to protect privacy in
the US. The Children's online privacy protection act or COPPA is one example of a
federal privacy law that governs the data of children 13 years and younger. It has
very strong consent protections in place. And you're likely familiar with the
Health Insurance Portability and Accountability Act, HIPAA, which governs the
privacy of health information, you probably encountered this consent
requirement through forms in your doctor's office and pharmacy. And there's
also FERPA, the Family Educational Rights and Privacy Act, which grants privacy
protections over education records.
These are all federal laws that allow for consistent implementation of privacy
policies around the country. There's not currently a GDPR like privacy law in the
United States. But if the Congress were to move this way, it would likely be in the
interest of offering more protections for consumers to control their data, and
bringing a more European like privacy legal standard to the United States.
Indeed, the US is already benefiting from Europe's leadership on privacy, they're
able to see what's worked and what hasn't with GDPR so far, and to make
changes as they develop our own law. Without the European law leading the way,
it would have likely been much harder to even get state privacy laws passed
within the US. But now that California has the privacy ball rolling, so to speak,
there's hope that they’ll all eventually gain much more appropriate control over
their own data.
7.5 The Legislative Future of Personal Data
The legislative future is not just limited to privacy protection laws, they need a
robust set of laws that cover many different aspects of the personal data
problem. As we discussed in the chapter on data scandals, cybersecurity is a real
issue that connects with data privacy. We've all been the victims of multiple data
hacks, whether it's our credit cards being stolen from major retailers, or social
media companies and email providers being hacked. Sometimes these breaches
don't yield much useful information. It could just be our email addresses released
which isn't especially valuable or private. However, as evidenced by the hack of
the Office of Personnel Management, which revealed background check records
for federal employees and contractors, deeply sensitive information can also be
released. How do we prevent people from hacking this kind of data?
The bad guys are always going to try to get it and that means we need a strong
defence against their attacks. That requires comprehensive cybersecurity and
strong incentives for companies to follow best practices and the latest guidelines.
Right now, the penalties for poor security practices are relatively weak. Even
when these companies are gathering tremendously sensitive information about
people that can cause major disruptions in their lives. The penalties for weak
security tend to be small enough that their business is not disrupted. Contrast
this against laws in Europe that can find companies a significant percentage of
their gross profits for irresponsible data security. We need better cybersecurity
laws that create very harsh penalties for companies that do not protect our data.
We also need to understand and discourage massive data collection without
purpose. One interesting proposal that's been floated in the US Senate is to
require publicly traded companies to disclose the value and liability associated
with the personal data that they hold about people. Now, we don't really know
how to put a value on that data at this point and so if legislation like that were to
become law, it would require techniques for valuing the data.
However, if that problem is solved, and companies have to report that they
potentially hold billions of dollars’ worth of personal data, and the liability
associated with it being stolen, that serves as a financial deterrent for saving
unnecessary information. For example, if Amazon had to disclose that they have,
say $10 billion worth of personal data wrapped up in all the recordings that
they've stored from people who have an Amazon Echo, they may decide to keep
fewer of those recordings, that then improves privacy, because there's less data
for them to analyse, and improve security because if that data is leaked, there's
less information there that could be exploited.
There's not going to be any omnibus bill that addresses all of the issues
surrounding personal data. Instead, we're going to have to look at the different
parts of this very complicated problem and come up with steps that improve
each.
The focus on corporations also carries into the discussion on privacy and
surveillance. We've talked a lot about the rights of consumers and what
legislation may do to protect them from companies who have their data. But
what about the relationship between people and companies when the people are
employees. Companies have very few limitations when it comes to monitoring
their employees. We likely expect the companies can monitor our work emails,
even though we hope they're not reading them regularly just to monitor us. But
monitoring technology often extends out of the office and out of the bounds of
work related activities.
Consider the story of Myrna Arias, she worked for a money transfer firm. Her
company had an app that tracked employees when they were working out of the
office. Many companies have this and in most cases, it's a reasonable way to
monitor work. If workers are making deliveries. The app lets employers know
where they are in the process. It can track if they're actually working and provide
better service to customers. But should employees keep these apps on when
they're not working?
Myrna’s company said yes. When she objected. She was told she had to have her
phone on and app launched at all times. Her boss used it to monitor her when she
was off duty and bragged that he knew how fast she was driving at certain times
when she was not working because he monitored her in the app. When she
uninstalled the app, she was fired. She sued the company and settled out of court.
Would she have won a suit for wrongful termination? It's not clear and the
waters get even murkier when workplace programmes gather information that's
protected.
For example, in April 2019, The Washington Post reported that pregnancy
tracking app ovia was sharing data with employers. Women use this app to track
their periods, bodily functions, sex drive, and more. The app also partners with
workplace wellness programmes. As part of that they provide information to
employers for a fee, that shares aggregated information about the women using
the app. This includes if they're pregnant, when they might return to work. And if
they lost a pregnancy. It's illegal to use pregnancy status in hiring decisions, and
while okay isn't sharing names directly with employers, it can be easy to identify
women. Consider a small company in a male dominated field with relatively few
female employees. Some may be old enough that they aren't likely to consider
having children, others may be younger, it could be a very small number who
may be considering pregnancy, and they could become identifiable quite easily. If
an employer finds out this kind of information by partnering with the app, they
may make employment decisions based on that. And if they do, how can a woman
prove that she was discriminated against? More importantly, why should
companies get this information at all? We've become so used to surveillance and
workplace situations that we may need to step back and really ask why an
employer has any legitimate right to know about its employee’s fertility issues,
there's really no need for it. And yet companies continue to push employee
tracking with a variety of apps and devices. They may require it for insurance
coverage or discounts. They end up with a lot of very personal data for no
legitimate reason.
A lot of the discussion around privacy rights is centred on companies and their
users data. And this is an important topic, but there should also be a serious and
critical analysis of companies monitoring employees and legislation that
enshrines protections for workers as well.
Of course, monitoring within the workplace may be important, legitimate and
should be allowed. But as employers cross into monitoring the Private Lives of
workers, we shift from a space of desired productivity to desired power.
The technology that can collect, analyse and derive insights from our data is
growing so fast. As a society, we've not figured out how to apply our ethics,
values and protections in this domain. When we think we've caught up, the
technology has sped ahead. DNA profiling shows just how disruptive new
technologies can be in the face of old privacy guarantees. It can help catch
criminals but also identify anonymous donors. To address this legally, we need to
think about fundamental rights people have over their data. There's a change
happening here as well. Though the future is uncertain, but trends suggest we
may see more protections coming.
This brings us to the end of our quick but critical look at the world of personal
data and what you can do to control how you operate in it. What have we
learned? Well, we've learned that you can't control everything. Part of living in a
data driven world is coming to terms with the fact that you and the data you
create intentionally or unintentionally, are valuable commodities and that
people, corporations and governments can and do go to great lengths to access
that data. Sometimes this can make your life better in meaningful ways. But
sometimes it can result in really alarming outcomes. We've also learned that you
have some power over the situation, you have the power to know where you
stand when it comes to your privacy preferences. You have the power to take
targeted steps to create and maintain a level of data privacy that works for you
and you have the power to speak up intelligently and demand change when you
feel more protections are needed.
Download