Example: privacy-preserving surveillance

advertisement
The Social Security Number Crisis
Latanya Sweeney
privacy.cs.cmu.edu
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Thanks to
Harry Lewis
Henry Leitner
Harvard Center for Research on Computation and Society
Gratitude to
Harvard Extension School
Harvard Summer School
Harvard GSAS
Harvard College
for exposing me to other disciplines and other ways of thinking.
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Data Detective
How do we learn sensitive or strategic
information from seemingly innocent
information?
Data Protector
How do we provably prevent sensitive or
strategic information from being learned?
Privacy Technology
Privacy
1.Example: linking data
2.Example: anonymizing data
And
3.Example: distributed surveillance
Technology Or Privacy
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Traditional Belief System
This Work
Usefulness
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: Identity theft protections
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: Identity theft protections
12.Example: k-Anonymity
Original
Tracked
De-Identified
13.Example: webcam
surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
HIPAA CERTIFIED!
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
120.0%
Cumulative Percentage of Patients
100.0%
80.0%
Unaltered
60.0%
Safe
40.0%
Altered
20.0%
0.0%
0
5
10
15
20
-20.0%
Binsize
25
30
35
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
Gross overview
6.Example: identity
theft
7.Example: fingerprint capture
Sufficiently anonymous
Normal operation
8.Example: bio-terrorism
surveillance
9.Example: privacy-preserving
surveillance
Sufficiently de-identified
Unusual activity
10.Example: DNA privacy
Identifiable
Suspicious activity
11.Example: SSN
failures and biometrics
12.Example: k-Anonymity
Readily identifiable
Outbreak suspected
13.Example: webcam surveillance
Explicitly
identified
Outbreak detected
14.Example: text
de-identification
15.Example: face de-identification
Identifiability 0..1
Detection
Status 0..1
16.Example: fraudulent
Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: tracking people
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: Identity theft protections
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Privacy Technology
1.Example: linking data
2.Example: anonymizing data
3.Example: distributed surveillance
4.Example: trails of dots
5.Example: learning who you know
6.Example: identity theft
7.Example: fingerprint capture
8.Example: bio-terrorism surveillance
9.Example: privacy-preserving surveillance
10.Example: DNA privacy
11.Example: SSN failures and biometrics
12.Example: k-Anonymity
13.Example: webcam surveillance
14.Example: text de-identification
15.Example: face de-identification
16.Example: fraudulent Spam
privacy.cs.cmu.edu
Team Members
• Computer scientists
(AI, database, security, theory, NLP,
HCI, data mining, vision, biometrics,
link analysis)
• Lawyers
• Medical doctors
• Social scientists
• Policy analysts
• Geneticists
• Forensic scientists
• Ethicists
• Economists
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
SSN Numbering Scheme
• Social Security number allocations
•Historical highlights and uses
•Inferences from SSNs
Historical Highlights of the SSN
• 1935 Social Security Act
SSNs only to be used for the social security program.
• 1943 Executive Order 9397
Required federal agencies to use SSNs in new record systems
• 1961 IRS began using SSN
As taxpayer identification number
• 1974 Privacy Act
Government agencies use of SSN required authorization
and disclosures (exempt agencies already using SSN)
• 1976 Tax Reform Act
Granted authority to State and local governments to use
SSNs: state and local taxes, motor vehicle agencies
•Over 400 million different numbers have been issued.
Source: Social Security Administration, http://www.ssa.gov/history/hfaq.html
Non-Government Uses of SSN
•Corporate use of the SSN is not bound by the laws
and regulations mentioned earlier.
You can request an alternative number from companies.
You can refuse to provide, they can refuse service.
• Most common non-government use relates to credit bureaus
and credit granting companies who rely on the number for:
Recognition – to locate your credit history for sharing it
with you or with others from whom you requested credit.
Authentication – to make sure new entries are added to
the credit report that relates to you. Primary means is
SSN along with mother’s maiden name, which serves as
a kind of password.
•Common uses are as corporate identification numbers:
Example: medical and school identification cards
Quality of the SSN Assignment
Ability to acquire the number and use it falsely grows
as more copies of the number are stored for different
purposes while possible benefits of misuse have
rewards (even if illegal).
A Social Security number is almost always specific to
one person and one person typically has a unique
SSN. There are exceptions.
Unusual case of SSN 078-05-1120
Used by thousands of People!
In 1938, a wallet manufacturer provided a sample SSN
card, inserted in each new wallet.
The company’s Vice President used the actual SSN of his
secretary, Mrs. Hilda Schrader Whitcher.
The wallet was sold by Woolworth and other stores. Even
though it had the word "specimen" written across the face,
many purchasers of the wallet adopted the SSN as their
own. In the peak year of 1943, 5,755 people were using it.
SSA voided the number. (Mrs. Whitcher was given a new
number.) In total, over 40,000 people reported this as their
SSN. As late as 1977, 12 people were still using it.
Source: Social Security Administration, http://www.ssa.gov/history/ssn/misused.html
SSN Numbering Scheme
• Social Security number allocations
•Historical highlights and uses
•Inferences from SSNs
SSNs are Encoded Numbers
The encoding is based on how the numbers are
issued. They typically situate the recipient in a
geographical area within a time range. They
may also reveal whether the person is an
immigrant, an alien, or a worker on the railroad.
Format: AAA-GG-NNNN
AAA is area code
GG is group code
NNNN is serially assigned number
First 3 digits Provide the State of
Issuance, 1
001-003 New Hampshire
004-007 Maine
008-009 Vermont
010-034 Massachusetts
035-039 Rhode Island
040-049 Connecticut
050-134 New York
135-158 New Jersey
159-211 Pennsylvania
212-220 Maryland
221-222 Delaware
223-231 Virginia
691-699*
232-236 West Virginia
232 North Carolina
237-246
681-690
247-251 South Carolina
654-658
252-260 Georgia
667-675
261-267 Florida
589-595
766-772
268-302 Ohio
303-317 Indiana
Source: Social Security Administration, http://www.ssa.gov/foia/stateweb.html
First 3 digits Provide the State of
Issuance, 2
318-361 Illinois
362-386 Michigan
387-399 Wisconsin
400-407 Kentucky
408-415 Tennessee
756-763*
416-424 Alabama
425-428 Mississippi
587-588
752-755*
429-432 Arkansas
676-679
433-439 Louisiana
659-665
440-448 Oklahoma
449-467 Texas
627-645
468-477 Minnesota
478-485 Iowa
486-500 Missouri
501-502 North Dakota
503-504 South Dakota
505-508 Nebraska
509-515 Kansas
Source: Social Security Administration, http://www.ssa.gov/foia/stateweb.html
First 3 digits Provide the State of
Issuance, 3
516-517 Montana
518-519 Idaho
520 Wyoming
521-524 Colorado
650-653
525,585 New Mexico
648-649
526-527 Arizona
600-601
764-765
528-529 Utah
646-647
530 Nevada
680
531-539 Washington
540-544 Oregon
545-573 California
602-626
574 Alaska
575-576 Hawaii
750-751*
577-579 District of Columbia
580 Virgin Islands
Source: Social Security Administration, http://www.ssa.gov/foia/stateweb.html
First 3 digits Provide the State of
Issuance, 4
580-584 Puerto Rico
596-599
586 Guam
586 American Samoa
586 Philippine Islands
700-728 Railroad Board**
* Some states may share the same area by transfer or split.
** Railroad employees, discontinued July 1, 1963.
000 will NEVER start a valid SSN.
Source: Social Security Administration, http://www.ssa.gov/foia/stateweb.html
SSNs are Encoded Numbers
The encoding is based on how the numbers are
issued. They typically situate the recipient in a
geographical area within a time range. They
may also reveal whether the person is an
immigrant, an alien, or a worker on the railroad.
Format: AAA-GG-NNNN
AAA is area code
GG is group code
NNNN is serially assigned number
Digits 4 and 5, Order of Issuance
Called the Group numbers. Not assigned
sequentially, but in the following order:
ODD - 01, 03, 05, 07, 09
EVEN - 10 to 98
After all in 98 are assigned, then
EVEN - 02, 04, 06, 08
ODD - 11 to 99
Source: Social Security Administration, http://www.ssa.gov/foia/ssnweb.html
High Group Listing
On a regular basis, the Social Security Administration (SSA)
publishes the highest group number that has been assigned
for each area. Below is a sample of the first few entries for
9/2/2003.
001
007
013
019
025
031
037
98
02
86
86
86
84
68
002
008
014
020
026
032
038
98
86
86
86
86
84
68
003
009
015
021
027
033
039
96
86
86
86
86
84
68
004
010
016
022
028
034
040
Source: Social Security Administration, http://www.ssa.gov/foia/highgroup.htm
High Group Listing, How to Read
On a regular basis, the Social Security Administration (SSA)
publishes the highest group number that has been assigned
for each area. Below is a sample of the first few entries for
9/2/2003.
001
007
013
019
025
031
037
98
02
86
86
86
84
68
002
98
003
96
004
008
86
009
86
010
014 area86003 015
86 3 digits
016 of an
For
(the first
020 the
86 highest
021 number
86
022
SSN),
used in
026 th 86 th027
86
028
the
4 and
5 033
digits84
is 96.034
032
84
038
68
039
68
040
High Group Listing, Interpretation
Recall the assignment of group numbers:
ODD - 01, 03, 05, 07, 09 then EVEN - 10 to 98
After all in 98 are assigned, then
EVEN - 02, 04, 06, 08 then ODD - 11 to 99
001
007
013
019
025
031
037
98
02
86
86
86
84
68
002
98
003
96
004
008
86
009
86
010
014
86
015
016SSN.
003-09-1234
would 86
be valid
020
86
021
003-02-1234
would 86
NOT 022
be valid.
026
86
027
86
028
032
84
033
84
034
038
68
039
68
040
What Can be Learned
from the First 5 Digits of an SSN
In “semantic learning” terms,
•The first 3 digits provide reliable inferences
about place of issuance.
•Digits 4 and 5 provide inferences
on time of issuance.
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Social Security Death Index
The Social Security Administration releases the Social
Security Death Index for public use. Perceived benefits:
•genealogical research (constructing family trees)
•attempt to defeat illegal re-use of SSNs.
Released information for each death:
Name
SSN
date of birth
date of death
place where SSN was issued
place where SSN benefit was paid upon death
Social Security Death Index
Search by name
or SSN, in art or
whole.
Advanced search
includes options
for date of birth,
date of death, and
geographical
location, in part
or whole.
http://ssdi.genealogy.rootsweb.com/
Sample Result for Herb Simon
Search on Herbert
Simon, Last residence
was Pennsylvania.
SSNwatch
On-line SSN validation
system. Given the first 3
or 5 digits of an SSN,
returns the state in which
the SSN was issued along
with an estimated age
range of the person.
Sample uses:
Job Applications
Apartment Rentals
Insurance Claims
Student Applications
http://privacy.cs.cmu.edu/dataprivacy/projects/ssnwatch/index.html
SSNwatch Results for SSN 078-05Geography
Date of issuance
Year of Birth
(5-digit prefix)
New York
Issued before 1993
64% born 1889 to 1910
98% born 1879 to 1921
If the person presenting the SSN is
about age 20, then it is extremely
unlikely that the provided SSN was
issued to that person.
SSNwatch Results for SSN 078-05Geography
Date of issuance
Year of Birth
(5-digit prefix)
New York
Issued before 1993
64% born 1889 to 1910
98% born 1879 to 1921
If the person presenting the SSN
fails to list or acknowledge New York
as a prior residence, then it is
extremely unlikely that the provided
SSN was issued to that person.
Lab Activity:
Predicting an SSN from Facebook Profiles
Take a moment and write down the steps
(“algorithm”) needed to predict a SSN.
Assume SSN is issued at birth.
Your algorithm should predict the first 6
to 9 digits for Alice, who is born today in
Cambridge, MA.
(You don’t have to give me the answer, but
tell me how to figure it out.)
Lab Activity:
Predicting an SSN from Facebook Profiles
Recent finding:
We can accurately predict 6 to 9 digits of a
young person’s SSN.
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Federal Trade Commission Report:
Victim Complaint Data
The next group of slides are excerpts from
the Federal Trade Commission Report on
Identity Theft, Victim Complaint Data.
Figures and Trends January-December 2001.
Federal Trade Commission Report:
Victim Complaint Data
Federal Trade Commission Report:
Victim Complaint Data
Federal Trade Commission Report:
Victim Complaint Data
Federal Trade Commission Report:
Victim Complaint Data
Other Statistics
•Of the credit card fraud, more than half (or 26% of all thefts)
involved new accounts. [Federal Trade Commission Report on
Identity Theft, Victim Complaint Data. Figures and Trends
January-December 2001.]
•Number of months between date of identity theft first
occurring and date first discovered by victim:
Less than 1 month 45%, 1-6 months 25%
[Federal Trade Commission Report on Identity Theft, Victim
Complaint Data. Figures and Trends January-December
2001.]
•50% of the credit card reports checked contained errors.
Two reasons for errors: (1) mistaken for another person with
similar name; and, (2) fraud. [Consumer Reports, July 2000]
Federal Trade Commission Report: Overview of
the Identity Theft Program, Oct 1998 – Sep 2003
Data Privacy Lab Finding
Fraudulent New Credit Cards
We can describe an algorithm that shows how
thousands of fraudulent credit cards could
be issued to malicious parties using only FREE
on-line information?
• If works, thousands of Americans are at risk
to identity theft immediately!
• If works, need:
• Credit card application requirements
• Finding Social Security numbers on-line
• Finding dates of birth on-line
• Finding mother’s maiden name on-line
Basic Information Necessary
For a Credit Card Application
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Strategy: if one can identify these fields for a person,
they have the basic information needed to acquire a
credit card in that person’s name. Therefore, we need
only demonstrate how this information can be obtained
on-line.
Student
application
Basic
information
and School
Information
Basic Information Necessary
For a Credit Card Application
Do these
first.
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Strategy: if one can identify these fields for a person,
they have the basic information needed to acquire a
credit card in that person’s name. Therefore, we need
only demonstrate how this information can be obtained
on-line.
One Approach is to Buy an SSN
There are websites that advertise SSNs for sale.
The California-based Foundation for Taxpayer and Consumer Rights
said for $26 each it was able to purchase the Social Security
numbers and home addresses for Tenet, Ashcroft and other top Bush
administration officials, including Karl Rove, the president's chief
political adviser. [Associated Press, “Social Security numbers sold
on Web” 8/28/2003]
One Approach is to Buy an SSN
http://socialsecuritypeoplesearch.com/index.asp
One Approach is to Buy an SSN
http://socialsecuritypeoplesearch.com/index.asp
Reportedly Permissible Purposes for
Purchasing an SSN On-line, 1
Locating Missing Persons
Child Support Enforcement
Skip Tracing
Collections
People Locator Service
Locating Alumni
Other Legal, Normal Business Use
Judgement on Subject
Apprehending Criminals
Law Firm -Fiduciary Interest
http://socialsecuritypeoplesearch.com/index.asp
Reportedly Permissible Purposes for
Purchasing an SSN On-line, 2
Legal Process Service
Legal Research
Finding Owners of Unclaimed Goods
Fraud and Loss Prevention
Government Agency
Insurance Claims Investigations
Investigation of Civil Litigation
Journalistic Endeavors
Law Enforcement
Licensed PI
http://socialsecuritypeoplesearch.com/index.asp
Reportedly Permissible Purposes for
Purchasing an SSN On-line, 3
Locate Former Patients (Medical Industry Only)
Locating Beneficiaries and Heirs
Locating Existing Customers
Locating Former Customers
Locating Former Employees
Locating Fraud Victims
Locating Pension Fund Beneficiaries
Necessary to Complete Transaction
Permission from Subject
Product Recalls
http://socialsecuritypeoplesearch.com/index.asp
Reportedly Permissible Purposes for
Purchasing an SSN On-line, 4
Resolve Customer Disputes
Search on Myself
To give to a Court of Law
Witness and Victim Locating
Asset Identification
Court Related
http://socialsecuritypeoplesearch.com/index.asp
Related Approaches in the Past
Projects related to locating Social Security numbers
on-line:
In 2001, the approach was based on the use of the ID
card of students being a SSN.
In 2002, the approach was based on the use of student
provided information.
Seth Mandel’s Approach
in this Course in 2001
Strategy: Recognizing the student ID number at the
University is the SSN, Seth mined course web sites in
which student grades were posted using part of the
students SSN (the last 6 digits).
He then crossed student listed as being in the course
with their web pages, to get hometown  thereby
inferring the first 3 digits!
Example from
CMU in 2003,
using last 4
digits
Maksim Tsvetovat’s Approach
in this Course in 2002
Strategy: On-line resumes often include Social
Security number. So, go to an in-formal job discussion
site in which resumes are exchanged or a repository is
found, and locate all the SSNs, along with name and
address which is also typically included.
Results: he found one job bank repository that had
hundreds of resumes containing SSNs along with
names and addresses! Very few included date of
birth. None included mother’s maiden name.
Job Banks are On-line with Resumes
Listing {SSN, name, address}
... Welcome to Maryland's Job Bank! ... Are You Looking For
Dream Job. ... Search for jobs nationwide,
and by creating a resume, thousands of employers across the
nation ...
www.ajb.dni.us/md/ - 29k
NationalJobBank.com - Post your jobs or resume for FREE!
... The National Job Bank is a web-site developed
specifically for job seekers, employers ... We
encourage you to post your resume, post a job listing or
contact ...
www.nationaljobbank.com/ - 16k - Sep 9, 2003
Google: resume ssn site:.edu 1
[DOC]RESUME
File Format: Microsoft Word 2000 - View as HTML
RESUME. RICHARD ALLEN BROWN. Richard Allen
Brown. PO Box 782. Kayenta, AZ 86033.
Home Telephone-520-697-3513. NAU Telephone-520-5234099. DOB: 03-10-77. SSN: 527-71 ...
dana.ucc.nau.edu/~rab39/RAB%20Resume.doc
Many found. One is shown above.
But the actual resumes are amidst lots of
non-resume pages!
Google: resume ssn site:.edu 2
resume
... 2843. DOB: 10-10-48 New Britain, CT 06050-4010. F:
(860) 832-3753.
SSN: 461-84-… H: (203) 740-7255 C: (203) 561-8674.
Education. Ph. ...
www.math.ccsu.edu/vaden-goad/resume.htm
A second example.
Google: resume ssn site:.edu 3
Scot Lytle's Resume
Scot Patrick Lytle. Home: (301)-249-5330 2116 Blaz Court
School: (410)-455-1662
Upper Marlboro, MD 20772 SSN: 578-90-…. OBJECTIVE.
... userpages.umbc.edu/~slytle1/resume.html
We emailed warnings to these people that
this is not a good practice!
One claimed to have been the victim of a
identity theft recently.
Basic Information Necessary
For a Credit Card Application
Done.
Next...
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Strategy: if one can identify these fields for a person,
they have the basic information needed to acquire a
credit card in that person’s name. Therefore, we need
only demonstrate how this information can be obtained
on-line.
Google: resume ssn site:.edu 1
[DOC]RESUME
File Format: Microsoft Word 2000 - View as HTML
RESUME. RICHARD ALLEN BROWN. Richard Allen
Brown. PO Box 782. Kayenta, AZ 86033.
Home Telephone-520-697-3513. NAU Telephone-520-5234099. DOB: 03-10-77. SSN: 527-71 ...
dana.ucc.nau.edu/~rab39/RAB%20Resume.doc
This on-line resume, located earlier,
actually listed date of birth too!
Google: resume ssn site:.edu 2
resume
... 2843. DOB: 10-10-48 New Britain, CT 06050-4010. F:
(860) 832-3753.
SSN: 461-84-… H: (203) 740-7255 C: (203) 561-8674.
Education. Ph. ...
www.math.ccsu.edu/vaden-goad/resume.htm
This on-line resume, found earlier, also
listed date of birth!
Google: resume ssn site:.edu 3
Scot Lytle's Resume
Scot Patrick Lytle. Home: (301)-249-5330 2116 Blaz Court
School: (410)-455-1662
Upper Marlboro, MD 20772 SSN: 578-90-…
OBJECTIVE. ... userpages.umbc.edu/~slytle1/resume.html
The third resume did not have his DOB
listed.
anybirthday.com given a
name, provides a
birthday
Had
several hits
matching
name, but
only one in
his ZIP.
Finding Dates of Birth
Anybirthday.com tends to have information on
people over the age of 30. Younger people are
often not included.
Many other population registers can be used,
such as voter lists. Anybirthday.com is not he
only source!
Basic Information Necessary
For a Credit Card Application
Done.
Done.
Next...
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Strategy: if one can identify these fields for a person,
they have the basic information needed to acquire a
credit card in that person’s name. Therefore, we need
only demonstrate how this information can be obtained
on-line.
Publicly Available Birth Records
Not all states, but many consider birth records,
the kind of information included on a person’s
birth certificate in the United States, as publicly
available information.
A few states have gone further to provide this
information on-line.
In the United States, birth certificate information
tends to include the mother’s maiden name!
California on-line Birth Records
Results of search on ‘Jones’
Source: http://www.vitalsearch-ca.com/gen/_nonmembers/ca/_vitals/cabirths-nopsm.htm
Basic Information Necessary
For a Credit Card Application
Done.
Done.
Done.
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Strategy: if one can identify these fields for a person,
they have the basic information needed to acquire a
credit card in that person’s name. Therefore, we need
only demonstrate how this information can be obtained
on-line.
Resulting Concern
Done.
Done.
Done.
• Name
• Social Security number
• Address
• Date of birth
• Mother’s maiden name
Thousands of people are at risk!
Even if this is not the current means accounting
for the bulk of fraud related to new credit card
accounts, this is clearly a very serious and
growing threat!
Identity Angel –resumes
1. Locate on-line resumes
(using Filtered Searching)
2. Extract sensitive values
(using regular expressions)
3. Email subjects about their risks
L. Sweeney. AI Technologies to Defeat Identity Theft Vulnerabilities. AAAI Spring
Symposium on AI Technologies for Homeland Security, 2005.
http://privacy.cs.cmu.edu/dataprivacy/projects/idangel/index.html
ID Angel, Sample Resume
[DOC]RESUME
File Format: Microsoft Word 2000 - View as HTML
RESUME. RICHARD ALLEN BROWN. Richard Allen
Brown. PO Box 782. Kayenta, AZ 86033.
Home Telephone-520-697-3513. NAU Telephone-520-5234099. DOB: 03-10-77. SSN: 527-71 ...
dana.ucc.nau.edu/~rab39/RAB%20Resume.doc
100’s found. One is shown above.
But the actual resumes are amidst lots of
non-resume pages!
Identity Angel –resume findings
1. 1000 resume hits on Google using
fliteredSearch, revealed 150 resumes,
of which 140 (or 93%) had complete 9digit SSNs.
10 resumes had partial, invalid, or
some other country’s SSN.
L. Sweeney. AI Technologies to Defeat Identity Theft Vulnerabilities. AAAI Spring
Symposium on AI Technologies for Homeland Security, 2005.
http://privacy.cs.cmu.edu/dataprivacy/projects/idangel/index.html
Identity Angel –resume findings
2. All email addresses (113 of 113 or
100%) were found. The ‘@’ and dot (.)
notation worked well. All dates of
birth (110 of 110 or 100%) were found,
but some dates, which were not dates
of birth were incorrectly reported as
such; this happened in 20 cases (but
only 7 where the proper DOB was not
also found).
L. Sweeney. AI Technologies to Defeat Identity Theft Vulnerabilities. AAAI Spring
Symposium on AI Technologies for Homeland Security, 2005.
http://privacy.cs.cmu.edu/dataprivacy/projects/idangel/index.html
Identity Angel –resume findings
3. In terms of combinations:
104 (or 69%) resumes
had {SSN, DOB};
105 (or 70%) had {SSN, email},
76 (or 51%) had {SSN, DOB, email}.
L. Sweeney. AI Technologies to Defeat Identity Theft Vulnerabilities. AAAI Spring
Symposium on AI Technologies for Homeland Security, 2005.
http://privacy.cs.cmu.edu/dataprivacy/projects/idangel/index.html
Identity Angel –resume findings
4. A single email message was sent to
each of the 105 people having {SSN,
email} alerting them to the risk.
Within a month, 42 (or 55% of all of
DBB) no longer had the information
publicly available.
A year later, 102 (or 68% of all of
DBA) no longer had the information
available.
L. Sweeney. AI Technologies to Defeat Identity Theft Vulnerabilities. AAAI Spring
Symposium on AI Technologies for Homeland Security, 2005.
http://privacy.cs.cmu.edu/dataprivacy/projects/idangel/index.html
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Lab Activity:
Locating an SSN at Harvard.edu
Using Google, search for on-line resumes
containing SSNs and dates of birth.
The first one I found was for a Harvard
Professor.
Let’s find his email and send him a message,
advising him to remove his SSN from his
on-line resume.
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Lab Activity:
Solving the Problem (?)
Here is a proposed quick fix.
Please review this proposal and tell me what
problems, if any, you think it may fix.
Proposal:
Instead of assigning SSNs using the
structured numbering scheme, have a
central repository that randomly assigns
numbers.
Questions Addressed in this Lecture
How are Social Security numbers assigned?
What predictions can we make about a person
and his SSN?
If we have a person’s Social Security number,
can we get a credit card in her name?
Show me someone who gives his Social
Security number away for free.
Give me a solution to consider.
Download