Public ppt - International Institute of Information Technology

advertisement
Efficient Privacy Preserving
Protocols for
Visual Computation
Maneesh Upmanyu
IIIT Hyderabad
Advisors: C. V. Jawahar , Anoop M. Namboodiri, Kannan Srinathan,
Center for Visual Information Technology
Center for Security, Theory & Algorithmic Research
IIIT- Hyderabad
Security and Privacy of Visual Data
Broad Objective
• Development of secure computational algorithms in computer vision
and related areas.
– To develop “highly-secure” solutions
– To develop “computationally efficient” solutions
IIIT Hyderabad
– To develop solutions to problems with immediate impact
Project Web-Page: http://cvit.iiit.ac.in/projects/SecureVision
Research Directions
Private Content Based Image
Retrieval (PCBIR)
Blind Authentication: A Secure
Crypto-Biometric Verification
Protocol
Efficient Privacy Preserving
Video Surveillance
Feature vector (fquery)
Root Info
fquery, f(A1)
Q1
A1
fquery, f(A2)
Q2
A2
……..
Publication: Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar;
Efficient Privacy Preserving Video Surveillance: Proceedings of the 12th International
Conference on Computer Vision (ICCV 2009)
IIIT Hyderabad
Publication: Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar;
Blind Authentication - A Secure Crypto-Biometric Verification Protocol: Appears in IEEETransactions on Information Forensics and Security (IEEE-TIFS), June 2010
Publication: Shashank J, Kowshik P, Kannan Srinathan and C.V. Jawahar; Private
Content Based Image Retrieval; In Proceedings of Computer Vision and Pattern
Recognition (CVPR 2008)
Our Security Goal
• What is meant by ‘Privacy’?
– Design protocols to limit the information leakage through what is
learned in addition to the designated output.
• What is the ‘Adversary Model’?
– Semi-honest vs. Malicious adversary
• Analysis outline:
IIIT Hyderabad
– Correctness
– Security
– Complexity
Assumptions
• Reliable and secure communication channel
• Players are passively corrupt, that is, honest but curious.
• Players are computationally bounded.
IIIT Hyderabad
• Players do not collude.
Thesis Objective
• Traditional Approaches uses highly interactive protocols.
– Limitation: massive datasets
– Example: Blind Vision
• Paradigm Shift
– Compute directly in encrypted domain.
• Encrypt -> Communicate -> Compute -> Decrypt
– Domain specific encryption schemes.
• PKC is data independent and generic.
IIIT Hyderabad
– Can the paradigm be generic yet efficient?
Contribution of Thesis
IIIT Hyderabad
A method that provides provable security, while allowing
efficient computations for generic vision algorithms have
remained elusive.
We show that, one can exploit certain properties inherent to
visual data to break this seemingly impenetrable barrier.
IIIT Hyderabad
Dilemma of Privacy vs. Accuracy
What is Blind Authentication?
A biometric authentication protocol that does not
reveal any:
– information about the biometric samples to the
authenticating server.
IIIT Hyderabad
– information regarding the classifier, employed by the
server, to the user or client
IIIT Hyderabad
Biometric Authentication System
Primary Concerns in a Biometric System
• Template Protection
• Non-Repudiable
• Network and Client-side Security
IIIT Hyderabad
• Revocability
IIIT Hyderabad
Previous Work
“A template protection scheme with provable security and acceptable
recognition performance has thus far remained elusive.”
– A.K. Jain, Eurasip 2008
Homomorphic Encryption
• An encryption scheme using which some algebric
operation , like addition or multiplication, can be directly
done on the cipher text.
Let x1 = 20 and x2 = 22,
to compute x1+x2 = 42
Use an encryption scheme, for example E(x) = ex
Server stores E(x1) = e20 and E(x2) = e22
IIIT Hyderabad
Compute using encrypted data
y = E(x1) E(x2) = e20.e22 = e42
Decrypt z = D(y) = ln(y)
z = D(y) ln (e42) = 42
User Enrollment
IIIT Hyderabad
Enrollment based on a trusted third party.
IIIT Hyderabad
Authentication using a Linear Kernel
Extensions to Kernels & Neural Networks
• Kernel based classifier uses a discriminating function like
• Similarly, in Neural Network the basic units are for
example perceptron or sigmoid
• Model above functions as arithmetic circuits consisting of
add and multiplication gates over a finite domain.
IIIT Hyderabad
• Consider two encryptions E+ and E*
Implementation and Analysis
• Experiments designed to evaluate the efficiency and
accuracy of proposed approach.
• For evaluation, an SVM based verifier based on clientserver architecture was implemented.
– Accuracy: as no assumptions are made, accuracy remains same.
IIIT Hyderabad
• Verified this on various public domain (UCI, Statlog) datasets.
IIIT Hyderabad
Case study shows that matching using fixed length feature representation
is comparable to variable length methods such as dynamic warping.
Security, Privacy and Trust
• Server Security
– Template database security
– Hacker sitting in server
• Client Security
– Hacker has user’s key or biometric
– Passive attacks at client end
IIIT Hyderabad
• Network Security
– Network is susceptible to snooping attacks
Advantages of Blind Authentication
• Fast and Provably Secure authentication without trading off
accuracy.
• Supports generic classifiers such as Neural Network and
SVMs.
• Useful with wide variety of fixed-length biometric-traits.
IIIT Hyderabad
• Ideal for applications such as biometric ATMs, login from
public terminals.
Proposed Surveillance System
Plain Video
Encrypted Video
Processed Video
Result Video
Captured by Camera
As seen by one of the
Computational Servers
As seen by the
Computational Server
Received by
Observer
IIIT Hyderabad
How do we carry out surveillance
on ‘Randomized’ images ?
Motivation
Can we do surveillance without
‘seeing’ the original video ?
Ability to run video surveillance algorithms,
completely in encrypted domain can address most
privacy concerns.
IIIT Hyderabad
Existing methods are either too slow for surveillance
applications or do not provide provable privacy.
Paradigm Shift
Trusted Third Party
Selective Encryption
(TTP)
(Smart Camera)
In practice, do not have the luxury of a
trusted entity
Homomorphic
Encryption (Doubly)
IIIT Hyderabad
Computationally expensive
No provable privacy, costly and tedious
to upgrade
Traditionally
Explored
Paradigms
Secure Multiparty
Computation (SMC)
Highly inefficient,
High level of privacy, an overkill in
practice
We use the paradigm of secret sharing to achieve private and efficient
surveillance.
Protocol in a nutshell
IIIT Hyderabad
Propose a ‘Cloud-Computing’ based solution using k>2 non-colluding
servers
Shatter
Merge
Compute
Image
Result
• The camera splits each captured frame F, into k ( > 2 )
using
pixel
level
shatter
function:
•shares
carry
out aaof
basic
operation
f on
the input
eachby
•ToThe
results
operations
on the
shares
areimage,
integrated
server
blindly carries
the equivalent
basic operation
f’ on
the observer
using aout
merge
function ( CRT),
to obtain final
itsresult.
share.
• Each
share is then sent to an independent server for
processing.
Secret Sharing
• A method of distributing a secret among a group of servers,
such that:
IIIT Hyderabad
– Each server on its own has no meaningful information
– Secret is reconstructed only when all shares combine together
• Existing methods are highly inefficient
• Asmuth-Bloom overcomes this limitation by working in
Residue Number System (RNS).
Example to do Addition in RNS
RNS ( m1 = 37, m2 = 49; M = m1 x m2 = 1813)
X = 973%(m1, m2)
(x1, x2) = (11, 42)
Y = 678%(m1, m2)
(y1, y2) = (12, 41)
Shatter: f(x) = (x.S+h) mod mi
x1 = 11, y1 = 12
x2 = 42, y2 = 41
z1 = (x1 + y1) % m1
= (11+12) % 37
= 23
z2 = (x2 + y2) % m2
= (42+41) % 49
= 34
IIIT Hyderabad
Merge: m(xi, mi) = CRT(xi, mi) /S
CRT (z1, z2)
Z = 1651
Data Properties
• While general purpose secure computation appears
inherently complex and oftentimes impractical.
– We show certain properties of the data can be used to ensure
efficiency while ensuring privacy.
• Following properties are of interest to us.
IIIT Hyderabad
–
–
–
–
Limited and Fixed Range
Scale Invariant
Approximate Nature
Non-General Operands
Characteristics of the System
IIIT Hyderabad
Preserve Privacy
• Carry out surveillance on random
looking images.
Light weight
• Encrypted domain representation
should allow efficient computations.
Limited data
expansion
• Obfuscation process should not blow
up the video data.
Secure Storage
• Obfuscation should be provably secure
to ensure security at un-trusted servers.
Reconstruction of
data
• Only authorized people should be able
to recover original plain video.
Implementation Challenges
• Representation of negative numbers: Use an Implicit sign
representation.
– Use (0, M/2) as positive and rest as negative.
– Sign conversion is carried out using additive inversion of Z.
• Overflow and Underflow: Operations are valid and correct as
long as range of data is (-M/2, M/2).
IIIT Hyderabad
• Integer Division and Thresholding: RNS domain is finite and
hence not all divisions are defined.
– Dividing integer A by B is defined as A/B = (ai.bi-1) mod mi
• Defining Equivalent operations: For every f(x), we need to
define f`(x) such that merging f`(xi) would give f(x).
IIIT Hyderabad
Experimental Results
IIIT Hyderabad
Properties of the Protocol
• Servers are un-trusted and the
network may be insecure.
• Near loss-less
(PSNR~51).
data
encoding
• No compromise in accuracy.
• Inexpensive capture device, and
a unidirectional data flow.
• Negligible overheads to make
private computation practical.
Circumvent theoretical
bounds. Extremely
efficient over SMC
Not only efficient, but
also provably secure
Scalable, inexpensive
and generic, thus
practical
IIIT Hyderabad
• Secure as long as servers do not
collude.
Our approach shows that privacy and efficiency co-exists
in the domain of visual data
K-Means Clustering
IIIT Hyderabad
• Data clustering is one of the most important techniques for discovery
of patterns in a dataset.
• K-Means clustering is a simple and extensively used technique that
automatically partitions a dataset into k clusters.
• The technique becomes more effective with larger amount of data such
as when multiple businesses share their data to carry out the clustering
together.
• However, the data may contain sensitive information.
Secure K-Means Algorithms
• Trusted Third Party (TTP) based solutions
– Dwork et al. ( Crypto 2004)
 Very Efficient
 No TTP in Real World, Possible security compromise
• Data Perturbation techniques
– Stanley et al. (BSD 03), Kargupta et al. (ICDM 03)
 Negligible communication overhead
 Partial security, Non-invertible transformations used
IIIT Hyderabad
• Those employing Multiparty Computations
– Vaidya et al. (KDD 03), Jha et al. (ESORICS 05)
Wright et al. (KDD 05), Inan et al (DKE 07)
 Complete privacy
 Highly in-efficient
Our Distributed Solution
IIIT Hyderabad
• We simulate TTP on a set of un-trusted servers over an insecure network.
• Secret Sharing is a method of distributing a secret among a
group of servers.
Proposed Protocol
• Protocol consists of two phases
– Phase One: Secure Data Distribution
– Phase Two: Secure K-Means
• Phase One: Secure Storage of data at servers
– Selection of an optimal RNS.
– Shattering of the user’s private data.
Privacy: Server stores only the shattered shares of data.
IIIT Hyderabad
• Phase Two: Secure K-Means
– Initialization
– Lloyd Step
– Knowledge Revelation
Phase Two: Secure K-Means
• Clusters are initialized using the shattered shares
• Lloyd Step involves iteratively computing the closest
centers in a Euclidean space
– Secure protocols for division and comparison
• Securely evaluate the termination criteria
– Send the shattered cluster centers to users who uses the Merge
function on it
• Privacy: No information is leaked to the servers
IIIT Hyderabad
– Data for operations such as division secured using randomization
– Randomization done so as to secure against possible GCD and
factorization based attacks
IIIT Hyderabad
Overview of the Protocol
User 1
User 2
Analysis
• Overheads calculated over the naïve TTP based protocol.
• Division and Comparison operations introduce
communication overhead.
– Limited to one round per operation
• Traditional approaches uses SMC for this.
– Based on OT, a communicational intensive protocol.
– O(n2) communication overhead to multiply two vectors (length n)
• Limited data expansion
IIIT Hyderabad
– Eg: 32bit data shattered into 5 shares requires 54bits while
traditional SS requires 160bits.
Algorithm Properties
• We have proposed a highly secure framework using
paradigm of secret sharing.
• Negligible overheads in simulating algebraic operations.
• Achieve efficiency by exploiting the data properties.
IIIT Hyderabad
• Solution does not demand any trust and the clustering is
carried out directly on the encrypted data.
Conclusion
Broad Objective
• Development of secure computational algorithms in computer vision
and related areas.
– To develop “highly-secure” solutions
– To develop “computationally efficient” solutions
– To develop solutions to problems with immediate impact
• The traditional methods of ensuring privacy are
communication and computation expensive.
• We show that domain specific knowledge can be
incorporated to ensure efficiency while retaining privacy.
IIIT Hyderabad
• Moreover, our methods do not trade off accuracy.
Related Publications
Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar;
“Blind Authentication - A Secure Crypto-Biometric Verification Protocol”
In IEEE-Transactions on Information Forensics and Security
(IEEE-TIFS, June 2010)
“Efficient Biometric Verification in Encrypted Domain”
In Proceedings of 3rd International Conference on Biometrics
(ICB 2009)
IIIT Hyderabad
“Efficient Privacy Preserving Video Surveillance”
Proceedings of the 12th International Conference on Computer Vision
(ICCV 2009)
“Efficient Privacy Preserving K-Means Clustering”
Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics
(PAISI 2010)
IIIT Hyderabad
Thank you
for your attention
RNS & CRT
• Residue Number System (RNS) is an integer using a set of
smaller integers.
– RNS is defined by a set of k integer constants. {m1, m2, m3, …, mk}
– Secret A is represented by k smaller integers. {a1, a2, a3, …, ak} where
ai = A modulo mi
– This representation is valid as long as 0 < A < M, where M is LCM of mi’s
• Chinese Remainder Theorem (CRT) is the method of
recovering the integer value from a given set of smaller
integers.
IIIT Hyderabad
– Define Mi = M/mi
– Compute ci = Mi x (Mi-1 mod mi)
– The above equation is always valid in our system, therefore unique solution
exists
Shatter & Merge Functions
• Shatter function
of the private data.
: Compute and store the secret shares
– Where xi is the ith secret share, and η is a uniform randomness
• Merge function
: Reconstruct the secret.
for different primes Pi’s, secret is
IIIT Hyderabad
– Given
recovered using CRT
Download