Some open questions: aggregation and privacy protection
Coding, information theory and signal processing
Du, Kargupta, Vora
Allowed queries of form
Closure h
({h i
(x): minimum support of =
})
Require responses of the form that enable accurate determination of f(x): minimum support of =
’ typically >>
What is minimal
While preventing determination of g(x)
x or information required for g(x) if g(x) an outlier, i.e. x an outlier
Privacy should be representation-invariant the
“legitimate” activity?
4/11/2020 Poorvi Vora/CS/GWU 2
• Modelling of data relationships
– Entropy/mutual information (often used for discrete-valued data)
– Number of principal components (often used for continuous-valued data)
– Clustering into separate regimes in appropriate representations/codes: Fourier, Walsh, etc.
• Consequent question: optimal source code
– What is a good source code?
– For example, jpeg is a decent compression, though lossy, of images
– Walsh transform: similar band-limitedness notion in terms of decision tree lengths, data points used in a single query
4/11/2020 Poorvi Vora/CS/GWU 3
What we know is if
– the adversary is allowed any kind of adaptive query, and
– the adversary wants error to go to zero the minimum cost per bit is bounded below, tightly, if
– both adversary and database are willing to participate in an unbounded number of queries.
So
– make it expensive to ask a question,
– stop questioning (bound number of queries)
– make it such that the capacity for allowed patterns is high, and capacity for private information is low How?
– Can one limit the adversary in any other ways to obtain anything more interesting? (polynomial time is not enough of a limitation)
4/11/2020 Poorvi Vora/CS/GWU 4
• Instead of uniform perturbation
• Allow a total of np lies to n binary questions
• What kinds of shapes can the ellipse be made to look like by computationally bounded Alice and Bob
• What would be preserved? The “average” shape?
4/11/2020 Poorvi Vora/CS/GWU 5
4/11/2020 Poorvi Vora/CS/GWU 6