Big Data Promise - Cloud Security Alliance

advertisement
Big Data Working Group Session
Praveen Murthy, Fujitsu Labs of America
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
The ‘freshman’ of the CSA working groups 
Lots of press & attention
Leadership team:
Chair - Sree Rajan, Fujitsu
Co-chair - Neel Sundaresan, Ebay
Co-Chair - Wilco van Ginkel, Verizon
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
1: Data
analytics for
security
2: Privacy
preserving/enhancing
technologies
3: Big datascale crypto
4: Big data
Infrastructures'
Attack Surface
Analysis and
Reduction
Big Data Working
Group
8: Framework
and Taxonomy
70+ members
7: Top 10
6: Legal Issues
5: Policy and
Governance
https://basecamp.com/1825565/projects/511355-big-data-working
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Lead to crystallization of best practices for
security and privacy in big data
Support industry and government on
adoption of best practices
Establish liaisons with other organizations
in order to coordinate the development of
big data security and privacy standards
Accelerate the adoption of novel research
aimed to address security and privacy
issues
Identify scalable techniques for datacentric security and privacy problems
Top 10 Big Data Security & Privacy
Challenges developed for CSA Congress
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
1) Secure computations in distributed programming frameworks
2) Security best practices for non-relational data stores
3) Secure data storage and transactions logs
4) End-point input validation/filtering
Big Data
Top-10
5) Real-time security/compliance monitoring
6) Scalable and composable privacy-preserving analytics
7) Crypto-enforced access control and secure communication
8) Granular access control
9) Granular audits
10)Data provenance
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
CSA Big Data Working Group Site
https://cloudsecurityalliance.org/research/big-data/
CSA, Big Data LinkedIn
http://www.linkedin.com/groups?home=&gid=4458215&trk=anet_ug_hm
Basecamp Project Collaboration Site Request Form
https://cloudsecurityalliance.org/research/basecamp/
For any questions/remarks/feedback, please contact either:
Who
How
Sreeranga (Sree) Rajan
(Fujitsu)
sree@us.fujitsu.com
Neel Sundaresan (eBay)
nsundaresan@ebay.com
Wilco van Ginkel (Verizon)
wilco.vanginkel@verizon.com
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Help Us Secure Cloud Computing
www.cloudsecurityalliance.org
info@cloudsecurityalliance.org
LinkedIn: www.linkedin.com/groups?gid=1864210
Twitter: @cloudsa, @CSAResearchGuy
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Alvaro Cardenas, UT Dallas
Pratyusa Mandhata, HP Labs
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Create a reference architecture educating new users of how
big data analytics can be used for security, (might include
tutorials?)
Explain what is new when compared to other traditional
continuous monitoring approaches,
Crystallize best practices on big data analytics,
Identify big data analytics problems and technologies that
can be standardized
Identify gaps where new research is needed and best
practices
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Intrusion Detection Systems (1990)
Network flows, Host Intrusion Detection logs, etc.
Security Information and Event Management (SIEM) (mid-2000)
Alarm Correlation
Big Data Security/Analytics (now)
Variety of Data, Security Intelligence
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Big Data Promise
Traditional Systems
More rigid, predefined
schemas
Structured and unstructured data
treated seamlessly
Data gets deleted
Keep data for historical correlation
(e.g., 10 years)
Complex analyst queries
take long to complete
Faster query response times
Hadoop is de facto open standard for big data at rest
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Big Data
• Cyber-Data
• Logs,
events,
network
flows,
user id. &
activity,
etc
Analytics
• Models,
Baselining
• Feature
extraction
• Anomaly
detection
• Context
(external
sources of
information)
Copyright © 2011 Cloud Security Alliance
Dashboard
• Security
analyst
(human)
looks at
indicators
• Correlates
with
external
sources of
info to
detect
attacks
www.cloudsecurityalliance.org
In 2011 >60% of respondents installed tools to gain a better
view of what is on their network
McAfee Risk & Compliance Outlook 2012
Examples:
Database Activity Monitoring (DAM)
Monitors administrator activity, unusual database reads/updates, event
aggregation, correlation and reporting
Identity Access Management
Risk-Management control room
Security Information and Event Management (SIEM)
Vulnerability Assessment Tools
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Currently on internal review
Target: Open to external review Q1 2013
Target: Final report published H1 2013
Main thrust: the centrality of data analytics for
combating APTs
Contributions to the report (so far) by
Symantec, AT&T, EMC, RSA, HP, IBM, Fujitsu,
University of Luxembourg, University of Texas at
Dallas.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Henry St. Andre, inContact
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Privacy Enhancing Technologies:
PET is a term for a set of computer tools
and applications which when integrated
with online services allow online users to
protect the privacy of their personally
identifiable information.
The PET Team or PETT seeks to address
significant problems around PET
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Need volunteers!
Reach out on basecamp, or send email to
sree.rajan@us.fujitsu.com
Basecamp Project Collaboration Site
Request Form
https://cloudsecurityalliance.org/research/basecamp/
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Arnab Roy, Fujitsu Labs
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
1.
Communication protocols
2.
Data-centric security
3.
Big data privacy
4.
Key management
5.
Data integrity and poisoning concerns
6.
Searching / filtering encrypted data
7.
Secure data collection/aggregation
8.
Secure collaboration
9.
Proof of data storage
10. Secure outsourcing of computation
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
PK
Filtering
Token
Encrypter

SK
Decrypter
“Conjunctive, subset, and range queries on encrypted
data” by Dan Boneh and Brent Waters, 2007
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org

How to make collection of data private as well
as authenticated?
Can verify signature came
from a group member
Cannot infer which member
Copyright © 2011 Cloud Security Alliance
In case of dispute,
a trusted third party
can trace the signature
to an individual
www.cloudsecurityalliance.org
The technical problem is to make group
signatures efficient and short
“Short Group Signatures” by Boneh, Boyen
and Shacham, 2004
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Private Searching on Streaming Data
Ostrovsky and Skeith, CRYPTO 2005
Problem Scenario:
The intelligence gathering community needs to collect a useful subset of huge
streaming sources of data
The criteria for being useful may be classified – private criteria
Most of the streaming data is useless and storing it all may be impractical –
filter at source
How de we keep the filtering criteria secret even if it is executing at the
source?
Solution: Obfuscate the filtration code
Even if the source falls into enemy hands, it cannot figure out the criteria
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Secret
Criteria
Obfuscate
Garbled
Blogs
Filter
Garbled
Filter
Net Traffic
Encrypted
Filtered
Data
Decrypt
News Feed
Cloud
Filtered
Data
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Computing on Authenticated Data
A signature scheme such that it is possible to derive signatures on
“related” data from a signature on the original document
For example, deriving signatures on a redacted version of a
document, without knowing the signing key
“Computing on Authenticated Data” by Jae Hyun Ahn, Dan Boneh,
Jan Camenisch, Susan Hohenberger, abhi shelat and Brent Waters.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
N = pq
f = F mod φ(N)
File F; N
random g
F
gF mod N
Check if
gf = gF mod N


“PORs: Proofs of Retrievability for Large Files” by Juels and Kaliski
“Compact Proofs of Retrievability” by Shacham and Waters
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Problem Scenario:
A “weak client” wants to outsource a computation
The provider returns the result along with a “proof” that the computation
was carried out correctly
Catch: verification of the proof should require substantially less
computational effort than computing the result from scratch
References:
“Non-Interactive Verifiable Computing Outsourcing Computation to
Untrusted Workers” by Rosario Gennaro, Craig Gentry and Bryan Parno.
“Fully Homomorphic Message Authenticators” by Rosario Gennaro and
Daniel Wichs.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Functional
Encryption
Identitybased
encryption
Attributebased
encryption
Richer
policies
Disjunction,
Conjunction
Polynomials
Threshold
Predicates
“Predicate Encryption Supporting Disjunctions,
Polynomial Equations, and Inner Products” - Jonathan
Katz, Amit Sahai and Brent Waters.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
We invite you to participate in the WG.
Contact:
Initiative Lead: Arnab Roy, Fujitsu Labs of America
Email: aroy@us.fujitsu.com
Many thanks to Dan Boneh, Mihai
Christodorescu and Roy P. D’Souza for
discussion on this topic
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Praveen Murthy, Fujitsu Labs
Bryan Payne, Nebula
Jesus Molina, Molina Consulting
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Provide analytical security metrics for Big Data
infrastructure
Hadoop, OpenStack, …
Analyze attack surface
Idea is to be able to do differential analysis to determine
how attack surface changes with various configurations
Seed ideas and prototypes in this BDWG initiative in an
open, transparent, architecture/brand neutral/agnostic
manner
Crowd-source for improving and standardizing metrics
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Big Data and Virtualization
infrastructure are being hit
from inside due to
Advanced Targeted Attacks
(Spear Phishing)
Explore attack surface for
these infrastructures for
different configurations
using open source
implementations:
OpenStack - Hadoop
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Create a highly reconfigurable
implementation of the distributed
infrastructure in a public cloud
(openstack cloud in the cloud, Hadoop
in the cloud)
Evaluate attack surface for each
configuration, evaluate open attack
vectors
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Usage:
testbed
[command] [options] [testbed-name]
command
Description
start
Start instances in a testbed
stop
Stop instances in a testbed
destroy
Destroy instances in a testbed
list
List instances in a testebed
Describe
Describe testbed configuration (IP, vpc, etc)
Create
Create a new testbed
Configure
Configure an existing tetbed
ssh
Ssh to the controller nodes
surface
Creates surface attack node
Eg: testbed create 5 –config openstack.conf
openstacktestbed
Creates a VPC for the testbed in AWS cloud
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
10.0.0.0/24
Elastic IP
Virtual Private Cloud
AWS Cloud
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
The attack surface is an enumeration of elements
that could be utilized by an adversary to infiltrate
the system
Attack surface can be utilized as a security metric,
and also to understand the possible attack vectors
and reduce their risk.
Currently evaluating three dimensions of the
surface
1.
Enablers (open processes, files,…)
2.
Communication within the distributed system (exposed
ports, protocols, …)
3.
Access rights
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Howard, Michael, Jon Pincus, and Jeannette Wing. "Measuring
relative attack surfaces." Computer Security in the 21st
Century (2005): 109-137.
Create “snapshots” of system state – Windows Attack Surface Tool
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Nodes
Baseline
Nodes
Configuration 1
Nodes
Configuration 2
Attack surface
report
Attack surface
report
Attack surface
report
Difference - report
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
•
Initial kickoff in Fall 2012 (OpenStack Summit in Oct 2012)
•
Broad industry support and collaboration
Copyright © 2011 Cloud
39 Security Alliance
www.cloudsecurityalliance.org
•
High interest expressed from participants at
OpenStack summit in Fall 2012
•
Initial work underway
•
Aiming for v1 in Spring / Summer 2013
•
Start small, grow with community involvement for
future versions
•
Attack Surface Modeling can help direct this
security guide, providing a scientific basis for
specific security recommendations.
Copyright © 2011 Cloud
40 Security Alliance
www.cloudsecurityalliance.org
We invite you to participate in the WG.
Contact:
Initiative Leads: Praveen Murthy, Fujitsu Labs of
America, Bryan Payne, Nebula
Email: praveen.murthy@us.fujitsu.com
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Srinivas Jaini, Kinetic Networks Inc.,
Pratyusa K. Manadhata, HP
Sarah Hendrickson, Dell
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Data analytics for security
Privacy preserving/enhancing
technologies
Big data-scale crypto
Cloud Attack Surface Reduction
Policy and Governance
Framework and Taxonomy
To address data governance challenges and
contribute to development of standards in the
areas of security and governance in big data
technologies.
Define Big Data Framework & Taxonomy to
(i) get a common understanding of Big Data
terms & definitions and (ii) act as a structure
to which all the Big Data Initiatives can be
linked.
Two separate initiatives now, but may
become one.
Top 10
Legal Issues
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Initiative 5: Policy and Governance is looking
for volunteers to join.
Kick off meeting second week of March planned.
Interested volunteers are encouraged to sign up on
basecamp.
https://cloudsecurityalliance.org/research/basecamp/
Send email to Srinivas Jaini [srinivasjaini@gmail.com]
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Sree Rajan, Fujitsu Labs
Arnab Roy, Fujitsu Labs
Alvaro Cardenas, UT Dallas
Jesus Molina, Molina Consulting
Praveen Murthy, Fujitsu Labs
Wilco Van Ginkel, Verizon
Neel Sundaresan, Ebay
Pratyusa Manadhata, HP Labs
Shiju Sathyadevan, Amrita University
Rongxing Lu, University of Waterloo
Adam Fuchs, Sqrrl
Yu Chen, SUNY Binghamton
Alan Lane, Securosis
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Top 10 Challenges Identified by CSA BDWG
1)
2)
3)
4)
5)
6)
7)
8)
9)
Secure computations in
distributed programming
frameworks
Security best practices for
non-relational datastores
Secure data storage and
transactions logs
End-point input
validation/filtering
Real time security monitoring
4, 8, 9
1, 3, 5, 6, 7, 8, 9, 10
Scalable and composable
privacy-preserving data
mining and analytics
Cryptographically enforced
access control and secure
communication
10
4, 10
Granular access control
Granular audits
2, 3, 5, 8, 9
Data Storage
Public/Private/Hybrid Cloud
10) Data provenance
5, 7, 8, 9
Copyright © 2011 Cloud Security Alliance
46
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
A security and privacy challenge typically has three dimensions of
difficulty:
Modeling: formalizing a threat model that covers most of the cyber-attack or data-leakage
scenarios
Analysis: finding tractable solutions based on the threat model
Implementation: implementing the solution in existing infrastructures.
Followed a three-step process to arrive at top challenges in big data:
Interviewed Cloud Security Alliance members and surveyed security-practitioner oriented
trade journals to draft an initial list of high priority security and privacy problems
Studied published solutions
Characterized a problem as a challenge if the proposed solution does not cover the problem
scenarios.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Secure Computation in Distributed
Programming Frameworks
How do we secure distributed frameworks which exploit
parallelism in computation and storage?
Threats/Challenges:
Current Mitigations:
Malfunctioning compute
worker nodes
Trust establishment: initiation,
periodic trust update
Access to sensitive data
Mandatory access control
Privacy of output information
Privacy preserving
transformations
Copyright © 2011 Cloud Security Alliance
48
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Security Best Practices for Non Relational
Data Stores
How do we secure non-relational data stores which were not
built with security in mind?
Threats/Challenges:
Current Mitigations:
Lack of stringent authentication
and authorization mechanisms
Enforcement through middleware
layer
Passwords should never be held
in clear
Encrypted data at rest
Lack of secure communication
between compute nodes
Protect communication using
SSL/TLS
Copyright © 2011 Cloud Security Alliance
49
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Secure data storage and transaction logs
How do we secure infrastructure for big data storage
management?
Threats/Challenges:
Current Mitigations:
Data Confidentiality and
Integrity
Encryption and Signatures
Availability
Proof of data possession
Consistency
Periodic audit and hash
chains
Collusion
Policy based encryption
Copyright © 2011 Cloud Security Alliance
50
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
End-point Input Validation / Filtering
How can we trust data that is coming in from diverse endpoints like sensors, devices and applications?
Threats/Challenges:
Current Mitigations:
Adversary may tamper with
device or software
Tamper-proof Software
Adversary may clone fake
devices
Trust Certificate and Trusted
Devices
Adversary may directly
control source of data
Analytics to detect outliers
Adversary may compromise
data in transmission
Cryptographic Protocols
Copyright © 2011 Cloud Security Alliance
51
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Real-time Security Monitoring
How do we leverage big data analytics to help improve the
security of systems?
Threats/Challenges:
Current Mitigations:
Security of the infrastructure
Discussed before
Security of the monitoring
code itself
Secure coding practices
Security of the input
sources
Discussed before
Adversary may cause data
poisoning
Analytics to detect outliers
Copyright © 2011 Cloud Security Alliance
52
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Scalable and Composable Privacy-Preserving
Data Mining and Analytics
How do we leverage big data analytics to help improve the
security of systems?
Threats/Challenges:
Current Mitigations:
Exploiting vulnerability at host
Encryption of data at rest, access
control and authorization mechanisms
Insider threat
Separation of duty principles, clear
policy for logging access to datasets
Outsourcing analytics to untrusted
partners
Unintended leakage through sharing of
data
Awareness of re-identification issues,
differential privacy
Copyright © 2011 Cloud Security Alliance
53
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Cryptographically Enforced Data Centric
Security
How do we enforce the protection of data end to end?
Threats/Challenges:
Current Mitigations:
Enforcing access control
Identity and Attribute-based
encryptions
Search and filter
Encryption techniques
supporting search and filter
Outsourcing of computation
Fully Homomorphic
Encryption
Integrity of data and
preservation of anonymity
Group signatures with
trusted third parties
Copyright © 2011 Cloud Security Alliance
54
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Granular Access Control
How do we control access to diverse datasets?
Threats/Challenges:
Current Mitigations:
Keeping track of secrecy requirements
of individual data elements
Pick right level of granularity: row level,
column level, cell level
Maintaining access labels across
analytical transformations
At the minimum, conform to lattice of
access restrictions. More sophisticated
data transforms are being considered
in active research
Keeping track of roles and authorities
of users
Authentication, authorization,
mandatory access control
Copyright © 2011 Cloud Security Alliance
55
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Granular Audits
How do we audit diverse and distributed systems?
Threats/Challenges:
Current Mitigations:
Completeness of audit
information
Timely access to audit
information
Integrity of audit information
Infrastructure solutions as
discussed before.
Scaling of SIEM tools.
Authorized access to audit
information
Copyright © 2011 Cloud Security Alliance
56
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Data Provenance
How do we keep track of complex metadata?
Threats/Challenges:
Current Mitigations:
Secure collection of data
Authentication techniques
Consistency of data and
metadata
Message digests
Insider threats
Access Control through
systems and cryptography
Copyright © 2011 Cloud Security Alliance
57
www.cloudsecurityalliance.org
Copyright 2013 FUJITSU LIMITED
Vivian Tero, Governance Risk & Compliance (GRC)
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Progress in the regulatory and legal track
has been slow
Difficult to find legal counsels conversant and willing
to discuss corporate compliance/risk management
activities for their big data activities.
Reached out to a couple of regulators (FTC
and CFRB)
Meetings planed within the next 3-4 weeks.
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
For more info about
CSA CloudBytes: Top Challenges for Big Data
https://cloudsecurityalliance.org/research/big-data/
Help Us Secure Cloud Computing
www.cloudsecurityalliance.org
info@cloudsecurityalliance.org
LinkedIn: www.linkedin.com/groups?gid=1864210
Twitter: @cloudsa, @CSAResearchGuy
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Most of our Research Projects
are ideas from professionals like
you
Do you have an idea for a
research project on a cloud
security topic?
If so, please take the time to
describe your concept by filling
out the our online form. This
form is monitored by the CSA
research team, who will review
your proposal and respond to you
with feedback.
Copyright©©2011
2011Cloud
CloudSecurity
SecurityAlliance
Alliance
Copyright
Copyright © 2012 Cloud Security Alliance
www.cloudsecurityalliance.org
www.cloudsecurityalliance.org
Learn how you can participate in Cloud
Security Alliance's goals to promote the
use of best practices for providing security
assurance within Cloud Computing
http://www.linkedin.com/groups?gid=1864210
https://cloudsecurityalliance.org/get-involved/
Copyright © 2011 Cloud Security Alliance
www.cloudsecurityalliance.org
Download