Project Plan - Cs Team Site | courses.cs.tau.ac.il

advertisement
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
Project Plan
Background:
The past few years introduced us to large scale databases that are distributed throughout multiple
machines. Our project discusses the security issues that arise with this new database mechanism,
specifically how additional security comes at a price in performance.
The two databases we will be implementing our tests on are:
 Cassandra:
Apache Cassandra is an open source distributed database management system.
It is an Apache Software Foundation top-level project designed to handle very large amounts of data
spread out across many commodity servers while providing a highly available service with no single
point of failure.
Cassandra is distributed, which means that it is capable of running on multiple machines while
appearing to users as a unified whole. Moreover since Cassandra is decentralized, every node is
identical. In Cassandra no node performs certain organizing operations distinct from any other node.
Instead, Cassandra features a peer-to-peer protocol and uses gossip algorithm to maintain and keep in
sync a list of nodes that are alive or dead.
Cassandra is being used in production by some of the biggest properties on the Web, including
Facebook, Twitter, Cisco, Rackspace, Digg, Cloudkick, Reddit, and more.
Cassandra has become popular because of its outstanding technical features; it is durable, seamlessly
scalable, and consistent. It performs fast writes, can store hundreds of terabytes of data, and is
decentralized
 Accumulo:
Apache Accumulo is a sorted, distributed key/value store based on Google's BigTable design.,
Accumulo has cell-level access labels and a server-side programming mechanisms.
Goals:
The main goals of this project are to add support for cell-level ACLs (Access Control Lists) to Cassandra and
compare the resulting system to Accumulo on their performance. We will try to evaluate and measure the
security holes, then attempt to improve the security of both systems by increasing the consistency, while
measuring the performance penalty as well.
Success Criteria:
Our success criteria are divided according to our two different main goals in this project.
For the first goal, we will consider our project successful if after adding the cell-level ACLs to Cassandra,
we will get performance measurements that are as good as the ones measured in Accumulo.
For the second goal, the success criteria will be managing to improve the security with a reasonable
decrease in performance.
Incremental Phases:
1. System set-up and initial performance measurements:
First we will install the system on a single node. The system consists of the database (Apache
Cassandra, Accumulo) and the testing framework (YCSB++). Once the installation is complete we
will run a few tests to verify the installation was successful. We will extend the system and install it
on five more nodes in order to prepare it for the performance measurements. Next we will measure
the performance of Cassandra prior to the additional cell-level security and produce the first
performance report. The scenarios for testing the performance are detailed below.
Exact science Faculty - Tel Aviv University
1
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
2. Implementation of cell-level ACLs:
We will implement ACLs support in Cassandra by storing them as additional attributes.
3. Performance comparison:
At this stage we will compare the performance of Cassandra with the added implementation of ACLs
to the performance we measured in phase one, without the added security. Moreover, we will check
Cassandra's performance in comparison to Accumulo's performance on the same tests. (Further
detail on the tests below).
4. Analysis of the security holes:
Here we will measure the security holes that may exist due to the inconsistency of the ACLs
configuration. This may occur, for example, when the user changes the permissions to deny access
to a certain file, but this restriction is not propagated to all the nodes and other users can access it
during the inconsistency window. YCSB++ allows us to measure this inconsistency as a read-afterwrite latency.
5. Improving the security through stronger consistency:
We will attempt to improve the security of ACLs in Cassandra by providing a solution with higher
consistency guarantees and measure the performance penalty (e.g. as a decrease in throughput).
Testing scenarios:
To test the performance of the databases we will be using the YCSB++ framework. In this framework, in
order to perform a test you must define a workload. A workload is a combination of a Workload java class
and a Parameter file. The Parameter file defines the data that will be loaded into the database during
the loading phase, and the operations that will be executed against the data set during
the transaction phase.
Additionally, we must choose the appropriate runtime parameters (number of client threads, target
throughput, etc.)
On the YCSB++ article we saw several experiments which are strongly related to the topic of our project.
In these tests they did the following:
They ran the YCSB++ insert throughput benchmark on Accumulo with a varying number of ACL entries (0
– 11 entries) on two different client configurations – single client with 100 threads and 6 clients with 16
threads each.
The following graph demonstrates their results:
Exact science Faculty - Tel Aviv University
2
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
On these results they commented: "Figure 14 shows the insert throughput, measured as the number of
rows inserted per second, for different numbers of entries in each ACL (while the total size of the ACLs is
constant). A value of zero entries means that no security was used. When the workload uses a single client
with 100 threads, we observe that the throughput decreases with increasing number of entries in each
ACL: in comparison to not using any access control, throughput drops by 24% with 4 entries in the ACL
and by as much as 47% with an 11-entry ACL. This happens because the single YCSB++ client is running
at almost 100% CPU utilization (as shown in Figure 15) and increasing the number of entries in each ACL
leads to increased computation overhead. However, using six YCSB++ clients with 16 threads each
reduces the insert throughput only by about 10%, even when there are 11 entries in the ACL."
What this basically means is that the limiting factor in these tests is not always the database server but
also the client. The drop in performance was most significant on the single client setup which showed a
decrease of close to 50%.
The second benchmark was of the SCAN operation with the exact same setup and these were the results:
In this experiment we see no significant difference between the two setups (the 1 client compared to 6)
but we do see an instant decrease of about 45% once the fine-grained ACL is invoked.
Our performance tests will mimic the tests described in the article that is, they will consist of the following:
Workloads:
1) An insert workload that writes 48 million single-cell rows in an empty table.
2) A scan workload that scans 320 million rows
Configurations:
1) A single client with 100 threads.
2) Six clients with 16 threads each.
Cycles:
We will run each test three times in order to get the most accurate results.
We will monitor the performance of these tests using YCSB's custom monitoring tool, called Otus which
allows us to process and analyze the collected data using a tailored web-based visualization system.
Exact science Faculty - Tel Aviv University
3
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
Milestones:
Milestone 1: Completing installations and running initial performance tests. This step includes the
following:
- Installing and running Cassandra.
- Installing and running YCSB++.
- Running some initial manual testing of Cassandra (creating accounts, basic inserts, scans etc.)
- Connecting YCSB++ to Cassandra and running benchmark tests with the following
configuration: YCSB++ with a single client with multiple threads, Cassandra running on 1
cluster (single PC or 2 PCs)
- Install Accumulo.
Milestone 1.5: Start the implementation of the Cell-level ACL for Cassandra. This will include the
following:
- Examination of the Cassandra source code and research different implementation options.
- Writing some initial code.
- Testing our implementation using YCSB++ with basic configuration (YCSB++ with a single client
with multiple threads, Cassandra running on 1 cluster) and reach conclusions regarding the
feasibility of a better implementation.
- Comparing the test results to the results of our initial tests and to tests done by others.
- Perhaps implement several different solutions to be tested against each other in the next step.
Milestone 2: Finishing the implementation of the Cell-level ACL and evaluate the performance on a more
advanced configuration of Cassandra. In this step we will do the following:
- Setting up a more advanced configuration of Cassandra, perhaps with several clusters (up to 3)
with several PCs in each cluster.
- Testing the performance of the advanced configuration of Cassandra with and without the
added Cell-level Security.
- Evaluating the feasibility of further improving the implementation. Testing other
implementations.
- Running more advanced set-ups of YCSB++ such as:
a. Connecting more client nodes to it and running a test with more than one client with
multiple threads in order to eliminate the CPU factor
b. Configuring better ACLs, playing with different sized ACL header compared to cell
content and setting up unique ACLs.
c. Connecting to Accumulo and try to reproduce the results shown in the article (see
Testing scenarios)
d. Figure out the pitfalls in our testing procedure (find limiting factors for example) and
configure custom tests that would more accurately evaluate the system performance
- Creating a final version of the implementation based on the test results and consultation from
our project advisor (Alexandra Shulman from IBM).
Milestone 2.5: Begin analyzing the security holes and implementing a security improvement. This
includes the following steps:
- Brainstorming and coming up with scenarios that may expose security risks related to the
control lists.
- Performing the tests we came up with and testing for the security holes.
- Examine different approaches to overcome these risks while considering the performance
penalty they may inflict on the database.
- Start implementing the approach we settled on.
Exact science Faculty - Tel Aviv University
4
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
Milestone 3: Finish implementing the security improvement and measure performance penalty. This
includes the following steps:
- Finish implementation on both Cassandra and Accumulo.
- Run the tests we found to expose the security holes with the added implementation.
- Measure the performance penalty by comparing the results to those we measured without the
added improvement.
Literature, technology and related projects:
Details on a related project that was done can be found here:
YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores.
http://www.pdl.cmu.edu/PDL-FTP/Storage/socc2011.pdf
Other
1.
2.
3.
4.
literature on technologies we will need to research can be found here:
Cassandra: http://cassandra.apache.org/
Accumulo: http://incubator.apache.org/accumulo/
YCSB++: http://www.pdl.cmu.edu/ycsb++/index.shtml
Eventual Consistency:
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
6. Dynamo: Amazon’s Highly Available Key-value Store
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
7. Bigtable: A Distributed Storage System for Structured Data
labs.google.com/papers/bigtable-osdi06.pdf
Requisite tools, resources and knowledge:





Installing the necessary software infrastructure for working with Cassandra.
Researching and reading material on the following subjects:
1. Security holes and threats.
2. Access control lists.
3. Consistency models.
4. Distributed databases structure.
Gain deeper understanding of Cassandra implementation, specifically its security attributes.
Getting familiar with the YCSB++ testing framework:
https://github.com/brianfrankcooper/YCSB/wiki/
Getting familiar with the Otus monitoring tool: https://github.com/otus/otus
Scope and implementation choices:
The project scope will depend on the complexity of source code, initial installation phase and the testing
phase. Also, to complete the initial installation phase (installing everything on a single machine) and the
later installation phase (expanding to several clusters) we will have to rely on the assistance of the system
manager (in Schrieber).
Since Cassandra and the YCSB++ benchmark tool are written in Java, our choice for the implementation
language is naturally Java as well.
Exact science Faculty - Tel Aviv University
5
Workshop in Information Security – Distributed Databases Project
By: Ilia Oshmiansky, Ainat Chervin and Yosi Barad
Risk factors and contingency plans:
1) Problems with the initial installation:
- The installation and operation of these systems (Cassandra,accumulo and YCSB++) is highly
complicated and we do not really know what to expect in terms of possible technical difficulties.
- We depend on the assistance of the system team which might limit our work. If for example we
need something installed to continue working and they won't be able to help us for a week –
then we won't be able to do much during this week.
Contingency plans:
If we get stuck on the installation phase we can seek alternatives. For example:
- If we cannot install it in the lab we can try installing it at home on a VM.
- Try to get assistance from IBM
- Perhaps we can install it later and continue in other directions (focus on developing the
testing environment etc.)
2) Problems with extending the installations:
We may encounter difficulties expanding the installation to other stations and establish
communication between them.
Contingency plan:
In the case where we cannot expand to the desired number of stations, we will consider reducing
the project scope and we will implement only the first test configuration (with the one client
running 100 threads)
3) The performance after adding the cell-level ACL implementation won't reach our
expected goals of milestone 2
Contingency Plan:
Decrease the scope of the project. Focus on improving the performance rather than moving on to
Milestone 2.5.
4) We are unable to find any security holes due to a lacking set up
Contingency Plan:
If we are unable to research the security holes due to a lacking setup (for example not able to set
up several clusters) we will attempt to extend the system or alternatively try to test it on IBM labs.
5) We are unable to find any security holes which are possible for us to fix
Contingency Plan:
If we cannot find security holes which are possible for us to fix we might move back to milestone 2
and focus on expanding our testing phase to include more in-depth analysis of the performance and
focus on improving that part.
Exact science Faculty - Tel Aviv University
6
Download