Apache Cassandra Arvind Dwarakanath

advertisement
Apache Cassandra
Arvind Dwarakanath
adwaraka@indiana.edu
Department of Computer Science
Indiana University Bloomington
1. What are NoSQL
Distributed Hash Tables?
and
The concept of NoSQL differs from the
standard
Relational
Database.
The
problems of the relational databases
included the inability to work on dataintensive applications and indexing of
large number of files/documents. Many
NoSQL systems have been developed in
order
to
cater
to
the
above
requirements.
Many of the more
databases have of late
in nature. This type of
redundant storage of
servers. The storing
distributed hash table.
popular NoSQL
been distributed
structure means
data on many
occurs using a
In a distributed hash table, the data is
stored and a keyspace is evaluated using
a hash function. The hashing is done
using a SHA-1 hash. The data is
traversed and then stored in a node that
is responsible for that keyspace. A
keyspace partitioning scheme splits
ownership of this keyspace among the
participating nodes. An overlay network
then connects the nodes, allowing them
to find the owner of any given key in the
keyspace.
A very popular version of a NoSQL
database using the concept of keyspace
is Apache Cassandra - the topic of the
survey.
2. What is Cassandra?
Cassandra is an open source distributed
database management system. It is an
Apache Software Foundation top-level
project designed to handle very large
amounts of data spread out across many
commodity servers while providing a
highly available service with no single
point of failure. It is a NoSQL system
that was initially developed by Facebook
and it powers their Inbox Search
feature. A standalone test version of
Twitter called „Twissandra‟ has also been
created as demonstration.
The basic fundamental of Cassandra is
that it is a columnar database or rather
a column-oriented distributed database.
The data is stored in the form of
columns and it is uniquely marked using
'keyspace'. It can be classified as a
'Cloud Db'.
3. Features of the Cassandra
Model
The data model
A table in Cassandra is a distributed
multidimensional map indexed by a
Keyspace. The value is an object
maybe an element or it may be highly
structured. The row key in a table is a
string with no size restrictions, although
typically 16 to 36 bytes long. Every
operation under a single row key is
atomic per replica no matter how many
columns are being read or written into.
Columns are grouped together into sets
called column families very much similar
to what happens in the BigTable system.
Cassandra exposes two kinds of column
families: Simple and Super. Super
column families can be visualized as a
column family within a column family.
The top dimension in Cassandra is called
Keyspace.
For instance; usrs['adwaraka'] will
indicate a column family of users. In it,
there will be an identifier „adwaraka‟. In
usrs,
we
can
further
add
usrs[adwaraka][fname],
usrs[adwaraka][lname]
and
usrs[adwaraka][gender].
Column and Column Family
As mentioned before, the data model is
columnar in nature. The column is the
base of Cassandra data model. The
column is the lowest and smallest
increment of data. It‟s a tupple (triplet)
that contains a name, a value and a
timestamp.
Here‟s a column represented in JSON
notation:
Each row has multiple columns, each of
which has a name, value, and a
timestamp.
Unlike a table in an RDBMS, different
rows in the same column family do not
have to share the same set of columns,
and a column may be added to one or
multiple rows at any time. It can be
useful to distinguish between “static”
column families that contain values such
as user data or other object data, and
“dynamic” column families that contain
data such as precalculated query results.
Keyspaces
Keyspaces
group
column
families
together. Typically, there will be one
Keyspace for each application that uses
a Cassandra cluster.
The most important settings that are
defined at the keyspace level are the
replication factor and the replica
placement strategy.
Thus, if you have sets of data that have
different requirements for these settings
(such as different levels of faulttolerance), these sets of data should
reside
in
different
keyspaces.
A
keyspace is to be set before any client
API like thrift has to be fired.
fname: "Arvind",
On the Cassandra CLI, use the 'use
<keyspace name>' to select the
required keyspace. The command goes
like this
lname: "Dwarakanath",
use keyspace Keyspace1;
For the usr[adwaraka]
{
gender: “Male”
}
A column family resembles a table in an
RDBMS. Column families contain rows
and columns. Each row is uniquely
identified by a row key.
Super Columns
Super Columns are a type of super
structure of columns. Super columns are
way to group multiple columns. Every
super column must have a different
name, just like with regular columns.
Different super columns may hold subcolumns with the same name. Super
columns are a way to add an extra map
layer to the data model.
Super columns are frequently used to
hold a single record where each field in
the record is represented by a subcolumn. For example, the name of a
super column might be the ID of a
transaction and each sub-column could
hold some attribute of the transaction.
For example, if a transactions row like
the one describe had two entries, it
might look like:
So a user can continuously add new
nodes to it without any worry about
stoppage of applications.
Durability
Durability is the property that writes,
once
completed,
will
survive
permanently, even if the server is killed
or crashes or loses power. This requires
calling fsync to tell the OS to flush its
write-behind cache to disk.
Fault Tolerant
Data is automatically replicated to the
multiple nodes for implementing faulttolerance. Replication across multiple
data centres is supported. Failed nodes
can be replaced with no downtime.
{
„trans-A‟: {
Changeable Consistency
„date‟: „01/02/2010‟,
„amount‟: 5000
„timespace‟: <value1>
},
„trans-B‟: {
„date‟: „01/03/2010‟,
„amount‟: 4500
„timespace‟: <value2>
}
}
Decentralized
Every node in the cluster is identical.
There are no hierarchies between the
nodes.
There
are
no
network
bottlenecks. There are no single points
of failure.
The ability to tune consistency levels per
query is a powerful feature of Cassandra
because it gives the developer complete
control of managing the trade-off of
availability versus consistency. This
means that Cassandra queries can be
configured to exhibit strongly consistent
behaviour (but there is no row-level
locking) if the developer is willing to
sacrifice latency.
The consistency levels offered by
Cassandra,
which
have
different
meanings for reads and writes, are
detailed in the tables below. A „quorum‟
of replicas is essentially a majority of
replicas, or [(Replica Number/2) + 1]
with any resulting fractions rounded
down.
WRITE CONSISTENCY LEVELS
Elasticity
New nodes can be added without any
down time or problems to applications.
-
ALL- All replicas must have
received the write; otherwise the
operation will fail.
-
-
-
-
-
ANY - Ensure that the write has
been written to at least one node
(can include hinted handoff
recipients). Note that if all replica
nodes are down at write time, an
ANY write may not be readable
until nodes have recovered.
ONE - Ensure that the write has
been written to at least one
replica‟s commit log and memory
table before responding to the
client.
QUORUM- Ensure that the write
has been written to a quorum of
replicas before responding to the
client.
LOCAL_QUORUM- Ensure that the
write has been written to a
quorum
of
replicas in the
datacenter
local
to
the
coordinator before responding to
the client. This setting avoids the
latency of inter-data center
communication.
EACH_QUORUM- Ensure that the
write has been written to a
quorum of replicas in each
datacenter in the cluster before
responding to the client.
READ CONSISTENCY LEVELS
-
-
-
ALL- Return the record with the
most recent timestamp once all
replicas have replied, failing the
operation if any replicas are
unresponsive.
ONE- Returns the response from
the closest replica, as determined
by the snitch configured for the
cluster. When read_repair is
enabled, Cassandra may perform
a
consistency
check
in
a
background thread.
QUORUM- Returns the record
with the most recent timestamp
once a quorum of replicas has
reported.
-
-
LOCAL_QUORUM- Returns the
record with the most recent
timestamp once a quorum of
replicas in the datacenter local to
the coordinator has reported.
This setting avoids the latency of
inter-data center communication.
EACH_QUORUMReturns
the
record with the most recent
timestamp once a quorum of
replicas in each datacenter in the
cluster has reported.
In choosing the consistency level for
particular operations, developers should
consider the relative importance of
consistency, latency, and availability.
Note that the read operations are slower
than rights in Cassandra. Cassandra can
therefore be used for more write
intensive operations on a massive scale
like a blog.
For instance, in a case where availability
is top priority it may make sense to
choose a level of ONE over QUORUM. If
the replication factor for the cluster is 3,
a QUORUM operation tolerates the loss
of only one node (or, one copy of the
data) while ONE allows the operation to
complete even if two nodes are
unavailable.
Clusters spanning multiple data centres
may present further questions regarding
local latency and durability.
4. Major Client Libraries for
Cassandra
Thrift
Thrift has been mentioned many times
before in the paper. What is Thrift? Thrift
is a software framework that allows for
scalable
cross-programming
development. In this context, Thrift is
the name of the RPC client used to
communicate with the Cassandra server.
It statically generates an interface for
serialization in a variety of languages,
including C++, Java, Python, PHP, Perl,
C# to name a few. It is this mechanism
that allows you to interact with
Cassandra from any of these client
languages.
We can see the following on the screenConnected to:
localhost/9160
"Test
Cluster"
on
Welcome to cassandra CLI.
Type 'help;' or '?' for help. Type
'quit;' or 'exit;' to quit.
[default@unknown]
Some other clients that are used include
Hector (using Java), Pycassa (using
Python),
phpcasssa
(PHP),
Ruby
(Cassandra) etc. The libraries are
available at github website.
On the following prompt, the user can
type in the necessary commands. If in
doubt, the user can type „help;‟ to see
what the various APIs of Cassandra are.
Running Cassandra and basic CLI
Commands
5. References and Notes
The latest version available for download
is Apache-Cassandra is 0.7.2. The
installation is simple enough. The
minimal installation for this study was
done only on one local machine using a
virtual OS Ubuntu. To run the Cassandra
in the foreground, we need to run the
following command:darkprince@ubuntu:~/Desktop/apache
-cassandra-0.7.2$ bin/cassandra –f
After a flurry of messages, we see that
the message that pops up looks
somewhat like.
INFO 10:09:01,633
thrift clients....
Listening
for
This indicates that Cassandra is all ears
for all the thrift clients and that the
clients can start their operations.
In the absence of any higher-level client,
Cassandra has its own indigenous client
that can be run using the Cassandra
Command line prompt. To run this, open
another terminal and run the following
commanddarkprince@ubuntu:~/Desktop/apache
-cassandra-0.7.2$
bin/cassandracli --host localhost
1. ‘Cassandra
–
A
Decentralized
Structured Storage System’, by
Avinash Lakshman and Prashant
Malik. Published April 2010.
2. Apache Cassandra Main Project
Site. This site contains the code
base and information of the data
model. This even provides the
client options a user can have to
access
the
contents
of
Cassandra;
http://cassandra.apache.org/
3. Apache Site that contains the
Cassandra
wiki;
http://wiki.apache.org/cassandra
/
4. Datastax
Cassandra
Documentation, it contains a
succinct summary of the tuneable
consistencies;
http://www.datastax.com/docs/0.
7/index
5. Software
Solutions
and
Development. Talks about the
installation of the Thrift clients
and the data model in detail;
http://www.sodeso.nl/
Download