Homework1

advertisement
Redis and MemCached
One specific example of an open-source database system is one called Redis. Some of
the features Redis features over other databases are that it is an in-memory database. What
this means is that the database relies primarily on main memory for the storage of information.
Redis also utilizes what is known as a key-value data, now this works by taking in what the user
wants to store and then stores that information using a schema-less setup. These are just some
of the reasons people and companies have chosen Redis as their open-source database, which
will be discussed later.
Redis is sponsored by VMware[1}. VMware was founded by Diane Greene, Mendel Rosenblum,
Scott Devine, Edward Wang and Edouard Bugnion and is currently located in Palo Alto
California. On April 12, 2011 VMware released a service called Cloud Foundry, which is a cloud
computing platform as a service (PaaS), which supported Redis.[2]
What separates Redis from most other databases is the way information is stored. Most
databases store only strings, which Redis stores as well, but it also stores a list of strings, a set
of strings which can be unsorted, a sorted set of strings, and hashes where the keys and values
of what the user needs are stored.
Redis also supports what is known as master-slave replication of its information. Any data
stored within a Redis server can be replicated to how ever many “slaves” are needed. The
reason this is used is so a tree of information and the keys stored can be represented using that
structure. The slaves created can be re-written which will cause some inconsistency with the
slaves but allows user control and change the slave keys. Users then subscribe to the slave
channels that were created; this feature is very prominent in applications as instagram.
The company Garantia Data was featured in Computer World’s Top 12 Hot Cloud Computing
Companies primarily for its use of Redis[3]. It praises it for great feature filled options, but at the
same time it highlights it problem of poor scalability. Scalability is the ability of an electronic or
tech company to have the ability to grow in order to accommodate for the high growth and
demand of that service.
Lets look at one specific company that relies heavily on Redis, instagram. Instagram was
founded in October of 2010 by Kevin Systrom and Mike Krieger. In an article Krieger had even
written “I <3 Redis”[4] and goes on to claim the company helped with the way their data was
stored and ways they used that data. As stated before it utilizes the master-slave feature so
users may follow the slave version of other users’ photos rather than having to gain access to
the users photos each time it is needed. He praises Redis for its ability to handle a rapid insert
function, rapid subsets, and the data structures that are relatively bound. They also use Redis
to power the main feed of pictures, the activity feed of users, the session system, and other
related systems. Since Redis allows the master-slave in its system this makes for a very easy
online failover to a new Redis machine without any downtown for instagram[5].
Another example of an open-source database is known as Memcached[6]. It is a free and also
open-source database solution. It is known for its high performance the way it distributes a
memory object and its geo-caching ability. It was developed in 2003 for the company
LiveJournal by Brad Fitzpatrick. The way it does this is it takes generic and simple ideas in the
way memory and geo-caching is handled. Since they take a generic approach to these features
it increases the speed and also alleviates the load the database has to take. Memcached is
known for its memory handling abilities by doing two things, the first is takes an array of
information and it breaks it into nodes. It makes every node independent and it also has it so
each node can use the memory of the other nodes[7].
They focus on certain values and what they call design philosophies of their company. Some of
the philosophies briefly explained are[9]









Simple Key Storage
o To simplify data search and storage
Smart Half in Client, Half in Server
o Implementation takes place partially client side as well as server side splitting it
into halves
Servers are Disconnected From Each Other
o Each server is independent and has no interaction with other servers, more
servers is more capacity
O(1) Everything
o Memcach will try and make every command O(1), using simple store options
Forgetting Data is a Feature
o After the data stored becomes old and “stale” it is removed to create more
space and speed up service
Cache Invalidation is a Hard Problem
o “Given memcached's centralized-as-a-cluster nature, the job of invalidating a
cache entry is trivial. Instead of broadcasting data to all available hosts, clients
direct in on the exact location of data to be invalidated. You may further
complicate matters to your needs, and there are caveats, but you sit on a strong
baseline”
The Protocol
o “There are commercial entities, other OSS projects, and so on. Memcached
clients are fairly common, and there are many uses for a memcached-like cluster
layout. We will continue to see other projects "speak" memcached, which in turn
influences memcached as a culture and as software itself.”
Persistent Storage
o “In many cases memcached is not a great fit here; expensive flash memory is
needed to keep data access performant. In most common scenarios, disaster can
ensue if a server is unavailable and later comes up with old data.”
Storage Engines
o “Storage Engines in general are an important future to memcached as a service.
Aside from our venerable slabbing algorithm, there are other memory backends
to experiment with.”
A lot of major companies have been known to use and implement Memcached for their
databases. They use this service for its speed and features, these companies are
LiveJournal, Wikipedia, Flickr, Bebo, Twitter, Typepad, Yellowbot, Youtube, Digg,
WordPress.com, Craigslist, and Mixi. [8]
Sources
[1] http://en.wikipedia.org/wiki/Redis
[2] http://en.wikipedia.org/wiki/VMware
[3] http://news.idg.no/cw/art.cfm?id=E705422F-A7A5-61B5-E80D79D38FA22637
[4] http://www.scribd.com/doc/89025069/Mike-Krieger-Instagram-at-the-Airbnb-tech-talk-on-ScalingInstagram
[5] http://instagram-engineering.tumblr.com/
[6] http://www.webresourcesdepot.com/25-alternative-open-source-databases-engines/
[7] http://en.wikipedia.org/wiki/Memcached
[8] http://memcached.org/
[9] http://code.google.com/p/memcached/wiki/NewOverview
Download