Redis and MemCached One specific example of an open-source database system is one called Redis. Some of the features Redis features over other databases are that it is an in-memory database. What this means is that the database relies primarily on main memory for the storage of information. Redis also utilizes what is known as a key-value data, now this works by taking in what the user wants to store and then stores that information using a schema-less setup. These are just some of the reasons people and companies have chosen Redis as their open-source database, which will be discussed later. Redis is sponsored by VMware[1}. VMware was founded by Diane Greene, Mendel Rosenblum, Scott Devine, Edward Wang and Edouard Bugnion and is currently located in Palo Alto California. On April 12, 2011 VMware released a service called Cloud Foundry, which is a cloud computing platform as a service (PaaS), which supported Redis.[2] What separates Redis from most other databases is the way information is stored. Most databases store only strings, which Redis stores as well, but it also stores a list of strings, a set of strings which can be unsorted, a sorted set of strings, and hashes where the keys and values of what the user needs are stored. Redis also supports what is known as master-slave replication of its information. Any data stored within a Redis server can be replicated to how ever many “slaves” are needed. The reason this is used is so a tree of information and the keys stored can be represented using that structure. The slaves created can be re-written which will cause some inconsistency with the slaves but allows user control and change the slave keys. Users then subscribe to the slave channels that were created; this feature is very prominent in applications as instagram. The company Garantia Data was featured in Computer World’s Top 12 Hot Cloud Computing Companies primarily for its use of Redis[3]. It praises it for great feature filled options, but at the same time it highlights it problem of poor scalability. Scalability is the ability of an electronic or tech company to have the ability to grow in order to accommodate for the high growth and demand of that service. Lets look at one specific company that relies heavily on Redis, instagram. Instagram was founded in October of 2010 by Kevin Systrom and Mike Krieger. In an article Krieger had even written “I <3 Redis”[4] and goes on to claim the company helped with the way their data was stored and ways they used that data. As stated before it utilizes the master-slave feature so users may follow the slave version of other users’ photos rather than having to gain access to the users photos each time it is needed. He praises Redis for its ability to handle a rapid insert function, rapid subsets, and the data structures that are relatively bound. They also use Redis to power the main feed of pictures, the activity feed of users, the session system, and other related systems. Since Redis allows the master-slave in its system this makes for a very easy online failover to a new Redis machine without any downtown for instagram[5]. Another example of an open-source database is known as Memcached[6]. It is a free and also open-source database solution. It is known for its high performance the way it distributes a memory object and its geo-caching ability. It was developed in 2003 for the company LiveJournal by Brad Fitzpatrick. The way it does this is it takes generic and simple ideas in the way memory and geo-caching is handled. Since they take a generic approach to these features it increases the speed and also alleviates the load the database has to take. Memcached is known for its memory handling abilities by doing two things, the first is takes an array of information and it breaks it into nodes. It makes every node independent and it also has it so each node can use the memory of the other nodes[7]. They focus on certain values and what they call design philosophies of their company. Some of the philosophies briefly explained are[9] Simple Key Storage o To simplify data search and storage Smart Half in Client, Half in Server o Implementation takes place partially client side as well as server side splitting it into halves Servers are Disconnected From Each Other o Each server is independent and has no interaction with other servers, more servers is more capacity O(1) Everything o Memcach will try and make every command O(1), using simple store options Forgetting Data is a Feature o After the data stored becomes old and “stale” it is removed to create more space and speed up service Cache Invalidation is a Hard Problem o “Given memcached's centralized-as-a-cluster nature, the job of invalidating a cache entry is trivial. Instead of broadcasting data to all available hosts, clients direct in on the exact location of data to be invalidated. You may further complicate matters to your needs, and there are caveats, but you sit on a strong baseline” The Protocol o “There are commercial entities, other OSS projects, and so on. Memcached clients are fairly common, and there are many uses for a memcached-like cluster layout. We will continue to see other projects "speak" memcached, which in turn influences memcached as a culture and as software itself.” Persistent Storage o “In many cases memcached is not a great fit here; expensive flash memory is needed to keep data access performant. In most common scenarios, disaster can ensue if a server is unavailable and later comes up with old data.” Storage Engines o “Storage Engines in general are an important future to memcached as a service. Aside from our venerable slabbing algorithm, there are other memory backends to experiment with.” A lot of major companies have been known to use and implement Memcached for their databases. They use this service for its speed and features, these companies are LiveJournal, Wikipedia, Flickr, Bebo, Twitter, Typepad, Yellowbot, Youtube, Digg, WordPress.com, Craigslist, and Mixi. [8] Sources [1] http://en.wikipedia.org/wiki/Redis [2] http://en.wikipedia.org/wiki/VMware [3] http://news.idg.no/cw/art.cfm?id=E705422F-A7A5-61B5-E80D79D38FA22637 [4] http://www.scribd.com/doc/89025069/Mike-Krieger-Instagram-at-the-Airbnb-tech-talk-on-ScalingInstagram [5] http://instagram-engineering.tumblr.com/ [6] http://www.webresourcesdepot.com/25-alternative-open-source-databases-engines/ [7] http://en.wikipedia.org/wiki/Memcached [8] http://memcached.org/ [9] http://code.google.com/p/memcached/wiki/NewOverview