6. Files of (horizontal) Records The concepts of pages or blocks suffices when doing I/O, but the higher layers of a DBMS operate on records and files of records • FILE: A collection of pages, each containing a collection of records, which supports: – insert, delete, and modify (on a record) – read a particular record (specified using Record ID or RID) – scan of all records (possibly with some conditions on the records to be retrieved) Section 6 #1 Files Types • The three basic file organizations supported by the File Manager of most DBMSs are: • HEAP FILES (files of un-ordered records) • SORTED or CLUSTERED FILES ( records sorted or clustered on some field(s) ) • • HASHED FILES (files in which records are positioned based on a hash function on some field(s) Unordered (Heap) Files • Simplest file structure contains records in no particular order. • As file grows and shrinks, disk pages are allocated and deallocated. • To support record level operations, DBMS must: – – – keep track of the pages in a file keep track of free space on pages keep track of the records on a page • There are many alternatives for keeping track of these. Heap File Implemented as a Linked List Data Page Data Page … Data Page Full Pages Header Page Data Page Data Page … Data Page • The header page id and Heap file name must be stored someplace. • Each page contains 2 `pointers’ plus data. Pages with Free Space Heap File Using a Page Directory Data Page 1 Header Page Data Page 2 DIRECTORY (linked list of Header blocks Containing page IDs) Data Page N • The entry for a page can include the number of free bytes on the page. • The directory is a collection of pages; linked list implementation is just one alternative. Heap File Facts Record insert? Method-1: System inserts new records at the End Of File (need Next Open Slot indicator), moves last record into freed slot following a deletion, updates indicator. - doesn't allow support of the RID or RRN concept. Or a deleted record slot can remain empty (until file reorganized) - If record are only moved into freed slots upon reorganization, then RIDs and RRNs can be supported. <- page | record |0 | record |1 | record |2 | |3 | |4 | |5 |3 | <- next open slot indicator Heap File Facts Record insert Method-2: Insert in any open slot. Must maintain a data structure indicating open slots (e.g., bit filter (or bit map) identifies open slots) - as a list or as a bit_filter <- page record record record 101001 availability bit filter (0 means available) If we want all records with a given value in particular field, need an "index" Of course index files must provide a fast way to find the particular value entries of interest (the heap file organization for index files would makes little sense). Index files are usually sorted files. Indexes are examples of ACCESS PATHS. Sorted File (Clustered File) Facts File is sorted on one attribute (e.g., using the unpacked record-pointer page-format) Advantages over heap includes: - reading records in that particular order is efficient - finding next record in order is efficient. For efficient "value-based" ordering (clustering), a level of indirection is useful (unpacked, record-pointer page-format) What happens when a page fills up? page 3 RID(3,3) 0 RID(3,0) 1 RID(3,4) 2 RID(3,2) 3 RID(3,1) 4 RID(3,8) 5 520341 unpacked record-pointer page-format slot-directory Use an overflow page for next record? Ovfl page 9 RID(3,6) 0 When a page fills up and,e.g., a record must be inserted and clustered between (3,1) and 1 (3,5), one solution is to simply place it on an overflow page in arrival order. 2 3 Then the overflow page is scanned like an unordered file page, when necessary. 4 Periodically the primary and overflow pages can 5 be reorganized as an unpacked record-pointer extent to improve sequential access speed |0 (next slide for an example) Sorted File (Clustered File) Facts Reorganizing a Sorted File with several overflow levels. THE BEFORE: page 3 RID(3,3) 0 RID(3,0) 1 RID(3,4) 2 RID(3,2) 3 RID(3,1) 4 RID(3,8) 5 520341 Ovfl page 9 RID(3,6) 0 RID(3,9) 1 RID(3,5) 2 RID(3,11)3 RID(3,10)4 RID(3,15) 5 534102 Ovfl page 2 RID(3,7) 0 1 2 3 4 5 0 AFTER: page 3 RID(3,3) 0 RID(3,0) 1 RID(3,4) 2 RID(3,2) 3 RID(3,1) 4 RID(3,5) 5 520341 Ovfl page 9 RID(3,6) 0 RID(3,9) 1 RID(3,8) 2 RID(3,11)3 RID(3,10)4 RID(3,7) 5 341250 Here, re-organization requires only Ovfl page 2 RID(3,15) 0 1 2 3 4 5 0 2 record swaps and 1 slot directory re-write. Hash files A hash function is applied to the key of a record to determine which "file bucket" it goes to ("file buckets" are usually the pages of that file. Assume there are M pages, numbered 0 through M-1. Then the hash function can be any function that converts the key to a number between 0 and M-1 (e.g., for numeric keys, MODM is typically used. For non-numeric keys, first map the non-numeric key value to a number and then apply MODM...) ). Collisions or Overflows can occur (when a new record hashes to a bucket that is already full). The simplest Overflow method is to use separate Overflow pages: Overflow pages are allocated if needed (as a separate link list for each bucket. Page#s are needed for pointers) or a shared link list. Long overflow chains can develop and degrade performance. – Extendible and Linear Hashing are dynamic techniques to fix this problem. e.g., h(key) mod M key 0 1 2 h ... M-1 Primary bucket pages Overflow pages(as Overflow pages(as Single link link list)list) separate Other Static Hashing overflow handling methods Overflow can be handled by open addressing also (more commonly used for internal hash tables where a bucket is a allocation of main memory, not a page. In Open Addressing, upon collision, search forward in the bucket sequence for the next open record slot. e.g., h(key) mod M rec key h h(rec_key)=1 Collision! 2? no 3? yes 0 1 2 3 4 5 6 rec rec rec rec bucket pages Then to search, apply h. If not found, search sequentially ahead until found (circle around to search start point)! Other overflow handling methods Overflow can be handled by re-hashing also. In re-hashing, upon collision, apply next hash function from a sequence of hash functions.. h0(key) then h1 then h2 ... h 0 1 2 3 4 5 6 rec rec rec rec bucket pages Then to search, apply h. If not found, apply next hash function until found or list exhausted. These methods can be combined also. Extendible Hashing Idea: Use directory of pointers to buckets, • • split just the bucket that overflowed double the directory when needed Directory is much smaller than file, so doubling it is cheap. Only one page of data entries is split. No overflow page! Trick lies in how hash function is adjusted! Example blocking factor(bfr)=4 (# entries per bucket) LOCAL DEPTH GLOBAL DEPTH = gd To find the bucket for a new key value, r, take just the last global depth bits of h(r), not all of it! (last 2 bits in this example) 2 00 2 4* 12* 32* 16* Bucket A 2 1* 5* 21* 13* Bucket B 01 (for simplicity we let h(r)=r here) E.g., h(5)=5=101binary thus it's in bucket pointed in the directory by 01. 10 2 11 10* DIRECTORY 2 15* 7* 19* Local depth of a bucket: # of bits used to determine if an entry belongs to bucket Global depth of directory: Max # of bits needed to tell which bucket an entry belongs to (= max of local depths) Bucket C Bucket D DATA PAGES Apply hash function, h, to key value, r Follow pointer of last 2 bits of h(r). Insert: If bucket is full, split it (allocate 1 new page, re-distribute over those 2 pages). Example how did we get there? LOCAL DEPTH 1 GLOBAL DEPTH = gd 4* 1 0 Bucket A Bucket B 0 1 First insert is 4: h(4) = 4 = 100binary in bucket pointed to by 0 in the directory. DIRECTORY DATA PAGES Example LOCAL DEPTH GLOBAL DEPTH = gd 1 0 Insert: 12, 32, 16 and 1 h(12) = 12 = 1100binary in bucket pointed in the directory by 0. 1 4* 12* 32* 16* 1 1* 1 h(32) = 32 = 10 0000binary in bucket pointed in the directory by 0. h(16) = 16 = 1 0000binary in bucket pointed in the directory by 0. h(1) = 1 = 1binary in bucket pointed in the directory by 1. Bucket A DIRECTORY DATA PAGES Bucket B Example LOCAL DEPTH GLOBAL DEPTH = gd 1 0 Insert: 5, 21 and 13 h(5) = 5 = 101binary in bucket pointed in the directory by 1. h(21) = 21 = 1 0101binary in bucket pointed in the directory by 1. h(13) = 13 = 1101binary in bucket pointed in the directory by 1. 1 4* 12* 32* 16* Bucket A 1 1* 5* 21* 13* 1 DIRECTORY DATA PAGES Bucket B Example 9th insert: 10 h(10) = 10 = 1010binary in bucket pointed in the directory by 0. Collision! LOCAL DEPTH GLOBAL DEPTH = gd 2 00 2 4* 12* 32* 16* 1 1* 5* 21* 13* 10 2 11 10* and adding a bit on the left). DIRECTORY Reset one pointer. Redistribute values among A and C (if necessary Not necessary this time since all green bits (2's position bits) are correct: 4 = 100 12 = 1100 32 =100000 16 = 10000 1010 Bucket B 01 Split bucket A into A and C. Double directory (by copying what is there 10 = Bucket A DATA PAGES Bucket C Example LOCAL DEPTH GLOBAL DEPTH = gd 2 4* 12* 32* 16* Bucket A Inserts: 15, 7 and 19 2 h(15) = 15 = 1111binary h(15) = 7 = 111binary h(19) = 15 = 1 0011binary 00 12 1* 5* 21* 13* 01 10 2 11 10* DIRECTORY Reset one pointer, and redistribute values among B and D (if necessary, not necessary this time). Reset local depth of B and D Bucket C 12 15* Split bucket B into B and D. No need to double directory because the local depth of B is less than the global depth. Bucket B 7* 19* DATA PAGES Bucket D Insert h(20)=20=10100 Bucket pointed to by 00 is full! Split A. LOCAL DEPTH GLOBAL DEPTH 23 0 00 Double directory and reset 1 pointer. 23 4* 12* 32* 16* Bucket A 2 1* 5* 21*13* Bucket B 0 01 010 2 011 10* Bucket C 1 00 1 01 1 10 2 15* 7* 19* Bucket D 1 11 3 4* 12* Bucket E (`split image' of Bucket A) Redistribute contents of A Points to Note • 20 = binary 10100. Last 2 bits (00) tell us r belongs in either A or A2, but not which one. Last 3 bits needed to tell which one. – Local depth of a bucket: # of bits used to determine if an entry belongs to this bucket. – Global depth of directory: Max # of bits needed to tell which bucket an entry belongs to (= max of local depths) • When does bucket split cause directory doubling? – – Before insert, local depth of bucket = global depth. Insert causes local depth to become > global depth; directory is doubled by copying it over and `fixing’ pointer to split image page. Use of least significant bits enables efficient doubling via copying of directory!) Comments on Extendible Hashing • If directory fits in memory, equality search answered with one disk access; else two. – Directory grows in spurts, and, if the distribution of hash values is skewed, directory can grow large. – Multiple entries with same hash value cause problems! • Delete: If removal of data entry makes bucket empty, can be merged with its `split image’. – As soon as each directory element points to same bucket as its (merged) split image, can halve directory. Linear Hash File Starts with M buckets (numbered 0, 1, ..., M-1 and initial hash function, h0=modM (or more general, h0(key)=h(key)modM for any hash ftn h which maps into the integers Use Chaining to shared overflow-pages to handle overflows. At the first overflow, split bucket0 into bucket0 and bucketM and rehash bucket0 records using h1=mod2M. Henceforth if h0 yields valuen=0, rehash using h1=mod2M At the next overflow, split bucket1 into bucket1 and bucketM+1 and rehash bucket1 records using h1=mod2M. Henceforth if h0 yields valuen=1, use h1 ... When all of the original M buckets have been split (M collisions), then rehash all overflow records using h1. Relabel h1 as h0, (discarding the old h0 forever) and start a new "round" by repeating the process above for all future collisions (i.e., now there are buckets 0,...,(2M-1) and h0 = MOD2M). To search for a record, let n = number of splits so far in the given round, if h0(key) is not greater than n, then use h1, else use h0. | 15|LOWE | | |ZAP | | |ND | 21 02|BAID 22|ZHU 25|CLAY | | |NY |NY |SF |CA |OUTBK|NJ | | | | 23 33|GOOD | |GATER|FL | | 8|SINGH | |FGO | |ND | 78 98 11|BROWN |NY 21|BARBIE|NY 99 | | | 36|SCHOTZ|CORN |IA | | | Linear Hash ex. M=5 45 14|THAISZ|KNOB |NJ 24|CROWE |SJ |CA |NY |NY Bucket pg 0 45 1 99 2 23 3 78 4 98 5 21 6 101 7 104 8 105 Insert 27|JONES |MHD |MN h0(27)mod5(27)=2 C! 00,5, mod10rehash 0; n=0 8|SINGH |FGO |ND Insert h0(8)mod5(8)=3 Insert 15|LOWE 101 |ZAP |ND h0(15)mod5(15)=0n h1(15)mod10(15)=5 Insert 32|FARNS |BEEP |NY | | | | | | 104 h0(32)mod5(32)=2! 11,6, mod10rehash 1; n=1 | | | | | | 105 h0(39)mod5(39)=4! 22,7; mod10rehash 2; n=2 Insert 39|TULIP |DERLK|IN Insert 31|ROSE |MIAME|OH h0(31)mod5(31)=1<n mod10(31)=1! 33,8; mod10rehash 3 n=3 Insert 36|SCHOTZ|CORN |IA h0(36)mod5(36)=1<n mod10(36)=6 OF 27|JONES |MHD |MN 32|FARNS |BEEP |NY 39|TULIP |DERLK|IN 31|ROSE |MIAME|OH 25|CLAY | 15|LOWE | | |OUTBK|NJ | |ZAP | |ND | 21 02|BAID 22|ZHU 10|RADHA | | |NY |SF |FGO | | 23 33|GOOD | | |NY |CA |ND | | 45 |GATER|FL | | | | 78 14|THAISZ|KNOB |NJ 24|CROWE |SJ |CA 98 11|BROWN | |NY | 21|BARBIE|NY 99 | |NY |NY | | | 36|SCHOTZ|CORN |IA | | | | | | | 8|SINGH |FGO | | | | | | | | 101 | | 104 |ND | | 105 | | 109 Bucket pg 0 45 1 99 2 23 3 78 4 98 5 21 6 101 7 104 8 105 LHex. 2nd rnd M=10 h0mod10 h027=7 h032=2 Collision! rehash mod20 h039=9 h031=1 9 10 Collision! rehash mod20 Insert 10|RADHA |FGO 109 110 |ND h0(10)mod10(10)=0 ETC. | | | | | | 110 OVERFLOW 27|JONES |MHD |MN 32|FARNS |BEEP |NY 39|TULIP |DERLK|IN 31|ROSE |MIAME|OH Summary • Hash-based indexes: best for equality searches, cannot support range searches. • Static Hashing can lead to performance degradation due to collision handling problems. • Extendible Hashing avoids performance problems by splitting a full bucket when a new data entry is to be added to it. (Duplicates may require overflow pages.) – Directory to keep track of buckets, doubles periodically. – Can get large with skewed data; additional I/O if this does not fit in main memory. Summary • Linear Hashing avoids directory by splitting buckets round-robin, and using overflow pages. – Overflow pages not likely to be long. – Duplicates handled easily. – Space utilization could be lower than Extendible Hashing, since splits not concentrated on `dense’ data areas. • skewed occurs when the hash values of the data entries are not uniform! ... v1 v2 v3 v4 v5 . . . vn values Distribution skew v 1 v2 v3 v4 v5 . . . vn values Count skew ... v1 v2 v3 v4 v5 . . . v n values Dist & Count skew Map Reduce (from Wikipedia) MapReduce is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes). Computational processing can occur on data stored either in a file system (unstructured) or in a database (structured). • "Map" step: The master node takes the input, divides it into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem and passes the answer back. • "Reduce" step: The master node then collects the answers to all the sub-problems and combines them in some way to form the output – the answer to the problem it was originally trying to solve. MapReduce allows for distributed processing of the map and reduction operations. Provided each mapping operation is independent of the others, all maps can be performed in parallel. Similarly, a set of 'reducers' can perform the reduction phase - provided all outputs of the map operation that share the same key are presented to the same reducer at the same time, or if the reduction function is associative. • Another way to look at MapReduce is as a 5-step parallel and distributed computation: • Prepare the Map() input – the "MapReduce system" designates Map processors, assigns the K1 input key value each processor would work on, and provides that processor with all the input data associated with that key value. • Run the user-provided Map() code – Map() is run exactly once for each K1 key value, generating output organized by key values K2. • "Shuffle" the Map output to the Reduce processors – the MapReduce system designates Reduce processors, assigns the K2 key value each processor would work on, and provides that processor with all the Map-generated data associated with that key value. • Run the user-provided Reduce() code – Reduce() is run exactly once for each K2 key value produced by the Map step. • Produce the final output – the MapReduce system collects all the Reduce output, and sorts it by K2 to produce the final outcome. • Logically these 5 steps can be thought of as running in sequence – each step starts only after the previous step is completed – though in practice, of course, they can be intertwined, as long as the final result is not affected. • In many situations the input data might already be distributed among many different servers, in which case step 1 could sometimes be greatly simplified by assigning Map servers that would process the locally present input data. Similarly, step 3 could sometimes be sped up by assigning Reduce processors that are as much as possible local to the Map-generated data they need to process. • The Map and Reduce functions of MapReduce are both defined with respect to data structured in (key, value) pairs. Map takes one pair of data with a type in one data domain, and returns a list of pairs in a different domain: Map(k1,v1) → list(k2,v2) • The Map function is applied in parallel to every pair in the input dataset. This produces a list of pairs for each call. After that, the MapReduce framework collects all pairs with the same key from all lists and groups them together, creating one group for each key. • The Reduce function is applied in parallel to each group, producing a collection of values in the same domain: Reduce(k2,list(v2))→list(v3). • Each Reduce call typically produces either one value v3 or an empty return, though one call is allowed to return more than one value. The returns of all calls are collected as the desired result list. • Thus the MapReduce framework transforms a list of (key, value) pairs into a list of values. The prototypical MapReduce example counts the appearance of each word in a set of documents function map(String name, String document): // name: document name; document: document contents for each word w in document: emit (w, 1) function reduce(String word, Iterator partialCounts): // word: a word; partialCounts: a list of aggregated partial counts sum = 0 for each pc in partialCounts: sum += ParseInt(pc) emit (word, sum) Here, each document is split into words, and each word is counted by the map function, using the word as the result key. The framework puts together all the pairs with the same key and feeds them to the same call to reduce, thus this function just needs to sum all of its input values to find the total appearances of that word. As another example, imagine that for a database of 1.1 billion people, one would like to compute the average number of social contacts a person has according to age. In SQL such a query could be expressed as: SELECT age AS Y, AVG(contacts) AS A FROM social.person GROUP BY age ORDER BY age. Using MapReduce, the K1 key values could be the integers 1 thu 1,100, each representing a batch of 1M records,K2 key value could be a person’s age in yrs, and this comp could be achieved using the following functions: function Map is input: integer K1 between 1 and 1100, representing a batch of 1 million social.person records for each social.person record in the K1 batch do let Y be the person's age let N be the number of contacts the person has produce one output record <Y,N> repeat end function function Reduce is input: age (in years) Y for each input record <Y,N> do Accumulate in S the sum of N; Accumulate in C the count of records so far repeat let A be S/C produce one output record <Y,A> end function Map Reduce-2 The MapReduce System would line up the 1,100 Map processors, and would provide each with its corresponding 1 million input records. The Map step would produce 1.1 billion <Y,N> records, with Y values ranging between, say, 8 and 103. The MapReduce System would then line up the 96 Reduce processors by performing shuffling operation of the key/value pairs due to the fact that we need average per age, and provide each with its millions of corresponding input records. The Reduce step would result in the much reduced set of only 96 output records <Y,A>, which would be put in the final result file, sorted by Y. Dataflow The frozen part of the MapReduce framework is a large distributed sort. The hot spots, which the application defines, are: an input reader, a Map function, a partition function, a compare function, a Reduce function, an output writer Input reader: The input reader divides the input into appropriate size 'splits' (in practice typically 16 MB to 128 MB) and the framework assigns one split to each Map function. The input reader reads data from stable storage (typically a distributed file system) and generates key/value pairs. A common example will read a directory full of text files and return each line as a record. Map function: The Map function takes a series of key/value pairs, processes each, and generates zero or more output key/value pairs. The input and output types of the map can be (and often are) different from each other. If the application is doing a word count, the map function would break the line into words and output a key/value pair for each word. Each output pair would contain the word as the key and the number of instances of that word in the line as the value. Partition function: Each Map function output is allocated to a particular reducer by the application's partition function for sharding purposes. The partition function is given the key and the number of reducers and returns the index of the desired reduce. A typical default is to hash the key and use the hash value modulo the number of reducers. It is important to pick a partition function that gives an approximately uniform distribution of data per shard for load-balancing purposes, otherwise the MapReduce operation can be held up waiting for slow reducers (reducers assigned more than their share of data) to finish. Between the map and reduce stages, the data is shuffled (parallel-sorted / exchanged between nodes) in order to move the data from the map node that produced it to the shard in which it will be reduced. The shuffle can sometimes take longer than the computation time depending on network bandwidth, CPU speeds, data produced and time taken by map and reduce computations. Comparison function: The input for each Reduce is pulled from the machine where Map ran and sorted using the app's comparison function. Reduce function:The framework calls the application's Reduce function once for each unique key in the sorted order. The Reduce can iterate through the values that are associated with that key and produce zero or more outputs. In the word count example, the Reduce function takes the input values, sums them and generates a single output of the word and the final sum. Output writer: The Output Writer writes the output of the Reduce to the stable storage, usually a distributed file system. Distribution and reliability: MapReduce achieves reliability by parceling out a number of operations on the set of data to each node in the network. Each node is expected to report back periodically with completed work and status updates. If a node falls silent for longer than that interval, the master node (similar to the master server in the Google File System) records the node as dead and sends out the node's assigned work to other nodes. Individual operations use atomic operations for naming file outputs as a check to ensure that there are not parallel conflicting threads running. When files are renamed, it’s possible to also copy them to another name in addition to the task name (allowing for side-effects). Map Reduce-3 The reduce operations operate much the same way. Because of their inferior properties with regard to parallel operations, the master node attempts to schedule reduce operations on the same node, or in the same rack as the node holding the data being operated on. This property is desirable as it conserves bandwidth across the backbone network of the datacenter. Implementations are not necessarily highly reliable. For example, in older versions of Hadoop the NameNode was a single point of failure for the distributed filesystem. Later versions of Hadoop have high availability with an active/passive failover for the "NameNode." Map Reduce-4 Uses: MapReduce is useful in a wide range of applications, including distributed pattern-based searching, distributed sorting, web link-graph reversal, term-vector per host, web access log stats, inverted index construction, document clustering, machine learning,[5] and statistical machine translation. Moreover, the MapReduce model has been adapted to several computing environments like multi-core and many-core systems,[6][7] desktop grids,[8] volunteer computing environments,[9] dynamic cloud environments,[10] and mobile environments.[11] At Google, MapReduce was used to completely regenerate Google's index of the World Wide Web. It replaced the old ad hoc programs that updated the index and ran the various analyses.[12] MapReduce's stable inputs and outputs are usually stored in a distributed file system. The transient data is usually stored on local disk and fetched remotely by the reducers. Criticism: David DeWitt and Michael Stonebraker, computer scientists specializing in parallel databases and shared-nothing architectures, have been critical of the breadth of problems that MapReduce can be used for.[13] They called its interface too low-level and questioned whether it really represents the paradigm shift its proponents have claimed it is.[14] They challenged the MapReduce proponents' claims of novelty, citing Teradata as an example of prior art that has existed for over two decades. They also compared MapReduce programmers to Codasyl programmers, noting both are "writing in a low-level language performing low-level record manipulation."[14] MapReduce's use of input files and lack of schema support prevents the performance improvements enabled by common database system features such as B-trees and hash partitioning, tho projects such as Pig (or PigLatin), Sawzall, Apache Hive,[15] YSmart,[16] HBase[17] and BigTable[17][18] are addressing these probs. Greg Jorgensen wrote an article rejecting these views.[19] Jorgensen asserts that DeWitt and Stonebraker's entire analysis is groundless as MapReduce was never designed nor intended to be used as a database. DeWitt and Stonebraker have published a detailed benchmark study in 2009 comparing performance of Hadoop's MapReduce and RDBMS approaches on several specific problems.[20] They concluded that relational databases offer real advantages for many kinds of data use, especially on complex processing or where the data is used across an enterprise, but that MapReduce may be easier for users to adopt for simple or one-time processing tasks. They have published the data and code used in their study to allow other researchers to do comparable studies. Google has been granted a patent on MapReduce.[21] However, there have been claims that this patent should not have been granted because MapReduce is too similar to existing products. For example, map and reduce functionality can be very easily implemented in Oracle's PL/SQL database oriented language.[22] Conferences and users groups The First International Workshop on MapReduce and its Applications (MAPREDUCE'10) was held with the HPDC conference and OGF'29 meeting in Chicago, IL. MapReduce Users Groups around the world. Map Reduce-5 See also Hadoop, Apache's free and open source implementation of MapReduce. Pentaho - Open source data integration (Kettle), analytics, reporting, visualization and predictive analytics directly from Hadoop nodes Nutch - An effort to build an open source search engine based on Lucene and Hadoop, also created by Doug Cutting Datameer Analytics Solution (DAS) - data source integration, storage, analytics engine and visualization Apache Accumulo - Secure Big Table HBase - BigTable-model database Hypertable - HBase alternative Apache Cassandra - column-oriented DB supports access from Hadoop HPCC - LexisNexis Risk Solutions High Perf Computing Cluster Sector/Sphere - Open source distributed storage and processing Cloud computing Big data Data Intensive Computing Algorithmic skeleton - A high-level parallel programming model for parallel and distributed computing MongoDB - A scalable, high-performance, open source NoSQL database MapReduce-MPI MapReduce-MPI Library Specific references: ^ Google spotlights data center inner workings | Tech news blog - CNET News.com ^ "Our abstraction is inspired by the map and reduce primitives present in Lisp and many other functional languages." -"MapReduce: Simplified Data Processing on Large Clusters", by Jeffrey Dean and Sanjay Ghemawat; from Google Research ^ "Google's MapReduce Programming Model -- Revisited" — paper by Ralf Lämmel; from Microsoft ^ http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0004.html ^ Cheng-Tao Chu; Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Ng, and Kunle Olukotun. "Map-Reduce for Machine Learning on Multicore". NIPS 2006. ^ Colby Ranger; Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, and Christos Kozyrakis. "Evaluating MapReduce for Multi-core and Multiprocessor Systems". HPCA 2007, Best Paper. ^ Bingsheng He, et al.. "Mars: a MapReduce framework on graphics processors". PACT'08. ^ Bing Tang, Moca, M., Chevalier, S., Haiwu He and Fedak, G. "Towards MapReduce for Desktop Grid Computing". 3PGCIC'10. ^ Heshan Lin, et al. "MOON: MapReduce On Opportunistic eNvironments". HPDC'10. ^ Fabrizio Marozzo, Domenico Talia, Paolo Trunfio. "P2P-MapReduce: Parallel data processing in dynamic Cloud environments". In: Journal of Computer and System Sciences, vol. 78, n. 5, pp. 1382--1402, Elsevier Science, September 2012. ^ Adam Dou, et al . "Misco: a MapReduce framework for mobile systems". HPDC'10. ^ "How Google Works". baselinemag.com. "As of October, Google was running about 3,000 computing jobs per day through MapReduce, representing thousands of machine-days. Among others, these batch routines analyze latest Web pages and update Google's indexes." ^ "Database Experts Jump the MapReduce Shark". ^ a b David DeWitt; Michael Stonebraker. "MapReduce: A major step backwards". craig-henderson.blogspot.com. Retrieved 2008-08-27. ^ "Apache Hive - Index of - Apache Software Foundation". ^ Rubao Lee, et al "YSmart: Yet Another SQL-to-MapReduce Translator" (PDF). Map Reduce-6 ^ a b "HBase - HBase Home - Apache Software Foundation". ^ "Bigtable: A Distributed Storage System for Structured Data" (PDF). ^ Greg Jorgensen. "Relational Database Experts Jump The MapReduce Shark". typicalprogrammer.com. Retrieved 2009-11-11. ^ D. J. Dewitt, M. Stonebraker. et al "A Comparison of Approaches to Large-Scale Data Analysis". Brown University. Retrieved 2010-01-11. ^ US Patent 7,650,331: "System and method for efficient large-scale data processing " ^ Curt Monash. "More patent nonsense — Google MapReduce". dbms2.com. Retrieved 2010-03-07. General references: Dean, Jeffrey & Ghemawat, Sanjay (2004). "MapReduce: Simplified Data Processing on Large Clusters". Retrieved Nov. 23, 2011. Matt WIlliams (2009). "Understanding Map-Reduce". Retrieved Apr. 13, 2011. External links: Papers "CloudSVM: Training an SVM Classifier in Cloud Computing Systems"-paper by F. Ozgur Catak, M. Erdal Balaban, Springer, LNCS "A Hierarchical Framework for Cross-Domain MapReduce Execution" — paper by Yuan Luo, Zhenhua Guo, Yiming Sun, Beth Plale, Judy Qiu; from Indiana University and Wilfred Li; from University of California, San Diego "Interpreting the Data: Parallel Analysis with Sawzall" — paper by Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan; from Google Labs "Evaluating MapReduce for Multi-core and Multiprocessor Systems" — paper by Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, and Christos Kozyrakis; from Stanford University "Why MapReduce Matters to SQL Data Warehousing" — intro of MapReduce/SQL integration by Aster Data Systems and Greenplum "MapReduce for the Cell B.E. Architecture" — paper by Marc de Kruijf and Karthikeyan Sankaralingam; from University of Wisconsin–Madison "Mars: A MapReduce Framework on Graphics Processors" — paper by Bingsheng He, et al from Hong Kong University of Science and Technology; published in Proc. PACT 2008. It presents the design and implementation of MapReduce on graphics processors. "A Peer-to-Peer Framework for Supporting MapReduce Applications in Dynamic Cloud Environments" Fabrizio Marozzo, et al University of Calabria; Cloud Computing: Principles, Systems and Applications, chapt. 7, pp. 113–125, Springer, 2010, ISBN 978-1-84996-240-7. "Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters" — paper by Hung-Chih Yang, et al Yahoo and UCLA; published in Proc. of ACM SIGMOD, pp. 1029–1040, 2007. (This paper shows how to extend MapReduce for relational data processing.) FLuX: the Fault-tolerant, Load Balancing eXchange operator from UC Berkeley provides an integration of partitioned parallelism with process pairs. This results in a more pipelined approach than Google's MapReduce with instantaneous failover, but with additional implementation cost. "A New Computation Model for Rack-Based Computing" — paper by Foto N. Afrati; Jeffrey D. Ullman; from Stanford University; Not published as of Nov 2009. This paper is an attempt to develop a general model in which one can compare algorithms for computing in an environment similar to what map-reduce expects. FPMR: MapReduce framework on FPGA—paper by Yi Shan, Bo Wang, Jing Yan, Yu Wang, Ningyi Xu, Huazhong Yang (2010), in FPGA '10, Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays.