Big Table: Distributed Storage System For Structured Data Sergejs Melderis Dennis Kafura – CS5204 – Operating Systems 11 BigTable Unstructured Data vs. Structured Data Unstructured data refers to computerized information that either does not have a data model plain text, audio Structured data can be described by data model Flat Hierarchical Network Relational Dimensional Object-relational Dennis Kafura – CS5204 – Operating Systems 2 BigTable Relational Model and RDBMS most popular model of organizing structured data model based on first-order predicate logic provides a declarative method for specifying data and queries via SQL data is organized in tables of fixed-length records variety of open source and commercial implementations provides ACID properties Dennis Kafura – CS5204 – Operating Systems 33 BigTable NoSQL not relational database no fixed table schemas no join operations no sql flexible and/or no data model usually do not provide ACID properties scale horizontally Dennis Kafura – CS5204 – Operating Systems 44 BigTable BigTable distributed, high performance, fault-tolerant, NoSql storage system build on top of Google File System designed to scale to a very large size on low cost commodity hardware it was designed by Google and used in various projects (web indexing) the paper was published in 2006 related implementations HBase Hypertable Apache Cassandra Neptune Dennis Kafura – CS5204 – Operating Systems 55 BigTable BigTable Data Model sparse, distributed, persistent multi-dimensional sorted map map is indexed by a row key, column family, column key, and a timestamp { row : { column_family : { column : { timestamp : value } } } Dennis Kafura – CS5204 – Operating Systems 66 BigTable Webtable “contents” “anchor:cnnsi.com t6 “com.cnn.www” “<html>...” “CNN” “anchor:my.look.ca” t9 Dennis Kafura – CS5204 – Operating Systems t9 “CNN.com” 77 BigTable Relational Data Model Student StudentCourse Course student_id - PK student_id crn PK first_name crn course last_name title birthday type major instructor_id academic_level seats Dennis Kafura – CS5204 – Operating Systems 88 BigTable Student table Row Key Column Family Column Qualifier student_id Column Qualifier info course last_name <crn> first_name birthday major academic_level Dennis Kafura – CS5204 – Operating Systems 9 BigTable Course table Row Key Column Family Column Qualifier crn Column Qualifier info students course <student_id> title type instructor_id seats Dennis Kafura – CS5204 – Operating Systems 10 BigTable Example info:first_name “905514” “Sergejs” info:course “96322” “CS5204” info:last_name “Melderis” info:title “Operating Systems” info:major courses:96322 “Computer Science” info:instructor_id “YES” students:905514 “1983943” Dennis Kafura – CS5204 – Operating Systems “YES” courses:96320 “NO” students:905520 “YES” 11 11 BigTable Students data view in JSON { 905514: { info : { first_name : { t1 : Sergejs }, last_name : { t1 : Melderis }, major : { t1 : Comp Science } }, courses : { 96322: { t1 : “YES” }, 96320: { t2 : “NO” } } } Dennis Kafura – CS5204 – Operating Systems 12 12 BigTable Rows row keys are arbitrary strings up to 64 KB read and write of data under a single row is atomic ordered in lexicographic order by row key row range is dynamically partitioned into blocks called tablets tablets are units of distribution and loadbalancing Dennis Kafura – CS5204 – Operating Systems 13 13 BigTable Columns Column keys are grouped by column families Column family is a basic unit of access control All data stored in a column family is of the same type Number of column families should be small There can be unlimited number of columns Column key is named using family:qualifier Dennis Kafura – CS5204 – Operating Systems 14 14 BigTable Timestamps Bigtable can contain multiple versions of the same data timestamps are 64-bit integers assigned by Bigtable or client client can specify to keep up to n versions of data Dennis Kafura – CS5204 – Operating Systems 15 15 BigTable Implementation client library one master server distributed lock service called Chubby many tablet servers containing several tablets tablet server handles read and write requests automatically splits tablets that have grown too large (100 - 200 MB) client data directly goes to tablet server Dennis Kafura – CS5204 – Operating Systems 16 16 BigTable Tablet Location three-level hierarchy to store tablet location first level is stored in lock service root tablet contains the location of metadata tables metadata tablets contain the location of user tables UserTable1 METADATA tablets Root tablet Lock Service UserTable2 Dennis Kafura – CS5204 – Operating Systems 17 BigTable Distribution of data One master server Chubby distributed lock service Hundred or thousands of tablet servers Each tablet contains a contiguous range of rows Master distributes tablets across of servers Each tablet server contains tablets with different ranges Dennis Kafura – CS5204 – Operating Systems 18 18 BigTable Tablet Representation memtable Read Op Memory GFS tablet log Write Op SSTable Dennis Kafura – CS5204 – Operating Systems SSTable 19 19 BigTable Compactions compaction is a process of writing memtable to SSTable minor compaction write memtable to SSTable shrinks the memory usage of the tablet server reduces the commit log merging compaction merges several SSTables major compaction rewrites all SSTables into exactly one SSTable Dennis Kafura – CS5204 – Operating Systems 20 20 BigTable API create, delete tables and column families write or delete values look up values from individual rows scan over a subset of the data in a table Dennis Kafura – CS5204 – Operating Systems 21 21 BigTable Dennis Kafura – CS5204 – Operating Systems 22 22