Hadoop-HBase-Tutorial - CSE Labs User Home Pages

Gowtham Rajappan  HDFS – Hadoop Distributed File System modeled on Google GFS.  Hadoop MapReduce – Similar to Google MapReduce  Hbase – Similar to Google Bigtable  Master: hadoop01.cselabs.umn.edu  Slaves: hadoop02 – hadoop05.cselabs.umn.edu  You will require cselabs account to access this cluster. You can login to any of these machines from any cs/cselabs machine.   Data is divided into various tables Table is composed of columns, columns are grouped into columnfamilies   Partitioning  A table is horizontally partitioned into regions, each region is composed of sequential range of keys  Each region is managed by a RegionServer, a single RegionServer may hold multiple regions Persistence and data availability  HBase stores its data in HDFS, it doesn't replicate RegionServers and relies on HDFS replication for data availability.  Region data is cached in-memory  Updates and reads are served from in-memory cache (MemStore)  MemStore is flushed periodically to HDFS  Write Ahead Log (stored in HDFS) is used for durability of updates  HBase shell provides interactive commands for manipulating database  Create/delete tables  Insert/update/read from tables  Manage regions   Hbase provides single row atomic operations  CheckAndPut – Similar to test-and-set  CheckAndDelete  All row operations are atomic no matter how many columns are involved. Hbase also provides row level exclusive locks  You can use these locks to implement single row level transactions  HBase stores multiple versions of a column in a row. Each version is identified by a integer timestamp  By default system time is used as version timestamps. However user can specify a logical timestamp for versioning  Each update to a row creates a new version, for the specified column.  A version can be accessed or deleted using its timestamp. HBase allows to obtain list of all the versions.  Hadoop Home - http://hadoop.apache.org/  Hbase - http://hbase.apache.org/  API  http://hbase.apache.org/apidocs/  http://hadoop.apache.org/

Hadoop-HBase-Tutorial - CSE Labs User Home Pages

Related documents

Products

Support

Hadoop-HBase-Tutorial - CSE Labs User Home Pages

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib