NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman Content A brief intro to NoSQL About Berkeley Db About our application ??? 3 What is NoSQL? • Stands for Not Only SQL • Class of non-relational data storage systems • Usually do not require a fixed table schema nor do they use the concept of joins, group by, order by and so on. • All NoSQL offerings relax one or more of the ACID properties. What is NoSQL ? • Next generation databases • Characteristic: – Large Data Volumes – Non-relational – Distributed – Open-source – Scalable replication and distribution CAP Theorem History of NoSQL • The term NoSQL was introduced by Carl Strozzi in 1998 to name his file based database. • It was again re-introduced by Eric Evans when an event was organized to discuss open source distributed databases. 8 Why NoSQL Databases ? • Bigness • Massive write performance • Fast key-value access • Flexible schema and Flexible data types • No single point of failure • Programming ease of use Scaling to size vs complexity. 12 Berkeley DB - Introduction • An open-source, embedded transactional data management system. • A key/value store. • Runs on everything from cell phone to large servers. • Distributed as a library that can be linked directly into an application. • Berkeley DB has high reliability and high performance. Berkeley DB Product Family Architecture Berkeley DB: The Design Philosophy • Provide mechanisms without specifying policies. • For example, Berkeley DB is abstracted as a store of <key, value> pairs. – Both keys and values are opaque byte-strings. – Berkeley DB has no schema. – Application that embeds Berkeley DB is responsible for imposing its own schema on the data. Data Access Services • Indexing methods – B-Tree – Hash – Queue – A record-number-based index Advantages of <key, value> pairs • An application is free to store data in whatever form is most natural to it. – Objects (like structures in C language) – Rows in Oracle, SQL Server – Columns in C-store • Different data formats can be stored in the same databases. Data Management Services Concurrency Transactions Recovery Berkeley DB Applications • Local Directory Access Protocol • Mail Servers • Manage access control lists • Store user keys in a public-infrastructure • Record machine-to-network address mappings in address servers Berkeley DB for Computationally Intensive Algorithms • Algorithms that repeatedly execute a computationally intensive operation – E.g. Factorial • Useful to create a cache containing the already computed results – Cache = Set of <key,value> pairs containing <n, factorial(n)> • Advantages: – avoid to re-compute results for the same input (even over different executions) – In a process crash, we can still start again the process and quickly go back to the point where it stopped • • • • • • • • • • • • • • In memory map Simple Very efficient (b/s in completely memory) Need considerable amount of memory No fault tolerance (We need to manually save data to a file) Relation Databases ACID properties may not be necessary Cannot handle Big data Slow NoSQL databases (Berkeley DB) Fast key-value access Flexible schema and Flexible data types Ease of use Fault tolerance Berkeleydb.java • Open Environment: • EnvironmentConfig class specify environment configuration parameters • Open Class Catalog: • Class catalog : specialized database store that contain java class descriptions of all serialized objects stored in the database • Create Database and StoredClassCatalog object • Open Database: • Close Environment, Class Catalog and Databases: DBViews.java Factorial.java Factorial (Berkeley DB ) – Memory Usage Factorial (MySQL) – Memory Usage Factorial (HashMap) – Memory Usage References • http://www.slideshare.net/thobe/nosql-for-dummies • http://www.slideshare.net/gregburd/oracle-berkeley-db-java-editionsimple-java-object-persistence • Berkeley DB, Michael A. Olson, Keith Bostic, and Margo Seltzer, USENIX Technical Conference • http://highscalability.com/blog/2010/12/6/what-the-heck-are-youactually-using-nosql-for.html 41