finalpres

advertisement
NoSQL Databases
Oracle - Berkeley DB
Rasanjalee DM
Smriti J
CSC 8711
Instructor: Dr. Raj Sunderraman
Content
A brief intro to NoSQL
About Berkeley Db
About our application
???
3
What is NoSQL?
• Stands for Not Only SQL
• Class of non-relational data storage systems
• Usually do not require a fixed table schema nor do they
use the concept of joins, group by, order by and so on.
• All NoSQL offerings relax one or more of the ACID
properties.
What is NoSQL ?
• Next generation databases
• Characteristic:
– Large Data Volumes
– Non-relational
– Distributed
– Open-source
– Scalable replication and distribution
CAP Theorem
History of NoSQL
• The term NoSQL was introduced by Carl Strozzi in 1998
to name his file based database.
• It was again re-introduced by Eric Evans when an event
was organized to discuss open source distributed
databases.
8
Why NoSQL Databases ?
• Bigness
• Massive write performance
• Fast key-value access
• Flexible schema and Flexible data types
• No single point of failure
• Programming ease of use
Scaling to size vs complexity.
12
Berkeley DB - Introduction
• An open-source, embedded transactional data
management system.
• A key/value store.
• Runs on everything from cell phone to large servers.
• Distributed as a library that can be linked directly into an
application.
• Berkeley DB has high reliability and high performance.
Berkeley DB Product Family
Architecture
Berkeley DB: The Design Philosophy
• Provide mechanisms without specifying policies.
• For example, Berkeley DB is abstracted as a store of
<key, value> pairs.
– Both keys and values are opaque byte-strings.
– Berkeley DB has no schema.
– Application that embeds Berkeley DB is responsible
for imposing its own schema on the data.
Data Access Services
• Indexing methods
– B-Tree
– Hash
– Queue
– A record-number-based index
Advantages of <key,
value> pairs
• An application is free to store data in whatever form is most
natural to it.
– Objects (like structures in C language)
– Rows in Oracle, SQL Server
– Columns in C-store
• Different data formats can be stored in the same databases.
Data Management Services
Concurrency
Transactions
Recovery
Berkeley DB Applications
• Local Directory Access Protocol
• Mail Servers
• Manage access control lists
• Store user keys in a public-infrastructure
• Record machine-to-network address mappings in
address servers
Berkeley DB for Computationally
Intensive Algorithms
• Algorithms that repeatedly execute a computationally
intensive operation
– E.g. Factorial
• Useful to create a cache containing the already computed
results
– Cache = Set of <key,value> pairs containing <n, factorial(n)>
• Advantages:
– avoid to re-compute results for the same input (even over
different executions)
– In a process crash, we can still start again the process and
quickly go back to the point where it stopped
•
•
•
•
•
•
•
•
•
•
•
•
•
•
In memory map
Simple
Very efficient (b/s in completely memory)
Need considerable amount of memory
No fault tolerance (We need to manually save data to a file)
Relation Databases
ACID properties may not be necessary
Cannot handle Big data
Slow
NoSQL databases (Berkeley DB)
Fast key-value access
Flexible schema and Flexible data types
Ease of use
Fault tolerance
Berkeleydb.java
• Open Environment:
•
EnvironmentConfig class specify environment configuration parameters
• Open Class Catalog:
• Class catalog : specialized database store that contain java
class descriptions of all serialized objects stored in the
database
• Create Database and StoredClassCatalog object
• Open Database:
• Close Environment, Class Catalog and
Databases:
DBViews.java
Factorial.java
Factorial (Berkeley DB ) – Memory
Usage
Factorial (MySQL) – Memory
Usage
Factorial (HashMap) – Memory Usage
References
• http://www.slideshare.net/thobe/nosql-for-dummies
• http://www.slideshare.net/gregburd/oracle-berkeley-db-java-editionsimple-java-object-persistence
• Berkeley DB, Michael A. Olson, Keith Bostic, and Margo Seltzer,
USENIX Technical Conference
• http://highscalability.com/blog/2010/12/6/what-the-heck-are-youactually-using-nosql-for.html
41
Download