Uploaded by Umesh P

AWS-Foundation-DynamoDB Part-1

advertisement
AWS Foundation
Amazon DynamoDB
Copyright IntelliPaat, All rights reserved
Agenda
1
NoSQL Databases
2
DynamoDB
3
Components
4
API
5
Provisioned Throughput
6
Partitions
Copyright IntelliPaat, All rights reserved
NoSQL Databases
•
NoSQL (“not only” SQL) database is non-relational and largely distributed database.
•
Provides high scalability and availability. Provides structured, semi-structured and non-structured
schemas.
•
Types:
o Graph Database: A network database that uses edges and nodes to represent and store data. Neo4j, Titan etc.
o Key Value Store: Schema-less format, data stored as key value. Riak, DynamoDB.
o Document based: Data stored as JSON documents. MongoDB, CouchDB.
o Column based: Each storage block contains data from only one column. Cassandra, HBase.
•
Why NoSQL??
Copyright IntelliPaat, All rights reserved
DynamoDB Concepts
•
•
Fully managed NoSQL database.
No Database Administration required.
•
Scaling can be done at individual table level.
•
DynamoDB automatically spreads the data and traffic for tables over a sufficient number of servers to
handle throughput and storage requirements.
•
Data is stored on high performance SSDs and replicated across multiple AZs in a Region.
Copyright IntelliPaat, All rights reserved
Components
•
•
•
Tables – Data are stored in tables.
Items – Rows.
Attributes – Columns.
Copyright IntelliPaat, All rights reserved
Demo & Lab
•
•
•
AWS Management Console
AWS CLI (Command Line Interface)
Python Boto3
Copyright IntelliPaat, All rights reserved
JSON
• Introduction to JSON – Java Script Object Notation.
{
“EmpID” : 12345 ,
“EmpName” : “xyz” ,
“Address” : {
“Building” : “Bldg-1” ,
“Street” : ”40/1 Blvd” ,
“ZipCode” : 654321
},
“Skills” : [ “AWS” , “Java” , “Oracle” ] ,
“cars” : [
}
{ “name” : “Toyota” , “models” : [ “Prius” , “Camry”
,“Corolla”] } ,
{ “name” : “Honda” , “models” : [ “Accord” , “Civic” ] } ,
{ “name” : “Jeep” }
]
Copyright IntelliPaat, All rights reserved
JSON
•
Previous Record
{
“EmpID” : 12345 ,
“EmpName” : “xyz” ,
“Address” : {
“Building” : “Bldg-1” ,
“Street” : “40/1 Blvd” ,
“ZipCode” : 654321 ,
},
“Skills” : [ “AWS” , “Java” , “Oracle” ] ,
“cars” : [
{ “name” : “Toyota” , “models” : [ “Prius” , “Camry” , “Corolla”] } ,
{ “name” : “Honda” , “models” : [ “Accord” , “Civic” ] } ,
{ “name” : “Jeep” }
]
}
Copyright IntelliPaat, All rights reserved
Attributes
• Primary Key – MUST be mentioned while creating a table. Unique for each item.
 Partition Key (Hash Attribute).
 Sort Key (Range Attribute).
P
K
a
b
11
P
Q
22
R
S
P-1
Partition-1
Partition-2
Partition-3
Copyright IntelliPaat, All rights reserved
Secondary Indexes
•
Additional to primary keys for faster query performance.
•
Types:
 Global Secondary Indexes (GSI)
 Local Secondary Indexes (LSI)
 Max 5 Global and Local secondary indexes per table.
•
Indexes are updated automatically when base table is modified.
•
While creating an index attributes which are to be copied should be mentioned. If nothing is
mentioned DynamoDB will copy the key attributes at the least.
•
Index can be thought of a separate table sitting differently than the base table.
Copyright IntelliPaat, All rights reserved
Consistency Model
• DynamoDB data is replicated across multiple AZs in a Region.
• Eventual Consistency
• Strong Consistency
Partition-1
Partition-1
Replication-1
Partition-1
Replication-2
Copyright IntelliPaat, All rights reserved
Capacity Units
• While creating a table, throughput capacity can be provisioned in terms of READS and WRITES.
• Read Capacity Unit (RCU)
 Strong Consistent Read: One read per second, for item size up to 4KB
 Eventual Consistent Read: Two reads per second, for item size up to 4KB. OR, one read per second for item
size from 4KB+ to 8KB.
• Write Capacity Unit (WCU)
 One write per second, for item size up to 1KB.
• Limits (Soft) – Initial limits on provisioned throughput
 N. Virginia: 40000 RCU and 40000 WCU per table. 80000 RCU and WCU per account.
 Other: 10000 RCU and WCU per table. 20000 RCU and WCU per account.
•
“ProvisionedThroughputExceededException”.
Copyright IntelliPaat, All rights reserved
Capacity Units - Example
- Application Requirement:
-
9338 Strongly Consistent reads per second.
5000 Writes per second
Average Item Size – 10KB
How much RCU/WCU should be provisioned while creating this table.
- RCU needed to read one item per second – 10KB/4KB = 3
- RCU needed to read 9338 items per second – 9338 * 3 = 28014 (Only
possible in N. Virginia region)
- WCU needed to write one item per second – 10KB/1KB = 10
- WCU needed to write 5000 items per second – 5000 * 10 = 50000. This is
not possible anywhere until limit is increased for the account with the help
of AWS support.
Copyright IntelliPaat, All rights reserved
Partitions
• Partition is a storage allocation for tables backed by SSDs.
• Data distribution using primary key only. In the case of composite primary key with partition and sort
key, data with same partition is always stored physically close together.
• Initial Partitions – A partition can support a maximum of 3000 RCU or 1000 WCU and a data capacity
of 10GB.
3000 RCU
1000 WCU
10 GB
Copyright IntelliPaat, All rights reserved
Partitions - Example
Initial Partitions = (RCU/3000) + (WCU/1000)
• If the size per partition goes beyond 10GB or Provisioned throughputs are increased DynamoDB
doubles current no. of partitions.
• Throughput and data are evenly distributed across partitions.
• Rebalancing is done automatically by DynamoDB in the background, without any application impact at
all.
RCU = 8000
WCU = 1500
(8000/3000 = 3)
+
(1500/1000 = 2)
=5
RCU=1600
WCU = 300
RCU=1600
WCU = 300
RCU=1600
WCU = 300
RCU=1600
WCU = 300
RCU=1600
WCU = 300
Copyright IntelliPaat, All rights reserved
DynamoDB API
• DynamoDB operations has to be done by provided APIs
 CreateTable – Create a new table. Can be used to create indexes as well.
 DescribeTable – Returns information about tables.
 ListTables – Returns all tables.
 UpdateTable - Modifies the settings of a table or its indexes, creates or remove new indexes on a
table, or modifies DynamoDB Streams settings for a table.
 DeleteTable – Removes table and its dependent objects.
 PutItem – Writes a single item, primary key must be specified.
 BatchWriteItem - Writes up to 25 items to a table.
 GetItem – Retrieves a single item with Primary Key.
 BatchGetItem – Retrieves up to 100 items from a table.
 Query - Retrieves all of the items that have a specific partition key.
 Scan - Retrieves all of the items in the specified table or index.
 UpdateItem - Modifies one or more attributes in an item when Primary Key is provided.
 DeleteItem – Deletes a single item with a specific Primary Key.
Copyright IntelliPaat, All rights reserved
Download