Uploaded by Oli Dare

AWS Solutions Architect Bible

advertisement
AWS Solutions Architect – Exam Notes
AWS Solutions Architect – Exam Notes ...................................................................................................................................... 1
Chapter 3 – IAM .................................................................................................................................................................... 5
AWS STS........................................................................................................................................................................... 5
Chapter 4 – S3 ...................................................................................................................................................................... 6
Amazon S3 Event Notifications ........................................................................................................................................ 6
Chapter 5 - EC2 Section: ....................................................................................................................................................... 9
AWS Outposts.................................................................................................................................................................. 9
Chapter 6 – EBS .................................................................................................................................................................. 10
Volume Types: ............................................................................................................................................................... 10
EFS ................................................................................................................................................................................. 10
FSx VS EFS vs Fsx Lustre ................................................................................................................................................. 10
Storage Options (Big Topic!) .......................................................................................................................................... 11
AWS Backup .................................................................................................................................................................. 11
Chapter 7 - Database .......................................................................................................................................................... 13
RDS ................................................................................................................................................................................ 13
Redshift ......................................................................................................................................................................... 13
Aurora ........................................................................................................................................................................... 13
Aurora Serverless .......................................................................................................................................................... 13
DyanmoDb..................................................................................................................................................................... 13
Chapter 8 - Networking / virtual private cloud ................................................................................................................... 19
VPC ................................................................................................................................................................................ 19
NAT Gateway ................................................................................................................................................................. 19
Access Control Lists ....................................................................................................................................................... 19
NACL .............................................................................................................................................................................. 19
Security Groups ............................................................................................................................................................. 19
ENI – Elastic Network Interface ..................................................................................................................................... 20
Direct Connect ............................................................................................................................................................... 21
VPN vs Direct Connect ................................................................................................................................................... 21
VPC Endpoints ............................................................................................................................................................... 21
VPC peering ................................................................................................................................................................... 21
AWS Private Link ............................................................................................................................................................ 21
AWS Transit Gateway..................................................................................................................................................... 22
VPN Hub ........................................................................................................................................................................ 22
AWS Wavelength ........................................................................................................................................................... 22
Answer: ......................................................................................................................................................................... 22
Network ACLs are stateless, and security groups are stateful. ...................................................................................... 22
Chapter 9 – Route 53 .......................................................................................................................................................... 25
Chapter 10 – Elastic Load Balancer ..................................................................................................................................... 26
Application Load Balancer ............................................................................................................................................. 26
Chapter 11 - Cloudwatch .................................................................................................................................................... 27
CloudWatch Logs ........................................................................................................................................................... 27
Amazon Managed Grafana ............................................................................................................................................ 28
1
Amazon Managed service for Prometheus ................................................................................................................... 28
Chapter 12 – Scaling ........................................................................................................................................................... 29
How to Scale? ................................................................................................................................................................ 29
Scale Relational DBs ...................................................................................................................................................... 31
Scale Non Relational DBs (DynamoDB) ......................................................................................................................... 32
Chapter 13 – Decoupling Workflows .................................................................................................................................. 35
2, SNS: ........................................................................................................................................................................... 36
3, API Gateway .............................................................................................................................................................. 36
AWS Batch ..................................................................................................................................................................... 38
Fargate ........................................................................................................................................................................... 38
Amazon MQ................................................................................................................................................................... 41
Aws Step Function ......................................................................................................................................................... 42
Amazon Appflow ........................................................................................................................................................... 44
Chapter 14 Big Data ............................................................................................................................................................ 46
Amazon redshift ............................................................................................................................................................ 46
AWS Elastic Map Reduce ............................................................................................................................................... 46
AWS Kinesis ................................................................................................................................................................... 46
Kinesis Data Analytics .................................................................................................................................................... 47
AWS Athena ................................................................................................................................................................... 47
AWS Glue ....................................................................................................................................................................... 47
AWS Quicksight ............................................................................................................................................................. 50
Aws data pipeline .......................................................................................................................................................... 53
Amazon MSK ................................................................................................................................................................. 54
Amazon opensearch (Formerly elasticsearch) ............................................................................................................... 55
Chapter 15 - Serverless Architecture .................................................................................................................................. 57
Lambda .......................................................................................................................................................................... 57
AWS Serverless Application Repository ......................................................................................................................... 57
Container ....................................................................................................................................................................... 57
ECS/ EKS – running containers ....................................................................................................................................... 58
AWS Fargate .................................................................................................................................................................. 58
Amazon Eventbridge (AKA Cloudwatch Events) ............................................................................................................ 59
Amazon ECR .................................................................................................................................................................. 60
ECS Anywhere ............................................................................................................................................................... 60
Aurora Serverless .......................................................................................................................................................... 61
Amazon X-Ray ................................................................................................................................................................ 63
GraphQL - ...................................................................................................................................................................... 63
Chapter 16 - Security .......................................................................................................................................................... 65
DDOS ............................................................................................................................................................................. 65
CloudTrail – Logging API calls ........................................................................................................................................ 65
AWS Shield .................................................................................................................................................................... 65
AWS WAF (Web App Firewall) ....................................................................................................................................... 66
Centralising WAF Mgmt via AWS firewall manager ....................................................................................................... 66
awS guardDuty .............................................................................................................................................................. 67
2
Macie............................................................................................................................................................................. 67
AWS inspector ............................................................................................................................................................... 67
Key Management Service KMS ...................................................................................................................................... 69
HSM – ............................................................................................................................................................................ 69
AWS Secrets Manager ................................................................................................................................................... 71
Parameter store ............................................................................................................................................................. 72
IAM & KMS .................................................................................................................................................................... 72
Presigned URLs .............................................................................................................................................................. 72
IAM Policies ................................................................................................................................................................... 73
AWS Certificate Manager .............................................................................................................................................. 76
Audit Manager .............................................................................................................................................................. 76
AWS Artifact .................................................................................................................................................................. 77
Amazon Cognito (Comes up a lot) ................................................................................................................................. 77
Amazon Detective ......................................................................................................................................................... 79
AWS Network Firewall ................................................................................................................................................... 79
AWS Security Hub .......................................................................................................................................................... 80
AWS Network Firewall vs AWS Firewall Manager (easy slip up) .................................................................................... 80
Chapter 17 – Automation ................................................................................................................................................... 81
CloudFormation............................................................................................................................................................. 81
Elastic beanstalk ............................................................................................................................................................ 82
Systems Manager .......................................................................................................................................................... 83
Chapter 18 - Caching .......................................................................................................................................................... 84
Cloudfront ..................................................................................................................................................................... 84
Elasticache ..................................................................................................................................................................... 86
Managed version of open source Memcahced and redis. ............................................................................................. 86
Memcached ................................................................................................................................................................... 86
Redis .............................................................................................................................................................................. 86
Dynamo DB Accelerator (DAX) ...................................................................................................................................... 86
DAX vs Elasticache ......................................................................................................................................................... 86
Global Accelerator ......................................................................................................................................................... 87
Fixes IP caching .............................................................................................................................................................. 88
Traffic Routing – R53, Global Accelerator or Cloudfront ................................................................................................ 88
Chaper 19 Managing Accounts and Organisations ............................................................................................................. 90
AWS Organisations ........................................................................................................................................................ 90
AWS Resource Access Manager RAM – Sharing Resources ........................................................................................... 90
Cross Account Role Access ............................................................................................................................................ 91
AWS Config .................................................................................................................................................................... 91
AWS Directory Service ................................................................................................................................................... 92
Cost Explorer ................................................................................................................................................................. 92
AWS Budgets ................................................................................................................................................................. 92
AWS Cost and Usage Reports (AWS CUR) ...................................................................................................................... 92
AWS Compute Optimiser ............................................................................................................................................... 92
AWS Trusted Advisor ..................................................................................................................................................... 93
3
AWS Control Tower........................................................................................................................................................ 93
AWS Licence Manager ................................................................................................................................................... 93
AWS Health ................................................................................................................................................................... 93
AWS Service Catalog ...................................................................................................................................................... 94
AWS Proton ................................................................................................................................................................... 94
Well architected tool ..................................................................................................................................................... 94
Chapter 20 – Migration ....................................................................................................................................................... 95
AWS Snow ..................................................................................................................................................................... 95
Storage Gateway ........................................................................................................................................................... 97
File Gateway .................................................................................................................................................................. 97
Volume Gateway ........................................................................................................................................................... 97
Tape Gateway ................................................................................................................................................................ 97
SMB Summary of differences: NFS vs. SMB ................................................................................................................. 101
NFS .............................................................................................................................................................................. 102
SMB ............................................................................................................................................................................. 102
Datasync ...................................................................................................................................................................... 102
Migration Hub ............................................................................................................................................................. 104
Server migration service .............................................................................................................................................. 105
Database migration service ......................................................................................................................................... 105
AWS Application Discovery service.............................................................................................................................. 105
AWS Application Migration Service ............................................................................................................................. 105
Chapter 21 Front-End Web and Mobile ............................................................................................................................ 107
AWS Amplify ................................................................................................................................................................ 107
AWS Device Farm ........................................................................................................................................................ 107
Amazon Pinpoint ......................................................................................................................................................... 107
Chapter 22 Machine Learning Front-End Web and Mobile.............................................................................................. 108
Misc Qs ............................................................................................................................................................................. 109
Areas not best understood ............................................................................................................................................... 120
AWS Well-Architected Framework ................................................................................................................................... 122
Framework Overview .................................................................................................................................................. 122
6 Pillars ........................................................................................................................................................................ 122
AWS Shared Responsibility Model .................................................................................................................................... 123
Online tips ........................................................................................................................................................................ 124
4
Chapter 3 – IAM
IAM
Control IAM perms with policy documents.
IAM Federation to link IAM with Microsoft AD.
Groups
For permanent creds - Put users in a group and assign policy to group.
Roles
Roles are temporary
Can allow cross account access.
Can attach it to a resource.
AWS STS
AWS provides AWS Security Token Service (AWS STS) as a web service that enables you to request temporary, limitedprivilege credentials for users.
Question:
A new employee has joined a company as a deployment engineer. The deployment engineer will be using AWS
CloudFormation templates to create multiple AWS resources. A solutions architect wants the deployment engineer to
perform job activities while following the principle of least privilege.
Which combination of actions should the solutions architect take to accomplish this goal? (Choose two.)
A. Have the deployment engineer use AWS account root user credentials for performing AWS CloudFormation stack
operations.
B. Create a new IAM user for the deployment engineer and add the IAM user to a group that has the PowerUsers IAM
policy attached.
C. Create a new IAM user for the deployment engineer and add the IAM user to a group that has the AdministratorAccess
IAM policy attached.
D. Create a new IAM user for the deployment engineer and add the IAM user to a group that has an IAM policy that allows
AWS CloudFormation actions only.
E. Create an IAM role for the deployment engineer to explicitly define the permissions specific to the AWS CloudFormation
stack and launch stacks using that IAM role.
Answer:
The two actions that should be taken to follow the principle of least privilege are:
D) Create a new IAM user for the deployment engineer and add the IAM user to a group that has an IAM policy that allows
AWS CloudFormation actions only.
E) Create an IAM role for the deployment engineer to explicitly define the permissions specific to the AWS CloudFormation
stack and launch stacks using that IAM role.
The principle of least privilege states that users should only be given the minimal permissions necessary to perform their
job function.
5
Chapter 4 – S3
Amazon S3 Event Notifications
You can use the Amazon S3 Event Notifications feature to receive notifications when certain events happen in your S3
bucket. To enable notifications, add a notification configuration that identifies the events that you want Amazon S3 to
publish. Make sure that it also identifies the destinations where you want Amazon S3 to send the notifications. You store
this configuration in the notification subresource that's associated with a bucket. For more information, see Bucket
configuration options. Amazon S3 provides an API for you to manage this subresource.
S3 Transfer Acceleration
Amazon S3 Transfer Acceleration is a bucket-level feature that enables fast, easy, and secure transfers of files over long
distances between your client and an S3 bucket. Transfer Acceleration is designed to optimize transfer speeds from across
the world into S3 buckets. Transfer Acceleration takes advantage of the globally distributed edge locations in Amazon
CloudFront.
Question
An image-hosting company stores its objects in Amazon S3 buckets. The company wants to avoid accidental exposure of the
objects in the S3 buckets to the public. All S3 objects in the entire AWS account need to remain private.
Which solution will meet these requirements?
A. Use Amazon GuardDuty to monitor S3 bucket policies. Create an automatic remediation action rule that uses an AWS
Lambda function to remediate any change that makes the objects public.
B. Use AWS Trusted Advisor to find publicly accessible S3 buckets. Configure email notifications in Trusted Advisor when a
change is detected. Manually change the S3 bucket policy if it allows public access.
C. Use AWS Resource Access Manager to find publicly accessible S3 buckets. Use Amazon Simple Notification Service
(Amazon SNS) to invoke an AWS Lambda function when a change is detected. Deploy a Lambda function that
programmatically remediates the change.
D. Use the S3 Block Public Access feature on the account level. Use AWS Organizations to create a service control policy
(SCP) that prevents IAM users from changing the setting. Apply the SCP to the account.
Solution
Answer D is the correct solution that meets the requirements. The S3 Block Public Access feature allows you to restrict
public access to S3 buckets and objects within the account. You can enable this feature at the account level to prevent any
S3 bucket from being made public, regardless of the bucket policy settings. AWS Organizations can be used to apply a
Service Control Policy (SCP) to the account to prevent IAM users from changing this setting, ensuring that all S3 objects
remain private. This is a straightforward and effective solution that requires minimal operational overhead.
Question
A company’s website provides users with downloadable historical performance reports. The website needs a solution that
will scale to meet the company’s website demands globally. The solution should be cost-effective, limit the provisioning of
infrastructure resources, and provide the fastest possible response time.
Which combination should a solutions architect recommend to meet these requirements?
A. Amazon CloudFront and Amazon S3
B. AWS Lambda and Amazon DynamoDB
C. Application Load Balancer with Amazon EC2 Auto Scaling
D. Amazon Route 53 with internal Application Load Balancers
Solution
Global, cost-effective, serverless, low latency = CloudFront with S3
Static content = S3
6
Question
A company has an application that collects data from IoT sensors on automobiles. The data is streamed and stored in
Amazon S3 through Amazon Kinesis Data Firehose. The data produces trillions of S3 objects each year. Each morning, the
company uses the data from the previous 30 days to retrain a suite of machine learning (ML) models.
Four times each year, the company uses the data from the previous 12 months to perform analysis and train other ML
models. The data must be available with minimal delay for up to 1 year. After 1 year, the data must be retained for archival
purposes.
Which storage solution meets these requirements MOST cost-effectively?
A. Use the S3 Intelligent-Tiering storage class. Create an S3 Lifecycle policy to transition objects to S3 Glacier Deep Archive
after 1 year.
B. Use the S3 Intelligent-Tiering storage class. Configure S3 Intelligent-Tiering to automatically move objects to S3 Glacier
Deep Archive after 1 year.
C. Use the S3 Standard-Infrequent Access (S3 Standard-IA) storage class. Create an S3 Lifecycle policy to transition objects
to S3 Glacier Deep Archive after 1 year.
D. Use the S3 Standard storage class. Create an S3 Lifecycle policy to transition objects to S3 Standard-Infrequent Access (S3
Standard-IA) after 30 days, and then to S3 Glacier Deep Archive after 1 year.
Solution
Access patterns is given, therefore D is the most logical answer.
Intelligent tiering is for random, unpredictable access.
Question
An ecommerce company hosts its analytics application in the AWS Cloud. The application generates about 300 MB of data
each month. The data is stored in JSON format. The company is evaluating a disaster recovery solution to back up the data.
The data must be accessible in milliseconds if it is needed, and the data must be kept for 30 days.
Which solution meets these requirements MOST cost-effectively?
A. Amazon OpenSearch Service (Amazon Elasticsearch Service)
B. Amazon S3 Glacier
C. Amazon S3 Standard
D. Amazon RDS for PostgreSQL
Solution - C
S3 Standard provides high durability and availability for storage
It allows millisecond access to retrieve objects
Objects can be stored for any duration, meeting the 30 day retention need
Storage costs are low, around $0.023 per GB/month
OpenSearch and RDS require running and managing a cluster for DR storage
Glacier has lower cost but retrieval time is too high at 3-5 hours
S3 Standard's simplicity, high speed access, and low cost make it optimal for this small DR dataset that needs to be accessed
quickly
Question
A company is designing an application where users upload small files into Amazon S3. After a user uploads a file, the file
requires one-time simple processing to transform the data and save the data in JSON format for later analysis.
Each file must be processed as quickly as possible after it is uploaded. Demand will vary. On some days, users will upload a
high number of files. On other days, users will upload a few files or no files.
Which solution meets these requirements with the LEAST operational overhead?
A. Configure Amazon EMR to read text files from Amazon S3. Run processing scripts to transform the data. Store the
resulting JSON file in an Amazon Aurora DB cluster.
B. Configure Amazon S3 to send an event notification to an Amazon Simple Queue Service (Amazon SQS) queue. Use
Amazon EC2 instances to read from the queue and process the data. Store the resulting JSON file in Amazon DynamoDB.
C. Configure Amazon S3 to send an event notification to an Amazon Simple Queue Service (Amazon SQS) queue. Use an
AWS Lambda function to read from the queue and process the data. Store the resulting JSON file in Amazon DynamoDB.
7
D. Configure Amazon EventBridge (Amazon CloudWatch Events) to send an event to Amazon Kinesis Data Streams when a
new file is uploaded. Use an AWS Lambda function to consume the event from the stream and process the data. Store the
resulting JSON file in an Amazon Aurora DB cluster.
Solution
A. Configuring EMR and an Aurora DB cluster for this use case would introduce unnecessary complexity and operational
overhead. EMR is typically used for processing large datasets and running big data frameworks like Apache Spark or
Hadoop.
B. While using S3 event notifications and SQS for decoupling is a good approach, using EC2 to process the data would
introduce operational overhead in terms of managing and scaling the EC2.
D. Using EventBridge and Kinesis Data Streams for this use case would introduce additional complexity and operational
overhead compared to the other options. EventBridge and Kinesis are typically used for real-time streaming and processing
of large volumes of data.
In summary, option C is the recommended solution as it provides a serverless and scalable approach for processing
uploaded files using S3 event notifications, SQS, and Lambda. It offers low operational overhead, automatic scaling, and
efficient handling of varying demand. Storing the resulting JSON file in DynamoDB aligns with the requirement of saving the
data for later analysis.
Question
A hospital needs to store patient records in an Amazon S3 bucket. The hospital’s compliance team must ensure that all
protected health information (PHI) is encrypted in transit and at rest. The compliance team must administer the encryption
key for data at rest.
Which solution will meet these requirements?
A. Create a public SSL/TLS certificate in AWS Certificate Manager (ACM). Associate the certificate with Amazon S3.
Configure default encryption for each S3 bucket to use server-side encryption with AWS KMS keys (SSE-KMS). Assign the
compliance team to manage the KMS keys.
B. Use the aws:SecureTransport condition on S3 bucket policies to allow only encrypted connections over HTTPS (TLS).
Configure default encryption for each S3 bucket to use server-side encryption with S3 managed encryption keys (SSE-S3).
Assign the compliance team to manage the SSE-S3 keys.
C. Use the aws:SecureTransport condition on S3 bucket policies to allow only encrypted connections over HTTPS (TLS).
Configure default encryption for each S3 bucket to use server-side encryption with AWS KMS keys (SSE-KMS). Assign the
compliance team to manage the KMS keys.
D. Use the aws:SecureTransport condition on S3 bucket policies to allow only encrypted connections over HTTPS (TLS). Use
Amazon Macie to protect the sensitive data that is stored in Amazon S3. Assign the compliance team to manage Macie.
Soilution
Option C is correct because it allows the compliance team to manage the KMS keys used for server-side encryption, thereby
providing the necessary control over the encryption keys. Additionally, the use of the "aws:SecureTransport" condition on
the bucket policy ensures that all connections to the S3 bucket are encrypted in transit.
Also, SSE-S3 encryption is fully managed by AWS so the Compliance Team can't administer this.
ACM cannot be integrated with Amazon S3 bucket directly.
8
Chapter 5 - EC2 Section:
Question
A solutions architect needs to help a company optimize the cost of running an application on AWS. The application will use
Amazon EC2 instances, AWS Fargate, and AWS Lambda for compute within the architecture.
The EC2 instances will run the data ingestion layer of the application. EC2 usage will be sporadic and unpredictable.
Workloads that run on EC2 instances can be interrupted at any time. The application front end will run on Fargate, and
Lambda will serve the API layer. The front-end utilization and API layer utilization will be predictable over the course of the
next year.
Which combination of purchasing options will provide the MOST cost-effective solution for hosting this application?
(Choose two.)
A. Use Spot Instances for the data ingestion layer
B. Use On-Demand Instances for the data ingestion layer
C. Purchase a 1-year Compute Savings Plan for the front end and API layer.
D. Purchase 1-year All Upfront Reserved instances for the data ingestion layer.
E. Purchase a 1-year EC2 instance Savings Plan for the front end and API layer.
Solution
The two most cost-effective purchasing options for this architecture are:
A) Use Spot Instances for the data ingestion layer
C) Purchase a 1-year Compute Savings Plan for the front end and API layer
-
Spot Instances provide the greatest savings for flexible, interruptible EC2 workloads like data ingestion.
Savings Plans offer significant discounts for predictable usage like the front end and API layer.
All Upfront and partial/no Upfront RI's don't align well with the sporadic EC2 usage.
On-Demand is more expensive than Spot for flexible EC2 workloads.
By matching purchasing options to the workload patterns, Spot for unpredictable EC2 and Savings Plans for
steady-state usage, the solutions architect optimizes cost efficiency.
AWS Outposts
9
Chapter 6 – EBS
Volume Types:
-
SSD
Gp 2 SSD vol, boot disks , 16000 iops per volume
Gp3 SSD vol, high perf, 3000 IOPS baseline & 125 MiB/s
Io1 (provisioned IOPS SSD) – suitable for transaction processing & latency sensitive apps. 50 IOPS per Gb, 64,000
IOPS per volume (High perf & most expensive).
Io2 provissioned IOPS SSD – latest version, 500 IOPS per Gb and higher durability.
HDD
-
St1 throughput optimised HDD, suitable for big data, ETL , max throuput is 500Mb/s , cant be a boot volume.
Sc1 cold HDD, max thruput of 250MB/S per vol, for freq accessed data, cant be a boot volume and lowest cost
option
Volumes & Snapshots
-
Volumes exist on EBS and Snaps in S3 buckets.
Snaps are point in time, volumes are incremental
For consistent snapshots, stop the instance and detach the volume. (then data all saved to disk)
Can share snaps between regions
Can resize vols on the fly and change vol type gp2->gp3
Instance Store Volumes
-
Sometimes called ephemeral – hosts cannot be stopped but can be rebooted.
Data that's stored in instance store volumes isn't persistent through instance stops, terminations, or hardware
failures. For data that you want to retain longer, or if you want to encrypt the data, use Amazon EBS volumes
instead.
Encrypted volumes
-
Data at rest is encrypted inside the volume.
All snaps are encrypted
All data in flight Is encrypted
All vols created from the snap are encrypted.
EC2 Hibernation
-
Preserves in memory RAM on persistent storage (EBS)
Faster to boot as don’t need to reload the OS
On demand and RI
-
Supports NFS v4 protocol
1000s of concurrent NFS conns
Data stored across AZs in 1 region
Read after write consistency
Scenario – highly scalable, shared storage using NFS = EFS
Centralised storage for EC2s
EFS
FSx VS EFS vs Fsx Lustre
123-
EFS – distributed, high resilient storage for Linux instances
FSx for Windows - centralised windows based storage – sharepooint, SQL Server, Workspaces, IIS web server
FSx Lustre – high speed, high capacity distributed storage. High performance computing, data can be stored on
S3.
10
Storage Options (Big Topic!)
S3 – serverless object storage
Glacier – for archiving objects (learn the types)
EFS – NFS for Linux across AZs
FSx for Lustre – High performant computing
EBS Volumes – persistent storage for EC2s
Instance Store – ephemeral EC2 storage
FSx for Windows – NFS for windows across multi AZ
AWS Backup
Consolidations to backup AWS services
Can use with aws organisations to back up diff aws services across multi AWS accounts.
Backup gives you centralised control, letting you automate your backups and define lifecycle policies.
question
A company is designing a containerized application that will use Amazon Elastic Container Service (Amazon ECS). The
application needs to access a shared file system that is highly durable and can recover data to another AWS Region with a
recovery point objective (RPO) of 8 hours. The file system needs to provide a mount target m each Availability Zone within a
Region.
A solutions architect wants to use AWS Backup to manage the replication to another Region.
Which solution will meet these requirements?
A. Amazon FSx for Windows File Server with a Multi-AZ deployment
B. Amazon FSx for NetApp ONTAP with a Multi-AZ deployment
C. Amazon Elastic File System (Amazon EFS) with the Standard storage class
D. Amazon FSx for OpenZFS
Solution
Q: What is Amazon EFS Replication?
EFS Replication can replicate your file system data to another Region or within the same Region without requiring
additional infrastructure or a custom process. Amazon EFS Replication automatically and transparently replicates your data
to a second file system in a Region or AZ of your choice. You can use the Amazon EFS console, AWS CLI, and APIs to activate
replication on an existing file system. EFS Replication is continual and provides a recovery point objective (RPO) and a
recovery time objective (RTO) of minutes, helping you meet your compliance and business continuity goals.
EBS Exam Stuff:
-
Create the EBS volumes as encrypted volumes and attach the encrypted EBS volumes to the EC2 instances.
-
When you create an EBS volume, you can specify whether to encrypt the volume. If you choose to encrypt the
volume, all data written to the volume is automatically encrypted at rest using AWS-managed keys.
-
You can also use customer-managed keys (CMKs) stored in AWS KMS to encrypt and protect your EBS volumes.
You can create encrypted EBS volumes and attach them to EC2 instances to ensure that all data written to the
volumes is encrypted at rest.
-
Encrypt EBS , EKS cluster with customer key - These options allow EBS encryption with the customer managed
KMS key with minimal operational overhead:
o
C) Setting the KMS key as the regional EBS encryption default automatically encrypts new EKS node EBS
volumes.
o
D) The IAM role grants the EKS nodes access to use the key for encryption/decryption operations.
11
-
EC2 instances with multi EBS vols – efficient way to backup = Use AWS backup, copy to another region and add
the app’s EC2s instances as resources.
-
Encrypt ALB, EC2 and EBS - Use AWS Key Management Service (AWS KMS) to encrypt the EBS volumes and Aurora
database storage at rest. Attach an AWS Certificate Manager (ACM) certificate to the ALB to encrypt data in
transit
-
100s EC2s with EBS, efficient way to back them all up for easy DR – use AWS Backup, Backup API for restore
processes for multi EC2s.
-
RDS uses EBS – usually gp2/3 or io1
-
EC2Multi-Attach is supported exclusively on Provisioned IOPS SSD (io1 and io2) volumes.
-
EC2 & multiAZ DB instance with images on EBS storage -> combine with cloudfront and mount EFS on every EC2.
-
Can encrypt EBS vols from EC2 console / more centrally from AWS Orgs
-
Can use AWS config rule to make sure encryption is enabled.
-
Web app on ec2s with ephemeral EBS storage (no temp storage) linked to RDS DB – best solution for recovery and
efficiency and RPO 2 hours? Don’t need to take snapshots of EBS vols as the data is ephemeral, just have a latest
copy of the AMI & auto backups for RDS and use PITR for the RPO
-
GP3 more cost effective than GP2.
-
EC2 in diff AZs? Use EFS mount to have access to all data.
-
EC2 w/ RDS MulAZ on GP3 SSD but issues when goes to 20,000 IOPS ? solution cant be io2, RDS not supported w/
io2 – therefore, split the 2,000 GB gp3 volume with two 1,000 GB gp3 volumes
-
Storage gateway is not used for storing content - only to transfer to the Cloud
-
-
Which solution provides near-real-time data querying that is scalable with minimal data loss? Kinesis Firehose or
data streams? Answer: Kinesis DS = Provide near-real-time data ingestion into Kinesis Data Streams with the
ability to handle the 1 MB/s ingestion rate. Data would be stored redundantly across shards.
Can Turn on the EBS fast snapshot restore feature on the EBS snapshots
-
Q: Web Server must access file share (can’t change code) best solution? Use EFS and mount on all web servers
-
Encrypt a copy of the latest DB snapshot. Replace existing DB instance by restoring the encrypted snapshot (Once
DB is encrypted, newer snapshots and read replicas will also be encrypted.)
12
Chapter 7 - Database
RDS
SQL, Oracle, MySQL, PostGres, MariaDb & Aurora.
RDS good for transactional orders/ booking systems.
Not good for analysis/ forecasting (Use Redshift).
RDS Multi AZ
Amazon RDS for MySQL automatically provisions and maintains a synchronous (happens at the same time) standby replica
in a different Availability Zone
Custom RDS
Amazon RDS Custom is a managed database service for legacy, custom, and packaged applications that require access to
the underlying operating system and database environment.
Read Replica vs vs Multi AZ (watch out when Q asks for another Region in Q)
-
RR are for Scaling, improving read perf NOT recovery
RR Requires auto backups.
SQL, Oracle, MySQL, PostGres, MariaDb allos 5 RRs
MAZ = exact copy of db in another AZ
Redshift
Best for data warehousing tasks – analysing / forecasting data
Scenario - There is a need to run complex analytic queries against terabytes of structured data, using sophisticated query
optimization, columnar storage on high-performance storage, and massively parallel query execution. Which service will
best meet this requirement?
Aurora
-
Fully managed & serverless compute RDS
2 copies of data in at least 3 AZs
Can do Aurora, MySQL & PostGreSQL replicas – autoamated failover is only avail with Aurora replicas
Auto backups turned on by default, you can take snapshots with Aurora & share with other Accounts.
Static capacity
Aurora Serverless
-
Serverless Db – cost effective, simple option for intermittent workloads
RDS vs Aurora
Amazon Aurora is a proprietary, fully-managed relational database engine that is MySQL and PostgreSQL-compatible.
Amazon RDS is a hosted database service that supports a variety of relational database engines, including MySQL,
PostgreSQL, MariaDB, Microsoft SQL Server, and Oracle.
DyanmoDb
-
Stored on SSD storage
13
-
3 geographically distinct data centres
Eventually consistent (after 1 sec best read perf) / strongly consistent (successful response prior to read)
DynamoDB table export allows you to export data from DynamoDB to an S3 bucket. This can be useful for running one-time
queries on historical data.
Amazon Athena is a serverless, interactive query service that makes it easy to analyze data in Amazon S3. Athena can be
used to run one-time queries on the data in the S3 bucket.
DynamoDB Transactions
-
Any Q that mentions ACID requirement (Atomicity, Consistency, Isolation & durability) within account or region.
All or nothing transactions
DynamoDB On-demand Backup and restore.
-
Full backups at any time.
Zero impact on table performance/ availability
Consistent within seconds and retained until deleted.
Same region as source table
DynamoDB Point in time recovery
-
Restore up to any point in the last 35 days.
Incremental backups
DynamoDB Streams
Time ordered sequence of item-level changes in a table
DynamoDb Global tables
Amazon DynamoDB global tables is a fully managed, serverless, multi-Region, and multi-active database. Global tables
provide you 99.999% availability, increased application resiliency, and improved business continuity. As global tables
replicate your Amazon DynamoDB tables automatically across your choice of AWS Regions, you can achieve fast, local read
and write performance.
-
Based on DyDB Streams
No app rewrites
If you want to add redundancy
DocumentDB (MongoDb)
-
Migrating MongoDb on-prem to AWS
AWS Keyspaces
-
Migrating big Cassandra data cluster to AWS
AWS Timestream
-
Scenarios for large amount of time stream data.
Distractors:
Neptune is a distractor (to do with graphs)
Quantum Ledger Dtabase (immutable database like crypto blockchain)
Question
A company has a web application for travel ticketing. The application is based on a database that runs in a single data
center in North America. The company wants to expand the application to serve a global user base. The company needs to
deploy the application to multiple AWS Regions. Average latency must be less than 1 second on updates to the reservation
database.
14
The company wants to have separate deployments of its web platform across multiple Regions. However, the company
must maintain a single primary reservation database that is globally consistent.
Which solution should a solutions architect recommend to meet these requirements?
A. Convert the application to use Amazon DynamoDB. Use a global table for the center reservation table. Use the correct
Regional endpoint in each Regional deployment.
B. Migrate the database to an Amazon Aurora MySQL database. Deploy Aurora Read Replicas in each Region. Use the
correct Regional endpoint in each Regional deployment for access to the database.
C. Migrate the database to an Amazon RDS for MySQL database. Deploy MySQL read replicas in each Region. Use the
correct Regional endpoint in each Regional deployment for access to the database.
D. Migrate the application to an Amazon Aurora Serverless database. Deploy instances of the database to each Region. Use
the correct Regional endpoint in each Regional deployment to access the database. Use AWS Lambda functions to process
event streams in each Region to synchronize the databases.
Solution
B
"the company must maintain a single primary reservation database that is globally consistent." --> Relational database,
because it only allow writes from one regional endpoint
DynamoDB global table allow BOTH reads and writes on all regions (“last writer wins”), so it is not single point of entry. You
can set up IAM identity based policy to restrict write access for global tables that are not in NA but it is not mentioned.
Aurora Global DB provides native multi-master replication and automatic failover for high availability across regions.
Read replicas in each region ensure low read latency by promoting a local replica to handle reads.
A single Aurora primary region handles all writes to maintain data consistency.
Data replication and sync is managed automatically by Aurora Global DB.
Regional endpoints minimize cross-region latency.
Automatic failover promotes a replica to be the new primary if the current primary region goes down.
Q
A company runs a shopping application that uses Amazon DynamoDB to store customer information. In case of data
corruption, a solutions architect needs to design a solution that meets a recovery point objective (RPO) of 15 minutes and a
recovery time objective (RTO) of 1 hour.
What should the solutions architect recommend to meet these requirements?
A. Configure DynamoDB global tables. For RPO recovery, point the application to a different AWS Region.
B. Configure DynamoDB point-in-time recovery. For RPO recovery, restore to the desired point in time. Most Voted
C. Export the DynamoDB data to Amazon S3 Glacier on a daily basis. For RPO recovery, import the data from S3 Glacier to
DynamoDB.
D. Schedule Amazon Elastic Block Store (Amazon EBS) snapshots for the DynamoDB table every 15 minutes. For RPO
recovery, restore the DynamoDB table by using the EBS snapshot.
Answer
The best solution to meet the RPO and RTO requirements would be to use DynamoDB point-in-time recovery (PITR). This
feature allows you to restore your DynamoDB table to any point in time within the last 35 days, with a granularity of
seconds. To recover data within a 15-minute RPO, you would simply restore the table to the desired point in time within the
last 35 days.
To meet the RTO requirement of 1 hour, you can use the DynamoDB console, AWS CLI, or the AWS SDKs to enable PITR on
your table. Once enabled, PITR continuously captures point-in-time copies of your table data in an S3 bucket. You can then
use these point-in-time copies to restore your table to any point in time within the retention period.
https://www.examtopics.com/discussions/amazon/view/109701-exam-aws-certified-solutions-architect-associate-saa-c03/
Question
A company has a web application for travel ticketing. The application is based on a database that runs in a single data
center in North America. The company wants to expand the application to serve a global user base. The company needs to
15
deploy the application to multiple AWS Regions. Average latency must be less than 1 second on updates to the reservation
database.
The company wants to have separate deployments of its web platform across multiple Regions. However, the company
must maintain a single primary reservation database that is globally consistent.
Which solution should a solutions architect recommend to meet these requirements?
A. Convert the application to use Amazon DynamoDB. Use a global table for the center reservation table. Use the correct
Regional endpoint in each Regional deployment.
B. Migrate the database to an Amazon Aurora MySQL database. Deploy Aurora Read Replicas in each Region. Use the
correct Regional endpoint in each Regional deployment for access to the database.
C. Migrate the database to an Amazon RDS for MySQL database. Deploy MySQL read replicas in each Region. Use the
correct Regional endpoint in each Regional deployment for access to the database.
D. Migrate the application to an Amazon Aurora Serverless database. Deploy instances of the database to each Region. Use
the correct Regional endpoint in each Regional deployment to access the database. Use AWS Lambda functions to process
event streams in each Region to synchronize the databases.
Solution
Question
A company runs an Oracle database on premises. As part of the company’s migration to AWS, the company wants to
upgrade the database to the most recent available version. The company also wants to set up disaster recovery (DR) for the
database. The company needs to minimize the operational overhead for normal operations and DR setup. The company
also needs to maintain access to the database's underlying operating system.
Which solution will meet these requirements?
A. Migrate the Oracle database to an Amazon EC2 instance. Set up database replication to a different AWS Region.
B. Migrate the Oracle database to Amazon RDS for Oracle. Activate Cross-Region automated backups to replicate the
snapshots to another AWS Region.
C. Migrate the Oracle database to Amazon RDS Custom for Oracle. Create a read replica for the database in another AWS
Region.
D. Migrate the Oracle database to Amazon RDS for Oracle. Create a standby database in another Availability Zone.
solution
Option C is the best solution to meet the requirements:
Migrate the Oracle database to Amazon RDS Custom for Oracle.
Create a read replica for the database in another AWS Region for disaster recovery.
The reasons are:
RDS Custom provides a fully managed Oracle database instance. This reduces operational overhead compared to EC2.
RDS Custom allows accessing the underlying OS which is required.
Creating a read replica in another Region provides a simple DR solution.
RDS Automated Backups are within a single region. Cross-region DR requires replication.
RDS standby in the same AZ doesn't provide geographic diversity for DR.
So RDS Custom meets the managed database, OS access, and simple DR needs. The cross-region read replica provides
geographic diversity for DR. This is the right fit based on the requirements.
Question
A company has deployed a database in Amazon RDS for MySQL. Due to increased transactions, the database support team
is reporting slow reads against the DB instance and recommends adding a read replica.
Which combination of actions should a solutions architect take before implementing this change? (Choose two.)
A. Enable binlog replication on the RDS primary node.
16
B. Choose a failover priority for the source DB instance.
C. Allow long-running transactions to complete on the source DB instance.
D. Create a global table and specify the AWS Regions where the table will be available.
E. Enable automatic backups on the source instance by setting the backup retention period to a value other than 0.
Solution
*C. Long-running transactions can prevent the read replica from catching up with the source DB instance. Allowing these
transactions to complete before creating the read replica can help ensure that the replica is able to stay synchronized with
the source.
**E. Automatic backups must be enabled on the source DB instance for read replicas to be created. This is done by setting
the backup retention period to a value other than 0.
NOT D. Creating a global table and specifying AWS Regions is related to Aurora Global Databases, which is not the same as
creating a read replica for a standard RDS instance.
Question
You are working for a large financial institution and preparing for disaster recovery and upcoming DR drills. A key
component in the DR plan will be the database instances and their data. An aggressive Recovery Time Objective (RTO)
dictates that the database needs to be synchronously replicated. Which configuration can meet this requirement?
Solution
RDS Multi-AZ
Amazon RDS Multi-AZ deployments provide enhanced availability and durability for RDS database (DB) instances, making
them a natural fit for production database workloads. When you provision a Multi-AZ DB instance, Amazon RDS
automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different
Availability Zone (AZ). Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly
reliable.
Question
A new startup is considering the advantages of using Amazon DynamoDB versus a traditional relational database in AWS
RDS. The NoSQL nature of DynamoDB presents a small learning curve to the team members who all have experience with
traditional databases. The company will have multiple databases, and the decision will be made on a case-by-case basis.
Which of the following use cases would favor Amazon DynamoDB?
-
Online analytical processing (OLAP)/data warehouse implementations
Managing web session data
Strong referential integrity between tables
Storing metadata for S3 objects
Storing binary large object (BLOB) data
High-performance reads and writes for online transaction processing (OLTP) workloads
Solution
-
Managing web session data
Storing metadata for S3 objects
High-performance reads and writes for online transaction processing (OLTP) workloads
Amazon DynamoDB is a NoSQL database that supports key-value and document data structures. A key-value store is a
database service that provides support for storing, querying, and updating collections of objects that are identified using a
key and values that contain the actual content being stored. Meanwhile, a document data store provides support for
storing, querying, and updating items in a document format such as JSON, XML, and HTML. Amazon DynamoDB’s fast and
predictable performance characteristics make it a great match for handling session data. Plus, since it’s a fully-managed
NoSQL database service, you avoid all the work of maintaining and operating a separate session store. Amazon DynamoDB
Session Manager for Apache Tomcat.
17
Storing metadata for Amazon S3 objects is correct because the Amazon DynamoDB stores structured data indexed by
primary key and allows low-latency read and write access to items ranging from 1 byte up to 400KB. Amazon S3 stores
unstructured blobs and is suited for storing large objects up to 5 TB. In order to optimize your costs across AWS services,
large objects or infrequently accessed data sets should be stored in Amazon S3, while smaller data elements or file pointers
(possibly to Amazon S3 objects) are best saved in Amazon DynamoDB.
High-performance reads and writes are easy to manage with Amazon DynamoDB, and you can expect performance that is
effectively constant across widely varying loads. How to determine if Amazon DynamoDB is appropriate for your needs, and
then plan your migration.
NOT
Amazon DynamoDB can store binary items up to 400 KB, but DynamoDB is not generally suited to storing documents or
images. A better architectural pattern for this implementation is to store pointers to Amazon S3 objects in a DynamoDB
table. How to determine if Amazon DynamoDB is appropriate for your needs, and then plan your migration.
High-performance reads and writes are easy to manage with Amazon DynamoDB, and you can expect performance that is
effectively constant across widely varying loads. How to determine if Amazon DynamoDB is appropriate for your needs, and
then plan your migration.
18
Chapter 8 - Networking / virtual private cloud
VPC
Logical data centre in VPC.
Consists of internet gateways, route tables, NACL, subnets and SGs
1 subnet is in 1 AZ
NAT Gateway
You can use a NAT gateway so that instances in a private subnet can connect to services outside your VPC but external
services cannot initiate a connection with those instances.
Public – (Default) Instances in private subnets can connect to the internet through a public NAT gateway,
Private – Instances in private subnets can connect to other VPCs or your on-premises network through a private NAT
gateway. You
Exam Scenario: If you have resourcres in MultiAZ and they share a NAT Gateway, if the AZ where the NG resides goes down,
all the other AZs will lose internet access.
For High-Availability need to have a NG for each AZ.
Internet Gateway
An internet gateway enables resources in your public subnets (such as EC2 instances) to connect to the internet if the
resource has a public IPv4 address or an IPv6 address.
If a subnet's traffic is routed to an internet gateway, the subnet is known as a public subnet.
It is not possible to assign a static IP address to an internet gateway.
A public subnet is a subnet that's associated with a route table that has a route to an internet gateway. Reference: VPC with
public and private subnets (NAT) - Overview.
Access Control Lists
NACL
A network access control list (ACL) allows or denies specific inbound or outbound traffic at the subnet level. You can use the
default network ACL for your VPC, or you can create a custom network ACL for your VPC with rules that are similar to the
rules for your security groups in order to add an additional layer of security to your VPC.
Rule number. Rules are evaluated starting with the lowest numbered rule. As soon as a rule matches traffic, it's applied
regardless of any higher-numbered rule that might contradict it.
Each network ACL also includes a rule whose rule number is an asterisk (*). This rule ensures that if a packet doesn't match
any of the other numbered rules, it's denied. You can't modify or remove this rule.
Exam Scenario: if you want to block a dodgy ip, used NACL not a SG. However, one further - If the scenario is more about
protecting your application from common web exploits (SQL injection or cross-site scripting), then AWS WAF would be a
more suitable choice.
Security Groups
You can change the security groups for an instance when the instance is in the running or stopped state. You can change the
security groups for an instance when the instance is in the running or stopped state.
19
SG vs NACL
Security group is the firewall of EC2 Instances.
Network ACL is the firewall of the VPC Subnets.
Stateful vs Stateless
Security groups are stateful. This means that a security group only evaluates the packet from the source IP, not the
destination. i.e. any changes applied to an incoming rule will be automatically applied to the outgoing rule.
e.g. If you allow an incoming port 80, the outgoing port 80 will be automatically opened.
-
Once it has formed a connection data can go back and forth, if your first move is wanting outbound and outbound
rule is to deny, then it will not be allowed and no 2-way connection will be formed. If first move is outbound and
it is set to allow, the inbound rule wont matter, it will form a 2 way connection.
Network ACLs are stateless. This means that every packet that passes through the resource the ACL is applied to is
evaluated. i.e. means any changes applied to an incoming rule will not be applied to the outgoing rule.
e.g. If you allow an incoming port 80, you would also need to apply the rule for outgoing traffic.
Network ACLs are stateless, which means that responses to allowed inbound traffic are subject to the rules for outbound
traffic (and vice versa). https://docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html#default-network-acl
The following are the basic characteristics of security groups for your VPC:
Security groups are stateful. If you send a request from your instance, the response traffic for that request is allowed to
flow in regardless of inbound security group rules. Responses to allowed inbound traffic are allowed to flow out, regardless
of outbound rules. https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html
ENI – Elastic Network Interface
Think of the ethernet port on side of laptop, that is the Network Interface Card. The NIC equivalent for an EC2 is the ENI. –
this provides the EC2 with network connectivity.
When you build an EC2, each instance in your VPC has a default network interface (the primary network interface — eth0)
that assigns a private IPv4 address from the IPv4 address range of your VPC.
EN – Enhanced Networking – single root I/O virtualisation (SR-IOV) for high performance
Provides higher bandwith, higher packets per second, between 10-100Gbps & lower latency.
Depending on your instance type, EN can be enabled by Elastic Network Adaptor or Virtual Function Interface EXAM TIP: Always go for ENA, its better and faster.
EFA – Elastic Fabric Adaptor – Accelerates High Performance Computing and machine learning applications.
EXAM: when Q talks about high performance computing and what network adaptor we should use, use EFA.
Use OS Bypass on Linux which gets the applications to skip past the OS kernel and speaks directly to the EFA.
20
Direct Connect
Directly connects data centre to AWS.
Good for high thruput workloads
Travels across the internet using AWS network. While in transit, your network traffic remains on the AWS global network
and never touches the public internet.
VPN vs Direct Connect
AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated network connection from your
premises to AWS. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter,
office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput,
and provide a more consistent network experience than internet-based connections. …. So I guess DC?
VPN - It provides a connection between an on-premises network and a VPC, using a secure and private connection with
IPsec and TLS.
A VPC/VPN Connection utilizes IPSec to establish encrypted network connectivity between your intranet and Amazon VPC
over the Internet.
DC – (Expensive option) A private, dedicated network connection between your facilities and AWS
AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated network connection from your
premises to AWS. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter,
office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput,
and provide a more consistent network experience than internet-based connections.
VPN vs DC Conclusion - In AWS Direct Connect, the network is not fluctuating and provides a consistent experience, while
in AWS VPN the VPN is connected with shared and public networks, so the bandwidth and latency fluctuate. DC more
performant
VPC Endpoints
When you want to connect AWS Services w/out leaving the Amazon Internal Network, no public IP required.
2 Types:
Interface Endpoints - Interface Endpoints are Elastic Network Interfaces (ENI) with private IP addresses. ENI will
act as the entry point for the traffic that is destined to a particular service.
Gateway Endpoints – Gateway endpoints is a gateway targeted for a specific route in the routeing table. They can
be used to route traffic to a destined AWS service.- meant for DynamoDB and S3
VPC peering
-
Allows you to connect 1 VPC to another , via a direct network route using private Ips
Nstances behave like theyre on the same network.
Can pair with VPCs in other AWS accounts
AWS Private Link
AWS PrivateLink provides private connectivity between VPCs, AWS services, and your on-premises networks, without
exposing your traffic to the public internet
Scenatio: Q about peering VPCs to 1000s of customer VPCs think AWS PL.
Doesn’t require VPC peering, no route tables, NAT gateways, internet gateways.
21
Requires Network Load Balancer on the service VPC and an ENI on the cutomer VPC
AWS Transit Gateway
-
Can use route tables to limit how VPCs talk to each other.
Works with Dir Conn as well as VPC connections.
Scenario: simplifying network topology / IP multicasting. – saves you from having loads of of VPC connections
VPN Hub
-
Scenario: simplifies vpn network topology
AWS Wavelength
-
Scenario Q about mobile 5g edge computing.
Question:
You are evaluating the security setting within the main company VPC. There are several NACLs and security groups to
evaluate and possibly edit. What is true regarding NACLs and security groups?
-
Network ACLs are stateless, and security groups are stateful.
Network ACLs are stateful, and security groups are stateless.
Network ACLs and security groups are both stateless.
Network ACLs and security groups are both stateful.
Answer:
Network ACLs are stateless, and security groups are stateful.
Question
Solution
The request will be allowed.
Correct. Rules are evaluated starting with the lowest numbered rule. As soon as a rule matches traffic, it's applied
immediately regardless of any higher-numbered rule that may contradict it. The following are the basic things that you
need to know about network ACLs: Your VPC automatically comes with a modifiable default network ACL. By default, it
allows all inbound and outbound IPv4 traffic and, if applicable, IPv6 traffic. You can create a custom network ACL and
associate it with a subnet. By default, each custom network ACL denies all inbound and outbound traffic until you add rules.
Each subnet in your VPC must be associated with a network ACL. If you don't explicitly associate a subnet with a network
ACL, the subnet is automatically associated with the default network ACL].
A network ACL has separate inbound and outbound rules, and each rule can either allow or deny traffic. Network ACLs are
stateless, which means that responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice
versa).
22
Question
A company has a service that reads and writes large amounts of data from an Amazon S3 bucket in the same AWS Region.
The service is deployed on Amazon EC2 instances within the private subnet of a VPC. The service communicates with
Amazon S3 over a NAT gateway in the public subnet. However, the company wants a solution that will reduce the data
output costs.
Which solution will meet these requirements MOST cost-effectively?
A. Provision a dedicated EC2 NAT instance in the public subnet. Configure the route table for the private subnet to use the
elastic network interface of this instance as the destination for all S3 traffic.
B. Provision a dedicated EC2 NAT instance in the private subnet. Configure the route table for the public subnet to use the
elastic network interface of this instance as the destination for all S3 traffic.
C. Provision a VPC gateway endpoint. Configure the route table for the private subnet to use the gateway endpoint as the
route for all S3 traffic.
D. Provision a second NAT gateway. Configure the route table for the private subnet to use this NAT gateway as the
destination for all S3 traffic.
Solution
A VPC gateway endpoint allows you to privately access Amazon S3 from within your VPC without using a NAT gateway or
NAT instance. By provisioning a VPC gateway endpoint for S3, the service in the private subnet can directly communicate
with S3 without incurring data transfer costs for traffic going through a NAT gateway.
Question
A company runs workloads on AWS. The company needs to connect to a service from an external provider. The service is
hosted in the provider's VPC. According to the company’s security team, the connectivity must be private and must be
restricted to the target service. The connection must be initiated only from the company’s VPC.
Which solution will mast these requirements?
A. Create a VPC peering connection between the company's VPC and the provider's VPC. Update the route table to connect
to the target service.
B. Ask the provider to create a virtual private gateway in its VPC. Use AWS PrivateLink to connect to the target service.
C. Create a NAT gateway in a public subnet of the company’s VPUpdate the route table to connect to the target service.
D. Ask the provider to create a VPC endpoint for the target service. Use AWS PrivateLink to connect to the target service
Solution
D
Ask the provider to create a VPC endpoint for the target service
Use AWS PrivateLink to connect to the target service
The reasons are:
PrivateLink provides private connectivity between VPCs without using public internet.
The provider creates a VPC endpoint in their VPC for the target service.
The company uses PrivateLink to securely access the endpoint from their VPC.
Connectivity is restricted only to the target service.
The connection is initiated only from the company's VPC.
Options A, B, C would expose the connection to the public internet or require infrastructure changes in the provider's VPC.
PrivateLink enables private, restricted connectivity to the target service without VPC peering or public exposure.
Question
A company uses Amazon API Gateway to run a private gateway with two REST APIs in the same VPC. The BuyStock RESTful
web service calls the CheckFunds RESTful web service to ensure that enough funds are available before a stock can be
purchased. The company has noticed in the VPC flow logs that the BuyStock RESTful web service calls the CheckFunds
RESTful web service over the internet instead of through the VPC. A solutions architect must implement a solution so that
the APIs communicate through the VPC.
23
Which solution will meet these requirements with the FEWEST changes to the code?
A. Add an X-API-Key header in the HTTP header for authorization.
B. Use an interface endpoint.
C. Use a gateway endpoint.
D. Add an Amazon Simple Queue Service (Amazon SQS) queue between the two REST APIs.
Solution
an interface endpoint is a horizontally scaled, redundant VPC endpoint that provides private connectivity to a service. It is
an elastic network interface with a private IP address that serves as an entry point for traffic destined to the AWS service.
Interface endpoints are used to connect VPCs with AWS services
24
Chapter 9 – Route 53
ELBS don’t have a predefined IPv4 address, resolve using a DNS name
DNS = converts human friendly ww.goole.com to an ip.
NS record Alias record = used to map resource records set in your hosted zone to ELB, CF dists or S3 buckets that are configured as
websites. Human readable name that’s linked to a static bucket in S3 for instance
Cname = canonical name, used to resolve one name to another e.g. m.acloud.guru & mobile.acloud.guru to go to the same
place
Always choose alias over cname
Question
A company hosts its web application on AWS using seven Amazon EC2 instances. The company requires that the IP
addresses of all healthy EC2 instances be returned in response to DNS queries.
Which policy should be used to meet this requirement?
A. Simple routing policy
B. Latency routing policy
C. Multivalue routing policy
D. Geolocation routing policy
Solution
C - Use a multivalue answer routing policy to help distribute DNS responses across multiple resources. For example, use
multivalue answer routing when you want to associate your routing records with a Route 53 health check. For example, use
multivalue answer routing when you need to return multiple values for a DNS query and route traffic to multiple IP
addresses.
25
Chapter 10 – Elastic Load Balancer
Question
A solutions architect is designing a solution to run a containerized web application by using Amazon Elastic Container
Service (Amazon ECS). The solutions architect wants to minimize cost by running multiple copies of a task on each container
instance. The number of task copies must scale as the load increases and decreases. Which routing solution distributes the
load to the multiple tasks?
A. Configure an Application Load Balancer to distribute the requests by using path-based routing.
B. Configure an Application Load Balancer to distribute the requests by using dynamic host port mapping.
C. Configure an Amazon Route 53 alias record set to distribute the requests with a failover routing policy.
D. Configure an Amazon Route 53 alias record set to distribute the requests with a weighted routing policy.
Application Load Balancer
An Application Load Balancer is fronting an Auto Scaling Group of EC2 instances, and the instances are backed by an RDS
database. The Auto Scaling Group has been configured to use the Default Termination Policy. You are testing the Auto
Scaling Group and have triggered a scale-in. Which instance will be terminated first?
Answer: The instance launched from the oldest launch configuration.
26
Chapter 11 - Cloudwatch
Amazon CloudWatch collects and visualizes real-time logs, metrics, and event data in automated dashboards to streamline
your infrastructure and application maintenance.
– always use this for monitoring & alarm related.
Default metrics – CPU Util %, Network Throughput
Custom – EC2 memory Util, EC2 Storage Capacity
Alarms aren’t default, always need to configure.
Standard is 5min intervals, detailed is 1min at a greater cost.
CloudWatch Logs
"Monitor AWS resources and custom metrics generated by your applications and services". With Amazon CloudWatch, you
gain system-wide visibility into resource utilization, application performance, and operational health.
You can use Amazon CloudWatch Logs to monitor, store, and access your log files from Amazon Elastic Compute Cloud
(Amazon EC2) instances, AWS CloudTrail, Route 53, RDS, Lambda and other sources.
CloudWatch Logs enables you to centralize the logs from all of your systems, applications, and AWS services that you use, in
a single, highly scalable service. You can then easily view them, search them for specific error codes or patterns, filter them
based on specific fields, or archive them securely for future analysis. CloudWatch Logs enables you to see all of your logs,
regardless of their source, as a single and consistent flow of events ordered by time.
CloudWatch Logs also supports querying your logs with a powerful query language, auditing and masking sensitive data in
logs, and generating metrics from logs using filters or an embedded log format.
Log Event – Data point, record of what happened, it contains a timestamp and the data.
Log Stream – Collection of log events from a single source (e.g. an Ec2)
Log Group – Combo of Log Streams, (e.g. a group of Ec2s)
Exam Tip
If the Q mentions storage instead of processing, S3 might be more of a likely solution than CW.
If Q asks for realtime logs, probably want to select Kinesis.
It’s agent based, you need to install on your host
If SQL mentioned – Use ‘CloudWatch Insights’
If monitoring AWS Standards want to use AWS Config (AWS Config continually assesses, audits, and evaluates the
configurations and relationships of your resources on AWS, on premises, and on other clouds.) - Config gives you a
detailed inventory of your AWS resources and their current configuration, and continuously records configuration
changes
If its eventbridge logs – cloudtrail better suited
Question
A company has a legacy application running on servers on premises. To increase the application's reliability, the company
wants to gain actionable insights using application logs. A Solutions Architect has been given following requirements for the
solution:
✑ Aggregate logs using AWS.
✑ Automate log analysis for errors.
✑ Notify the Operations team when errors go beyond a specified threshold.
What solution meets the requirements?
A. Install Amazon Kinesis Agent on servers, send logs to Amazon Kinesis Data Streams and use Amazon Kinesis Data
Analytics to identify errors, create an Amazon CloudWatch alarm to notify the Operations team of errors
27
B. Install an AWS X-Ray agent on servers, send logs to AWS Lambda and analyze them to identify errors, use Amazon
CloudWatch Events to notify the Operations team of errors.
C. Install Logstash on servers, send logs to Amazon S3 and use Amazon Athena to identify errors, use sendmail to notify the
Operations team of errors.
D. Install the Amazon CloudWatch agent on servers, send logs to Amazon CloudWatch Logs and use metric filters to identify
errors, create a CloudWatch alarm to notify the Operations team of errors.
solution
I think the answer is D, even though A would work too.
A is a more complex and Kinesis Data Analytics is built for Analytics. while it might be possible to repurpose this for
automated log analysis, it's not ideal. CloudWatch Metrics is purpose built for automated log analysis.
Amazon Managed Grafana
Fully Managed AWS service allowing secure data vizualisations - a popular open-source analytics platform that enables you
to query, visualize, and alert on your metrics, logs, and traces (Similar to SPlunk).
AWS builds the Graf Workspaces, scales it and controls security.
Use Cases:
Several built in data sources - CW, AWS Prometheus & X ray.
Container metric Vizualisations – connect to data sources like Prometheus for vizualising EKS, ECS or your own Kub
cluster metrics.
IOT – Vast Data Plugins allow monitoring for edge device data and IOT
Troubleshooting
Amazon Managed service for Prometheus
Prometheus - An open-source monitoring system with a dimensional data model, flexible query language, efficient time
series database and modern alerting approach.
Container metric vizualisations – e.g. Kubernetes
-
Auto scaling, high avail across 3 AZs
Works with amazon EKS or self-managed k8 clusters
Use PromQL (specific lang)
Data retention, stored for 180 days
Exam Tips
-
Grafana = best managed service for viz of container/ IOT metrics
Prometheus = Container metrics at scale
Managed Services -
28
Chapter 12 – Scaling
CloudWatch collects and aggregates usage data, such as CPU and network I/O, across your Auto Scaling instances. You use
these metrics to create scaling policies that adjust the number of instances in your Auto Scaling group as the selected
metric's value increases and decreases.
Only for EC2s.
Vertical – t2.micro -> t3.Macro
Horizontal – many Ec2s
How to Scale?
Launch template - all settings needed for spinning up an Ec2, to save doing same steps in wizard each time. Support
versioning, used for autoscaling, more granular
Vs
Launch Configurations – “older version” Only for autoscaling, immutable limited config options – not reccom (LT
preferred).
Exam Tips
– LT, need to know it includes the AMI, EC2 instance, size, security groups and potentially networking info.
User data is included in the template, can run a script when first setup.
Can be versioned,
Use Elastic LBs to use health checks to direct traffic to right instances, also can configure them to terminate unhealthy
instances.
Auto Scaling Groups - An Auto Scaling group contains a collection of EC2 instances that are treated as a logical grouping for
the purposes of automatic scaling and management.
Steps
1.
2.
3.
4.
5.
Create LT
Networking & Purchasing – best to select network (VPC/ subnets etc) at this stage. Use multiple AZs, allows for
high availability. Purchasing, best to have Spot/ reserved/ on-demand instances?
ELB Config – Can put all the EC2s behind a LB but CRUCIALLY can use the Health Checks from LBs for Scaling –
otherwise will only have standard system checks.
Set Scaling Policies – Min, max and desired capacity needs to be set to ensure too many resources.
Notifications – Use SNS to inform of scaling events.
Restrictions
Min = best to set a min of 2
Max = it’s the roof
Desired = how much you want right NOW.
Instance Warm up and down
Warmup
Instances have a grace period while they are being created, while they are being setup the instance health check would
probably fail (needs to turn on, run user data script, install apps etc – can take 10 mins or more) and then they would be
terminated before its fully created… not added to the Load Balancer until the warmup complete.
The default instance warmup lets you specify how long after an instance reaches the InService state
When scaling up, auto scaling aware of the instances in the warm up state, so won’t add too many if theres a big surge.
Cooldown
After your Auto Scaling group launches or terminates instances, it waits for a cooldown period to end before any further
scaling activities initiated by simple scaling policies can start. The intention of the cooldown period is to prevent your Auto
Scaling group from launching or terminating additional instances before the effects of previous activities are visible.
default 5mins
29
Types of scaling
Reactive – respond, once the load is there, asses e.g. memory util %, then determine if you need to scale up or down.
Scheduled – predictable workload, have them ready before theyre needed – finance team needs accounts every Friday PM.
Predictive – Uses ML algorithms, determines when you’ll likely need to scale. Evaluated every 24 hours to work out needs
for the next 48 hours.
Niche EXAM Scenario:
Steady state autoscaling group – if you have old school legacy software which can’t be split across EC2s, instance needs to
be online but one copy. Use autoscal min,max and desired capacity all set to 1.
If it goes down or AZ goes down, gets instantly rebuilt in another AZ.
Question
Your team has provisioned Auto Scaling groups in a single Region. The Auto Scaling groups, at max capacity, would total 40
EC2 On-Demand Instances between them. However, you notice that the Auto Scaling groups will only scale out to a portion
of that number of instances at any one time. What could be the problem?
Solution
There is a vCPU-based On-Demand Instance limit per Region.
Your AWS account has default quotas, formerly referred to as limits, for each AWS service. Unless otherwise noted, each
quota is Region specific. You can request increases for some quotas, and other quotas cannot be increased. Remember that
each EC2 instance can have a variance of the number of vCPUs, depending on its type and your configuration,
Example Q
A solutions architect is designing the architecture for a software demonstration environment. The environment will run on
Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer (ALB). The system will experience
significant increases in traffic during working hours but is not required to operate on weekends.
Which combination of actions should the solutions architect take to ensure that the system can scale to meet demand?
(Choose two.)





A. Use AWS Auto Scaling to adjust the ALB capacity based on request rate.
B. Use AWS Auto Scaling to scale the capacity of the VPC internet gateway.
C. Launch the EC2 instances in multiple AWS Regions to distribute the load across Regions.
D. Use a target tracking scaling policy to scale the Auto Scaling group based on instance CPU utilization.
E. Use scheduled scaling to change the Auto Scaling group minimum, maximum, and desired capacity to zero for
weekends. Revert to the default values at the start of the week
DE
Question
A transaction processing company has weekly scripted batch jobs that run on Amazon EC2 instances. The EC2 instances are
in an Auto Scaling group. The number of transactions can vary, but the baseline CPU utilization that is noted on each run is
at least 60%. The company needs to provision the capacity 30 minutes before the jobs run.
Currently, engineers complete this task by manually modifying the Auto Scaling group parameters. The company does not
have the resources to analyze the required capacity trends for the Auto Scaling group counts. The company needs an
automated way to modify the Auto Scaling group’s desired capacity.
Which solution will meet these requirements with the LEAST operational overhead?
A. Create a dynamic scaling policy for the Auto Scaling group. Configure the policy to scale based on the CPU utilization
metric. Set the target value for the metric to 60%.
30
B. Create a scheduled scaling policy for the Auto Scaling group. Set the appropriate desired capacity, minimum capacity,
and maximum capacity. Set the recurrence to weekly. Set the start time to 30 minutes before the batch jobs run.
C. Create a predictive scaling policy for the Auto Scaling group. Configure the policy to scale based on forecast. Set the
scaling metric to CPU utilization. Set the target value for the metric to 60%. In the policy, set the instances to pre-launch 30
minutes before the jobs run.
D. Create an Amazon EventBridge event to invoke an AWS Lambda function when the CPU utilization metric value for the
Auto Scaling group reaches 60%. Configure the Lambda function to increase the Auto Scaling group’s desired capacity and
maximum capacity by 20%.
Solution
A scheduled scaling policy allows you to set up specific times for your Auto Scaling group to scale out or scale in. By
creating a scheduled scaling policy for the Auto Scaling group, you can set the appropriate desired capacity, minimum
capacity, and maximum capacity, and set the recurrence to weekly. You can then set the start time to 30 minutes before
the batch jobs run, ensuring that the required capacity is provisioned before the jobs run.
Option C, creating a predictive scaling policy for the Auto Scaling group, is not necessary in this scenario since the company
does not have the resources to analyze the required capacity trends for the Auto Scaling group counts. This would require
analyzing the required capacity trends for the Auto Scaling group counts to determine the appropriate scaling policy.
Scale Relational DBs
Vertical – sometimes you do need a stronger spec instance, can’t horizontally scale out lots of t2.micros if the processing is
too much. More RAM/ CPU can be the best first thing to check.
Scaling Storage – Storage can be increased but not reduced. (NB Aurora Autoscale disk space in 10Gb increments)
Read Replicas – For read heavy workloads, spread Read only copies across AZ/ regions. (Up to 15 RRs with Aurora)
Aurrora Serverless – unpredictable workload using Aurora, use aurora serverless
Exam Tip
Scaling vs Refactoring – In exam, refactoring to DynamoDB is a legit option – changes from Relational to non relational
which is a lot of work – But it scales easily, more managed AWS service.
Choose Aurora – whenever the Q asks for a Relational DB, Aurora is the tuned up and more aws managed version of
MySQL/ postgres
Q, A solutions architect is reviewing the resilience of an application. The solutions architect notices that a database
administrator recently failed over the application's Amazon Aurora PostgreSQL database writer instance as part of a scaling
exercise. The failover resulted in 3 minutes of downtime for the application.
Which solution will reduce the downtime for scaling exercises with the LEAST operational overhead?
A. Create more Aurora PostgreSQL read replicas in the cluster to handle the load during failover.
B. Set up a secondary Aurora PostgreSQL cluster in the same AWS Region. During failover, update the application to use the
secondary cluster's writer endpoint.
C. Create an Amazon ElastiCache for Memcached cluster to handle the load during failover.
D. Set up an Amazon RDS proxy for the database. Update the application to use the proxy endpoint. Most Voted
D - It is talking about the write database. Not reader. Amazon RDS proxy allows you to automatically route write request to
the healthy writer, minimizing downtime.
Amazon RDS Proxy is a fully managed, highly available database proxy for Amazon Relational Database Service (RDS) that
makes applications more scalable, more resilient to database failures, and more secure.
Many applications, including those built on modern serverless architectures, can have a large number of open connections
to the database server and may open and close database connections at a high rate, exhausting database memory and
31
compute resources. Amazon RDS Proxy allows applications to pool and share connections established with the database,
improving database efficiency and application scalability.
Question:
A company has deployed a web application on AWS. The company hosts the backend database on Amazon RDS for MySQL
with a primary DB instance and five read replicas to support scaling needs. The read replicas must lag no more than 1
second behind the primary DB instance. The database routinely runs scheduled stored procedures.
As traffic on the website increases, the replicas experience additional lag during periods of peak load. A solutions architect
must reduce the replication lag as much as possible. The solutions architect must minimize changes to the application code
and must minimize ongoing operational overhead.
Which solution will meet these requirements?
A. Migrate the database to Amazon Aurora MySQL. Replace the read replicas with Aurora Replicas, and configure Aurora
Auto Scaling. Replace the stored procedures with Aurora MySQL native functions. Most Voted
B. Deploy an Amazon ElastiCache for Redis cluster in front of the database. Modify the application to check the cache
before the application queries the database. Replace the stored procedures with AWS Lambda functions.
C. Migrate the database to a MySQL database that runs on Amazon EC2 instances. Choose large, compute optimized EC2
instances for all replica nodes. Maintain the stored procedures on the EC2 instances.
D. Migrate the database to Amazon DynamoDB. Provision a large number of read capacity units (RCUs) to support the
required throughput, and configure on-demand capacity scaling. Replace the stored procedures with DynamoDB streams.
Answer:
Option A is the most appropriate solution for reducing replication lag without significant changes to the application code
and minimizing ongoing operational overhead. Migrating the database to Amazon Aurora MySQL allows for improved
replication performance and higher scalability compared to Amazon RDS for MySQL. Aurora Replicas provide faster
replication, reducing the replication lag, and Aurora Auto Scaling ensures that there are enough Aurora Replicas to handle
the incoming traffic. Additionally, Aurora MySQL native functions can replace the stored procedures, reducing the load on
the database and improving performance.
Option B is not the best solution since adding an ElastiCache for Redis cluster does not address the replication lag issue, and
the cache may not have the most up-to-date information. Additionally, replacing the stored procedures with AWS Lambda
functions adds additional complexity and may not improve performance.
Question
A company is launching an application on AWS. The application uses an Application Load Balancer (ALB) to direct traffic to
at least two Amazon EC2 instances in a single target group. The instances are in an Auto Scaling group for each
environment. The company requires a development environment and a production environment. The production
environment will have periods of high traffic.
Which solution will configure the development environment MOST cost-effectively?
A. Reconfigure the target group in the development environment to have only one EC2 instance as a target.
B. Change the ALB balancing algorithm to least outstanding requests.
C. Reduce the size of the EC2 instances in both environments.
D. Reduce the maximum number of EC2 instances in the development environment’s Auto Scaling group.
solution
Option A because it can't be option D as there should be at least two EC2 instances in Auto scaling group, and can't be
reduced to one as said in option D.
So, simply reconfigure the target group in the development environment to have only one EC2 instance as a target as said in
option A to reduce cost.
Scale Non Relational DBs (DynamoDB)
DynamoDB = fully managed, only control the data and the read/ write capacity.
32
Use partition keys as a field that acts as a PK to keep each entry distinct.
Provisioned – predicatable workload
On-demand – sporadic
Can switch between the 2 but limited to once every 24 hours to not game the system.
Exam Qs:
A company hosts its application in the AWS Cloud. The application runs on Amazon EC2 instances behind an Elastic Load
Balancer in an Auto Scaling group and with an Amazon DynamoDB table. The company wants to ensure the application can
be made available in anotherAWS Region with minimal downtime.
What should a solutions architect do to meet these requirements with the LEAST amount of downtime?
A. Create an Auto Scaling group and a load balancer in the disaster recovery Region. Configure the DynamoDB table as a
global table. Configure DNS failover to point to the new disaster recovery Region's load balancer. Most Voted
B. Create an AWS CloudFormation template to create EC2 instances, load balancers, and DynamoDB tables to be launched
when needed Configure DNS failover to point to the new disaster recovery Region's load balancer.
C. Create an AWS CloudFormation template to create EC2 instances and a load balancer to be launched when needed.
Configure the DynamoDB table as a global table. Configure DNS failover to point to the new disaster recovery Region's load
balancer.
D. Create an Auto Scaling group and load balancer in the disaster recovery Region. Configure the DynamoDB table as a
global table. Create an Amazon CloudWatch alarm to trigger an AWS Lambda function that updates Amazon Route 53
pointing to the disaster recovery load balancer.
A
Configure DNS failover: Use DNS failover to point the application's DNS record to the load balancer in the disaster recovery
Region. DNS failover allows you to route traffic to the disaster recovery Region in case of a failure in the primary Region.
Question:
A company is designing a new web application that will run on Amazon EC2 Instances. The application will use Amazon
DynamoDB for backend data storage. The application traffic will be unpredictable. The company expects that the
application read and write throughput to the database will be moderate to high. The company needs to scale in response to
application traffic.
Which DynamoDB table configuration will meet these requirements MOST cost-effectively?
A. Configure DynamoDB with provisioned read and write by using the DynamoDB Standard table class. Set DynamoDB auto
scaling to a maximum defined capacity.
B. Configure DynamoDB in on-demand mode by using the DynamoDB Standard table class.
C. Configure DynamoDB with provisioned read and write by using the DynamoDB Standard Infrequent Access (DynamoDB
Standard-IA) table class. Set DynamoDB auto scaling to a maximum defined capacity.
D. Configure DynamoDB in on-demand mode by using the DynamoDB Standard Infrequent Access (DynamoDB Standard-IA)
table class.
Answer
When cost is a major factor is best to utilize provisioned capacity mode. You can also buy capacity reservations in
provisioned capacity mode which helps reduce costs even more by committing to a 1 or 3 year capacity reservation.
On-demand mode is great for unpredictable traffic and should cost not be the main factor in this workload, option B would
have been best.
Option A takes the win, burst capacity can help with handling the unpredictable aspect of the workload and application
autoscaling will ensure it scales easily. Provisioned capacity mode and possible reservations ensure its cost effective.
A company is using Amazon DynamoDB with provisioned throughput for the database tier of its ecommerce website.
During flash sales, customers experience periods of time when the database cannot handle the high number of transactions
taking place. This causes the company to lose transactions. During normal periods, the database performs appropriately.
Which solution solves the performance problem the company faces?
A. Switch DynamoDB to on-demand mode during flash sales.
33
B. Implement DynamoDB Accelerator for fast in memory performance.
C. Use Amazon Kinesis to queue transactions for processing to DynamoDB.
D. Use Amazon Simple Queue Service (Amazon SQS) to queue transactions to DynamoDB.
A
34
Chapter 13 – Decoupling Workflows
Loose Coupling: Always better, more scalable and highly available.
3 services that help:
1, SQS: Simple Queue Service – fully managed message queue service that allows you to decouple and scale microservices,
distributed systems and serverless applications.
Can replace the LB, get the messages from the EC2s and then backend can pull from the queue – doesn’t need live
connection.
-
Poll based messaging – asynchronous messaging, frontend creates the message and then backend polls for the
message from SQS. (It’s not direct messaging, gets the message when ready).
SQS Settings (Learn!):
- delivery delay (default=0min, can go up to 15mins)
message size up to 256kb
queue depth – can trigger autoscaling.
Messages encrypted in transit by default, can be encrypted at rest too
Messages stay for default 4 days, can be between 1min-14days
Short polling, backend connects and asks for messages, disconnects and keeps asking and reconnecting – uses more
CPU. Long polling, connects and asks for messages and then waits around (this is better usually)
Visibility Timeout: When a consumer receives and processes a message from a queue, the message remains in the queue.
Amazon SQS doesn't automatically delete the message. Because Amazon SQS is a distributed system, there's no guarantee
that the consumer actually receives the message (for example, due to a connectivity issue, or due to an issue in the
consumer application).
Immediately after a message is received, it remains in the queue. To prevent other consumers from processing the message
again.
Deleted once processed successfully.
Typically, you should set the visibility timeout to the maximum time that it takes your application to process and delete a
message from the queue.
35
Dead Letter Queue: If for example the message has any error in, not getting processed properly, backend will retry to
process it. You can set a retry limit and once it hits that, the message can be sent to a second SQS queue which is the DL
queue, sideline a queue for unsuccessful messages.
Visibilty Queue and Dl Queue for when you have issues in queue – then have cloudwatch to monitor the queue size.
SQS Message Ordering:
FIFO: Guarentees ordering and no duplicating, limits 300 messages a second and costs more due to dedup.
Message group id ensures messages processed in order.
Standard: Best effort ordering, sometimes duplicate, unlimited transactions a second, cheaper.
2, SNS: Fully managed messaging service, can do App2App or App2Person.
If you want to take 1 notification and proactively deliver to an endpoint instead of leaving it in a queue.
Push Based Messaging: Frontend delivers it and backend has to be ready.
SNS Settings:
-
Subscribers – Kinesis Firehose, SQS, Lambda, email, http, SMS.
Message size 256kb
DLQueue support
FIFO or Standard – only use FIFO when combining with SQS.
Encrypted in transit but can be encrypted at rest too with a KMS key.
Can also use SQS DLQ for further processing and investigation into what went wrong.
3, API Gateway: Fully managed service that makes it easy for developers to create, publish, maintain and monitor APIs at
any scale.
Secure and Safe front door for your applications, allows you to put restrictions on the endpoints of API. If it’s a custom built
app, use API gateway to front it.
API Gateway better than using access keys in application
Features:
Security – easily protect the endpoints by attaching a web Application Firewall WAF (More Details chapter 16)
Stop abuse – can do DDOS protection and rate limiting to curb abuse of endpoints.
Easy peasy to setup.
Supports versioning.
A Solutions Architect wants to make sure that only AWS users or roles with suitable permissions can access a new Amazon
API Gateway endpoint. The Solutions
Architect wants an end-to-end view of each request to analyze the latency of the request and create service maps.
How can the Solutions Architect design the API Gateway access control and perform request inspections?
A. For the API Gateway method, set the authorization to AWS_IAM. Then, give the IAM user or role execute-api:Invoke
permission on the REST API resource. Enable the API caller to sign requests with AWS Signature when accessing the
endpoint. Use AWS X-Ray to trace and analyze user requests to API Gateway.
ACG – use S3 to store static content and then for dynamic (update user deets/ add comment) use API Gateway to kick off a
Lambda call which updates Aurora.
36
EXAM TIPS: Need to decide when best to use a LB over a messaging service
Fanout: can combine SNS & SQS – have multiple SQS queues as a subscription, can send same message to multiple
messages.SQS access policy needs to define and allow message from specific SNS topic with appropriate ARNs.
SES = Simple Email Service – this is to distract, mostly for email marketing campaigns.
Question
A company is building a game system that needs to send unique events to separate leaderboard, matchmaking, and
authentication services concurrently. The company needs an AWS event-driven system that guarantees the order of the
events.
Which solution will meet these requirements?
A. Amazon EventBridge event bus
B. Amazon Simple Notification Service (Amazon SNS) FIFO topics
C. Amazon Simple Notification Service (Amazon SNS) standard topics
D. Amazon Simple Queue Service (Amazon SQS) FIFO queues
Solution
The answer is B la. SNS FIFO topics queue should be used combined with SQS FIFO queue in this case. The question asked
for correct order to different event, so asking for SNS fan out here to send to individual SQS.
Question:
A retail company uses a regional Amazon API Gateway API for its public REST APIs. The API Gateway endpoint is a custom
domain name that points to an Amazon Route 53 alias record. A solutions architect needs to create a solution that has
minimal effects on customers and minimal data loss to release the new version of APIs.
Which solution will meet these requirements?
A. Create a canary release deployment stage for API Gateway. Deploy the latest API version. Point an appropriate
percentage of traffic to the canary stage. After API verification, promote the canary stage to the production stage. Most
Voted
B. Create a new API Gateway endpoint with a new version of the API in OpenAPI YAML file format. Use the import-toupdate operation in merge mode into the API in API Gateway. Deploy the new version of the API to the production stage.
C. Create a new API Gateway endpoint with a new version of the API in OpenAPI JSON file format. Use the import-to-update
operation in overwrite mode into the API in API Gateway. Deploy the new version of the API to the production stage.
D. Create a new API Gateway endpoint with new versions of the API definitions. Create a custom domain name for the new
API Gateway API. Point the Route 53 alias record to the new API Gateway API custom domain name.
A - keyword: "latest versions on an api"
Canary release is a software development strategy in which a "new version of an API" (as well as other software) is
deployed for testing purposes.
37
A solutions architect is designing a RESTAPI in Amazon API Gateway for a cash payback service. The application requires 1
GB of memory and 2 GB of storage for its computation resources. The application will require that the data is in a relational
format.
Which additional combination ofAWS services will meet these requirements with the LEAST administrative effort? (Choose
two.)
A. Amazon EC2
B. AWS Lambda
C. Amazon RDS
D. Amazon DynamoDB
E. Amazon Elastic Kubernetes Services (Amazon EKS)
B&C
A company provides an API interface to customers so the customers can retrieve their financial information. Еhe company
expects a larger number of requests during peak usage times of the year.
The company requires the API to respond consistently with low latency to ensure customer satisfaction. The company
needs to provide a compute host for the API.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use an Application Load Balancer and Amazon Elastic Container Service (Amazon ECS).
B. Use Amazon API Gateway and AWS Lambda functions with provisioned concurrency.
C. Use an Application Load Balancer and an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
D. Use Amazon API Gateway and AWS Lambda functions with reserved concurrency.
B - The company requires the API to respond consistently with low latency to ensure customer satisfaction especially during
high peak periods, there is no mention of cost efficient. Hence provisioned concurrency is the best option.
Provisioned concurrency is the number of pre-initialized execution environments you want to allocate to your function.
These execution environments are prepared to respond immediately to incoming function requests. Configuring
provisioned concurrency incurs charges to your AWS account
AWS Batch
Batch processing is the method computers use to periodically complete high-volume, repetitive data jobs. Certain data
processing tasks, such as backups, filtering, and sorting, can be compute intensive and inefficient to run on individual data
transactions.
AWS Batch Def:
-
Workflows - Allows you to run batch computing workloads within AWS – either on Ec2s or fargate
Don’t need to worry about config and mgmt of the infrastructure
Automatically provision and scales compute resources and optimises distribution of workloads
No install req of batch computing software
AWS Batch Components
-
Job – unit of work submitted to AWS Batch, can be a script/ docker image.
Job Definition – specifies how it will be run, blueprint for the the resources of the job
Job Queues – Jobs submitted to specific queues, reside there until scheduled to run in compute environment.
Compute Environment – Set of managed or unmanaged compute resources used to run your job
Fargate - AWS Fargate is a serverless, pay-as-you-go compute engine that lets you focus on building applications without
managing servers or clusters. AWS Fargate is compatible with both Amazon Elastic Container Service (Amazon ECS) and
Amazon Elastic Kubernetes Service (Amazon EKS)
Fargate generally recommended – best for provisioning the right amount etc
38
Sometimes EC2 – when you need to use a custom AMI, 16vCPUs or more than 120Gb of memory, or need to use GPU/ G
CPU or linux parameter or large number of jobs (can be dispatched at a higher rate, better concurrency).
AWS Batch or Lambda
-
Lambda has a 15 min time limit, Batch unlimited.
Lambda has limited disk space
Lambda is fully serverless, natively limited runtimes
Batch uses docker, any runtime can be used.
Developers describe AWS Batch as "Fully Managed Batch Processing at Any Scale". It enables developers, scientists, and
engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions
the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and
specific resource requirements of the batch jobs submitted. On the other hand, AWS Lambda is detailed as "Automatically
run code in response to modifications to objects in Amazon S3 buckets, messages in Kinesis streams, or updates in
DynamoDB". AWS Lambda is a compute service that runs your code in response to events and automatically manages the
underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create
your own back-end services that operate at AWS scale, performance, and security.
In general, AWS Lambda can be more cost-effective for smaller, short-lived tasks or for event-driven computing use cases.
For long running or computation heavy tasks, AWS Batch can be more cost-effective, as it allows you to provision and
manage a more robust computing environment.
Sample use cases where AWS Lambda and AWS Lambda fits well
AWS Lambda can be used to image processing, such as resizing, watermarking etcin real-time.
AWS Lambda can be used in file processing as and when file uploaded/deleted from S3 buckets.
AWS Lambda can used to send emails, notifications or messages.
AWS Batch can be used to process large amounts of data, such as converting raw data into a some usable format or
aggregating data from multiple sources.
AWS Batch can be used to run large-scale parallel processing workloads, such as weather forecasting. Thank you for
reading, please follow if you find the article useful.
Managed vs Unmanaged Compute Environments
-
AWS manages capacity and instance types
Compute resource specs defined by you.
ECS instances and launched into VPC subnets – the services need access to the ECS service endpoint.
By default, uses most recent and approved amazon ECS AMI (can use your own).
Leverage Fargate, Fargate Spot or EC2 spot
Unmanaged
Manage your own resources entirely.
Need to use an Ami that meets ECS AMI specifications.
Manage the scaling / sizing
Good for when complex/ extremely specific.
Exam tips:
-
Long running (+15 mins) use AWS Batch
Batch is a managed service
Remember jobs and job defs, etc
Common scenario decide between batch and lambda
Question
A data analytics company wants to migrate its batch processing system to AWS. The company receives thousands of small
data files periodically during the day through FTP. An on-premises batch job processes the data files overnight. However,
the batch job takes hours to finish running.
39
The company wants the AWS solution to process incoming data files as soon as possible with minimal changes to the FTP
clients that send the files. The solution must delete the incoming data files after the files have been processed successfully.
Processing for each file needs to take 3-8 minutes.
Which solution will meet these requirements in the MOST operationally efficient way?
1.
2.
3.
4.
Use an Amazon EC2 instance that runs an FTP server to store incoming files as objects in Amazon S3 Glacier
Flexible Retrieval. Configure a job queue in AWS Batch. Use Amazon EventBridge rules to invoke the job to
process the objects nightly from S3 Glacier Flexible Retrieval. Delete the objects after the job has processed the
objects.
Use an Amazon EC2 instance that runs an FTP server to store incoming files on an Amazon Elastic Block Store
(Amazon EBS) volume. Configure a job queue in AWS Batch. Use Amazon EventBridge rules to invoke the job to
process the files nightly from the EBS volume. Delete the files after the job has processed the files.
Use AWS Transfer Family to create an FTP server to store incoming files on an Amazon Elastic Block Store
(Amazon EBS) volume. Configure a job queue in AWS Batch. Use an Amazon S3 event notification when each file
arrives to invoke the job in AWS Batch. Delete the files after the job has processed the files.
Use AWS Transfer Family to create an FTP server to store incoming files in Amazon S3 Standard. Create an AWS
Lambda function to process the files and to delete the files after they are processed. Use an S3 event notification
to invoke the Lambda function when the files arrive.
Solution
Use AWS Transfer Family for the FTP server to receive files directly into S3. This avoids managing FTP servers.
Process each file as soon as it arrives using Lambda triggered by S3 events. Lambda provides fast processing time per file.
Lambda can also delete files after processing succeeds.
Options A, B, C involve more operational overhead of managing FTP servers and batch jobs. Processing latency would be
higher waiting for batch windows.
Storing files in Glacier (Option A) adds latency for retrieving files.
Process incoming data files with minimal changes to the FTP clients that send the files = AWS Transfer Family.
Process incoming data files as soon as possible = S3 event notification.
Processing for each file needs to take 3-8 minutes = AWS Lambda function.
Delete file after processing = AWS Lambda function.
Question:
A company is migrating an old application to AWS. The application runs a batch job every hour and is CPU intensive. The
batch job takes 15 minutes on average with an on-premises server. The server has 64 virtual CPU (vCPU) and 512 GiB of
memory.
Which solution will run the batch job within 15 minutes with the LEAST operational overhead?
A. Use AWS Lambda with functional scaling.
B. Use Amazon Elastic Container Service (Amazon ECS) with AWS Fargate.
C. Use Amazon Lightsail with AWS Auto Scaling.
D. Use AWS Batch on Amazon EC2.
Answer:
D
Question
An ecommerce company needs to run a scheduled daily job to aggregate and filter sales records for analytics. The company
stores the sales records in an Amazon S3 bucket. Each object can be up to 10 GB in size. Based on the number of sales
events, the job can take up to an hour to complete. The CPU and memory usage of the job are constant and are known in
advance.
A solutions architect needs to minimize the amount of operational effort that is needed for the job to run.
Which solution meets these requirements?
A. Create an AWS Lambda function that has an Amazon EventBridge notification. Schedule the EventBridge event to run
once a day.
40
B. Create an AWS Lambda function. Create an Amazon API Gateway HTTP API, and integrate the API with the function.
Create an Amazon EventBridge scheduled event that calls the API and invokes the function.
C. Create an Amazon Elastic Container Service (Amazon ECS) cluster with an AWS Fargate launch type. Create an Amazon
EventBridge scheduled event that launches an ECS task on the cluster to run the job.
D. Create an Amazon Elastic Container Service (Amazon ECS) cluster with an Amazon EC2 launch type and an Auto Scaling
group with at least one EC2 instance. Create an Amazon EventBridge scheduled event that launches an ECS task on the
cluster to run the job.
Solution
The requirement is to run a daily scheduled job to aggregate and filter sales records for analytics in the most efficient way
possible. Based on the requirement, we can eliminate option A and B since they use AWS Lambda which has a limit of 15
minutes of execution time, which may not be sufficient for a job that can take up to an hour to complete.
Between options C and D, option C is the better choice since it uses AWS Fargate which is a serverless compute engine for
containers that eliminates the need to manage the underlying EC2 instances, making it a low operational effort solution.
Additionally, Fargate also provides instant scale-up and scale-down capabilities to run the scheduled job as per the
requirement.
Therefore, the correct answer is:
C. Create an Amazon Elastic Container Service (Amazon ECS) cluster with an AWS Fargate launch type. Create an Amazon
EventBridge scheduled event that launches an ECS task on the cluster to run the job.
Amazon MQ
Message brokers allow software systems, which often use different programming languages on various platforms, to
communication and exchange information. Amazon MQ is a managed message broker service for Apache ActiveMQ and
RabbitMQ that streamlines setup, operation, and management of message brokers on AWS. With a few steps, Amazon MQ
can provision your message broker with support for software version upgrades.
-
Can have one broker in 1AZ or a cluster deployment of 3 broker nodes across multiple AZs, behind a network load
balancer.
Highly available - Use Amazon MQ with active/standby brokers configured across two Availability Zones
Private networking only
Single instance brokers ideal for Dev envs
Specific messaging Protocols – JMS, AMQP 0-9-1, MQTT, OpenWire and Stomp
SQS or MQ
If migrating Existing messaging system then lean towards MQ
Setting up a new one, use SQS – easier and work well in AWS env
Private networking, vpc and direct connect = MQ, Public = SNS/ SQS
MQ has no default AWS integrations unlike SQS
SQS is a simple queueing service. It doesn't support many higher level abstractions like message routing, fanouts,
distribution lists etc. It is a queue - a message is produced, and a message is delivered. It is useful when you need a Queue
with limited backing logic.
AWS MQ is a managed Apache ActiveMQ(or RabbitMQ) broker service.
This provides you a fully managed Apache ActiveMQ system in the cloud, with support for a variety of industry-standard
queue and broadcast protocols like AMQP, JMS etc. It is useful when you have complicated delivery rules - or when you're
migrating an existing system from outside AWS into AWS, and your systems happen to talk to one another with a standard
queueing protocol.
Adding to the above answer, In simple terms we can say that Amazon SQS is simplified version of ActiveMQ. In ActiveMQ
you have a concept of Broker, which contains one or more Queues
Exam tip
Lookout for mentions of the messaging protocols or Apache ActiveMQ and RabbitMQ
Question
41
A company runs its ecommerce application on AWS. Every new order is published as a massage in a RabbitMQ queue that
runs on an Amazon EC2 instance in a single Availability Zone. These messages are processed by a different application that
runs on a separate EC2 instance. This application stores the details in a PostgreSQL database on another EC2 instance. All
the EC2 instances are in the same Availability Zone.
The company needs to redesign its architecture to provide the highest availability with the least operational overhead.
What should a solutions architect do to meet these requirements?
A. Migrate the queue to a redundant pair (active/standby) of RabbitMQ instances on Amazon MQ. Create a Multi-AZ Auto
Scaling group for EC2 instances that host the application. Create another Multi-AZ Auto Scaling group for EC2 instances that
host the PostgreSQL database.
B. Migrate the queue to a redundant pair (active/standby) of RabbitMQ instances on Amazon MQ. Create a Multi-AZ Auto
Scaling group for EC2 instances that host the application. Migrate the database to run on a Multi-AZ deployment of Amazon
RDS for PostgreSQL. Most Voted
C. Create a Multi-AZ Auto Scaling group for EC2 instances that host the RabbitMQ queue. Create another Multi-AZ Auto
Scaling group for EC2 instances that host the application. Migrate the database to run on a Multi-AZ deployment of Amazon
RDS for PostgreSQL.
D. Create a Multi-AZ Auto Scaling group for EC2 instances that host the RabbitMQ queue. Create another Multi-AZ Auto
Scaling group for EC2 instances that host the application. Create a third Multi-AZ Auto Scaling group for EC2 instances that
host the PostgreSQL database
Solution
Migrating to Amazon MQ reduces the overhead on the queue management. C and D are dismissed.
Deciding between A and B means deciding to go for an AutoScaling group for EC2 or an RDS for Postgress (both multi- AZ).
The RDS option has less operational impact, as provide as a service the tools and software required. Consider for instance,
the effort to add an additional node like a read replica, to the DB
B
Question
A company recently migrated a message processing system to AWS. The system receives messages into an ActiveMQ queue
running on an Amazon EC2 instance. Messages are processed by a consumer application running on Amazon EC2. The
consumer application processes the messages and writes results to a MySQL database running on Amazon EC2. The
company wants this application to be highly available with low operational complexity.
Which architecture offers the HIGHEST availability?
A. Add a second ActiveMQ server to another Availability Zone. Add an additional consumer EC2 instance in another
Availability Zone. Replicate the MySQL database to another Availability Zone.
B. Use Amazon MQ with active/standby brokers configured across two Availability Zones. Add an additional consumer EC2
instance in another Availability Zone. Replicate the MySQL database to another Availability Zone.
C. Use Amazon MQ with active/standby brokers configured across two Availability Zones. Add an additional consumer EC2
instance in another Availability Zone. Use Amazon RDS for MySQL with Multi-AZ enabled.
D. Use Amazon MQ with active/standby brokers configured across two Availability Zones. Add an Auto Scaling group for the
consumer EC2 instances across two Availability Zones. Use Amazon RDS for MySQL with Multi-AZ enabled.
Solution
Amazon MQ active/standby brokers across AZs for queue high availability
Auto Scaling group with consumer EC2 instances across AZs for redundant processing
RDS MySQL with Multi-AZ for database high availability
This combines the HA capabilities of MQ, EC2 and RDS to maximize fault tolerance across all components. The auto scaling
also provides flexibility to scale processing capacity as needed.
D
Aws Step Function
AWS Step Functions is a visual workflow service that helps developers use AWS services to build distributed applications,
automate processes, orchestrate microservices, and create data and machine learning (ML) pipelines. combine aws lambda
function with other AWS services
42
AWS Step Functions lets you add resilient workflow automation to your applications in minutes—without writing code.
Workflows built with Step Functions include built-in error handling, parameter passing, recommended security settings, and
state management, reducing the amount of code you have to write and maintain.
Standard vs Express
Standard only runs once
Can run up to 1 year in length
Good for long running workflows that need auditable history
2000 executions a second
Price per transaction
-
Express at least once execution
-
Up to 5 mins
High event workloads
IOT data stream and ingestion
Pricing based on ( executions & duration & mem consumed)
-
State machine
Provides a graphical console like below
Exam Tips
-
When it asks for serverless orchestration service for event driven tasks think Steps.
Express vs Standard
Written in Amazon States language
States are elements within state machine, every step in a workflow.
Many integrations – Lambda, API Gateway & eventbridge
Question
An ecommerce company is building a distributed application that involves several serverless functions and AWS services to
complete order-processing tasks. These tasks require manual approvals as part of the workflow. A solutions architect needs
to design an architecture for the order-processing application. The solution must be able to combine multiple AWS Lambda
functions into responsive serverless applications. The solution also must orchestrate data and services that run on Amazon
EC2 instances, containers, or on-premises servers.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use AWS Step Functions to build the application.
B. Integrate all the application components in an AWS Glue job.
C. Use Amazon Simple Queue Service (Amazon SQS) to build the application.
D. Use AWS Lambda functions and Amazon EventBridge events to build the application.
Answer
AWS Step Functions allow you to easily coordinate multiple Lambda functions and services into serverless workflows with
visual workflows. Step Functions are designed for building distributed applications that combine services and require
human approval steps. D
43
Question
A company is moving its data management application to AWS. The company wants to transition to an event-driven
architecture. The architecture needs to be more distributed and to use serverless concepts while performing the different
aspects of the workflow. The company also wants to minimize operational overhead.
Which solution will meet these requirements?
A. Build out the workflow in AWS Glue. Use AWS Glue to invoke AWS Lambda functions to process the workflow steps.
B. Build out the workflow in AWS Step Functions. Deploy the application on Amazon EC2 instances. Use Step Functions to
invoke the workflow steps on the EC2 instances.
C. Build out the workflow in Amazon EventBridge. Use EventBridge to invoke AWS Lambda functions on a schedule to
process the workflow steps.
D. Build out the workflow in AWS Step Functions. Use Step Functions to create a state machine. Use the state machine to
invoke AWS Lambda functions to process the workflow steps.
Solution
Answer is D.
Step Functions is based on state machines and tasks. A state machine is a workflow. A task is a state in a workflow that
represents a single unit of work that another AWS service performs. Each step in a workflow is a state.
Depending on your use case, you can have Step Functions call AWS services, such as Lambda, to perform tasks.
Amazon Appflow
With Amazon AppFlow automate bi-directional data flows between SaaS applications and AWS services in just a few clicks.
Run the data flows at the frequency you choose, whether on a schedule, in response to a business event, or on demand.
Simplify data preparation with transformations, partitioning, and aggregation. Automate preparation and registration of
your schema with the AWS Glue Data Catalog so you can discover and share data with AWS analytics and machine learning
services.
-
e.g. Getting data from Salesforce into S3 storage.
Use Cases:
transferring data from Salesforce to S3/ Redshift
Ingesting and analysing Slack convos in S3
Zendesk support tickets to snowflake
Transfer aggregate data on a scheduled basis to S3.
Question
A company wants to securely exchange data between its software as a service (SaaS) application Salesforce account and
Amazon S3. The company must encrypt the data at rest by using AWS Key Management Service (AWS KMS) customer
managed keys (CMKs). The company must also encrypt the data in transit. The company has enabled API access for the
Salesforce account.
44
A. Create AWS Lambda functions to transfer the data securely from Salesforce to Amazon S3.
B. Create an AWS Step Functions workflow. Define the task to transfer the data securely from Salesforce to Amazon S3.
C. Create Amazon AppFlow flows to transfer the data securely from Salesforce to Amazon S3.
D. Create a custom connector for Salesforce to transfer the data securely from Salesforce to Amazon S3.
Solution
° Amazon AppFlow can securely transfer data between Salesforce and Amazon S3.
° AppFlow supports encrypting data at rest in S3 using KMS CMKs.
° AppFlow supports encrypting data in transit using HTTPS/TLS.
° AppFlow provides built-in support and templates for Salesforce and S3, requiring less custom configuration than solutions
like Lambda, Step Functions, or custom connectors.
Question
A company's application integrates with multiple software-as-a-service (SaaS) sources for data collection. The company runs
Amazon EC2 instances to receive the data and to upload the data to an Amazon S3 bucket for analysis. The same EC2
instance that receives and uploads the data also sends a notification to the user when an upload is complete. The company
has noticed slow application performance and wants to improve the performance as much as possible.
Which solution will meet these requirements with the LEAST operational overhead?
A. Create an Auto Scaling group so that EC2 instances can scale out. Configure an S3 event notification to send events to an
Amazon Simple Notification Service (Amazon SNS) topic when the upload to the S3 bucket is complete.
B. Create an Amazon AppFlow flow to transfer data between each SaaS source and the S3 bucket. Configure an S3 event
notification to send events to an Amazon Simple Notification Service (Amazon SNS) topic when the upload to the S3 bucket
is complete.
C. Create an Amazon EventBridge (Amazon CloudWatch Events) rule for each SaaS source to send output data. Configure
the S3 bucket as the rule's target. Create a second EventBridge (Cloud Watch Events) rule to send events when the upload
to the S3 bucket is complete. Configure an Amazon Simple Notification Service (Amazon SNS) topic as the second rule's
target.
D. Create a Docker container to use instead of an EC2 instance. Host the containerized application on Amazon Elastic
Container Service (Amazon ECS). Configure Amazon CloudWatch Container Insights to send events to an Amazon Simple
Notification Service (Amazon SNS) topic when the upload to the S3 bucket is complete.
Solution
Option A suggests using an Auto Scaling group to scale out EC2 instances, but it does not address the potential bottleneck
of slow application performance and the notification process.
Option C involves using Amazon EventBridge (CloudWatch Events) rules for data output and S3 uploads, but it introduces
additional complexity with separate rules and does not specifically address the slow application performance.
Option D suggests containerizing the application and using Amazon Elastic Container Service (Amazon ECS) with
CloudWatch Container Insights, which may involve more operational overhead and setup compared to the simpler solution
provided by Amazon AppFlow.
Therefore, option B offers the most streamlined solution with the least operational overhead by utilizing Amazon AppFlow
for data transfer, configuring S3 event notifications for upload completion, and leveraging Amazon SNS for notifications
without requiring additional infrastructure management.
45
Chapter 14 Big Data
Amazon redshift
Fully managed petabyte scale data warehouse (up to 16Pbs), v large Relational DB (RDS on a larger scale).
Can use SQL, BI tools
A Redshift Multi-AZ deployment leverages compute resources in multiple AZs to scale data warehouse workload
processing. In situations where there is a high level of concurrency Redshift will automatically leverage the resources
in both AZs to scale the workload for both read and write requests using active-active processing.
On the other hand, a multi-AZ deployment is provisioned in multiple Availability Zones simultaneously. For a Multi-AZ
deployment, Amazon Redshift automatically selects two subnets from two different Availability Zones and deploys an
equal number of compute nodes in each Availability Zone.
Question
A company is migrating an on-premises application to AWS. The company wants to use Amazon Redshift as a solution.
Which use cases are suitable for Amazon Redshift in this scenario? (Choose three.)
A. Supporting data APIs to access data with traditional, containerized, and event-driven applications
B. Supporting client-side and server-side encryption
C. Building analytics workloads during specified hours and when the application is not active
D. Caching data to reduce the pressure on the backend database
E. Scaling globally to support petabytes of data and tens of millions of requests per minute
F. Creating a secondary replica of the cluster by using the AWS Management Console
Solution
Amazon Redshift is a data warehouse solution, so it is suitable for:
-Supporting encryption (client-side and server-side)
-Handling analytics workloads, especially during off-peak hours when the application is less active
-Scaling to large amounts of data and high query volumes for analytics purposes
The following options are incorrect because:
A) Data APIs are not typically used with Redshift. It is more for running SQL queries and analytics.
D) Redshift is not typically used for caching data. It is for analytics and data warehouse purposes.
F) Redshift clusters do not create replicas in the management console. They are standalone clusters. you could create DR
cluster from snapshot and restore to another region (automated or manual) but I do not think this what is meant in this
option.
AWS Elastic Map Reduce
ETL: Extract transform Load
Turning raw data into useful data.
EMR:
managed big data platform (fleet of Ec2s running open source tools), that allows you to process vast amounts of data.
Hadoop, hbase, spark & presto
Can use RI and spot instances to save money
You can use Amazon EMR with a customized version of Hive that includes connectivity to DynamoDB to perform
operations on data stored in DynamoDB:
-
AWS Kinesis
-
Ingest process analyse real time streaming data
46
Kinesis data Streams
Fast as possible, real time
Responsible for consumer and scaling the solution
Lot of work
Kinesis Fire House
Data transfer tool to get data into S3, redshift, splunk, elastisearch
Near real time – up to 60 seconds
Managed solution
Kinesis vs SQS
SQS - messaging broker that is simple but doesn’t provide real-time
Kinesis – more complicated to configure but is real time and good for big data.
Kinesis Data Analytics – easiest way to process data going through Kinesis using SQL.
AWS Athena
Interactive query lang that makes it easy to analyse data in S3 using SQL without having to load into a Db.
AWS Glue
Serverless data integration service – perform ETL without having to manage the Ec2 servers. (kinda replaces EMR).
Glue & Athena: glue structures the unstructured data, Athena then runs queries on it without having to load into a Db –
then could vizualise on AWS Quicksight (Dashboarding).
Exam Tip: asking for a serverless SQL solution, Athena best option – only service that allows you to directly query data
stored in S3.
Question
A company’s reporting system delivers hundreds of .csv files to an Amazon S3 bucket each day. The company must convert
these files to Apache Parquet format and must store the files in a transformed data bucket.
Which solution will meet these requirements with the LEAST development effort?
A. Create an Amazon EMR cluster with Apache Spark installed. Write a Spark application to transform the data. Use EMR
File System (EMRFS) to write files to the transformed data bucket.
B. Create an AWS Glue crawler to discover the data. Create an AWS Glue extract, transform, and load (ETL) job to transform
the data. Specify the transformed data bucket in the output step.
C. Use AWS Batch to create a job definition with Bash syntax to transform the data and output the data to the transformed
data bucket. Use the job definition to submit a job. Specify an array job as the job type.
D. Create an AWS Lambda function to transform the data and output the data to the transformed data bucket. Configure an
event notification for the S3 bucket. Specify the Lambda function as the destination for the event notification.
Solution
AWS Glue is a fully managed ETL service that simplifies the process of preparing and transforming data for analytics. Using
AWS Glue requires minimal development effort compared to the other options.
Option A requires more development effort as it involves writing a Spark application to transform the data. It also
introduces additional infrastructure management with the EMR cluster.
Option C requires writing and managing custom Bash scripts for data transformation. It requires more manual effort and
does not provide the built-in capabilities of AWS Glue for data transformation.
Option D requires developing and managing a custom Lambda for data transformation. While Lambda can handle the
transformation, it requires more effort compared to AWS Glue, which is specifically designed for ETL operations.
47
Therefore, option B provides the easiest and least development effort by leveraging AWS Glue's capabilities for data
discovery, transformation, and output to the transformed data bucket.
Question
A company is running an application in the AWS Cloud. The application collects and stores a large amount of unstructured
data in an Amazon S3 bucket. The S3 bucket contains several terabytes of data and uses the S3 Standard storage class. The
data increases in size by several gigabytes every day.
The company needs to query and analyze the data. The company does not access data that is more than 1 year old.
However, the company must retain all the data indefinitely for compliance reasons.
Which solution will meet these requirements MOST cost-effectively?
A. Use S3 Select to query the data. Create an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier
Deep Archive.
B. Use Amazon Redshift Spectrum to query the data. Create an S3 Lifecycle policy to transition data that is more than 1 year
old 10 S3 Glacier Deep Archive.
C. Use an AWS Glue Data Catalog and Amazon Athena to query the data. Create an S3 Lifecycle policy to transition data that
is more than 1 year old to S3 Glacier Deep Archive.
D. Use Amazon Redshift Spectrum to query the data. Create an S3 Lifecycle policy to transition data that is more than 1 year
old to S3 Intelligent-Tiering.
Solution
The other options are not correct because:
A. Using S3 Select is good for filtering data in S3, but it may not be a suitable solution for querying and analyzing large
amounts of data.
B. Amazon Redshift Spectrum can be used to query data stored in S3, but it may not be as cost-effective as using Amazon
Athena for querying unstructured data
D. Using Amazon Redshift Spectrum with S3 Intelligent-Tiering could be a good solution, but S3 Intelligent-Tiering is
designed to optimize storage costs based on access patterns and it would not be the best solution for compliance reasons
as S3 Intelligent-Tiering will move data to other storage classes according to access patterns.
The correct answer is C. Use an AWS Glue Data Catalog and Amazon Athena to query the data. Create an S3 Lifecycle policy
to transition data that is more than 1 year old to S3 Glacier Deep Archive.
This solution allows you to use Amazon Athena and the AWS Glue Data Catalog to query and analyze the data in an S3
bucket. Amazon Athena is a serverless, interactive query service that allows you to analyze data in S3 using SQL. The AWS
Glue Data Catalog is a managed metadata repository that can be used to store and retrieve table definitions for data stored
in S3. Together, these services can provide a cost-effective way to query and analyze large amounts of unstructured data.
Additionally, by using an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier Deep Archive, you
can retain the data indefinitely for compliance reasons while also reducing storage costs.
Question
A solutions architect manages an analytics application. The application stores large amounts of semistructured data in an
Amazon S3 bucket. The solutions architect wants to use parallel data processing to process the data more quickly. The
solutions architect also wants to use information that is stored in an Amazon Redshift database to enrich the data.
Which solution will meet these requirements?
A. Use Amazon Athena to process the S3 data. Use AWS Glue with the Amazon Redshift data to enrich the S3 data.
B. Use Amazon EMR to process the S3 data. Use Amazon EMR with the Amazon Redshift data to enrich the S3 data.
C. Use Amazon EMR to process the S3 data. Use Amazon Kinesis Data Streams to move the S3 data into Amazon Redshift so
that the data can be enriched.
D. Use AWS Glue to process the S3 data. Use AWS Lake Formation with the Amazon Redshift data to enrich the S3 data.
Solution
Option B is the correct solution that meets the requirements:
Use Amazon EMR to process the semi-structured data in Amazon S3. EMR provides a managed Hadoop framework
optimized for processing large datasets in S3.
EMR supports parallel data processing across multiple nodes to speed up the processing.
48
EMR can integrate directly with Amazon Redshift using the EMR-Redshift integration. This allows querying the Redshift data
from EMR and joining it with the S3 data.
This enables enriching the semi-structured S3 data with the information stored in Redshift
Question
A company needs to ingest and handle large amounts of streaming data that its application generates. The application runs
on Amazon EC2 instances and sends data to Amazon Kinesis Data Streams, which is configured with default settings. Every
other day, the application consumes the data and writes the data to an Amazon S3 bucket for business intelligence (BI)
processing. The company observes that Amazon S3 is not receiving all the data that the application sends to Kinesis Data
Streams.
What should a solutions architect do to resolve this issue?
A. Update the Kinesis Data Streams default settings by modifying the data retention period.
B. Update the application to use the Kinesis Producer Library (KPL) to send the data to Kinesis Data Streams.
C. Update the number of Kinesis shards to handle the throughput of the data that is sent to Kinesis Data Streams.
D. Turn on S3 Versioning within the S3 bucket to preserve every version of every object that is ingested in the S3 bucket.
Solution
Sneaky one, you would increase shards for more processing but the The question mentioned Kinesis data stream default
settings and "every other day". After 24hrs, the data isn't in the Data stream if the default settings is not modified to store
data more than 24hrs. - A
Question
A company needs to configure a real-time data ingestion architecture for its application. The company needs an API, a
process that transforms data as the data is streamed, and a storage solution for the data.
Which solution will meet these requirements with the LEAST operational overhead?
A. Deploy an Amazon EC2 instance to host an API that sends data to an Amazon Kinesis data stream. Create an Amazon
Kinesis Data Firehose delivery stream that uses the Kinesis data stream as a data source. Use AWS Lambda functions to
transform the data. Use the Kinesis Data Firehose delivery stream to send the data to Amazon S3.
B. Deploy an Amazon EC2 instance to host an API that sends data to AWS Glue. Stop source/destination checking on the
EC2 instance. Use AWS Glue to transform the data and to send the data to Amazon S3.
C. Configure an Amazon API Gateway API to send data to an Amazon Kinesis data stream. Create an Amazon Kinesis Data
Firehose delivery stream that uses the Kinesis data stream as a data source. Use AWS Lambda functions to transform the
data. Use the Kinesis Data Firehose delivery stream to send the data to Amazon S3.
D. Configure an Amazon API Gateway API to send data to AWS Glue. Use AWS Lambda functions to transform the data. Use
AWS Glue to send the data to Amazon S3.
Solution
The company needs an API = Amazon API Gateway API
A real-time data ingestion = Amazon Kinesis data stream
A process that transforms data = AWS Lambda functions
Kinesis Data Firehose delivery stream to send the data to Amazon S3
A storage solution for the data = Amazon S3
Question
A company wants to ingest customer payment data into the company's data lake in Amazon S3. The company receives
payment data every minute on average. The company wants to analyze the payment data in real time. Then the company
wants to ingest the data into the data lake.
Which solution will meet these requirements with the MOST operational efficiency?
A. Use Amazon Kinesis Data Streams to ingest data. Use AWS Lambda to analyze the data in real time.
B. Use AWS Glue to ingest data. Use Amazon Kinesis Data Analytics to analyze the data in real time.
C. Use Amazon Kinesis Data Firehose to ingest data. Use Amazon Kinesis Data Analytics to analyze the data in real time.
D. Use Amazon API Gateway to ingest data. Use AWS Lambda to analyze the data in real time.
Solution
49
By leveraging the combination of Amazon Kinesis Data Firehose and Amazon Kinesis Data Analytics, you can efficiently
ingest and analyze the payment data in real time without the need for manual processing or additional infrastructure
management. This solution provides a streamlined and scalable approach to handle continuous data ingestion and analysis
requirements.
Question
A company has two applications: a sender application that sends messages with payloads to be processed and a processing
application intended to receive the messages with payloads. The company wants to implement an AWS service to handle
messages between the two applications. The sender application can send about 1,000 messages each hour. The messages
may take up to 2 days to be processed: If the messages fail to process, they must be retained so that they do not impact the
processing of any remaining messages.
Which solution meets these requirements and is the MOST operationally efficient?
A. Set up an Amazon EC2 instance running a Redis database. Configure both applications to use the instance. Store, process,
and delete the messages, respectively.
B. Use an Amazon Kinesis data stream to receive the messages from the sender application. Integrate the processing
application with the Kinesis Client Library (KCL).
C. Integrate the sender and processor applications with an Amazon Simple Queue Service (Amazon SQS) queue. Configure a
dead-letter queue to collect the messages that failed to process.
D. Subscribe the processing application to an Amazon Simple Notification Service (Amazon SNS) topic to receive
notifications to process. Integrate the sender application to write to the SNS topic.
Solution
SQS provides a fully managed message queuing service that meets all the requirements:
SQS can handle the sending and processing of 1,000 messages per hour
Messages can be retained for up to 14 days to allow the full 2 days for processing
Using a dead-letter queue will retain failed messages without impacting other processing
SQS requires minimal operational overhead compared to running your own message queue server
AWS Quicksight
Fully managed BI data vizualisation service, allows you to easily create dashboards.
QUESTIOn
A company has an Amazon S3 data lake that is governed by AWS Lake Formation. The company wants to create a
visualization in Amazon QuickSight by joining the data in the data lake with operational data that is stored in an Amazon
Aurora MySQL database. The company wants to enforce column-level authorization so that the company’s marketing team
can access only a subset of columns in the database.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use Amazon EMR to ingest the data directly from the database to the QuickSight SPICE engine. Include only the required
columns.
B. Use AWS Glue Studio to ingest the data from the database to the S3 data lake. Attach an IAM policy to the QuickSight
users to enforce column-level access control. Use Amazon S3 as the data source in QuickSight.
C. Use AWS Glue Elastic Views to create a materialized view for the database in Amazon S3. Create an S3 bucket policy to
enforce column-level access control for the QuickSight users. Use Amazon S3 as the data source in QuickSight.
D. Use a Lake Formation blueprint to ingest the data from the database to the S3 data lake. Use Lake Formation to enforce
column-level access control for the QuickSight users. Use Amazon Athena as the data source in QuickSight.
Solution
This solution leverages AWS Lake Formation to ingest data from the Aurora MySQL database into the S3 data lake, while
enforcing column-level access control for QuickSight users. Lake Formation can be used to create and manage the data
lake's metadata and enforce security and governance policies, including column-level access control. This solution then uses
Amazon Athena as the data source in QuickSight to query the data in the S3 data lake. This solution minimizes operational
50
overhead by leveraging AWS services to manage and secure the data, and by using a standard query service (Amazon
Athena) to provide a SQL interface to the data.
QUESTION
A company wants to analyze and troubleshoot Access Denied errors and Unauthorized errors that are related to IAM
permissions. The company has AWS CloudTrail turned on.
Which solution will meet these requirements with the LEAST effort?
A. Use AWS Glue and write custom scripts to query CloudTrail logs for the errors.
B. Use AWS Batch and write custom scripts to query CloudTrail logs for the errors.
C. Search CloudTrail logs with Amazon Athena queries to identify the errors.
D. Search CloudTrail logs with Amazon QuickSight. Create a dashboard to identify the errors.
SOLUTION
Athena allows you to run SQL queries on data in Amazon S3, including CloudTrail logs. It is the easiest way to query the logs
and identify specific errors without needing to write any custom code or scripts.
With Athena, you can write simple SQL queries to filter the CloudTrail logs for the "AccessDenied" and
"UnauthorizedOperation" error codes. This will return the relevant log entries that you can then analyze. - C
Question
A company is using Amazon CloudFront with its website. The company has enabled logging on the CloudFront distribution,
and logs are saved in one of the company’s Amazon S3 buckets. The company needs to perform advanced analyses on the
logs and build visualizations.
What should a solutions architect do to meet these requirements?
A. Use standard SQL queries in Amazon Athena to analyze the CloudFront logs in the S3 bucket. Visualize the results with
AWS Glue.
B. Use standard SQL queries in Amazon Athena to analyze the CloudFront logs in the S3 bucket. Visualize the results with
Amazon QuickSight.
C. Use standard SQL queries in Amazon DynamoDB to analyze the CloudFront logs in the S3 bucket. Visualize the results
with AWS Glue.
D. Use standard SQL queries in Amazon DynamoDB to analyze the CloudFront logs in the S3 bucket. Visualize the results
with Amazon QuickSight.
Solution
OptionB: Amazon Athena allows you to run standard SQL queries directly on the data stored in the S3 bucket.
Amazon QuickSight is a business intelligence (BI) service that allows you to create interactive and visual dashboards to
analyze data. You can connect Amazon QuickSight to Amazon Athena to visualize the results of your SQL queries from the
CloudFront logs.
Question
A reporting team receives files each day in an Amazon S3 bucket. The reporting team manually reviews and copies the files
from this initial S3 bucket to an analysis S3 bucket each day at the same time to use with Amazon QuickSight. Additional
teams are starting to send more files in larger sizes to the initial S3 bucket.
The reporting team wants to move the files automatically analysis S3 bucket as the files enter the initial S3 bucket. The
reporting team also wants to use AWS Lambda functions to run pattern-matching code on the copied data. In addition, the
reporting team wants to send the data files to a pipeline in Amazon SageMaker Pipelines.
What should a solutions architect do to meet these requirements with the LEAST operational overhead?
A. Create a Lambda function to copy the files to the analysis S3 bucket. Create an S3 event notification for the analysis S3
bucket. Configure Lambda and SageMaker Pipelines as destinations of the event notification. Configure
s3:ObjectCreated:Put as the event type.
B. Create a Lambda function to copy the files to the analysis S3 bucket. Configure the analysis S3 bucket to send event
notifications to Amazon EventBridge (Amazon CloudWatch Events). Configure an ObjectCreated rule in EventBridge
(CloudWatch Events). Configure Lambda and SageMaker Pipelines as targets for the rule.
51
C. Configure S3 replication between the S3 buckets. Create an S3 event notification for the analysis S3 bucket. Configure
Lambda and SageMaker Pipelines as destinations of the event notification. Configure s3:ObjectCreated:Put as the event
type.
D. Configure S3 replication between the S3 buckets. Configure the analysis S3 bucket to send event notifications to Amazon
EventBridge (Amazon CloudWatch Events). Configure an ObjectCreated rule in EventBridge (CloudWatch Events). Configure
Lambda and SageMaker Pipelines as targets for the rule.
Solution
Option D is the solution with the least operational overhead:
Use S3 replication between buckets
Send S3 events to EventBridge
Add Lambda and SageMaker as EventBridge rule targets
The reasons this has the least overhead:
S3 replication automatically copies new objects to analysis bucket
EventBridge allows easily adding multiple targets for events
No custom Lambda function needed for copying objects
Leverages managed services for event processing
Question
A company produces batch data that comes from different databases. The company also produces live stream data from
network sensors and application APIs. The company needs to consolidate all the data into one place for business analytics.
The company needs to process the incoming data and then stage the data in different Amazon S3 buckets. Teams will later
run one-time queries and import the data into a business intelligence tool to show key performance indicators (KPIs).
Which combination of steps will meet these requirements with the LEAST operational overhead? (Choose two.)
A. Use Amazon Athena for one-time queries. Use Amazon QuickSight to create dashboards for KPIs.
B. Use Amazon Kinesis Data Analytics for one-time queries. Use Amazon QuickSight to create dashboards for KPIs.
C. Create custom AWS Lambda functions to move the individual records from the databases to an Amazon Redshift cluster.
D. Use an AWS Glue extract, transform, and load (ETL) job to convert the data into JSON format. Load the data into multiple
Amazon OpenSearch Service (Amazon Elasticsearch Service) clusters.
E. Use blueprints in AWS Lake Formation to identify the data that can be ingested into a data lake. Use AWS Glue to crawl
the source, extract the data, and load the data into Amazon S3 in Apache Parquet format.
Solution
AE
AWS Lake Formation and Glue provide automated data lake creation with minimal coding. Glue crawlers identify sources
and ETL jobs load to S3.
Athena allows ad-hoc queries directly on S3 data with no infrastructure to manage.
QuickSight provides easy cloud BI for dashboards.
Options C and D require significant custom coding for ETL and queries.
Redshift and OpenSearch would require additional setup and management overhead.
Question
A company hosts a data lake on AWS. The data lake consists of data in Amazon S3 and Amazon RDS for PostgreSQL. The
company needs a reporting solution that provides data visualization and includes all the data sources within the data lake.
Only the company's management team should have full access to all the visualizations. The rest of the company should
have only limited access.
Which solution will meet these requirements?
A. Create an analysis in Amazon QuickSight. Connect all the data sources and create new datasets. Publish dashboards to
visualize the data. Share the dashboards with the appropriate IAM roles.
B. Create an analysis in Amazon QuickSight. Connect all the data sources and create new datasets. Publish dashboards to
visualize the data. Share the dashboards with the appropriate users and groups.
52
C. Create an AWS Glue table and crawler for the data in Amazon S3. Create an AWS Glue extract, transform, and load (ETL)
job to produce reports. Publish the reports to Amazon S3. Use S3 bucket policies to limit access to the reports.
D. Create an AWS Glue table and crawler for the data in Amazon S3. Use Amazon Athena Federated Query to access data
within Amazon RDS for PostgreSQL. Generate reports by using Amazon Athena. Publish the reports to Amazon S3. Use S3
bucket policies to limit access to the reports.
Solution
Keywords:
- Data lake on AWS.
- Consists of data in Amazon S3 and Amazon RDS for PostgreSQL.
- The company needs a reporting solution that provides data VISUALIZATION and includes ALL the data sources within the
data lake.
A - Incorrect: Amazon QuickSight only support users(standard version) and groups (enterprise version). users and groups
only exists without QuickSight. QuickSight don't support IAM. We use users and groups to view the QuickSight dashboard
B - Correct: as explained in answer A and QuickSight is used to created dashboard from S3, RDS, Redshift, Aurora, Athena,
OpenSearch, Timestream
C - Incorrect: This way don't support visulization and don't mention how to process RDS data
D - Incorrect: This way don't support visulization and don't mention how to combine data RDS and S3
Option B involves using Amazon QuickSight, which is a business intelligence tool provided by AWS for data visualization and
reporting. With this option, you can connect all the data sources within the data lake, including Amazon S3 and Amazon
RDS for PostgreSQL. You can create datasets within QuickSight that pull data from these sources.
The solution allows you to publish dashboards in Amazon QuickSight, which will provide the required data visualization
capabilities. To control access, you can use appropriate IAM (Identity and Access Management) roles, assigning full access
only to the company's management team and limiting access for the rest of the company. You can share the dashboards
selectively with the users and groups that need access
Question
An ecommerce company wants to use machine learning (ML) algorithms to build and train models. The company will use
the models to visualize complex scenarios and to detect trends in customer data. The architecture team wants to integrate
its ML models with a reporting platform to analyze the augmented data and use the data directly in its business intelligence
dashboards.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use AWS Glue to create an ML transform to build and train models. Use Amazon OpenSearch Service to visualize the
data.
B. Use Amazon SageMaker to build and train models. Use Amazon QuickSight to visualize the data.
C. Use a pre-built ML Amazon Machine Image (AMI) from the AWS Marketplace to build and train models. Use Amazon
OpenSearch Service to visualize the data.
D. Use Amazon QuickSight to build and train models by using calculated fields. Use Amazon QuickSight to visualize the data.
Solution
Amazon SageMaker is a fully managed service that provides a complete set of tools and capabilities for building, training,
and deploying ML models. It simplifies the end-to-end ML workflow and reduces operational overhead by handling
infrastructure provisioning, model training, and deployment.
To visualize the data and integrate it into business intelligence dashboards, Amazon QuickSight can be used. QuickSight is a
cloud-native business intelligence service that allows users to easily create interactive visualizations, reports, and
dashboards from various data sources, including the augmented data generated by the ML models. === B
Aws data pipeline
-
Managed ETL service for automating movement and transformation of your data.
Data driven workflow to create dependencies between tasks and activities
Several storage integration between DynamoDb, RdS, Redshift & S3.
Compute integrations – EC2
53
-
KEYWORDS – Managed ETL service or auto-retries for data driven workflows.
Use Cases:
-
processing data in EMR using Hadoop stream
importing / exporting data in DynamoDB
copying CSV between s3 buckets
exporting RDS data to S3
copy data to redshift
Amazon MSK
-
Fully managed service for running data streaming applications that leverage Apache kafka
Control plane operations for CRUD of clusters as required
Data planes operations for producing and consuming streaming data
Good for existing apps who need to leverage Kafka
Can ecrypt with KMS for SEE reqs
Question
A company has more than 10,000 sensors that send data to an on-premises Apache Kafka server by using the Message
Queuing Telemetry Transport (MQTT) protocol. The on-premises Kafka server transforms the data and then stores the
results as objects in an Amazon S3 bucket.
Recently, the Kafka server crashed. The company lost sensor data while the server was being restored. A solutions architect
must create a new design on AWS that is highly available and scalable to prevent a similar occurrence.
Which solution will meet these requirements?
A. Launch two Amazon EC2 instances to host the Kafka server in an active/standby configuration across two Availability
Zones. Create a domain name in Amazon Route 53. Create a Route 53 failover policy. Route the sensors to send the data to
the domain name.
B. Migrate the on-premises Kafka server to Amazon Managed Streaming for Apache Kafka (Amazon MSK). Create a Network
Load Balancer (NLB) that points to the Amazon MSK broker. Enable NLB health checks. Route the sensors to send the data
to the NLB.
C. Deploy AWS IoT Core, and connect it to an Amazon Kinesis Data Firehose delivery stream. Use an AWS Lambda function
to handle data transformation. Route the sensors to send the data to AWS IoT Core.
D. Deploy AWS IoT Core, and launch an Amazon EC2 instance to host the Kafka server. Configure AWS IoT Core to send the
data to the EC2 instance. Route the sensors to send the data to AWS IoT Core.
Solution
Those who have selected option B ,Please note that "While Amazon MSK itself does not have built-in ETL (Extract,
Transform, Load) capabilities, it can be used in conjunction with other AWS services to build an ETL pipeline."
There are 3 ask in the question :
1)Pull the sensor data using MQTT->IOT Core has the capabilities to connect to MQTT
2)ETL the data ->We can direct send data from IOT Core to Lambda too but for large scale data Kinesis Data Firehouse is
being used .And Firehouse use Lambda for ETL .
3)Send the data to destination which is S3 . Lambda can do that easily .
OPTION C is the right answer
Question
A company hosts more than 300 global websites and applications. The company requires a platform to analyze more than
30 TB of clickstream data each day.
What should a solutions architect do to transmit and process the clickstream data?
A. Design an AWS Data Pipeline to archive the data to an Amazon S3 bucket and run an Amazon EMR cluster with the data
to generate analytics.
B. Create an Auto Scaling group of Amazon EC2 instances to process the data and send it to an Amazon S3 data lake for
Amazon Redshift to use for analysis.
C. Cache the data to Amazon CloudFront. Store the data in an Amazon S3 bucket. When an object is added to the S3 bucket.
run an AWS Lambda function to process the data for analysis.
54
D. Collect the data from Amazon Kinesis Data Streams. Use Amazon Kinesis Data Firehose to transmit the data to an
Amazon S3 data lake. Load the data in Amazon Redshift for analysis.
Solution
Option D is the most appropriate solution for transmitting and processing the clickstream data in this scenario.
Amazon Kinesis Data Streams is a highly scalable and durable service that enables real-time processing of streaming data at
a high volume and high rate. You can use Kinesis Data Streams to collect and process the clickstream data in real-time.
Amazon Kinesis Data Firehose is a fully managed service that loads streaming data into data stores and analytics tools. You
can use Kinesis Data Firehose to transmit the data from Kinesis Data Streams to an Amazon S3 data lake.
Once the data is in the data lake, you can use Amazon Redshift to load the data and perform analysis on it. Amazon Redshift
is a fully managed, petabyte-scale data warehouse service that allows you to quickly and efficiently analyze data using SQL
and your existing business intelligence tools.
Question
A company is preparing a new data platform that will ingest real-time streaming data from multiple sources. The company
needs to transform the data before writing the data to Amazon S3. The company needs the ability to use SQL to query the
transformed data.
Which solutions will meet these requirements? (Choose two.)
A. Use Amazon Kinesis Data Streams to stream the data. Use Amazon Kinesis Data Analytics to transform the data. Use
Amazon Kinesis Data Firehose to write the data to Amazon S3. Use Amazon Athena to query the transformed data from
Amazon S3.
B. Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to stream the data. Use AWS Glue to transform the
data and to write the data to Amazon S3. Use Amazon Athena to query the transformed data from Amazon S3.
C. Use AWS Database Migration Service (AWS DMS) to ingest the data. Use Amazon EMR to transform the data and to write
the data to Amazon S3. Use Amazon Athena to query the transformed data from Amazon S3.
D. Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to stream the data. Use Amazon Kinesis Data Analytics
to transform the data and to write the data to Amazon S3. Use the Amazon RDS query editor to query the transformed data
from Amazon S3.
E. Use Amazon Kinesis Data Streams to stream the data. Use AWS Glue to transform the data. Use Amazon Kinesis Data
Firehose to write the data to Amazon S3. Use the Amazon RDS query editor to query the transformed data from Amazon S3.
Solution
AB
A uses Kinesis Data Streams for streaming, Kinesis Data Analytics for transformation, Kinesis Data Firehose for writing to S3,
and Athena for SQL queries on S3 data.
B uses Amazon MSK for streaming, AWS Glue for transformation and writing to S3, and Athena for SQL queries on S3 data.
Option C: This option is not ideal for streaming real-time data as AWS DMS is not optimized for real-time data ingestion.
Option D & E: These option are not recommended as the Amazon RDS query editor is not designed for querying data in S3,
and it is not efficient for running complex queries.
Amazon opensearch (Formerly elasticsearch)
Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring,
website search, and more. OpenSearch is an open source, distributed search and analytics suite derived from Elasticsearch.
-
Qucikly ingest, search and analyse data in clusters
Easily scale cluster
Leverage IAM for access control – VPC, encryption at rest / transit
Multi AZ capable
Integrations -
Exam Tip: scenario that talks about a logging solution, viz of lof file analytics
55
Question
A company stores its application logs in an Amazon CloudWatch Logs log group. A new policy requires the company to store
all application logs in Amazon OpenSearch Service (Amazon Elasticsearch Service) in near-real time.
Which solution will meet this requirement with the LEAST operational overhead?
A. Configure a CloudWatch Logs subscription to stream the logs to Amazon OpenSearch Service (Amazon Elasticsearch
Service).
B. Create an AWS Lambda function. Use the log group to invoke the function to write the logs to Amazon OpenSearch
Service (Amazon Elasticsearch Service).
C. Create an Amazon Kinesis Data Firehose delivery stream. Configure the log group as the delivery streams sources.
Configure Amazon OpenSearch Service (Amazon Elasticsearch Service) as the delivery stream's destination.
D. Install and configure Amazon Kinesis Agent on each application server to deliver the logs to Amazon Kinesis Data
Streams. Configure Kinesis Data Streams to deliver the logs to Amazon OpenSearch Service (Amazon Elasticsearch Service).
Solution - C
Using Kinesis Data Firehose will allow near real-time delivery of the CloudWatch logs to Amazon Elasticsearch Service with
the least operational overhead compared to the other options.
Firehose can be configured to automatically ingest data from CloudWatch Logs into Elasticsearch without needing to run
Lambda functions or install agents on the application servers. This makes it the most operationally simple way to meet the
stated requirements.
56
Chapter 15 - Serverless Architecture
Lambda
Serverless Compute service that lets you run code without provisioning servers
-
Lambda function needs to make an API call, need to attach a role
Networking - have to define VPC s subnets etc
Define resources - CPU and ram
What triggers the function – need to define as wont run otherwise.
Cant run for longer than 15 mins.
Questions:
-
https://www.examtopics.com/discussions/amazon/view/116925-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/103598-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/109285-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/116979-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102180-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102144-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102128-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/109405-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/99756-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102178-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/111430-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/109513-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/85816-exam-aws-certified-solutions-architect-associatesaa-c03/
AWS Serverless Application Repository
-
Helps you publish your own custom serverless app
Define whole apps via a AWS SAM template
Deploy and publish
You can search for and deploy serverless applications that have been published to the AWS Serverless Application
Repository.
When you publish a serverless application to the AWS Serverless Application Repository, you make it available for others to
find and deploy.
Container
57
Standard unit of software that packages up code and all its dependencies, so the app runs quickly and reliably from one
computing env to another.
Containers are a key component of modern app development. They have become the standard way to organize compute
resources, and manage the content of your application deployments.
Containers provide a discrete reproducible compute environment. They also provide a way to simplify packaging and
dependency management. From the orchestration of very large multi-cluster estates to web applications - or even testing
your work and doing a proof of concept on your laptop - they are a great way to get started and build software to deploy in
the cloud.
-
Dockerfile – set of instructions to build a container.
Image – contains code, library, dependencies and config needed to run an app.
Registry – stores the docker image
Container – running copy of the image.
ECS/ EKS – running containers
Elastic Container Service – manage many of containers without hosting on a server, keeps them , intergrates with ELBs, can
assign a role & much easier! (only works in AWS)
Elastic Container Registry (ECR) – store a Docker image in registry
Kub – Open source orchestration platform. On-prem & cloud.
EKS – Elastic Kubernetes Service
EKS vs ECS
ECS – ease of use but all in for AWS
EKS – best when not wanting to be fully AWS.
AWS Fargate
Serverless compute engine for containers
58
EC2 vs Fargate
Use EC2 for long running containers
Fargate when don’t want to manage OS, short running (batch processing), isolated environment.
Lambda vs Fargate
Fargate when more consistent workloads, allows docker use across the org & breater level of control for
devlopers
Lambda – better for unpredictable, just AWS , excels with 1 single function
(basically, Do I need to have containers or do I just want to deal with my code?)
Questions:
-
https://www.examtopics.com/discussions/amazon/view/99813-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/109463-exam-aws-certified-solutions-architectassociate-saa-c03/ - C
https://www.examtopics.com/discussions/amazon/view/109664-exam-aws-certified-solutions-architectassociate-saa-c03/ - ECS slightly cheaper than EKS and says only runs on AWS.
Question
A company is designing a containerized application that will use Amazon Elastic Container Service (Amazon ECS). The
application needs to access a shared file system that is highly durable and can recover data to another AWS Region with a
recovery point objective (RPO) of 8 hours. The file system needs to provide a mount target m each Availability Zone within a
Region.
A solutions architect wants to use AWS Backup to manage the replication to another Region.
Which solution will meet these requirements?
A. Amazon FSx for Windows File Server with a Multi-AZ deployment
B. Amazon FSx for NetApp ONTAP with a Multi-AZ deployment
C. Amazon Elastic File System (Amazon EFS) with the Standard storage class
D. Amazon FSx for OpenZFS
Solution
Q: What is Amazon EFS Replication?
EFS Replication can replicate your file system data to another Region or within the same Region without requiring
additional infrastructure or a custom process. Amazon EFS Replication automatically and transparently replicates your data
to a second file system in a Region or AZ of your choice. You can use the Amazon EFS console, AWS CLI, and APIs to activate
replication on an existing file system. EFS Replication is continual and provides a recovery point objective (RPO) and a
recovery time objective (RTO) of minutes, helping you meet your compliance and business continuity goals.
Amazon Eventbridge (AKA Cloudwatch Events)
-
Serverless event bus, allows you to pass events from a source to an endpoint – (the glue that hold your serverless
application together).
Trigger an action based on something that happened in AWS.
Fastest way to respond to things in your env
Amazon EventBridge works with AWS CloudTrail, a service that records actions from AWS services. CloudTrail
captures API calls made by or on behalf of your AWS account from the EventBridge console and to EventBridge
API operations.
Question
An Amazon EventBridge rule targets a third-party API. The third-party API has not received any incoming traffic. A solutions
architect needs to determine whether the rule conditions are being met and if the rule's target is being invoked.
Which solution will meet these requirements?
59
A. Check for metrics in Amazon CloudWatch in the namespace for AWS/Events.
B. Review events in the Amazon Simple Queue Service (Amazon SQS) dead-letter queue.
C. Check for the events in Amazon CloudWatch Logs.
D. Check the trails in AWS CloudTrail for the EventBridge events.
Solution - D
The key reasons:
AWS CloudTrail provides visibility into EventBridge operations by logging API calls made by EventBridge.
Checking the CloudTrail trails will show the PutEvents API calls made when EventBridge rules match an event pattern.
CloudTrail will also log the Invoke API call when the rule target is triggered.
CloudWatch metrics and logs contain runtime performance data but not info on rule evaluation and targeting.
SQS dead letter queues collect failed event deliveries but won't provide insights on successful invocations.
CloudTrail is purpose-built to log operational events and API activity so it can confirm if the EventBridge rule is being
evaluated and triggering the target as expected.
Amazon ECR
-
-
AWS managed container image registry (regional) – that offers secure, scalable and reliable infrastructure.
Amazon Elastic Container Registry (Amazon ECR) is an AWS-managed container image registry service that is
secure, scalable, and reliable. You can store Docker images, Open Container Initiative (OCI) images, and OCIcompatible artifacts.
Need an auth token –
Repo policy - control access per repo basis
Life cycle policies – mgmt. of images , defines rules for cleanup, test rules before applying them, scan images for
vulnerabilities in container, can scan on push and retrieve the report
Can cache.
Can combine with docker on prem, ECS, EKS
Supports docker images, OCI images
Open Source Kubernetes in EKS Distro
-
`Fully managed by you, can run anywhere (aws, on-prem)
Have to upgrade yourself
EKS Anywhere
EKS Anywhere provides an installable software package for creating and operating Kubernetes clusters on-premises and
automation tooling for cluster lifecycle operations. EKS Anywhere is certified Kubernetes conformant, so existing
applications that run on upstream Kubernetes are compatible with EKS Anywhere.
-
On-prem way to manage K8 clusters with same practices used for AWS.
Based on the EKS Distro
Offers full lifecycle mgmt. of mgmt. of multiple k8 clusters.
ECS Anywhere
Amazon Elastic Container Service (ECS) Anywhere is a feature of Amazon ECS that lets you run and manage container
workloads on your infrastructure. This feature helps you meet compliance requirements and scale your business without
sacrificing your on-premises investments.
-
Mgmt. of container based apps on-prem
No need to intall local orchestration container software, more operational efficiency
Completely managed solution
No ELB support for inbound traffic
Need to have SSM Agent, ECS Agent & Docker installed
60
SSM Agent - SSM Agent makes it possible for Systems Manager to update, manage, and configure these resources. The
agent processes requests from the Systems Manager service in the AWS Cloud, and then runs them as specified in the
request.
Question
Due to strict compliance requirements, your company cannot leverage AWS cloud for hosting their Kubernetes clusters, nor
for managing the clusters. However, they do want to try to follow the established best practices and processes that the
Amazon EKS service has implemented. How can your company achieve this while running entirely on-premises?
-
Run Amazon ECS anywhere.
Run the clusters on-premises using Amazon EKS Distro.
Run Amazon EKS.
This cannot be done.
Answer
Amazon EKS is based on the EKS Distro, which allows you to leverage the best practices and established processes onpremises that Amazon EKS uses in AWS. Reference: Amazon EKS Distro
ECS anywhere does not support Kubernetes deployments.
Aurora Serverless
-
On demand , auto scaling version of aurora
Automation of monitoring workloads and adjusting capacity for databases
Capacity based on demand.
Billing is charged only for resources used.
Budget friendly service
6 copies of data across 3AZs
Multi AZ deployments
ACU – Aurora Capacity Units, set min and max, measurement of how your clusters scale – billed per second for used
resources.
Use cases
good for variable workloads
dev and testing/ new apps – rather than guessing capacity
good for capacity planning
Question
61
A company has a web application with sporadic usage patterns. There is heavy usage at the beginning of each month,
moderate usage at the start of each week, and unpredictable usage during the week. The application consists of a web
server and a MySQL database server running inside the data center. The company would like to move the application to the
AWS Cloud, and needs to select a cost-effective database platform that will not require database modifications.
Which solution will meet these requirements?
A. Amazon DynamoDB
B. Amazon RDS for MySQL
C. MySQL-compatible Amazon Aurora Serverless
D. MySQL deployed on Amazon EC2 in an Auto Scaling group
Solution
C: Aurora Serverless is a MySQL-compatible relational database engine that automatically scales compute and memory
resources based on application usage. no upfront costs or commitments required.
A: DynamoDB is a NoSQL
B: Fixed cost on RDS class
D: More operation requires
Question
A company runs an application on AWS. The application receives inconsistent amounts of usage. The application uses AWS
Direct Connect to connect to an on-premises MySQL-compatible database. The on-premises database consistently uses a
minimum of 2 GiB of memory.
The company wants to migrate the on-premises database to a managed AWS service. The company wants to use auto
scaling capabilities to manage unexpected workload increases.
Which solution will meet these requirements with the LEAST administrative overhead?
A. Provision an Amazon DynamoDB database with default read and write capacity settings.
B. Provision an Amazon Aurora database with a minimum capacity of 1 Aurora capacity unit (ACU).
C. Provision an Amazon Aurora Serverless v2 database with a minimum capacity of 1 Aurora capacity unit (ACU).
D. Provision an Amazon RDS for MySQL database with 2 GiB of memory.
Solution
A company runs an application on AWS. The application receives inconsistent amounts of usage. The application uses AWS
Direct Connect to connect to an on-premises MySQL-compatible database. The on-premises database consistently uses a
minimum of 2 GiB of memory.
The company wants to migrate the on-premises database to a managed AWS service. The company wants to use auto
scaling capabilities to manage unexpected workload increases.
Which solution will meet these requirements with the LEAST administrative overhead?
A. Provision an Amazon DynamoDB database with default read and write capacity settings.
B. Provision an Amazon Aurora database with a minimum capacity of 1 Aurora capacity unit (ACU).
C. Provision an Amazon Aurora Serverless v2 database with a minimum capacity of 1 Aurora capacity unit (ACU).
D. Provision an Amazon RDS for MySQL database with 2 GiB of memory. R45
Questions:
-
https://www.examtopics.com/discussions/amazon/view/109608-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/119590-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/99769-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/117272-exam-aws-certified-solutions-architectassociate-saa-c03/
62
-
https://www.examtopics.com/discussions/amazon/view/99948-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/109532-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/102143-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/87570-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/87570-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/95319-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102140-exam-aws-certified-solutions-architectassociate-saa-c03/
-
Amazon X-Ray
Application insight - Service that collects data within your data, gaining insights about requests & responses.
AWS X-Ray provides a complete view of requests as they travel through your application and filters visual data across
payloads, functions, traces, services, APIs, and more with no-code and low-code motions.
Questions
-
GraphQL -
Scalable GraphQL interface, front end app data fetching – good for IOS.
63
64
Chapter 16 - Security
DDOS
Layer 4 DDOS attack (SYN Flood) – at the transport layer (TCP/UDP).
TCP does a 3 way handshake, client sends a SYN packet, server gives a SYN-ACK, then client responds with an ACK.
After complete, tcp connection is complete and app can then send data on the application layer 7 (HTTP)
SYN Flood – sends loads of syn packets and doesn’t wait for the syn ack, the server sends out a SYN acck and reserves a
port for a TCP connection – can eventually get overwhelmed as it has too many unused open connections, preventing legit
requests.
NTP Amplification Attack
Attack may send a 3rd party server (e.g. Netword Time Protocol NTP Server) a request using a spoofed ip address (actually
the taget’s IP). NTP amplifies the payload and sends back to the target IP.
Layer 7 Attack
Web Server receives a flood of Get or POST requests usually from a botnet or large number of compromised computers.
CloudTrail – Logging API calls
“CCTV monitoring for your AWS account”
Records user and resource activity from their Mgmt Console actions and API calls, Source IP and when they were made.
Near real time and you can combine with Lambda.
Logs Stored in S3
What is logged:
-
Metadata around API calls
Src ip of Api caller
Identity of API caller
Time of API call
Request params
Response elements returned from server.
AWS Shield
Free DDOS - on ELB, Cloudfront & R53
Protects against SYN/ UDP floods, and
Layer 3 0r 4 attacks only.
Shield Advanced
-
Enhanced protections from larger attacks
Offers, always-on, flow based monitoring of network traffic and active app monitoring to provide near real time
notifications of DDOS attacks
24/7 DDoS response team on hand
Protects your AWS bill from large resource usage of DDOS attacks
65
AWS WAF (Web App Firewall)
Lets you monitor the HTTP/ HTTPS requests that are forwarded to Cloudfront or Application LB (layer 7).
Configure conditions such as what IP addresses are allowed to makes this request or what query string params need to be
passed.
-
For SQL code or cross-site scripting
Can block countries
Centralising WAF Mgmt via AWS firewall manager
Firewall manager – allows you to centrally set up and manage firewall rules across multiple AWS accounts and app in AWS
organisations.
Can create WAF rules with it – simplifies it, enforces polcies on existing and new.
Question
A media company hosts its website on AWS. The website application’s architecture includes a fleet of Amazon EC2
instances behind an Application Load Balancer (ALB) and a database that is hosted on Amazon Aurora. The company’s
cybersecurity team reports that the application is vulnerable to SQL injection.
How should the company resolve this issue?
A. Use AWS WAF in front of the ALB. Associate the appropriate web ACLs with AWS WAF.
B. Create an ALB listener rule to reply to SQL injections with a fixed response.
C. Subscribe to AWS Shield Advanced to block all SQL injection attempts automatically.
D. Set up Amazon Inspector to block all SQL injection attempts automatically.
Solution
° Use AWS WAF in front of the Application Load Balancer
° Configure appropriate WAF web ACLs to detect and block SQL injection patterns
The key points:
° Website hosted on EC2 behind an ALB with Aurora database
° Application is vulnerable to SQL injection attacks
° AWS WAF is designed to detect and block SQL injection and other common web exploits. It can be placed in front of the
ALB to inspect all incoming requests. WAF rules can identify malicious SQL patterns and block them.
Question
A company has implemented a self-managed DNS service on AWS. The solution consists of the following:
• Amazon EC2 instances in different AWS Regions
• Endpoints of a standard accelerator in AWS Global Accelerator
The company wants to protect the solution against DDoS attacks.
What should a solutions architect do to meet this requirement?
A. Subscribe to AWS Shield Advanced. Add the accelerator as a resource to protect.
B. Subscribe to AWS Shield Advanced. Add the EC2 instances as resources to protect.
C. Create an AWS WAF web ACL that includes a rate-based rule. Associate the web ACL with the accelerator.
D. Create an AWS WAF web ACL that includes a rate-based rule. Associate the web ACL with the EC2 instances.
66
Solution
AWS Shield is a managed service that provides protection against Distributed Denial of Service (DDoS) attacks for
applications running on AWS. AWS Shield Standard is automatically enabled to all AWS customers at no additional cost.
AWS Shield Advanced is an optional paid service. AWS Shield Advanced provides additional protections against more
sophisticated and larger attacks for your applications running on Amazon Elastic Compute Cloud (EC2), Elastic Load
Balancing (ELB), Amazon CloudFront, AWS Global Accelerator, and Route 53.
Question:
A solutions architect must design a highly available infrastructure for a website. The website is powered by Windows web
servers that run on Amazon EC2 instances. The solutions architect must implement a solution that can mitigate a large-scale
DDoS attack that originates from thousands of IP addresses. Downtime is not acceptable for the website.
Which actions should the solutions architect take to protect the website from such an attack? (Choose two.)
A. Use AWS Shield Advanced to stop the DDoS attack.
B. Configure Amazon GuardDuty to automatically block the attackers.
C. Configure the website to use Amazon CloudFront for both static and dynamic content.
D. Use an AWS Lambda function to automatically add attacker IP addresses to VPC network ACLs.
E. Use EC2 Spot Instances in an Auto Scaling group with a target tracking scaling policy that is set to 80% CPU utilization.
Solution:
Option A. Use AWS Shield Advanced to stop the DDoS attack.
It provides always-on protection for Amazon EC2 instances, Elastic Load Balancers, and Amazon Route 53 resources. By
using AWS Shield Advanced, the solutions architect can help protect the website from large-scale DDoS attacks.
Option C. Configure the website to use Amazon CloudFront for both static and dynamic content.
CloudFront is a content delivery network (CDN) that integrates with other Amazon Web Services products, such as Amazon
S3 and Amazon EC2, to deliver content to users with low latency and high data transfer speeds. By using CloudFront, the
solutions architect can distribute the website's content across multiple edge locations, which can help absorb the impact of
a DDoS attack and reduce the risk of downtime for the website.
awS guardDuty
Threat detection service that uses ML to continuously monitor for malicious behaviour.
-
Port scanning, failed logins
Compromised instances
Unauth deployments
Unusual Api calls
Monitors vpc flow logs, cloudtrail and DNS logs
Updates Db of known malicious domains.
Centralises threat detection across multiple aws accounts, automated response using cloudwatch events & lambda
Macie
Amazon Macie is designed specifically for discovering and classifying sensitive data like personally identifiable information
PII in S3. This makes it the optimal service to use.
Sent to eventbridge or other SIEM tools
AWS inspector
67
Amazon Inspector is an automated vulnerability management service that continually scans AWS workloads for software
vulnerabilities and unintended network exposure on EC2s and VPCs
-
Network assessment – network config analysis to checks for ports reachable from outside of VPC.
Host assessment – vulnerable software assessments, host hardening & security best practices
Question:
A company has deployed its newest product on AWS. The product runs in an Auto Scaling group behind a Network Load
Balancer. The company stores the product’s objects in an Amazon S3 bucket.
The company recently experienced malicious attacks against its systems. The company needs a solution that continuously
monitors for malicious activity in the AWS account, workloads, and access patterns to the S3 bucket. The solution must also
report suspicious activity and display the information on a dashboard.
Which solution will meet these requirements?
A. Configure Amazon Macie to monitor and report findings to AWS Config.
B. Configure Amazon Inspector to monitor and report findings to AWS CloudTrail.
C. Configure Amazon GuardDuty to monitor and report findings to AWS Security Hub.
D. Configure AWS Config to monitor and report findings to Amazon EventBridge.
Solution:
Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized
behavior. It analyzes AWS CloudTrail, VPC Flow Logs, and DNS logs.
GuardDuty can detect threats like instance or S3 bucket compromise, malicious IP addresses, or unusual API calls.
Findings can be sent to AWS Security Hub which provides a centralized security dashboard and alerts.
Amazon Macie and Amazon Inspector do not monitor the breadth of activity that GuardDuty does. They focus more on
data security and application vulnerabilities respectively.
AWS Config monitors for resource configuration changes, not malicious activity.
Question:
An IAM user made several configuration changes to AWS resources in their company's account during a production
deployment last week. A solutions architect learned that a couple of security group rules are not configured as desired. The
solutions architect wants to confirm which IAM user was responsible for making changes.
Which service should the solutions architect use to find the desired information?
A. Amazon GuardDuty
B. Amazon Inspector
C. AWS CloudTrail
D. AWS Config
Solution:
C. AWS CloudTrail
The best option is to use AWS CloudTrail to find the desired information. AWS CloudTrail is a service that enables
governance, compliance, operational auditing, and risk auditing of AWS account activities. CloudTrail can be used to log all
changes made to resources in an AWS account, including changes made by IAM users, EC2 instances, AWS management
console, and other AWS services. By using CloudTrail, the solutions architect can identify the IAM user who made the
configuration changes to the security group rules.
Question:
A security audit reveals that Amazon EC2 instances are not being patched regularly. A solutions architect needs to provide a
solution that will run regular security scans across a large fleet of EC2 instances. The solution should also patch the EC2
instances on a regular schedule and provide a report of each instance’s patch status.
68
Which solution will meet these requirements?
A. Set up Amazon Macie to scan the EC2 instances for software vulnerabilities. Set up a cron job on each EC2 instance to
patch the instance on a regular schedule.
B. Turn on Amazon GuardDuty in the account. Configure GuardDuty to scan the EC2 instances for software vulnerabilities.
Set up AWS Systems Manager Session Manager to patch the EC2 instances on a regular schedule.
C. Set up Amazon Detective to scan the EC2 instances for software vulnerabilities. Set up an Amazon EventBridge scheduled
rule to patch the EC2 instances on a regular schedule.
D. Turn on Amazon Inspector in the account. Configure Amazon Inspector to scan the EC2 instances for software
vulnerabilities. Set up AWS Systems Manager Patch Manager to patch the EC2 instances on a regular schedule.
Solution:
Amazon Inspector is a security assessment service that automatically assesses applications for vulnerabilities or deviations
from best practices. It can be used to scan the EC2 instances for software vulnerabilities. AWS Systems Manager Patch
Manager can be used to patch the EC2 instances on a regular schedule. Together, these services can provide a solution that
meets the requirements of running regular security scans and patching EC2 instances on a regular schedule. Additionally,
Patch Manager can provide a report of each instance’s patch status.
Question:
A solutions architect needs to review a company's Amazon S3 buckets to discover personally identifiable information (PII).
The company stores the PII data in the us-east-1 Region and us-west-2 Region.
Which solution will meet these requirements with the LEAST operational overhead?
A. Configure Amazon Macie in each Region. Create a job to analyze the data that is in Amazon S3.
B. Configure AWS Security Hub for all Regions. Create an AWS Config rule to analyze the data that is in Amazon S3.
C. Configure Amazon Inspector to analyze the data that is in Amazon S3.
D. Configure Amazon GuardDuty to analyze the data that is in Amazon S3.
Solution:
Amazon Macie is designed specifically for discovering and classifying sensitive data like PII in S3. This makes it the optimal
service to use.
Macie can be enabled directly in the required Regions rather than enabling it across all Regions which is unnecessary. This
minimizes overhead.
Macie can be set up to automatically scan the specified S3 buckets on a schedule. No need to create separate jobs.
Security Hub is for security monitoring across AWS accounts, not specific for PII discovery. More overhead than needed.
Inspector and GuardDuty are not built for PII discovery in S3 buckets. They provide broader security capabilities.
Key Management Service KMS
Managed service that makes it easy for you to create and control the encryption keys used to encrypt your data.
Integrates with EBS, S3 & RDS & others
Cmk Customer Master Key is logical representation of a master key – includes metadata such as key id, creation date, key
state.
HSM – Hardware security module is a physical computing device that safeguards and manages digital keys and performs
encryption and decrypt functions
Policies best way to manage a KMS, must attach a resource based policy to CMKs (key policy).
Exam Tips:
You start using the service requesting the creation of CMK – can control the lifecycle of the CMK.
What is the minimum length of time before you can schedule a KMS key to be deleted? 7 days
69
3 ways to generate a CMK
1. AWS creates the CMK, the key material for a CMK is generated in a HSM managed by AWS KMS.
2. Import key material from key mgmt. infra and assoc with cmk
3. Have the key material generated and used in an AWS cloudHSm cluster as part of custom key store.
3 ways to Control perms
1. use the key policy,
2. IAM policy in combo with key policy
3. use grants in combination with the key policy
Q, A company stores confidential data in an Amazon Aurora PostgreSQL database in the ap-southeast-3 Region. The
database is encrypted with an AWS Key Management Service (AWS KMS) customer managed key. The company was
recently acquired and must securely share a backup of the database with the acquiring company’s AWS account in apsoutheast-3.
What should a solutions architect do to meet these requirements?
A. Create a database snapshot. Copy the snapshot to a new unencrypted snapshot. Share the new snapshot with the
acquiring company’s AWS account.
B. Create a database snapshot. Add the acquiring company’s AWS account to the KMS key policy. Share the snapshot with
the acquiring company’s AWS account.
C. Create a database snapshot that uses a different AWS managed KMS key. Add the acquiring company’s AWS account to
the KMS key alias. Share the snapshot with the acquiring company's AWS account.
D. Create a database snapshot. Download the database snapshot. Upload the database snapshot to an Amazon S3 bucket.
Update the S3 bucket policy to allow access from the acquiring company’s AWS account.
A,
To securely share a backup of the database with the acquiring company's AWS account in the same Region, a solutions
architect should create a database snapshot, add the acquiring company's AWS account to the AWS KMS key policy, and
share the snapshot with the acquiring company's AWS account.
Option A, creating an unencrypted snapshot, is not recommended as it will compromise the confidentiality of the data.
Option C, creating a snapshot that uses a different AWS managed KMS key, does not provide any additional security and will
unnecessarily complicate the solution. Option D, downloading the database snapshot and uploading it to an S3 bucket, is
not secure as it can expose the data during transit.
Therefore, the correct option is B: Create a database snapshot. Add the acquiring company's AWS account to the KMS key
policy. Share the snapshot with the acquiring company's AWS account.
Q,
A company is using AWS Key Management Service (AWS KMS) keys to encrypt AWS Lambda environment variables. A
solutions architect needs to ensure that the required permissions are in place to decrypt and use the environment
variables.
Which steps must the solutions architect take to implement the correct permissions? (Choose two.)
A. Add AWS KMS permissions in the Lambda resource policy.
B. Add AWS KMS permissions in the Lambda execution role.
C. Add AWS KMS permissions in the Lambda function policy.
D. Allow the Lambda execution role in the AWS KMS key policy.
E. Allow the Lambda resource policy in the AWS KMS key policy
A,
To decrypt environment variables encrypted with AWS KMS, Lambda needs to be granted permissions to call KMS APIs. This
is done in two places:
The Lambda execution role needs kms:Decrypt and kms:GenerateDataKey permissions added. The execution role governs
what AWS services the function code can access.
The KMS key policy needs to allow the Lambda execution role to have kms:Decrypt and kms:GenerateDataKey permissions
for that specific key. This allows the execution role to use that particular key. - BD
70
More Questions:
-
https://www.examtopics.com/discussions/amazon/view/85942-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/111385-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/102187-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/102135-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/99702-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/99705-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/95325-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/100232-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/87582-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/84747-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/102125-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/109398-exam-aws-certified-solutions-architectassociate-saa-c03/
https://www.examtopics.com/discussions/amazon/view/99790-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/99790-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/exams/amazon/aws-certified-solutions-architect-associate-saa-c03/view/22/
https://www.examtopics.com/discussions/amazon/view/85993-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/95040-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/85941-exam-aws-certified-solutions-architect-associatesaa-c03/
https://www.examtopics.com/discussions/amazon/view/111246-exam-aws-certified-solutions-architectassociate-saa-c03/
^This needs revisiting
AWS Secrets Manager
Securely stores, encrypts and rotates db credentials & other secrets.
Encryption in transit and at rest using KMS
Auto rotates creds
Apply fine grained access control using IAM
Costs money but is highly scalable.
App makes API call to SM to retrieve the secret programmatically, reduces risk of creds being exposed (not hard coded in).
RDS Creds, Non RDS creds, SSH Keys, API Keys
Exam Topic – Auto rotation issue, SM immediately auto rotates creds to test if it works – ensure all your apps that use the
creds are updated to retrieve the creds from this secret manager (not hard coded in).
71
Parameter store
Capability of AWS systems manager that provides secure, hierarchial storage for config data mgmt. and SM .
-
Can store pwords, db strings, AMI Ids, licence codes etc (as plain text or encrypted)
SM or PS
-
PS is free
If you need key rotation, more than 10k params or gen pwords with Cloudformation – use SM
IAM & KMS
https://docs.aws.amazon.com/kms/latest/developerguide/iam-policies.html
Presigned URLs
S3 objects private by default and object owner person who has access by default. They can optionally share the objects
with other by using a pre-signed url directly to S3, they use own security creds to grant time-limited perms to download
the objects.
-
Must provide security credentials
Bucket name / object key
Indicate HTTP method
Expiration date & time
Exam Scenario: want to share a file in a private bucket, use PS Url. If you want to share multiple files, use presigned cookies
which will reside on users PC.
QuestioN:
A social media company is building a feature for its website. The feature will give users the ability to upload photos. The
company expects significant increases in demand during large events and must ensure that the website can handle the
upload traffic from users.
Which solution meets these requirements with the MOST scalability?
A. Upload files from the user's browser to the application servers. Transfer the files to an Amazon S3 bucket.
B. Provision an AWS Storage Gateway file gateway. Upload files directly from the user's browser to the file gateway.
C. Generate Amazon S3 presigned URLs in the application. Upload files directly from the user's browser into an S3 bucket.
D. Provision an Amazon Elastic File System (Amazon EFS) file system. Upload files directly from the user's browser to the file
system.
Answer:
C is the best solution to meet the scalability requirements.
Generating S3 presigned URLs allows users to upload directly to S3 instead of application servers. This removes the
application servers as a bottleneck for upload traffic.
S3 can scale to handle very high volumes of uploads with no limits on storage or throughput. Using presigned URLs
leverages this scalability.
Question:
72
A media company uses Amazon CloudFront for its publicly available streaming video content. The company wants to secure
the video content that is hosted in Amazon S3 by controlling who has access. Some of the company’s users are using a
custom HTTP client that does not support cookies. Some of the company’s users are unable to change the hardcoded URLs
that they are using for access.
Which services or methods will meet these requirements with the LEAST impact to the users? (Choose two.)
A. Signed cookies
B. Signed URLs
C. AWS AppSync
D. JSON Web Token (JWT)
E. AWS Secrets Manager
Solution:
I thought that option A was totally wrong, because the question mentions "HTTP client does not support cookies". However
it is right, along with option B. Check the link bellow, first paragraph.
https://aws.amazon.com/blogs/media/secure-content-using-cloudfront-functions/
IAM Policies
Json doc that defines perms
-
Identity policy – users and groups
Resource policy – applied to S3 buckets, KMS or CMKs
Have no effect until attached
ARN – Amazon Resource Name, uniquely identify AWS resources:
arn:partition:service:region:account-id:resource-type:resource-id
Examples
IAM user: arn:aws:iam::123456789012:user/johndoe
SNS topic: arn:aws:sns:us-east-1:123456789012:example-sns-topic-name
VPC: arn:aws:ec2:us-east-1:123456789012:vpc/vpc-0e9801d129EXAMPLE
73
Figure 1 Policy Doc is a list of statements, each statement matches an API request.
-
Effect is either allow / deny.
Match based on their action.
Resource is the action its against.
Permission Boundaries
Used to delegate admin to other users, limits the amount of perms a user can give to various resources – controls Max
perms an IAM policy can grant.
To have a play: https://awspolicygen.s3.amazonaws.com/policygen.html
IAM Policies vs IAM Roles
An AWS IAM policy regulates access to AWS resources to help ensure that only authorized users have access to specific
digital assets. Permissions defined within a policy either allow or deny access for the user to perform an action on a specific
resource.
IAM identities provide access to resources under specific conditions within an AWS account. The purpose of an identity is
not only to provide access to resources, but to ensure that users are authenticated and authorized so that your digital
resources can be properly managed and remain secure. Identities come in three varieties: users, groups, and roles.
Once a role is created, it can be assigned to as many individuals as needed. This makes roles particularly useful when
assigning permissions to new users or changing permissions to users who have shifted jobs within their organization.
The difference between IAM roles and policies in AWS is that a role is a type of IAM identity that can be authenticated and
authorized to utilize an AWS resource, whereas a policy defines the permissions of the IAM identity.
Apply least-privilege permissions
When you set permissions with IAM policies, grant only the permissions required to perform a task. You do this by defining
the actions that can be taken on specific resources under specific conditions, also known as least-privilege permissions. You
might start with broad permissions while you explore the permissions that are required for your workload or use case. As
74
your use case matures, you can work to reduce the permissions that you grant to work toward least privilege. For more
information about using IAM to apply permissions, see Policies and permissions in IAM.
Exam Tips:
Not explicitly allowed == implicitly denied.
Explicit deny > everything else
Only attached policies have an effect
Question
A solutions architect has created two IAM policies: Policy1 and Policy2. Both policies are attached to an IAM group.
A cloud engineer is added as an IAM user to the IAM group. Which action will the cloud engineer be able to perform?
A. Deleting IAM users
B. Deleting directories
C. Deleting Amazon EC2 instances
D. Deleting logs from Amazon CloudWatch Logs
Solution:
ec2:* Allows full control of EC2 instances, so C is correct
The policy only grants get and list permission on IAM users, so not A
ds:Delete deny denies delete-directory, so not B, see
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ds/index.html
The policy only grants get and describe permission on logs, so not D
Question:
A company is expecting rapid growth in the near future. A solutions architect needs to configure existing users and grant
permissions to new users on AWS. The solutions architect has decided to create IAM groups. The solutions architect will
add the new users to IAM groups based on department.
Which additional action is the MOST secure way to grant permissions to the new users?
A. Apply service control policies (SCPs) to manage access permissions
B. Create IAM roles that have least privilege permission. Attach the roles to the IAM groups
C. Create an IAM policy that grants least privilege permission. Attach the policy to the IAM groups
D. Create IAM roles. Associate the roles with a permissions boundary that defines the maximum permissions
Solution:
75
An IAM policy is an object in AWS that, when associated with an identity or resource, defines their permissions. Permissions
in the policies determine whether a request is allowed or denied. You manage access in AWS by creating policies and
attaching them to IAM identities (users, groups of users, or roles) or AWS resources.
So, option B will also work.
I think because you can be more specific for each user ,
Question
A solutions architect needs to securely store a database user name and password that an application uses to access an
Amazon RDS DB instance. The application that accesses the database runs on an Amazon EC2 instance. The solutions
architect wants to create a secure parameter in AWS Systems Manager Parameter Store.
What should the solutions architect do to meet this requirement?
A. Create an IAM role that has read access to the Parameter Store parameter. Allow Decrypt access to an AWS Key
Management Service (AWS KMS) key that is used to encrypt the parameter. Assign this IAM role to the EC2 instance.
B. Create an IAM policy that allows read access to the Parameter Store parameter. Allow Decrypt access to an AWS Key
Management Service (AWS KMS) key that is used to encrypt the parameter. Assign this IAM policy to the EC2 instance.
C. Create an IAM trust relationship between the Parameter Store parameter and the EC2 instance. Specify Amazon RDS as a
principal in the trust policy.
D. Create an IAM trust relationship between the DB instance and the EC2 instance. Specify Systems Manager as a principal
in the trust policy.
Solution
CORRECT Option A
To securely store a database user name and password in AWS Systems Manager Parameter Store and allow an application
running on an EC2 instance to access it, the solutions architect should create an IAM role that has read access to the
Parameter Store parameter and allow Decrypt access to an AWS KMS key that is used to encrypt the parameter. The
solutions architect should then assign this IAM role to the EC2 instance.
This approach allows the EC2 instance to access the parameter in the Parameter Store and decrypt it using the specified
KMS key while enforcing the necessary security controls to ensure that the parameter is only accessible to authorized
parties.
Option B, would not be sufficient, as IAM policies cannot be directly attached to EC2 instances.
Option C, would not be a valid solution, as the Parameter Store parameter and the EC2 instance are not entities that can be
related through an IAM trust relationship.
Option D, would not be a valid solution, as the trust policy would not allow the EC2 instance to access the parameter in the
Parameter Store or decrypt it using the specified KMS key.
AWS Certificate Manager
Cm – allows you to create, manage and deploy pub/priv SSL certs for use with other aws services.
Integrates with other services – ELB, cloudfront distributions and API Gateway – allowing you to easily manage and deploy
SSL certs in your AWS env.
-
Its free
Auto renewal and deployment
Audit Manager
Continually audit your AWS usage to make sure you stay compliant with industry standards and regs.
Automated service that produces reports specific to auditors for PCI compliance, GDPR and more.
76
AWS Artifact
Single source you can visit to get compliance related info that matters to you, such as AWS security and compliance reports.
E.g. ISO reports
Normally a distractor
Amazon Cognito (Comes up a lot)
With Amazon Cognito, you can add user sign-up and sign-in features and control access to your web and mobile
applications. Amazon Cognito provides an identity store that scales to millions of users, supports social and enterprise
identity federation, and offers advanced security features to protect your consumers and business. Built on open identity
standards, Amazon Cognito supports various compliance regulations and integrates with frontend and backend
development resources.
Provides authentication, authorisation and user mgmt. for your web and mobile apps in a single service without the need
for custom code. Can use 3rd party login like Facebook/ google.
Recommended for all mobile apps.
Question:
A company wants to restrict access to the content of one of its main web applications and to protect the content by using
authorization techniques available on AWS. The company wants to implement a serverless architecture and an
authentication solution for fewer than 100 users. The solution needs to integrate with the main web application and serve
web content globally. The solution must also scale as the company's user base grows while providing the lowest login
latency possible.
Which solution will meet these requirements MOST cost-effectively?
A. Use Amazon Cognito for authentication. Use Lambda@Edge for authorization. Use Amazon CloudFront to serve the web
application globally.
B. Use AWS Directory Service for Microsoft Active Directory for authentication. Use AWS Lambda for authorization. Use an
Application Load Balancer to serve the web application globally.
C. Use Amazon Cognito for authentication. Use AWS Lambda for authorization. Use Amazon S3 Transfer Acceleration to
serve the web application globally.
D. Use AWS Directory Service for Microsoft Active Directory for authentication. Use Lambda@Edge for authorization. Use
AWS Elastic Beanstalk to serve the web application globally.
Solution
Amazon Cognito is a serverless authentication service that can be used to easily add user sign-up and authentication to web
and mobile apps. It is a good choice for this scenario because it is scalable and can handle a small number of users without
any additional costs.
Lambda@Edge is a serverless compute service that can be used to run code at the edge of the AWS network. It is a good
choice for this scenario because it can be used to perform authorization checks at the edge, which can improve the login
latency.
Amazon CloudFront is a content delivery network (CDN) that can be used to serve web content globally. It is a good choice
for this scenario because it can cache web content closer to users, which can improve the performance of the web
application.
User pools – User directories that provide sign-up/ sign-in options for users of your app.
An Amazon Cognito user pool is a user directory for web and mobile app authentication and authorization. From the
perspective of your app, an Amazon Cognito user pool is an OpenID Connect (OIDC) identity provider (IdP). A user pool
77
adds layers of additional features for security, identity federation, app integration, and customization of the user
experience.
You can, for example, verify that your users’ sessions are from trusted sources. You can combine the Amazon Cognito
directory with an external identity provider. With your preferred AWS SDK, you can choose the API authorization model that
works best for your app. And you can add AWS Lambda functions that modify or overhaul the default behavior of Amazon
Cognito.
Identity pools – Allows your users to access other AWS services
An Amazon Cognito identity pool is a directory of federated identities that you can exchange for AWS credentials. Identity
pools generate temporary AWS credentials for the users of your app, whether they’ve signed in or you haven’t identified
them yet. With AWS Identity and Access Management (IAM) roles and policies, you can choose the level of permission that
you want to grant to your users. Users can start out as guests and retrieve assets that you keep in AWS service
Question
company is hosting a web application from an Amazon S3 bucket. The application uses Amazon Cognito as an identity
provider to authenticate users and return a JSON Web Token (JWT) that provides access to protected resources that are
stored in another S3 bucket.
Upon deployment of the application, users report errors and are unable to access the protected content. A solutions
architect must resolve this issue by providing proper permissions so that users can access the protected content.
Which solution meets these requirements?
A. Update the Amazon Cognito identity pool to assume the proper IAM role for access to the protected content.
B. Update the S3 ACL to allow the application to access the protected content.
C. Redeploy the application to Amazon S3 to prevent eventually consistent reads in the S3 bucket from affecting the ability
of users to access the protected content.
D. Update the Amazon Cognito pool to use custom attribute mappings within the identity pool and grant users the proper
permissions to access the protected content.
Solution
To resolve the issue and provide proper permissions for users to access the protected content, the recommended solution
is:
A. Update the Amazon Cognito identity pool to assume the proper IAM role for access to the protected content.
78
Explanation:
Amazon Cognito provides authentication and user management services for web and mobile applications.
In this scenario, the application is using Amazon Cognito as an identity provider to authenticate users and obtain JSON Web
Tokens (JWTs).
The JWTs are used to access protected resources stored in another S3 bucket.
To grant users access to the protected content, the proper IAM role needs to be assumed by the identity pool in Amazon
Cognito.
By updating the Amazon Cognito identity pool with the appropriate IAM role, users will be authorized to access the
protected content in the S3 bucket.
Option D is incorrect because updating custom attribute mappings in Amazon Cognito will not directly grant users the
proper permissions to access the protected content.
Question
A company hosts its application on AWS. The company uses Amazon Cognito to manage users. When users log in to the
application, the application fetches required data from Amazon DynamoDB by using a REST API that is hosted in Amazon
API Gateway. The company wants an AWS managed solution that will control access to the REST API to reduce development
efforts.
Which solution will meet these requirements with the LEAST operational overhead?
A. Configure an AWS Lambda function to be an authorizer in API Gateway to validate which user made the request.
B. For each user, create and assign an API key that must be sent with each request. Validate the key by using an AWS
Lambda function.
C. Send the user’s email address in the header with every request. Invoke an AWS Lambda function to validate that the user
with that email address has proper access.
D. Configure an Amazon Cognito user pool authorizer in API Gateway to allow Amazon Cognito to validate each request.
Solution
Option D is the best solution with the least operational overhead:
Configure an Amazon Cognito user pool authorizer in API Gateway to allow Amazon Cognito to validate each request.
The key reasons are:
º Cognito user pool authorizers allow seamless integration between Cognito and API Gateway for access control.
º API Gateway handles validating the access tokens from Cognito automatically without any custom code.
º This is a fully managed solution with minimal ops overhead.
Amazon Detective
Operates across multiuple AWS services - Analyse, investigate and quickly identify the root cause of potential security
issues or suspicious activities.
Detective pulls data in from AWS resources and uses ML to help you find the root cause of security issues.
VPC, cloudtrail, EKS
Exam Tip:
usually a distractor , just know what it does or the words ‘Root casue’.
Not to be confused with AWS Inspector – automates vulnerability mgmt. service that scans EC2s and container
workloads for software vulnerabilities and unintended workload exposure.
AWS Network Firewall
Managed service to deploy physical firewall across VPCS, aws
79
Exam Tip: Qs about filtering your network traffic before it reaches your internet gateway , or if you require Intrusion
prevention systems or hardware firewall requirements.
Question:
A company recently migrated to AWS and wants to implement a solution to protect the traffic that flows in and out of
the production VPC. The company had an inspection server in its on-premises data center. The inspection server
performed specific operations such as traffic flow inspection and traffic filtering. The company wants to have the same
functionalities in the AWS Cloud.
Which solution will meet these requirements?
A. Use Amazon GuardDuty for traffic inspection and traffic filtering in the production VPC.
B. Use Traffic Mirroring to mirror traffic from the production VPC for traffic inspection and filtering.
C. Use AWS Network Firewall to create the required rules for traffic inspection and traffic filtering for the production VPC.
D. Use AWS Firewall Manager to create the required rules for traffic inspection and traffic filtering for the production
VPC.
Solution:
AWS Network Firewall is a managed firewall service that provides filtering for both inbound and outbound network traffic.
It allows you to create rules for traffic inspection and filtering, which can help protect your production VPC.
- AWS Network Firewall is a managed network security service that provides stateful inspection of traffic and allows you to
define firewall rules to control the traffic flow in and out of your VPC.
- With AWS Network Firewall, you can create custom rule groups to define specific operations for traffic inspection and
filtering.
- It can perform deep packet inspection and filtering at the network level to enforce security policies, block malicious traffic,
and allow or deny traffic based on defined rules.
- By integrating AWS Network Firewall with the production VPC, you can achieve similar functionalities as the on-premises
inspection server, performing traffic flow inspection and filtering.
AWS Security Hub
Single place to View all security alerts from services like Guard duty , inspector, macie & firewall manager .
AWS Network Firewall vs AWS Firewall Manager (easy slip up)
NF - A managed service that makes it easy to deploy physical firewall protection across your VPCs via its managed
infrastructure (e.g., a physical firewall that is managed by AWS).
FM - A security management service that allows you to centrally configure and manage firewall rules across your accounts
and applications.
80
Chapter 17 – Automation
Manual steps – lead to errors.
Always automate over manual
CloudFormation
Infrastructure as code, supports JSON or YAML.
- Params – questions user is asked when the template comes online.
- Mappings – values that fill themselves in
- Resource – where all the resources go
Cross-region issues – can be from hard coded IDs, ami ids from us-east-1 won’t work in eu-west-1
A company’s application is having performance issues. The application is stateful and needs to complete in-memory tasks
on Amazon EC2 instances. The company used AWS CloudFormation to deploy infrastructure and used the M5 EC2 instance
family. As traffic increased, the application performance degraded. Users are reporting delays when the users attempt to
access the application.
Which solution will resolve these issues in the MOST operationally efficient way?
A. Replace the EC2 instances with T3 EC2 instances that run in an Auto Scaling group. Make the changes by using the AWS
Management Console.
B. Modify the CloudFormation templates to run the EC2 instances in an Auto Scaling group. Increase the desired capacity
and the maximum capacity of the Auto Scaling group manually when an increase is necessary.
C. Modify the CloudFormation templates. Replace the EC2 instances with R5 EC2 instances. Use Amazon CloudWatch builtin EC2 memory metrics to track the application performance for future capacity planning.
D. Modify the CloudFormation templates. Replace the EC2 instances with R5 EC2 instances. Deploy the Amazon
CloudWatch agent on the EC2 instances to generate custom application latency metrics for future capacity planning.
Answer:
R5 instances are better optimized for the in-memory workload than M5.
Auto Scaling alone doesn't handle stateful applications well, manual capacity adjustments would still be needed.
Custom latency metrics give better visibility than built-in metrics for capacity planning.
"in-memory tasks" => need the "R" EC2 instance type to archive memory optimization. So we are concerned about C & D.
Because EC2 instances don't have built-in memory metrics to CW by default. As a result, we have to install the CW agent to
archive the purpose.]
Question:
A solutions architect is using an AWS CloudFormation template to deploy a three-tier web application. The web application
consists of a web tier and an application tier that stores and retrieves user data in Amazon DynamoDB tables. The web and
application tiers are hosted on Amazon EC2 instances, and the database tier is not publicly accessible. The application EC2
instances need to access the DynamoDB tables without exposing API credentials in the template.
What should the solutions architect do to meet these requirements?
A. Create an IAM role to read the DynamoDB tables. Associate the role with the application instances by referencing an
instance profile.
B. Create an IAM role that has the required permissions to read and write from the DynamoDB tables. Add the role to the
EC2 instance profile, and associate the instance profile with the application instances.
C. Use the parameter section in the AWS CloudFormation template to have the user input access and secret keys from an
already-created IAM user that has the required permissions to read and write from the DynamoDB tables.
D. Create an IAM user in the AWS CloudFormation template that has the required permissions to read and write from the
DynamoDB tables. Use the GetAtt function to retrieve the access and secret keys, and pass them to the application
instances through the user data.
81
Answer:
Option B is the correct approach to meet the requirements:
Create an IAM role with permissions to access DynamoDB
Add the IAM role to an EC2 Instance Profile
Associate the Instance Profile with the application EC2 instances
This allows the instances to assume the IAM role to obtain temporary credentials to access DynamoDB.
Question:
A company runs an application on Amazon EC2 instances. The company needs to implement a disaster recovery (DR)
solution for the application. The DR solution needs to have a recovery time objective (RTO) of less than 4 hours. The DR
solution also needs to use the fewest possible AWS resources during normal operations.
Which solution will meet these requirements in the MOST operationally efficient way?
A. Create Amazon Machine Images (AMIs) to back up the EC2 instances. Copy the AMIs to a secondary AWS Region.
Automate infrastructure deployment in the secondary Region by using AWS Lambda and custom scripts.
B. Create Amazon Machine Images (AMIs) to back up the EC2 instances. Copy the AMIs to a secondary AWS Region.
Automate infrastructure deployment in the secondary Region by using AWS CloudFormation.
C. Launch EC2 instances in a secondary AWS Region. Keep the EC2 instances in the secondary Region active at all times.
D. Launch EC2 instances in a secondary Availability Zone. Keep the EC2 instances in the secondary Availability Zone active at
all times.
Answer:
Option A: Add complexity and management overhead.
Option B: Creating AMIs for backup and using AWS CloudFormation for infrastructure deployment in the secondary Region
is a more streamlined and automated approach. CloudFormation allows you to define and provision resources in a
declarative manner, making it easier to maintain and update your infrastructure. This solution is more operationally
efficient compared to Option A.
Option C: could be expensive and not fully aligned with the requirement of using the fewest possible AWS resources during
normal operations.
Option D: might not be sufficient for meeting the DR requirements, as Availability Zones are still within the same AWS
Region and might be subject to the same regional-level failures.
Question:
A company is making a prototype of the infrastructure for its new website by manually provisioning the necessary
infrastructure. This infrastructure includes an Auto Scaling group, an Application Load Balancer and an Amazon RDS
database. After the configuration has been thoroughly validated, the company wants the capability to immediately deploy
the infrastructure for development and production use in two Availability Zones in an automated fashion.
What should a solutions architect recommend to meet these requirements?
A. Use AWS Systems Manager to replicate and provision the prototype infrastructure in two Availability Zones
B. Define the infrastructure as a template by using the prototype infrastructure as a guide. Deploy the infrastructure with
AWS CloudFormation.
C. Use AWS Config to record the inventory of resources that are used in the prototype infrastructure. Use AWS Config to
deploy the prototype infrastructure into two Availability Zones.
D. Use AWS Elastic Beanstalk and configure it to use an automated reference to the prototype infrastructure to
automatically deploy new environments in two Availability Zones.
Answer:
Just Think Infrastructure as Code=== Cloud Formation
Elastic beanstalk
82
Platform as a service (PaaS) – You write the code and then provider builds, deploys your app and manages it.
-
Speed, automate, ease of use – use beanstalk.
NOT serverless – uses standard EC2 infra.
Systems Manager
Free Suite of tools designed to let you view, control and automate both your AWS architecture and on-prem resources.
-
-
Automation Docs/ Runbooks: can be used to control your instances or AWS resource, Automation, a capability of
AWS Systems Manager, simplifies common maintenance, deployment, and remediation tasks for AWS services
like Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (Amazon RDS), Amazon
Redshift, Amazon Simple Storage Service (Amazon S3), a
Automate processes such as patching and resource changes across AWS, on premises, and other clouds. Quickly
diagnose and remediate operational issues before they affect users.
Partamter store – securely store secret values.
Hybrid activations – control on-prem architecture using SM
Q, What are common use cases for the AWS Systems Manager Parameter Store "SecureString" parameter?
A, AWS recommends SecureString parameters if you want to use data/parameters across AWS services without exposing
the values as plaintext in commands, functions, agent logs, or CloudTrail logs. AWS Systems Manager Parameter Store.
A SecureString parameter is any sensitive data that needs to be stored and referenced in a secure manner. If you have data
that you don't want users to alter or reference in plaintext, such as passwords or license keys, create those parameters
using the SecureString data type. Only the value of a SecureString parameter is encrypted. Parameter names, descriptions,
and other properties aren't encrypted. Reference Documentation: AWS Systems Manager Parameter Store
83
Chapter 18 - Caching
Cloudfront
CDN – content delivery network, distribute static content to edge location – reduces latency.
Defaults to http connection and you can use custom SSL certificate instead of the CF cert.
Want to front CF with WAF to protect it.
Commonly in front of S3s
Cloudfront signed cookies are useful when you want to access multiple files
CloudFront Signed URLs are commonly used to distribute paid content through dynamically generated signed URLs.
Question
company has a static website that is hosted on Amazon CloudFront in front of Amazon S3. The static website uses a
database backend. The company notices that the website does not reflect updates that have been made in the website’s
Git repository. The company checks the continuous integration and continuous delivery (CI/CD) pipeline between the Git
repository and Amazon S3. The company verifies that the webhooks are configured properly and that the CI/CD pipeline is
sending messages that indicate successful deployments.
A solutions architect needs to implement a solution that displays the updates on the website.
Which solution will meet these requirements?
A. Add an Application Load Balancer.
B. Add Amazon ElastiCache for Redis or Memcached to the database layer of the web application.
C. Invalidate the CloudFront cache.
D. Use AWS Certificate Manager (ACM) to validate the website’s SSL certificate.
Solution
Invalidate the CloudFront cache to ensure that the latest updates from the Git repository are reflected on the static
website. When updates are made to the website's Git repository and deployed to Amazon S3, the CloudFront cache may
still be serving the old cached content to users. By invalidating the CloudFront cache, you're instructing CloudFront to fetch
fresh content from the origin (Amazon S3) and serve it to users.
Question:
A media company uses an Amazon CloudFront distribution to deliver content over the internet. The company wants only
premium customers to have access to the media streams and file content. The company stores all content in an Amazon S3
bucket. The company also delivers content on demand to customers for a specific purpose, such as movie rentals or music
downloads.
Which solution will meet these requirements?
A. Generate and provide S3 signed cookies to premium customers.
B. Generate and provide CloudFront signed URLs to premium customers.
C. Use origin access control (OAC) to limit the access of non-premium customers.
D. Generate and activate field-level encryption to block non-premium customers.
Answer:
Use CloudFront signed URLs or signed cookies to restrict access to documents, business data, media streams, or content
that is intended for selected users, for example, users who have paid a fee.
84
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PrivateContent.html#:~:text=CloudFront%20sign
ed%20URLs
Question:
A company uses an Amazon CloudFront distribution to serve content pages for its website. The company needs to ensure
that clients use a TLS certificate when accessing the company's website. The company wants to automate the creation and
renewal of the TLS certificates.
Which solution will meet these requirements with the MOST operational efficiency?
A. Use a CloudFront security policy to create a certificate.
B. Use a CloudFront origin access control (OAC) to create a certificate.
C. Use AWS Certificate Manager (ACM) to create a certificate. Use DNS validation for the domain.
D. Use AWS Certificate Manager (ACM) to create a certificate. Use email validation for the domain.
Solution:
The key reasons are:
AWS Certificate Manager (ACM) provides free public TLS/SSL certificates and handles certificate renewals automatically.
Using DNS validation with ACM is operationally efficient since it automatically makes changes to Route 53 rather than
requiring manual validation steps.
ACM integrates natively with CloudFront distributions for delivering HTTPS content.
CloudFront security policies and origin access controls do not issue TLS certificates.
Email validation requires manual steps to approve the domain validation emails for each renewal.
Question
A global video streaming company uses Amazon CloudFront as a content distribution network (CDN). The company wants
to roll out content in a phased manner across multiple countries. The company needs to ensure that viewers who are
outside the countries to which the company rolls out content are not able to view the content.
Which solution will meet these requirements?
A. Add geographic restrictions to the content in CloudFront by using an allow list. Set up a custom error message.
B. Set up a new URL tor restricted content. Authorize access by using a signed URL and cookies. Set up a custom error
message.
C. Encrypt the data for the content that the company distributes. Set up a custom error message.
D. Create a new URL for restricted content. Set up a time-restricted access policy for signed URLs.
Solution
A
Question
Organizers for a global event want to put daily reports online as static HTML pages. The pages are expected to generate
millions of views from users around the world. The files are stored in an Amazon S3 bucket. A solutions architect has been
asked to design an efficient and effective solution.
Which action should the solutions architect take to accomplish this?
A. Generate presigned URLs for the files.
B. Use cross-Region replication to all Regions.
C. Use the geoproximity feature of Amazon Route 53.
D. Use Amazon CloudFront with the S3 bucket as its origin.
Solution
The most effective and efficient solution would be Option D (Use Amazon CloudFront with the S3 bucket as its origin.)
85
Amazon CloudFront is a content delivery network (CDN) that speeds up the delivery of static and dynamic web content,
such as HTML pages, images, and videos. By using CloudFront, the HTML pages will be served to users from the edge
location that is closest to them, resulting in faster delivery and a better user experience. CloudFront can also handle the
high traffic and large number of requests expected for the global event, ensuring that the HTML pages are available and
accessible to users around the world.
Using the S3 bucket as the origin, CloudFront can fetch the files once and cache them globally.
Elasticache
Managed version of open source Memcahced and redis.
Memcached
– simple database caching solution, nothing is permanent (not a source of truth), isn’t a Db by itself, no failover or multi az
support or backups.
Redis
-
can function as a non relational Db (can be next best option if DynamoDb isn’t an option), failover and Multi AZ,
backups
Normally Memcached and Redis will sit in front of a Relational Db – like RDS.
Dynamo DB Accelerator (DAX)
When you want to cache a non SQL solution.
In memory caching solution.
Issue = ProvisionedThroughputExceededException – use DAX
DAX vs Elasticache
DAX for DynamoDb, EC for other solutions (really excels in front of RDS).
Question
A company has a mobile chat application with a data store based in Amazon DynamoDB. Users would like new messages to
be read with as little latency as possible. A solutions architect needs to design an optimal solution that requires minimal
application changes.
Which method should the solutions architect select?
A. Configure Amazon DynamoDB Accelerator (DAX) for the new messages table. Update the code to use the DAX endpoint.
B. Add DynamoDB read replicas to handle the increased read load. Update the application to point to the read endpoint for
the read replicas.
C. Double the number of read capacity units for the new messages table in DynamoDB. Continue to use the existing
DynamoDB endpoint.
D. Add an Amazon ElastiCache for Redis cache to the application stack. Update the application to point to the Redis cache
endpoint instead of DynamoDB.
Solution
Amazon DynamoDB Accelerator (DAX): DAX is an in-memory cache for DynamoDB that provides low-latency access to
frequently accessed data. By configuring DAX for the new messages table, read requests for the table will be served from
the DAX cache, significantly reducing the latency.
Minimal Application Changes: With DAX, the application code can be updated to use the DAX endpoint instead of the
standard DynamoDB endpoint. This change is relatively minimal and does not require extensive modifications to the
application's data access logic.
Low Latency: DAX caches frequently accessed data in memory, allowing subsequent read requests for the same data to be
served with minimal latency. This ensures that new messages can be read by users with minimal delay.\
Option B (Add DynamoDB read replicas) involves creating read replicas to handle the increased read load, but it may not
directly address the requirement of minimizing latency for new message reads.
Question
86
An entertainment company is using Amazon DynamoDB to store media metadata. The application is read intensive and
experiencing delays. The company does not have staff to handle additional operational overhead and needs to improve the
performance efficiency of DynamoDB without reconfiguring the application.
What should a solutions architect recommend to meet this requirement?
A. Use Amazon ElastiCache for Redis.
B. Use Amazon DynamoDB Accelerator (DAX).
C. Replicate data by using DynamoDB global tables.
D. Use Amazon ElastiCache for Memcached with Auto Discovery enabled.
Solution
A. Using Amazon ElastiCache for Redis would require modifying the application code and is not specifically designed to
enhance DynamoDB performance.
C. Replicating data with DynamoDB global tables would require additional configuration and operational overhead.
D. Using Amazon ElastiCache for Memcached with Auto Discovery enabled would also require application code
modifications and is not specifically designed for improving DynamoDB performance.
In contrast, option B, using Amazon DynamoDB Accelerator (DAX), is the recommended solution as it is purpose-built for
enhancing DynamoDB performance without the need for application reconfiguration. DAX provides a managed caching
layer that significantly reduces read latency and offloads traffic from DynamoDB tables.
Question
A company deployed a serverless application that uses Amazon DynamoDB as a database layer. The application has
experienced a large increase in users. The company wants to improve database response time from milliseconds to
microseconds and to cache requests to the database.
Which solution will meet these requirements with the LEAST operational overhead?
A. Use DynamoDB Accelerator (DAX).
B. Migrate the database to Amazon Redshift.
C. Migrate the database to Amazon RDS.
D. Use Amazon ElastiCache for Redis.
Solution
AAAAA
Question
A company runs a three-tier web application in the AWS Cloud that operates across three Availability Zones. The application
architecture has an Application Load Balancer, an Amazon EC2 web server that hosts user session states, and a MySQL
database that runs on an EC2 instance. The company expects sudden increases in application traffic. The company wants to
be able to scale to meet future application capacity demands and to ensure high availability across all three Availability
Zones.
Which solution will meet these requirements?
A. Migrate the MySQL database to Amazon RDS for MySQL with a Multi-AZ DB cluster deployment. Use Amazon ElastiCache
for Redis with high availability to store session data and to cache reads. Migrate the web server to an Auto Scaling group
that is in three Availability Zones.
B. Migrate the MySQL database to Amazon RDS for MySQL with a Multi-AZ DB cluster deployment. Use Amazon ElastiCache
for Memcached with high availability to store session data and to cache reads. Migrate the web server to an Auto Scaling
group that is in three Availability Zones.
C. Migrate the MySQL database to Amazon DynamoDB Use DynamoDB Accelerator (DAX) to cache reads. Store the session
data in DynamoDB. Migrate the web server to an Auto Scaling group that is in three Availability Zones.
D. Migrate the MySQL database to Amazon RDS for MySQL in a single Availability Zone. Use Amazon ElastiCache for Redis
with high availability to store session data and to cache reads. Migrate the web server to an Auto Scaling group that is in
three Availability Zones.
Solution
The key reasons why option A is preferable:
RDS Multi-AZ provides high availability for MySQL by synchronously replicating data across AZs. Automatic failover handles
AZ outages.
ElastiCache for Redis is better suited for session data caching than Memcached. Redis offers more advanced data structures
and flexibility.
Auto scaling across 3 AZs provides high availability for the web tier
Memcached is best suited for caching data, while Redis is better for storing data that needs to be persisted. If you need to
store data that needs to be accessed frequently, such as user profiles, session data, and application settings, then Redis is
the better choice
Global Accelerator
AWS Global Accelerator is a service that uses edge locations to look for the optimal pathway from your users to your
applications.
87
AWS Global Accelerator is a networking service that helps you improve the availability, performance, and security of your
public applications. Global Accelerator provides two global static public IPs that act as a fixed entry point to your
application endpoints, such as Application Load Balancers, Network Load Balancers, Amazon Elastic Compute Cloud (EC2)
instances, and elastic IPs.
Fixes IP caching
Traffic Routing – R53, Global Accelerator or Cloudfront
The key features of Global Accelerator is using AWS backbone network and latency based region selection with failover.
in Cloudfront CDN traffic between Edge location and Origin also uses internal AWS network. And Route53 is offering
multiple possibilities for GEO based DNS, when nearest region is selected automatically, with customizable healthchecks
and failover possibilities.
Best for videos, static content
Advantages of AWS Global Accelerator solution over Cloudfront based solution:
No need in Route53 and Alias record for domain. Global Accelerator provides 2 dedicated IPs, which can be used with any
3-rd party DNS.
Better for non http use cases like TCP/ UDP
Faster configuration. Dispite Cloudfront configuration, Global Accelerator updates itself immediatelly, so in critical situation
updates can be applied faster.
Disadvantages of Global Accelerator over solution with Cloudfront:
There is no caching on the edge locales. Not possible to cache website assets, or API responses, or do full-page caching.
Not possible to run Lambda Edge on Edge locales.
Conclusion
For a domain that has only API endpoints, and never needs caching - better to set-up AWS Global Accelerator and use
architecture from Schema 3: Multi regional application using AWS Global Accelerator.
For a domain with a website, or API endpoints that might need caching - better to go for a solution with Cloudfront +
Route53.
Question
A company has implemented a self-managed DNS solution on three Amazon EC2 instances behind a Network Load Balancer
(NLB) in the us-west-2 Region. Most of the company's users are located in the United States and Europe. The company
wants to improve the performance and availability of the solution. The company launches and configures three EC2
instances in the eu-west-1 Region and adds the EC2 instances as targets for a new NLB.
Which solution can the company use to route traffic to all the EC2 instances?
A. Create an Amazon Route 53 geolocation routing policy to route requests to one of the two NLBs. Create an Amazon
CloudFront distribution. Use the Route 53 record as the distribution’s origin.
B. Create a standard accelerator in AWS Global Accelerator. Create endpoint groups in us-west-2 and eu-west-1. Add the
two NLBs as endpoints for the endpoint groups.
C. Attach Elastic IP addresses to the six EC2 instances. Create an Amazon Route 53 geolocation routing policy to route
requests to one of the six EC2 instances. Create an Amazon CloudFront distribution. Use the Route 53 record as the
distribution's origin.
D. Replace the two NLBs with two Application Load Balancers (ALBs). Create an Amazon Route 53 latency routing policy to
route requests to one of the two ALBs. Create an Amazon CloudFront distribution. Use the Route 53 record as the
distribution’s origin.
Solution
B. Create a standard accelerator in AWS Global Accelerator. Create endpoint groups in us-west-2 and eu-west-1. Add the
two NLBs as endpoints for the endpoint groups.
Global Accelerator: AWS Global Accelerator is designed to improve the availability and performance of applications by using
static IP addresses (Anycast IPs) and routing traffic over the AWS global network infrastructure.
Endpoint Groups: By creating endpoint groups in both the us-west-2 and eu-west-1 Regions, the company can effectively
distribute traffic to the NLBs in both Regions. This improves availability and allows traffic to be directed to the closest
Region based on latency.
After reading the discussion I think the right answer is B, as the service they use is DNS it does not make sense using a
cloudfront distribution for this. The scenario would be different if the service were HTTP/HTTPS.
AWS Global Accelerator allows routing traffic to endpoints in multiple AWS Regions. It uses the AWS global network to
optimize availability and performance.
88
Creating an accelerator with endpoint groups in us-west-2 and eu-west-1 allows traffic to be distributed across both
regions.
Adding the NLBs in each region as endpoints allows the traffic to be routed to the EC2 instances behind them.
This provides improved performance and availability compared to just using Route 53 geolocation routing.
Question:
A company has an online gaming application that has TCP and UDP multiplayer gaming capabilities. The company uses
Amazon Route 53 to point the application traffic to multiple Network Load Balancers (NLBs) in different AWS Regions. The
company needs to improve application performance and decrease latency for the online game in preparation for user
growth.
Which solution will meet these requirements?
A. Add an Amazon CloudFront distribution in front of the NLBs. Increase the Cache-Control max-age parameter.
B. Replace the NLBs with Application Load Balancers (ALBs). Configure Route 53 to use latency-based routing.
C. Add AWS Global Accelerator in front of the NLBs. Configure a Global Accelerator endpoint to use the correct listener
ports.
D. Add an Amazon API Gateway endpoint behind the NLBs. Enable API caching. Override method caching for the different
stages.
Solution
The key considerations are:
The application uses TCP and UDP for multiplayer gaming, so Network Load Balancers (NLBs) are appropriate.
AWS Global Accelerator can be added in front of the NLBs to improve performance and reduce latency by intelligently
routing traffic across AWS Regions and Availability Zones.
Global Accelerator provides static anycast IP addresses that act as a fixed entry point to application endpoints in the
optimal AWS location. This improves availability and reduces latency.
The Global Accelerator endpoint can be configured with the correct NLB listener ports for TCP and UDP.
Question
A company is designing the network for an online multi-player game. The game uses the UDP networking protocol and will
be deployed in eight AWS Regions. The network architecture needs to minimize latency and packet loss to give end users a
high-quality gaming experience.
Which solution will meet these requirements?
A. Setup a transit gateway in each Region. Create inter-Region peering attachments between each transit gateway.
B. Set up AWS Global Accelerator with UDP listeners and endpoint groups in each Region.
C. Set up Amazon CloudFront with UDP turned on. Configure an origin in each Region.
D. Set up a VPC peering mesh between each Region. Turn on UDP for each VPC.
Solution
AWS Global Accelerator = TCP/UDP minimize latency
89
Chaper 19 Managing Accounts and Organisations
AWS Organisations
Governance tool that allows you to create and manage multiple AWS accounts from one place.
Saves replicating resources – saving money.
Question.
A large company with hundreds of AWS accounts has a newly established centralized internal process for purchasing new or
modifying existing Reserved
Instances. This process requires all business units that want to purchase or modify Reserved Instances to submit requests to
a dedicated team for procurement or execution. Previously, business units would directly purchase or modify Reserved
Instances in their own respective AWS accounts autonomously.
Which combination of steps should be taken to proactively enforce the new process in the MOST secure way possible?
(Choose two.)
1. Ensure all AWS accounts are part of an AWS Organizations structure operating in all features mode.
2. Use AWS Config to report on the attachment of an IAM policy that denies access to the
ec2:PurchaseReservedInstancesOffering and ec2:ModifyReservedInstances actions.
3. In each AWS account, create an IAM policy with a DENY rule to the ec2:PurchaseReservedInstancesOffering and
ec2:ModifyReservedInstances actions.
4. Create an SCP that contains a deny rule to the ec2:PurchaseReservedInstancesOffering and
ec2:ModifyReservedInstances actions. Attach the SCP to each organizational unit (OU) of the AWS Organizations
structure.
5. Ensure that all AWS accounts are part of an AWS Organizations structure operating in consolidated billing features
mode.
Solution
1&5
All features – The default feature set that is available to AWS Organizations. It includes all the functionality of consolidated
billing, plus advanced features that give you more control over accounts in your organization. For example, when all
features are enabled the management account of the organization has full control over what member accounts can do. The
management account can apply SCPs to restrict the services and actions that users (including the root user) and roles in an
account can access.
Centralised Logging Setup a dedicated AWS account to store logs, cloudtrail can ship all the account logs to one location
Need to
Consoilidated billing
Can share reserved instances between accounts
Service Control Policies – Acts as a global, all powerful policy. Policies applied to every resource inside the account – the
ultimate way to restrict perms and even apply to the ROOT ACCOUNT.
Allow in the YAML doesn’t give perms, gives you potential to set an allow further inside – mostly just used to restrict.
Scenario – Want centrlaised logs and want to ensure that no-one can delete them
AWS Resource Access Manager RAM – Sharing Resources
Free service that allows you to share resources with other accounts in your org – want to share a network/ VPC
-
Transit Gateways, VPC subnet, licences, R53
RAM vs VPC Peering
90
Sharing resources within the same region? Use RAM
Sharing across regions? VPC Peering
VPC Peering - excels when you’re connecting two distinct networks together, RAM better for just intra account resource
sharing due to ease.
Cross Account Role Access
As number of AWS Accounts you manage increase – duplicating IAM accounts creates a security vulnerability. Cross account
role access gives you the ability to setup temp access you can easily control.
Those roles we used for CityFibre/ NHS – very useful!!!
AWS Config
Config == enforcing standards (Automation Docs or Lambda)
Inventory mgmt. and control tool , allows you to show the history of your infra along with creating rules to make sure it
conforms to best practices.
-
Discover, query what’s in your aws, or also whats been deleted
Alert to flag when resource violating rules – e.g. s3 bucket is made public
History of your env – who / when was resource deleted. (useful for troubleshooting).
Best way to check what standards are applied to your architecture
Can set remediate action – run an automate remediation which will amend the issue found by config.
Question:
A company's solutions architect is designing an AWS multi-account solution that uses AWS Organizations. The solutions
architect has organized the company's accounts into organizational units (OUs).
The solutions architect needs a solution that will identify any changes to the OU hierarchy. The solution also needs to notify
the company's operations team of any changes.
Which solution will meet these requirements with the LEAST operational overhead?
A. Provision the AWS accounts by using AWS Control Tower. Use account drift notifications to identify the changes to the
OU hierarchy.
B. Provision the AWS accounts by using AWS Control Tower. Use AWS Config aggregated rules to identify the changes to the
OU hierarchy.
C. Use AWS Service Catalog to create accounts in Organizations. Use an AWS CloudTrail organization trail to identify the
changes to the OU hierarchy.
D. Use AWS CloudFormation templates to create accounts in Organizations. Use the drift detection operation on a stack to
identify the changes to the OU hierarchy.
Answer:
The key advantages you highlight of Control Tower are convincing:
Fully managed service simplifies multi-account setup.
Built-in account drift notifications detect OU changes automatically.
More scalable and less complex than Config rules or CloudTrail.
Better security and compliance guardrails than custom options.
Lower operational overhead compared to other solution
91
https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html
https://docs.aws.amazon.com/controltower/latest/userguide/prevention-and-notification.html
AWS Directory Service
Fully managed version of AD - run AD inside AWS without the setup on-prem.
Types:
-
Managed Microsoft AD – entire AD suite
AD connector – create a tunnel between AWS env and on-prem AD
Simple AD – standalone directory , simple auth service (less common in practice)
Cost Explorer
Tool to visualise your cloud costs – generate reports based on a veriety of factors including resource tags.
Break down by service, by time – can also forecast future spending.
Tags are super useful AWS Budgets
Tools to plan and set expectations to track ongoing spend.
Costbudget – how much are we soending?
Usage budget – how much are we using, are we using all our architecture
Reservation budget – efficient with RIs?
Savings plan budgets – are actions covered by savings plan?
Users can be alerted on current spend/ projected spend. Can filter with tags too.
AWS Cost and Usage Reports (AWS CUR)
Most comprehensive set of cost data
Publishes biulling reports to S3
Breaks down by time , service, resource or tags
Sends a csv once a day.
Integrates with Athena, Redshift or visualise in Quicksight.
Can be used with AWS orgs – entire OU groups or individual member accounts
Monitor on-demand capacity reservation
View data transfer exchanges – ext and internal region costing
Deep dive into cost allocation tags
Secnario: mention of detailed cost breakdown, daily usage reports or tracking savings Plan utilisations.
AWS Compute Optimiser
-
AWS Compute Optimizer is a service that analyzes the configuration and utilization metrics of your AWS
resources.
It reports whether your resources are optimal, and generates optimization recommendations to reduce the cost
and improve the performance of your workloads.
Compute Optimizer also provides graphs showing recent utilization metric history data, as well as projected
utilization for recommendations, which you can use to evaluate which recommendation provides the best price-
92
performance trade-off. The analysis and visualization of your usage patterns can help you decide when to move or
resize your running resources, and still meet your performance and capacity requirements.
Works with - EC2, auto-scaling groups, EBS & Lambda
Intergates with AWS Organisations – can control from mgmt. account.
Disabled by default
AWS gives you recommendations for optimising savings
Savings Plans – are 1 or 3 years, you can pay up front or partially or no upfront. EC2, Compute or Sagemaker.
AWS Trusted Advisor
Fully managed best practice audit tool – recommends best practices in your architecture.
5 areas:
-
Cost optimisation
Performance – do some EBS volumes need more IOPS
Security – holes in your SGs
Fault tolerance – do you have Mulit AZ on RDS dbs
Service limits – do you have room to scale?
Exam tips: - focus on answers that have an automated response (kick off a lambda function/ send an alert), Use SNS to alert
users, most checks are extra cost, doesn’t fix the problems just notify, use Eventbridge (Cloudwatch events) to kick off
lambda to solve the problem.
AWS Control Tower
Easy way to govern a AWS multi-account env
Automates account creation and security controls
Extends AWS orgs to prevent governance drift and leverages guradrails
Landing Zone: well architected, multi account env based on compliance and security BPs
Account Factory: configuarable account template for standardising pre-approved configs
Shared account – mgmt. account, log archive and audit account
Guardrails – high level rules providing continuous governance for the AWS env.
Preventative = ensure accounts remain in compliance by not allowing bad actions, uses Service Control Policies (e.g.public
s3 bucket),
Detective = Detect and alert on non compliance, uses AWS config rules, only work in certain regions, CT doesn’t work in
every region yet.
Exam Scenario: automated, multi account governance, guardrails, account orchestration and governed user account
provisioning.
AWS Licence Manager
Simplifies managing software licences (Microsoft, SAP, Oracle), centralises licences, set usage limits, reduces penalties of
overage software limits.
Key exam words: AWS hosted licence mgmt, hybrid licence mgmt., prevent liccence abuse.
AWS Health
Visibility of resource perf and avail of AWS services / accounts.
93
Exam Tips:
Receive notifications and alerts for affected resources and upcoming events
Can automate actions based on events using amazon eventbridge
Specific and public events
Keywords: Qs about checking for alerts for service health, automating the reboot of EC2 instances for AWS maintenance.
AWS Service Catalog
Allows orgs to create and manage catalogs of IT services.
AWS Service Catalog lets you centrally manage your cloud resources to achieve governance at scale of your infrastructure
as code (IaC) templates, written in CloudFormation or Terraform configurations.
With AWS Service Catalog, you can meet your compliance requirements while making sure your customers can quickly
deploy the cloud resources they need.
Users can deploy pre-approved CF / TF templates within org.
Fine grained access control with IAM
Can be versioned.
AWS Proton
-
Service that creates and manages infra and deployment tooling for users as well as serverless and container
based apps.
-
Automates Infra as Code provisioning and deployments.
Defines standardised infra for your serverless & container based apps.
Use templates to define and manage app stacks
Support terraform and CF.
Developers can use it as a self service tool to stand up infra, allowing them to spend more time on dev
Provisions resources, configures CI/CD and deploys the code
Well architected tool
Measures your env against the best practices.
Enables assistance with documenting workloads and architecture decisions.
94
Chapter 20 – Migration
AWS Snow
Sack of hard drives (Pbs) that Amazon ships to you, very fast upload for high volumes.
Snowcone – up to 8tbs
Snowball – 50-80Tb
Exam Scenarios (reasons to use Snowball):
-
Not wanting to transfer over the internet.
Having a large amount of data.
Very slow internet connection.
Question
A company needs to transfer 600 TB of data from its on-premises network-attached storage (NAS) system to the AWS
Cloud. The data transfer must be complete within 2 weeks. The data is sensitive and must be encrypted in transit. The
company’s internet connection can support an upload speed of 100 Mbps.
Which solution meets these requirements MOST cost-effectively?
A. Use Amazon S3 multi-part upload functionality to transfer the files over HTTPS.
B. Create a VPN connection between the on-premises NAS system and the nearest AWS Region. Transfer the data over the
VPN connection.
C. Use the AWS Snow Family console to order several AWS Snowball Edge Storage Optimized devices. Use the devices to
transfer the data to Amazon S3.
D. Set up a 10 Gbps AWS Direct Connect connection between the company location and the nearest AWS Region. Transfer
the data over a VPN connection into the Region to store the data in Amazon S3.
Solution
With the existing data link the transfer takes ~ 600 days in the best case. Thus, (A) and (B) are not applicable. Solution (D)
could meet the target with a transfer time of 6 days, but the lead time for the direct connect deployment can take weeks!
Thus, (C) is the only valid solution.
Question
A company needs to migrate a MySQL database from its on-premises data center to AWS within 2 weeks. The database is
20 TB in size. The company wants to complete the migration with minimal downtime.
Which solution will migrate the database MOST cost-effectively?
A. Order an AWS Snowball Edge Storage Optimized device. Use AWS Database Migration Service (AWS DMS) with AWS
Schema Conversion Tool (AWS SCT) to migrate the database with replication of ongoing changes. Send the Snowball Edge
device to AWS to finish the migration and continue the ongoing replication.
B. Order an AWS Snowmobile vehicle. Use AWS Database Migration Service (AWS DMS) with AWS Schema Conversion Tool
(AWS SCT) to migrate the database with ongoing changes. Send the Snowmobile vehicle back to AWS to finish the migration
and continue the ongoing replication.
C. Order an AWS Snowball Edge Compute Optimized with GPU device. Use AWS Database Migration Service (AWS DMS)
with AWS Schema Conversion Tool (AWS SCT) to migrate the database with ongoing changes. Send the Snowball device to
AWS to finish the migration and continue the ongoing replication
D. Order a 1 GB dedicated AWS Direct Connect connection to establish a connection with the data center. Use AWS
Database Migration Service (AWS DMS) with AWS Schema Conversion Tool (AWS SCT) to migrate the database with
replication of ongoing changes.
Solution
I agreed with A. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_LargeDBs.html
Why not D.?
95
When you initiate the process by requesting an AWS Direct Connect connection, it typically starts with the AWS Direct
Connect provider. This provider may need to coordinate with AWS to allocate the necessary resources. This initial setup
phase can take anywhere from a few days to a couple of weeks.
Couple of weeks? No Good
Keyword "20 TB", choose "AWS Snowball", there are A or C. C has word "GPU" what is not related, therefore choose A.
Question
A company has 5 PB of archived data on physical tapes. The company needs to preserve the data on the tapes for another
10 years for compliance purposes. The company wants to migrate to AWS in the next 6 months. The data center that stores
the tapes has a 1 Gbps uplink internet connectivity.
Which solution will meet these requirements MOST cost-effectively?
A. Read the data from the tapes on premises. Stage the data in a local NFS storage. Use AWS DataSync to migrate the data
to Amazon S3 Glacier Flexible Retrieval.
B. Use an on-premises backup application to read the data from the tapes and to write directly to Amazon S3 Glacier Deep
Archive.
C. Order multiple AWS Snowball devices that have Tape Gateway. Copy the physical tapes to virtual tapes in Snowball. Ship
the Snowball devices to AWS. Create a lifecycle policy to move the tapes to Amazon S3 Glacier Deep Archive.
D. Configure an on-premises Tape Gateway. Create virtual tapes in the AWS Cloud. Use backup software to copy the
physical tape to the virtual tape.
Solution
Option C is likely the most cost-effective solution given the large data size and limited internet bandwidth. The physical data
transfer and integration with the existing tape infrastructure provides efficiency benefits that can optimize the cost.
Question
A company has 700 TB of backup data stored in network attached storage (NAS) in its data center. This backup data need to
be accessible for infrequent regulatory requests and must be retained 7 years. The company has decided to migrate this
backup data from its data center to AWS. The migration must be complete within 1 month. The company has 500 Mbps of
dedicated bandwidth on its public internet connection available for data transfer.
What should a solutions architect do to migrate and store the data at the LOWEST cost?
A. Order AWS Snowball devices to transfer the data. Use a lifecycle policy to transition the files to Amazon S3 Glacier Deep
Archive.
B. Deploy a VPN connection between the data center and Amazon VPC. Use the AWS CLI to copy the data from on premises
to Amazon S3 Glacier.
C. Provision a 500 Mbps AWS Direct Connect connection and transfer the data to Amazon S3. Use a lifecycle policy to
transition the files to Amazon S3 Glacier Deep Archive.
D. Use AWS DataSync to transfer the data and deploy a DataSync agent on premises. Use the DataSync task to copy files
from the on-premises NAS storage to Amazon S3 Glacier.
Solution
By ordering Snowball devices, the company can transfer the 700 TB of backup data from its data center to AWS. Once the
data is transferred to S3, a lifecycle policy can be applied to automatically transition the files from the S3 Standard storage
class to the cost-effective Amazon S3 Glacier Deep Archive storage class.
Option B would require continuous data transfer over the public internet, which could be time-consuming and costly given
the large amount of data. It may also require significant bandwidth allocation.
Option C would involve additional costs for provisioning and maintaining the dedicated connection, which may not be
necessary for a one-time data migration.
Option D could be a viable option, but it may incur additional costs for deploying and managing the DataSync agent.
96
Therefore, option A is the recommended choice as it provides a secure and efficient data transfer method using Snowball
devices and allows for cost optimization through lifecycle policies by transitioning the data to S3 Glacier Deep Archive for
long-term storage.
Storage Gateway
Hybrid cloud storage system that helps you merge on-premises resources with the cloud. (one time migration or long term
pairing).
File Gateway
Network File Store – mount locally and it backs up data into S3, can also keep most recently cached files on prem and then
everything else in S3.
Helps with migrations.
Scenario: don’t have enough on-prem storage space, set up cached file gateway – solution extend on-prem into the cloud.
A File Gateway supports storage on S3 and combines a service and a virtual software appliance. By using this combination,
you can store and retrieve objects in Amazon S3 using industry-standard file protocols such as Network File System (NFS)
and Server Message Block (SMB).
Volume Gateway
iSCSI mount, its gonna be backing up disks that the Vms are currently reading
can create EBS snapshots, to restore volumes.
Easy way to migrate on-prem volumes to EBS volumes on AWS.
Scenario – want to maintain local access to data, low latency
Tape Gateway
A Tape Gateway provides cloud-backed virtual tape storage.
With a Tape Gateway, you can cost-effectively and durably archive backup data in S3 Glacier Flexible Retrieval or S3 Glacier
Deep Archive. A Tape Gateway provides a virtual tape infrastructure that scales seamlessly with your business needs and
eliminates the operational burden of provisioning, scaling, and maintaining a physical tape infrastructure.
Storage/ vol vs AWS DataSync
The way I look at this after some other questions a helpful comment was migrate use Data Sync if you need to still retain on
site and move to cloud then storage/volume gateways.
Question:
A company wants to implement a disaster recovery plan for its primary on-premises file storage volume. The file storage
volume is mounted from an Internet Small Computer Systems Interface (iSCSI) device on a local storage server. The file
storage volume holds hundreds of terabytes (TB) of data.
The company wants to ensure that end users retain immediate access to all file types from the on-premises systems
without experiencing latency.
Which solution will meet these requirements with the LEAST amount of change to the company's existing infrastructure?
A. Provision an Amazon S3 File Gateway as a virtual machine (VM) that is hosted on premises. Set the local cache to 10 TB.
Modify existing applications to access the files through the NFS protocol. To recover from a disaster, provision an Amazon
EC2 instance and mount the S3 bucket that contains the files.
97
B. Provision an AWS Storage Gateway tape gateway. Use a data backup solution to back up all existing data to a virtual tape
library. Configure the data backup solution to run nightly after the initial backup is complete. To recover from a disaster,
provision an Amazon EC2 instance and restore the data to an Amazon Elastic Block Store (Amazon EBS) volume from the
volumes in the virtual tape library.
C. Provision an AWS Storage Gateway Volume Gateway cached volume. Set the local cache to 10 TB. Mount the Volume
Gateway cached volume to the existing file server by using iSCSI, and copy all files to the storage volume. Configure
scheduled snapshots of the storage volume. To recover from a disaster, restore a snapshot to an Amazon Elastic Block Store
(Amazon EBS) volume and attach the EBS volume to an Amazon EC2 instance.
D. Provision an AWS Storage Gateway Volume Gateway stored volume with the same amount of disk space as the existing
file storage volume. Mount the Volume Gateway stored volume to the existing file server by using iSCSI, and copy all files to
the storage volume. Configure scheduled snapshots of the storage volume. To recover from a disaster, restore a snapshot to
an Amazon Elastic Block Store (Amazon EBS) volume and attach the EBS volume to an Amazon EC2 instance
Solution
D is the correct answer
Volume Gateway CACHED Vs STORED
Cached = stores a subset of frequently accessed data locally
Stored = Retains the ENTIRE ("all file types") in on prem data centre
Q
A company has several on-premises Internet Small Computer Systems Interface (ISCSI) network storage servers. The
company wants to reduce the number of these servers by moving to the AWS Cloud. A solutions architect must provide
low-latency access to frequently used data and reduce the dependency on on-premises servers with a minimal number of
infrastructure changes.
Which solution will meet these requirements?
A. Deploy an Amazon S3 File Gateway.
B. Deploy Amazon Elastic Block Store (Amazon EBS) storage with backups to Amazon S3.
C. Deploy an AWS Storage Gateway volume gateway that is configured with stored volumes.
D. Deploy an AWS Storage Gateway volume gateway that is configured with cached volumes.
S.
The Storage Gateway volume gateway provides iSCSI block storage using cached volumes. This allows replacing the onpremises iSCSI servers with minimal changes.
Cached volumes store frequently accessed data locally for low latency access, while storing less frequently accessed data in
S3.
This reduces the number of on-premises servers while still providing low latency access to hot data.
EBS does not provide iSCSI support to replace the existing servers.
S3 File Gateway is for file storage, not block storage.
Stored volumes would store all data on-premises, not in S3.
Q
A company needs to store data from its healthcare application. The application’s data frequently changes. A new regulation
requires audit access at all levels of the stored data.
The company hosts the application on an on-premises infrastructure that is running out of storage capacity. A solutions
architect must securely migrate the existing data to AWS while satisfying the new regulation.
Which solution will meet these requirements?
A. Use AWS DataSync to move the existing data to Amazon S3. Use AWS CloudTrail to log data events.
B. Use AWS Snowcone to move the existing data to Amazon S3. Use AWS CloudTrail to log management events.
C. Use Amazon S3 Transfer Acceleration to move the existing data to Amazon S3. Use AWS CloudTrail to log data events.
D. Use AWS Storage Gateway to move the existing data to Amazon S3. Use AWS CloudTrail to log management events
98
S
Question
A company has an aging network-attached storage (NAS) array in its data center. The NAS array presents SMB shares and
NFS shares to client workstations. The company does not want to purchase a new NAS array. The company also does not
want to incur the cost of renewing the NAS array’s support contract. Some of the data is accessed frequently, but much of
the data is inactive.
A solutions architect needs to implement a solution that migrates the data to Amazon S3, uses S3 Lifecycle policies, and
maintains the same look and feel for the client workstations. The solutions architect has identified AWS Storage Gateway as
part of the solution.
Which type of storage gateway should the solutions architect provision to meet these requirements?
A. Volume Gateway
B. Tape Gateway
C. Amazon FSx File Gateway
D. Amazon S3 File Gateway
Solution
It provides an easy way to lift-and-shift file data from the existing NAS to Amazon S3. The S3 File Gateway presents SMB and
NFS file shares that client workstations can access just like the NAS shares.
Behind the scenes, it moves the file data to S3 storage, storing it durably and cost-effectively.
S3 Lifecycle policies can be used to transition less frequently accessed data to lower-cost S3 storage tiers like S3 Glacier.
From the client workstation perspective, access to files feels seamless and unchanged after migration to S3. The S3 File
Gateway handles the underlying data transfers.
It is a simple, low-cost gateway option tailored for basic file share migration use cases.
Amazon S3 File Gateway provides a file interface to objects stored in S3. It can be used for a file-based interface with S3,
which allows the company to migrate their NAS array data to S3 while maintaining the same look and feel for client
workstations. Amazon S3 File Gateway supports SMB and NFS protocols, which will allow clients to continue to access the
data using these protocols. Additionally, Amazon S3 Lifecycle policies can be used to automate the movement of data to
lower-cost storage tiers, reducing the storage cost of inactive data.
Question
A company that primarily runs its application servers on premises has decided to migrate to AWS. The company wants to
minimize its need to scale its Internet Small Computer Systems Interface (iSCSI) storage on premises. The company wants
only its recently accessed data to remain stored locally.
Which AWS solution should the company use to meet these requirements?
A. Amazon S3 File Gateway
B. AWS Storage Gateway Tape Gateway
C. AWS Storage Gateway Volume Gateway stored volumes
D. AWS Storage Gateway Volume Gateway cached volumes
Solution
The best AWS solution to meet the requirements is to use AWS Storage Gateway cached volumes (option D).
The key points:
Company migrating on-prem app servers to AWS
Want to minimize scaling on-prem iSCSI storage
Only recent data should remain on-premises
The AWS Storage Gateway cached volumes allow the company to connect their on-premises iSCSI storage to AWS cloud
storage. It stores frequently accessed data locally in the cache for low-latency access, while older data is stored in AWS.
99
Question
A recent analysis of a company's IT expenses highlights the need to reduce backup costs. The company's chief information
officer wants to simplify the on-premises backup infrastructure and reduce costs by eliminating the use of physical backup
tapes. The company must preserve the existing investment in the on-premises backup applications and workflows.
What should a solutions architect recommend?
A. Set up AWS Storage Gateway to connect with the backup applications using the NFS interface.
B. Set up an Amazon EFS file system that connects with the backup applications using the NFS interface.
C. Set up an Amazon EFS file system that connects with the backup applications using the iSCSI interface.
D. Set up AWS Storage Gateway to connect with the backup applications using the iSCSI-virtual tape library (VTL) interface.
Solution
D - https://aws.amazon.com/storagegateway/vtl/?nc1=h_ls
Question
A company is implementing a shared storage solution for a media application that is hosted in the AWS Cloud. The company
needs the ability to use SMB clients to access data. The solution must be fully managed.
Which AWS solution meets these requirements?
A. Create an AWS Storage Gateway volume gateway. Create a file share that uses the required client protocol. Connect the
application server to the file share.
B. Create an AWS Storage Gateway tape gateway. Configure tapes to use Amazon S3. Connect the application server to the
tape gateway.
C. Create an Amazon EC2 Windows instance. Install and configure a Windows file share role on the instance. Connect the
application server to the file share.
D. Create an Amazon FSx for Windows File Server file system. Attach the file system to the origin server. Connect the
application server to the file system.
Solution
A. involves using Storage Gateway, but it does not specifically mention support for SMB clients. It may not meet the
requirement of using SMB clients to access data.
B. involves using Storage Gateway with tape gateway configuration, which is primarily used for archiving data to S3. It does
not provide native support for SMB clients to access data.
C. involves manually setting up and configuring a Windows file share on an EC2 Windows instance. While it allows SMB
clients to access data, it is not a fully managed solution as it requires manual setup and maintenance.
D. involves creating an FSx for Windows File Server file system, which is a fully managed Windows file system that supports
SMB clients. It provides an easy-to-use shared storage solution with native SMB support.
Based on the requirements of using SMB clients and needing a fully managed solution, option D is the most suitable choice.
Question
A research company uses on-premises devices to generate data for analysis. The company wants to use the AWS Cloud to
analyze the data. The devices generate .csv files and support writing the data to an SMB file share. Company analysts must
be able to use SQL commands to query the data. The analysts will run queries periodically throughout the day.
Which combination of steps will meet these requirements MOST cost-effectively? (Choose three.)
A. Deploy an AWS Storage Gateway on premises in Amazon S3 File Gateway mode.
B. Deploy an AWS Storage Gateway on premises in Amazon FSx File Gateway made.
C. Set up an AWS Glue crawler to create a table based on the data that is in Amazon S3.
D. Set up an Amazon EMR cluster with EMR File System (EMRFS) to query the data that is in Amazon S3. Provide access to
analysts.
E. Set up an Amazon Redshift cluster to query the data that is in Amazon S3. Provide access to analysts.
100
F. Setup Amazon Athena to query the data that is in Amazon S3. Provide access to analysts.
Solution
I thought B but maybe A is better
FSx does support SMB protocol. However so does s3 file gateway which is version 2 and 3 of the SMB protocol. Hence
using it with athena ACF should be correct
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-csv-home.html
https://aws.amazon.com/blogs/aws/amazon-athena-interactive-sql-queries-for-data-in-amazon-s3/
https://aws.amazon.com/storagegateway/faqs/
Question
A Solutions Architect must create a cost-effective backup solution for a company's 500MB source code repository of
proprietary and sensitive applications. The repository runs on Linux and backs up daily to tape. Tape backups are stored for
1 year.
The current solution is not meeting the company's needs because it is a manual process that is prone to error, expensive to
maintain, and does not meet the need for a Recovery Point Objective (RPO) of 1 hour or Recovery Time Objective (RTO) of 2
hours. The new disaster recovery requirement is for backups to be stored offsite and to be able to restore a single file if
needed.
Which solution meets the customer's needs for RTO, RPO, and disaster recovery with the LEAST effort and expense?
A. Replace local tapes with an AWS Storage Gateway virtual tape library to integrate with current backup software. Run
backups nightly and store the virtual tapes on Amazon S3 standard storage in US-EAST-1. Use cross-region replication to
create a second copy in US-WEST-2. Use Amazon S3 lifecycle policies to perform automatic migration to Amazon Glacier and
deletion of expired backups after 1 year.
B. Configure the local source code repository to synchronize files to an AWS Storage Gateway file Amazon gateway to store
backup copies in an Amazon S3 Standard bucket. Enable versioning on the Amazon S3 bucket. Create Amazon S3 lifecycle
policies to automatically migrate old versions of objects to Amazon S3 Standard - Infrequent Access, then Amazon Glacier,
then delete backups after 1 year.
C. Replace the local source code repository storage with a Storage Gateway stored volume. Change the default snapshot
frequency to 1 hour. Use Amazon S3 lifecycle policies to archive snapshots to Amazon Glacier and remove old snapshots
after 1 year. Use cross-region replication to create a copy of the snapshots in US-WEST-2.
D. Replace the local source code repository storage with a Storage Gateway cached volume. Create a snapshot schedule to
take hourly snapshots. Use an Amazon CloudWatch Events schedule expression rule to run an hourly AWS Lambda task to
copy snapshots from US-EAST -1 to US-WEST-2.
Solution
Tough question. I do support answer "B".
Eventhough there is no cross region replication, but that is not a requirement in the question. The requirement is just an
offsite (AWS) disaster recovery. Therefore, a single copy in AWS would make the deal.
Also, there is a tricky requirement of restoring a SINGLE FILE!
The snapshot of Storage Gateway (cached or stored, or tape) are storing the backup as a whole, and not as files! that mean
to restore, you need to build a snapshot, and mount the snapshot into an EC2, then restore the files. Therefore, it needs
effort to restore, un-like the (File storage), which store the files as they are in S3 bucket! Where pulling a file is very straight
forward.
https://d1.awsstatic.com/whitepapers/aws-storage-gateway-file-gateway-for-hybrid-architectures.pdf
A: nightly backup, does not meet the requirement. Plus restoration effort.
C & D: both are working solutions, and valid too. however restoration effort, and cost is higher than B.
Again, it is a tough decision.
SMB Summary of differences: NFS vs. SMB
101
With NFS, a user (or client device) can connect to a network server and access files on the server. It has rules that allow
multiple users to share the same file without data conflicts. Similarly, SMB also allows users to read files on the server.
However, it offers more flexibility, so clients can share files with each other as well. Clients can use SMB to establish
connections with any other networked devices—like printers or file servers. The client can then access the device’s files as if
it were local to the client.
NFS
SMB
Network File System.
Server Message Block.
Best suited for
Linux-based network architectures.
Windows-based architectures.
Shared resources
Files and directories.
A wide range of network resources,
including file and print services,
storage devices, and virtual machine
storage.
Client can communicate with
Servers.
Servers, plus clients can communicate
with other clients by using the server
as a mediator.
What is it?
Datasync
Agent-based solution for migrating on-prem storage to AWS. Easily move data between NFS and SMB shares and AWS
storage solutions
AWS DataSync is an online data movement and discovery service that simplifies and accelerates data migrations to AWS as
well as moving data to and from on-premises storage, edge locations, other cloud providers, and AWS Storage services.
Scenarios – Datasync good for 1 time, Storage Gateway better for continuous sync
While DataSync usually uses an agent, an agent is not required when transferring between AWS storage services in the
same AWS account.
Question
A company has a business system that generates hundreds of reports each day. The business system saves the reports to a
network share in CSV format. The company needs to store this data in the AWS Cloud in near-real time for analysis.
Which solution will meet these requirements with the LEAST administrative overhead?
A. Use AWS DataSync to transfer the files to Amazon S3. Create a scheduled task that runs at the end of each day.
B. Create an Amazon S3 File Gateway. Update the business system to use a new network share from the S3 File Gateway.
C. Use AWS DataSync to transfer the files to Amazon S3. Create an application that uses the DataSync API in the automation
workflow.
D. Deploy an AWS Transfer for SFTP endpoint. Create a script that checks for new files on the network share and uploads
the new files by using SFTP.
Solution
102
option has the least administrative overhead because:
Using DataSync avoids having to rewrite the business system to use a new file gateway or SFTP endpoint.
Calling the DataSync API from an application allows automating the data transfer instead of running scheduled tasks or
scripts.
DataSync directly transfers files from the network share to S3 without needing an intermediate server
Answer B, creating an Amazon S3 File Gateway and updating the business system to use a new network share from the S3
File Gateway, is not the best solution because it requires additional configuration and management overhead.
Question
A company is running an SMB file server in its data center. The file server stores large files that are accessed frequently for
the first few days after the files are created. After 7 days the files are rarely accessed.
The total data size is increasing and is close to the company's total storage capacity. A solutions architect must increase the
company's available storage space without losing low-latency access to the most recently accessed files. The solutions
architect must also provide file lifecycle management to avoid future storage issues.
Which solution will meet these requirements?
A. Use AWS DataSync to copy data that is older than 7 days from the SMB file server to AWS.
B. Create an Amazon S3 File Gateway to extend the company's storage space. Create an S3 Lifecycle policy to transition the
data to S3 Glacier Deep Archive after 7 days.
C. Create an Amazon FSx for Windows File Server file system to extend the company's storage space.
D. Install a utility on each user's computer to access Amazon S3. Create an S3 Lifecycle policy to transition the data to S3
Glacier Flexible Retrieval after 7 days.
Solution
B
Question
A medical research lab produces data that is related to a new study. The lab wants to make the data available with
minimum latency to clinics across the country for their on-premises, file-based applications. The data files are stored in an
Amazon S3 bucket that has read-only permissions for each clinic.
What should a solutions architect recommend to meet these requirements?
A. Deploy an AWS Storage Gateway file gateway as a virtual machine (VM) on premises at each clinic
B. Migrate the files to each clinic’s on-premises applications by using AWS DataSync for processing.
C. Deploy an AWS Storage Gateway volume gateway as a virtual machine (VM) on premises at each clinic.
D. Attach an Amazon Elastic File System (Amazon EFS) file system to each clinic’s on-premises servers.
Solution
AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to provide
seamless and secure integration between an organization's on-premises IT environment and AWS's storage infrastructure.
By deploying a file gateway as a virtual machine on each clinic's premises, the medical research lab can provide low-latency
access to the data stored in the S3 bucket while maintaining read-only permissions for each clinic. This solution allows the
clinics to access the data files directly from their on-premises file-based applications without the need for data transfer or
migration.
AWS Transfer Family
Move files in and out of S3/ EFS using SFTP, FTPS or FTP.
Good when theres a collection of older apps.
Question:
103
A company runs a highly available SFTP service. The SFTP service uses two Amazon EC2 Linux instances that run with elastic
IP addresses to accept traffic from trusted IP sources on the internet. The SFTP service is backed by shared storage that is
attached to the instances. User accounts are created and managed as Linux users in the SFTP servers.
The company wants a serverless option that provides high IOPS performance and highly configurable security. The company
also wants to maintain control over user permissions.
Which solution will meet these requirements?
A. Create an encrypted Amazon Elastic Block Store (Amazon EBS) volume. Create an AWS Transfer Family SFTP service with
a public endpoint that allows only trusted IP addresses. Attach the EBS volume to the SFTP service endpoint. Grant users
access to the SFTP service.
B. Create an encrypted Amazon Elastic File System (Amazon EFS) volume. Create an AWS Transfer Family SFTP service with
elastic IP addresses and a VPC endpoint that has internet-facing access. Attach a security group to the endpoint that allows
only trusted IP addresses. Attach the EFS volume to the SFTP service endpoint. Grant users access to the SFTP service.
C. Create an Amazon S3 bucket with default encryption enabled. Create an AWS Transfer Family SFTP service with a public
endpoint that allows only trusted IP addresses. Attach the S3 bucket to the SFTP service endpoint. Grant users access to the
SFTP service.
D. Create an Amazon S3 bucket with default encryption enabled. Create an AWS Transfer Family SFTP service with a VPC
endpoint that has internal access in a private subnet. Attach a security group that allows only trusted IP addresses. Attach
the S3 bucket to the SFTP service endpoint. Grant users access to the SFTP service.
Solution
First Serverless - EFS
Second it says it is attached to the Linux instances at the same time, only EFS can do that.
Question
A university research laboratory needs to migrate 30 TB of data from an on-premises Windows file server to Amazon FSx for
Windows File Server. The laboratory has a 1 Gbps network link that many other departments in the university share.
The laboratory wants to implement a data migration service that will maximize the performance of the data transfer.
However, the laboratory needs to be able to control the amount of bandwidth that the service uses to minimize the impact
on other departments. The data migration must take place within the next 5 days.
Which AWS solution will meet these requirements?
Solution
AWS DataSync is a data transfer service that can copy large amounts of data between on-premises storage and Amazon FSx
for Windows File Server at high speeds. It allows you to control the amount of bandwidth used during data transfer.
• DataSync uses agents at the source and destination to automatically copy files and file metadata over the network. This
optimizes the data transfer and minimizes the impact on your network bandwidth.
• DataSync allows you to schedule data transfers and configure transfer rates to suit your needs. You can transfer 30 TB
within 5 days while controlling bandwidth usage.
• DataSync can resume interrupted transfers and validate data to ensure integrity. It provides detailed monitoring and
reporting on the progress and performance of data transfers.
upvoted 3 times
kruasan 5 months, 1 week ago
Option A - AWS Snowcone is more suitable for physically transporting data when network bandwidth is limited. It would not
complete the transfer within 5 days.
Option B - Amazon FSx File Gateway only provides access to files stored in Amazon FSx and does not perform the actual
data migration from on-premises to FSx.
Option D - AWS Transfer Family is for transferring files over FTP, FTPS and SFTP. It may require scripting to transfer 30 TB and
Migration Hub
Track progress of your app migration to AWS with a GUI. Discover existing servers, plan migration efforts and track
migration status. Integrates with Database/ server Migration Services.
104
Only discovers and plans migrations – needs other services listed below to actually DO THE MIGRATION.
Server migration service
Automate, Converting your VM architecture into an EBS snapshot whoch you can turn into an AMI to be used on an EC2
instance.
Supports lots of vm types. (Azure)
Can do incremental testing, with incremental migration.
Avoids downtime.
Database migration service
Old oracle/ SQL Db, runs it through AWS Schema Conversion tool, output to Aurora Db.
Can migrate to both aws and on-prem at same time, hybrid storage.
One time or continuous migration.
Can consolidate multiple Dbs into one large Dbs.
Schema Conversion Tool – converts database schemas.
Change Data Capture – CDC guarantees transactional integrity of the target db.
How it works
DMS is a server running replication software.
Create source & target connections for loading.
Schedule tasks to run on DMS server to move data
AWS creates tables and Primary Keys
SCT converts tables, indexes and more
Can do MySQl to RDS MySQL, or also a diff engine (MySQL to RDS Oracle)
Migration Types
1.
2.
3.
Full load – all existing data is moved from sources to target in parallel (changes are caches on replication server)
Full load and CDC – catch changes on source table during migration. (Guarantees transactional integrity)
CDC only - Only replicate changes from the source db
AWS Application Discovery service
-
Easily migrate apps to AWS cloud,
agentless discovery performed in vCenter (easy Vm migration),
agent based discovery (install agent on linux/ windows to collect deets on Vms and physical hosts)
AWS Application Migration Service
This service is an automated lift-and-shift solution for expediting migration of apps to AWS and can be used for physical,
virtual, or cloud servers to avoid cutover windows or disruptions.
Automated lifgt and shift of migrating infra to AWS
Replicate source servers (vms, physical or cloud) into AWS for non disruptive cutovers.
RTO is measuring in mins, RPO in sub seconds
Question
105
What tool would you recommend as the best choice for transferring large amounts of data between on-premises storage,
edge locations, other clouds, and AWS Storage services?
Solution
AWS Snowmobile is not the most cost-effective choice and is not as flexible in its use cases as AWS DataSync. Reference:
AWS DataSync
Correct Answer
AWS DataSync is great for online data transfers to simplify, automate, and accelerate copying large amounts of data
between on-premises storage, edge locations, other clouds, and AWS Storage services. Reference: AWS DataSync
106
Chapter 21 Front-End Web and Mobile
AWS Amplify
Tools for front end web and mobile devs to quickly build full-stack applications.
Amplify Studio – easy auth, dev & ready to use componenets.
Scenario- managed server side rendering, easy mobile dev & running full stack apps.
AWS Device Farm
App testing service for IOS, ANDroid and web apps.
Remote access – login and interact with the phone
Scenario – need phones/ tablet for app testing, use this
Amazon Pinpoint
Engage with customer though a variety of channels – emails, SNS, SQS
Who uses it ? marketers, business users to promote products, order confirmations or bulk communications to a target
groups & machine learning with user engagement predictions.
107
Chapter 22 Machine Learning Front-End Web and Mobile
AWS Alexa
-
Needs AWS Polly, Lex and Transcribe
Textract = ocr image to text
Pinpoint Amazon Pinpoint - Amazon Pinpoint allows users to easily engage millions of customers via different
communication channels.
Polly – text into lifelike speech
Kendra – create intelligent search service powered by ML.
Amazon Comprehend - a natural-language processing (NLP) service that uses machine learning (ML) to uncover
information in unstructured data
Amazon SageMaker - build and train machine learning models
Amazon Forecast is a time-series forecasting service based on machine learning (ML) and built for business metrics
analysis.
Amazon Transcribe
Lex –
Amazon Rekognition
Amazon Elastic Transcoder - to convert (or “transcode”) media files from their source format into versions that will playback
on devices like smartphones, tablets and PCs.
108
Misc Qs
Question
A small startup company has multiple departments with small teams representing each department. They have hired you to
configure Identity and Access Management in their AWS account. The team expects to grow rapidly, and promote from
within which could mean promoted team members switching over to a new team fairly often. How can you configure IAM
to prepare for this type of growth?
Solution
Create the user accounts, create a group for each department, create and attach an appropriate policy to each group, and
place each user account into their department’s group. When new team members are onboarded, create their account and
put them in the appropriate group. If an existing team member changes departments, move their account to their new IAM
group.
An IAM group is a collection of IAM users. Groups let you specify permissions for multiple users, which can make it easier to
manage the permissions for those users. For example, you could have a group called Admins and give that group the types
of permissions that administrators typically need. Any user in that group automatically has the permissions that are
assigned to the group. If a new user joins your organization and needs administrator privileges, you can assign the
appropriate permissions by adding the user to that group. Similarly, if a person changes jobs in your organization, instead of
editing that user's permissions, you can remove the user from the old groups and add the user to the appropriate new
groups.
Question
You work for an Australian company, who are currently being audited and need some compliance reports regarding your
applications that are hosted on AWS. Specifically, they need a Australian Hosting Certification Framework - Strategic
Certification certificate. You need to get this as quickly as possible. What should you do?
-
Use AWS Certificate Manager to generate the certificate
Use AWS Trusted Advisor to generate the report
Use Amazon Detective to generate the report
Use AWS Artifact to download the certificate
Solution
NOT - AWS Certificate Manager is used to create and store SSL certificates, not certification certificates.
AWS Artifact is a single source you can visit to get the compliance-related information that matters to you, such as AWS
security and compliance reports or select online agreements.
Question
Your company has gotten back results from an audit. One of the mandates from the audit is that your application, which is
hosted on EC2, must encrypt the data before writing this data to storage. It has been directed internally that you must have
the ability to manage dedicated hardware security module instances to generate and store your encryption keys. Which
service could you use to meet this requirement?
-
AWS CloudHSM
AWS KMS
Amazon EBS encryption
AWS Security Token Service
Solution
The AWS CloudHSM service helps you meet corporate, contractual, and regulatory compliance requirements for data
security by using dedicated Hardware Security Module (HSM) instances within the AWS cloud. A Hardware Security Module
(HSM) provides secure key storage and cryptographic operations within a tamper-resistant hardware device. HSMs are
designed to securely store cryptographic key material and use the key material without exposing it outside the
cryptographic boundary of the hardware. You should use AWS CloudHSM when you need to manage the HSMs that
generate and store your encryption keys. In AWS CloudHSM, you create and manage HSMs, including creating users and
109
setting their permissions. You also create the symmetric keys and asymmetric key pairs that the HSM stores. AWS
Documentation: When to use AWS CloudHSM.
NOT - AWS Security Token Service (AWS STS) is a web service that enables you to request temporary, limited-privilege
credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users).
Question
You have been put in charge of S3 buckets for your company. The buckets are separated based on the type of data they are
holding and the level of security required for that data. You have several buckets that have data you want to safeguard from
accidental deletion. Which configuration will meet this requirement?
-
Configure cross-account access with an IAM Role prohibiting object deletion in the bucket.
Signed URLs to all users to access the bucket.
Enable versioning on the bucket and multi-factor authentication delete as well.
Archive sensitive data to Amazon Glacier.
Solution
Correct. Versioning is a means of keeping multiple variants of an object in the same bucket. You can use versioning to
preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket. With versioning, you can
easily recover from both unintended user actions and application failures. When you enable versioning for a bucket, if
Amazon S3 receives multiple write requests for the same object simultaneously, it stores all of the objects. Key point:
versioning is turned off by default. If a bucket's versioning configuration is MFA Delete–enabled, the bucket owner must
include the x-amz-mfa request header in requests to permanently delete an object version or change the versioning state of
the bucket. References: https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html
https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMFADelete.html
Incorrect. Cross-account access has to do with the ability to access buckets in other accounts. A second account was not
mentioned in the question. And you do not want to completely prohibit deletion in the bucket. There may be valid reasons
to delete objects. The key phrase is “safeguard from accidental deletion”.
Question
An application team has decided to leverage AWS for their application infrastructure. The application performs proprietary,
internal processes that other business applications utilize for their daily workloads. It is built with Apache Kafka to handle
real-time streaming, which virtual machines running the application in docker containers consume the data from. The team
wants to leverage services that provide less overhead but also cause the least amount of disruption to coding and
deployments. Which combination of AWS services would best meet the requirements?
-
Amazon Kinesis Data Streams
Amazon ECS Fargate
Amazon SNS
Amazon MSK
AWS Lambda
Amazon MQ
Solution
Fargate containers offer the least disruptive changes, while also minimizing the operational overhead of managing the
compute services. Reference: What is AWS Fargate?
MSK - This service is meant for applications that currently use or are going to use Apache Kafka for messaging. It allows for
managing of control plane operations in AWS. Reference: Welcome to the Amazon MSK Developer Guide
Question
Your company is in the process of creating a multi-region disaster recovery solution for your database, and you have been
tasked to implement it. The required RTO is 1 hour, and the RPO is 15 minutes. What steps can you take to ensure these
thresholds are met?
110
-
Take EBS snapshots of the required EC2 instances nightly. In the event of a disaster, restore the snapshots to
another region.
Use Redshift to host your database. Enable "multi-region" failover with Redshift. In the event of a failure, do
nothing, as Redshift will handle it for you.
Use RDS to host your database. Create a cross-region read replica of your database. In the event of a failure,
promote the read replica to be a standalone database. Send new reads and writes to this database.
Use RDS to host your database. Enable the Multi-AZ option for your database. In the event of a failure, cut over to
the secondary database.
Solution
RDS with RR - This would handle both your Recovery Time Objective and Recovery Point Objective. Your data is kept in the
secondary region and could easily be accessed when needed. https://aws.amazon.com/rds/features/read-replicas/
EBS Snaps, would be an issue for both Recovery Time Objective and Recovery Point Objective. You cannot lose more than
15 minutes of data, and in this scenario you're only taking 1 backup per day. The potential data loss would not be
acceptable in this situation.
Question
You work for an organization that has multiple AWS accounts in multiple regions and multiple applications. You have been
tasked with making sure that all your firewall rules across these multiple accounts and regions are consistent. You need to
do this as quickly and efficiently as possible. Which AWS service would help you achieve this?
-
Amazon Detective
AWS Web Application Firewall (AWS WAF)
AWS Network Firewall
AWS Firewall Manager
Solution
YES - AWS Firewall Manager is a security management service in a single pane of glass. This allows you to centrally set up
and manage firewall rules across multiple AWS accounts and applications in AWS Organizations.
NO - Amazon Detective is a service that can analyze, investigate, and quickly identify the root cause of potential security
issues or suspicious activities. It is not used for firewall management.
Question
You have been instructed to configure a web application using stateless web servers. Which services can you use to handle
session state data?
Amazon RDS
Amazon DynamoDB
Amazon S3 Glacier
Amazon ElastiCache
Amazon Redshift
Solution
Correct. Amazon RDS can store session state data. It is slower than Amazon DynamoDB, but may be fast enough for some
situations.
Elasticache and DynamoDB can both be used to store session data. https://aws.amazon.com/caching/sessionmanagement/
https://docs.aws.amazon.com/sdk-for-net/v2/developer-guide/dynamodb-session-net-sdk.html
Incorrect. The nature of Glacier is such that quick retrieval of session data is not possible.
111
Question
An international company has many clients around the world. These clients need to transfer gigabytes to terabytes of data
quickly and on a regular basis to an S3 bucket. Which S3 feature will enable these long distance data transfers in a secure
and fast manner?
-
Cross-account replication
Transfer Acceleration
AWS Snowmobile
Multipart upload
Solution
You might want to use Transfer Acceleration on a bucket for various reasons, including the following: You have customers
that upload to a centralized bucket from all over the world. You transfer gigabytes to terabytes of data on a regular basis
across continents. You are unable to utilize all of your available bandwidth over the Internet when uploading to Amazon S3.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html
Question
-
-
A small software team is creating an application which will give subscribers real-time weather updates. The
application will run on EC2 and will make several requests to AWS services such as S3 and DynamoDB. What is the
best way to grant permissions to these other AWS services?
Create an IAM user, grant the user permissions, and pass the user credentials to the application.
Embed the appropriate credentials to access AWS services in the application.
Create an IAM role that you attach to the EC2 instance to give temporary security credentials to applications
running on the instance.
Create an IAM policy that you attach to the EC2 instance to give temporary security credentials to applications
running on the instance.
Solution
Create an IAM role in the following situations: You're creating an application that runs on an Amazon Elastic Compute Cloud
(Amazon EC2) instance and that application makes requests to AWS. Don't create an IAM user and pass the user's
credentials to the application or embed the credentials in the application. Instead, create an IAM role that you attach to the
EC2 instance to give temporary security credentials to applications running on the instance. When an application uses these
credentials in AWS, it can perform all of the operations that are allowed by the policies attached to the role. For details, see
Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances.
https://docs.aws.amazon.com/IAM/latest/UserGuide/id.html#id_which-to-choose_role.
Incorrect. It is bad practice to embed credentials in this manner. It introduces vulnerabilities to attacks.
Question
You work in healthcare for an IVF clinic. You host an application on AWS, which allows patients to track their medication
during IVF cycles. The application also allows them to view test results, which contain sensitive medical data. You have a
regulatory requirement that the application is secure and you must use a firewall managed by AWS that enables control
and visibility over VPC-to-VPC traffic and prevents the VPCs hosting your sensitive application resources from accessing
domains using unauthorized protocols. What AWS service would support this?
-
AWS Firewall Manager
AWS WAF
AWS Network Firewall
AWS PrivateLink
Solution
The AWS Network Firewall infrastructure is managed by AWS, so you don’t have to worry about building and maintaining
your own network security infrastructure. AWS Network Firewall’s stateful firewall can incorporate context from traffic
flows, like tracking connections and protocol identification, to enforce policies such as preventing your VPCs from accessing
domains using an unauthorized protocol. AWS Network Firewall gives you control and visibility of VPC-to-VPC traffic to
logically separate networks hosting sensitive applications or line-of-business resources.
112
NOT - AWS WAF is a web application firewall that helps protect web applications from attacks by allowing you to configure
rules that allow, block, or monitor (count) web requests based on conditions that you define. These conditions include IP
addresses, HTTP headers, HTTP body, URI strings, SQL injection and cross-site scripting.
Question
Solution
Question
Solution
Question
Solution
Question
Jennifer is a cloud engineer for her application team. The application leverages several third-party SaaS vendors to
complete their workflows within the application. Currently, the team uses numerous AWS Lambda functions for each SaaS
vendor that run daily to connect to the configured vendor. The functions initiate a transfer of data files, ranging from one
megabyte up to 80 gibibytes in size. These data files are stored in an Amazon S3 bucket and then referenced by the
application itself. The data transfer routinely fails due to execution timeout limits in the Lambda functions, and the team
wants to find a simpler and less error-prone way of transferring the required data. Which solution or AWS service could be
the best fit for their solution?
1-
Amazon AppFlow
2-
Amazon EKS with Auto Scaling
3-
Amazon EC2 Auto Scaling Groups
4-
Increase the Lambda function timeouts to one hour
Solution
AppFlow offers a fully managed service for easily automating the exchange of data between SaaS vendors and AWS services
like Amazon S3. You can transfer up to 100 gibibytes per flow, and this avoids the Lambda function timeouts. Reference:
What is Amazon AppFlow? Tutorial: Transfer data between applications with Amazon AppFlow
While this could help remove the timeout errors, this would in turn require more operational overhead and would not be
cost-effective. There is a simpler solution.
QUESTION
113
Your company is in the process of creating a multi-region disaster recovery solution for your database, and you have been
tasked to implement it. The required RTO is 1 hour, and the RPO is 15 minutes. What steps can you take to ensure these
thresholds are met?




Use Redshift to host your database. Enable "multi-region" failover with Redshift. In the event of a failure, do
nothing, as Redshift will handle it for you.
Take EBS snapshots of the required EC2 instances nightly. In the event of a disaster, restore the snapshots to
another region.
Use RDS to host your database. Create a cross-region read replica of your database. In the event of a failure,
promote the read replica to be a standalone database. Send new reads and writes to this database.
Use RDS to host your database. Enable the Multi-AZ option for your database. In the event of a failure, cut over to
the secondary database.
Solution
Use Redshift to host your database. Enable "multi-region" failover with Redshift. In the event of a failure, do nothing, as
Redshift will handle it for you.
Question
An application team has decided to leverage AWS for their application infrastructure. The application performs proprietary,
internal processes that other business applications utilize for their daily workloads. It is built with Apache Kafka to handle
real-time streaming, which virtual machines running the application in docker containers consume the data from. The team
wants to leverage services that provide less overhead but also cause the least amount of disruption to coding and
deployments. Which combination of AWS services would best meet the requirements?
(Choose 2)
-
Amazon ECS Fargate
Amazon MQ
AWS Lambda
Amazon SNS
Amazon MSK
Amazon Kinesis Data Streams
Solution
Fargate and Kinesis
QUESTION 18
You are managing data storage for your company, and there are many EBS volumes. Your management team has given you
some new requirements. Certain metrics on the EBS volumes need to be monitored, and the database team needs to be
notified by email when certain metric thresholds are exceeded. Which AWS services can be configured to meet these
requirements?
(Choose 2)
-
SWF
CloudWatch
SES
SQS
SNS
Solution
Cloudwatch & SNS
Question
A company’s security team requests that network traffic be captured in VPC Flow Logs. The logs will be frequently accessed
for 90 days and then accessed intermittently.
114
What should a solutions architect do to meet these requirements when configuring the logs?
A. Use Amazon CloudWatch as the target. Set the CloudWatch log group with an expiration of 90 days
B. Use Amazon Kinesis as the target. Configure the Kinesis stream to always retain the logs for 90 days.
C. Use AWS CloudTrail as the target. Configure CloudTrail to save to an Amazon S3 bucket, and enable S3 Intelligent-Tiering.
D. Use Amazon S3 as the target. Enable an S3 Lifecycle policy to transition the logs to S3 Standard-Infrequent Access (S3
Standard-IA) after 90 days.
solution
D
question
A company recently migrated its entire IT environment to the AWS Cloud. The company discovers that users are
provisioning oversized Amazon EC2 instances and modifying security group rules without using the appropriate change
control process. A solutions architect must devise a strategy to track and audit these inventory and configuration changes.
Which actions should the solutions architect take to meet these requirements? (Choose two.)
A. Enable AWS CloudTrail and use it for auditing.
B. Use data lifecycle policies for the Amazon EC2 instances.
C. Enable AWS Trusted Advisor and reference the security dashboard.
D. Enable AWS Config and create rules for auditing and compliance purposes.
E. Restore previous resource configurations with an AWS CloudFormation template.
Answer:
A. Enable AWS CloudTrail and use it for auditing. CloudTrail provides event history of your AWS account activity, including
actions taken through the AWS Management Console, AWS Command Line Interface (CLI), and AWS SDKs and APIs. By
enabling CloudTrail, the company can track user activity and changes to AWS resources, and monitor compliance with
internal policies and external regulations.
D. Enable AWS Config and create rules for auditing and compliance purposes. AWS Config provides a detailed inventory of
the AWS resources in your account, and continuously records changes to the configurations of those resources. By creating
rules in AWS Config, the company can automate the evaluation of resource configurations against desired state, and receive
alerts when configurations drift from compliance.
NOT B) Data lifecycle policies control when EC2 instances are backed up or deleted but do not audit configuration changes.
C) AWS Trusted Advisor security checks may detect some compliance violations after the fact but do not comprehensively
log changes like AWS CloudTrail and AWS Config do.
Q, https://www.examtopics.com/discussions/amazon/view/99795-exam-aws-certified-solutions-architect-associate-saac03/
A, Reckon C
Question
A company runs a web application on Amazon EC2 instances in multiple Availability Zones. The EC2 instances are in private
subnets. A solutions architect implements an internet-facing Application Load Balancer (ALB) and specifies the EC2
instances as the target group. However, the internet traffic is not reaching the EC2 instances.
How should the solutions architect reconfigure the architecture to resolve this issue?
A. Replace the ALB with a Network Load Balancer. Configure a NAT gateway in a public subnet to allow internet traffic.
B. Move the EC2 instances to public subnets. Add a rule to the EC2 instances’ security groups to allow outbound traffic to
0.0.0.0/0.
C. Update the route tables for the EC2 instances’ subnets to send 0.0.0.0/0 traffic through the internet gateway route. Add
a rule to the EC2 instances’ security groups to allow outbound traffic to 0.0.0.0/0.
D. Create public subnets in each Availability Zone. Associate the public subnets with the ALB. Update the route tables for
the public subnets with a route to the private subnets.
115
Solution
https://repost.aws/knowledge-center/public-load-balancer-private-ec2
D. is the correct solution. By creating public subnets and associating them with the ALB, inbound internet traffic can reach
the ALB. The route tables for the public subnets are updated to include a route to the private subnets, allowing traffic to
reach the EC2 instances in the private subnets. This setup enables secure access to the application while allowing internet
traffic to reach the EC2 instances through the ALB.
QUESTION 5
You have been evaluating the NACLs in your company. Currently, you are looking at the default network ACL. Which
statement is true about NACLs?
1.
The default configuration of the default NACL is Deny, and the default configuration of a custom NACL is Allow.
2.
The default configuration of the default NACL is Allow, and the default configuration of a custom NACL is Deny.
3.
The default configuration of the default NACL is Allow, and the default configuration of a custom NACL is Allow.
4.
The default configuration of the default NACL is Deny, and the default configuration of a custom NACL is Deny.
Solution
Incorrect. The default configuration of the default NACL is Allow, and the default configuration of a custom NACL is Deny.
Your VPC automatically comes with a modifiable default network ACL. By default, it allows all inbound and outbound IPv4
traffic and, if applicable, IPv6 traffic. You can create a custom network ACL and associate it with a subnet. By default, each
custom network ACL denies all inbound and outbound traffic until you add rules.
https://docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html#default-network-acl
Question:
A team of architects is designing a new AWS environment for a company which wants to migrate to the Cloud. The
architects are considering the use of EC2 instances with instance store volumes. The architects realize that the data on the
instance store volumes are ephemeral. Which action will not cause the data to be deleted on an instance store volume?
1.
The underlying disk drive fails.
2.
Hardware disk failure.
3.
Reboot
4.
Instance is stopped
Solution
Correct. Some Amazon Elastic Compute Cloud (Amazon EC2) instance types come with a form of directly attached, blockdevice storage known as the instance store. The instance store is ideal for temporary storage, because data in the instance
store is lost under any of the following circumstances:
The underlying disk drive fails
The instance stops
The instance terminates
Instance is terminated
Hardware disk failure
References: https://aws.amazon.com/premiumsupport/knowledge-center/instance-store-vs-ebs/
Question
You are working in a large healthcare facility that uses EBS volumes on most of the EC2 instances. The CFO has approached
you about some cost savings, and it has been decided that some of the EC2 instances and EBS volumes would be deleted.
116
What step can be taken to preserve the data on the EBS volumes and ensure the data can be restored to a new EBS volume
within minutes?
-
Store the data in CloudFormation user data.
-
Move the data to Amazon S3.
-
Take point-in-time snapshots of your Amazon EBS volumes.
-
Use S3 Glacier using the Standard retrieval tier.
Solution
You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time snapshots. Snapshots are
incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot
are saved. This minimizes the time required to create the snapshot and saves on storage costs by not duplicating data.
When you delete a snapshot, only the data unique to that snapshot is removed. Each snapshot contains all of the
information that is needed to restore your data (from the moment when the snapshot was taken) to a new EBS volume.
Reference: Amazon EBS Snapshots
S3 Glacier's Standard retrieval tier takes several hours for data restoration, which would not meet the time requirement.
Additionally, the process involves transferring data to S3 before archiving, making it far less efficient than using EBS
snapshots.
Question
After an IT Steering Committee meeting you have been put in charge of configuring a hybrid environment for the company’s
compute resources. You weigh the pros and cons of various technologies based on the requirements you are given. Your
primary requirement is the necessity for a private, dedicated connection, which bypasses the Internet and can provide
throughput of 10 Gbps. Which option will you select?
-
AWS Direct Gateway
VPC Peering
AWS VPN
AWS Direct Connect
Solution
Direct Connect?
Question
You work for an online retailer where any downtime at all can cause a significant loss of revenue. You have architected your
application to be deployed on an Auto Scaling Group of EC2 instances behind a load balancer. You have configured and
deployed these resources using a CloudFormation template. The Auto Scaling Group is configured with default settings and
a simple CPU utilization scaling policy. You have also set up multiple Availability Zones for high availability. The load balancer
does health checks against an HTML file generated by script. When you begin performing load testing on your application
and notice in CloudWatch that the load balancer is not sending traffic to one of your EC2 instances. What could be the
problem?
-
The instance has not been registered with CloudWatch.
-
You are load testing at a moderate traffic level and not all instances are needed.
-
The EC2 instance has failed EC2 status checks.
-
The EC2 instance has failed the load balancer health check.
Solution
117
Load balancer health check
Question
The AWS team in a large company is spending a lot of time monitoring EC2 instances and maintenance when the instances
report health check failures. How can you most efficiently automate this monitoring and repair?
-
Create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically reboots the
instance if a health ch eck fails.
-
Create a Lambda function which can be triggered by a failed instance health check. Have the Lambda function
destroy the instance and spin up a new instance.
-
Create a cron job which monitors the instances periodically and starts a new instance if a health check has failed.
-
Create a Lambda function which can be triggered by a failed instance health check. Have the Lambda function
deploy a CloudFormation template which can perform the creation of a new instance.
Solution
You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically reboots the
instance. The reboot alarm action is recommended for Instance Health Check failures (as opposed to the recover alarm
action, which is suited for System Health Check failures). An instance reboot is equivalent to an operating system reboot. In
most cases, it takes only a few minutes to reboot your instance. When you reboot an instance, it remains on the same
physical host, so your instance keeps its public DNS name, private IP address, and any data on its instance store volumes.
Rebooting an instance doesn't start a new instance billing hour, unlike stopping and restarting your instance.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/UsingAlarmActions.html
Question
A new startup is considering the advantages of using Amazon DynamoDB versus a traditional relational database in AWS
RDS. The NoSQL nature of DynamoDB presents a small learning curve to the team members who all have experience with
traditional databases. The company will have multiple databases, and the decision will be made on a case-by-case basis.
Which of the following use cases would favor Amazon DynamoDB?
(Choose 3)
-
Storing binary large object (BLOB) data
Storing metadata for S3 objects
Strong referential integrity between tables
Managing web session data
Online analytical processing (OLAP)/data warehouse implementations
High-performance reads and writes for online transaction processing (OLTP) workloads
Solution
118
119
Areas not best understood
-
Options for high traffic, with VPC endpoints, R53, cloudfront, LBs etc – not sure best option
Lambda – diff between provisioned and reserved concurrency.
AWS Transfer Family - https://aws.amazon.com/aws-transfer-family/
Redshift spectrum …
Lake formation
Endpoints – vpcs endpoints/ public endpoints
IOT Core
Traffic routing best options – route53, global accelerator, cloudfront done in caching chapter
Stateful vs stateless (NACL & SGs)
Kinesis or cloudwatch - https://www.examtopics.com/discussions/amazon/view/5338-exam-aws-certified-solutionsarchitect-professional-topic-1/
-
Privileges – when to use groups, roles / resource basedf policy https://www.examtopics.com/discussions/amazon/view/85816-exam-aws-certified-solutions-architect-associate-saac03/
https://www.examtopics.com/discussions/amazon/view/46539-exam-aws-certified-solutions-architect-associate-saac02/
Question
A company is preparing to deploy a new serverless workload. A solutions architect must use the principle of least
privilege to configure permissions that will be used to run an AWS Lambda function. An Amazon EventBridge (Amazon
CloudWatch Events) rule will invoke the function.
Which solution meets these requirements?
A. Add an execution role to the function with lambda:InvokeFunction as the action and * as the principal.
B. Add an execution role to the function with lambda:InvokeFunction as the action and Service:
lambda.amazonaws.com as the principal.
C. Add a resource-based policy to the function with lambda:* as the action and Service: events.amazonaws.com as
the principal.
D. Add a resource-based policy to the function with lambda:InvokeFunction as the action and Service:
events.amazonaws.com as the principal.
Solution
The correct solution is D. Add a resource-based policy to the function with lambda:InvokeFunction as the action and
Service: events.amazonaws.com as the principal.
The principle of least privilege requires that permissions are granted only to the minimum necessary to perform a
task. In this case, the Lambda function needs to be able to be invoked by Amazon EventBridge (Amazon CloudWatch
Events). To meet these requirements, you can add a resource-based policy to the function that allows the
InvokeFunction action to be performed by the Service: events.amazonaws.com principal. This will allow Amazon
EventBridge to invoke the function, but will not grant any additional permissions to the function.
-
https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-use-resource-based.html#eb-lambda-permissions
Why other options are wrong
Option A is incorrect because it grants the lambda:InvokeFunction action to any principal (*), which would allow any
entity to invoke the function and goes beyond the minimum permissions needed.
Option B is incorrect because it grants the lambda:InvokeFunction action to the Service: lambda.amazonaws.com
principal, which would allow any Lambda function to invoke the function and goes beyond the minimum permissions
needed.
120
Option C is incorrect because it grants the lambda:* action to the Service: events.amazonaws.com principal, which
would allow Amazon EventBridge to perform any action on the function and goes beyond the minimum permissions
needed.
Question
A company is migrating its on-premises workload to the AWS Cloud. The company already uses several Amazon EC2
instances and Amazon RDS DB instances. The company wants a solution that automatically starts and stops the EC2
instances and DB instances outside of business hours. The solution must minimize cost and infrastructure
maintenance.
Which solution will meet these requirements?
A. Scale the EC2 instances by using elastic resize. Scale the DB instances to zero outside of business hours.
B. Explore AWS Marketplace for partner solutions that will automatically start and stop the EC2 instances and DB
instances on a schedule.
C. Launch another EC2 instance. Configure a crontab schedule to run shell scripts that will start and stop the existing
EC2 instances and DB instances on a schedule.
D. Create an AWS Lambda function that will start and stop the EC2 instances and DB instances. Configure Amazon
EventBridge to invoke the Lambda function on a schedule.
Solution
This option leverages AWS Lambda and EventBridge to automatically schedule the starting and stopping of resources.
Lambda provides the script/code to stop/start instances without managing servers.
EventBridge triggers the Lambda on a schedule without cronjobs.
No additional code or third party tools needed.
Serverless, maintenance-free solution
Question
A company has resources across multiple AWS Regions and accounts. A newly hired solutions architect discovers a
previous employee did not provide details about the resources inventory. The solutions architect needs to build and
map the relationship details of the various workloads across all accounts.
Which solution will meet these requirements in the MOST operationally efficient way?
A. Use AWS Systems Manager Inventory to generate a map view from the detailed view report.
B. Use AWS Step Functions to collect workload details. Build architecture diagrams of the workloads manually.
C. Use Workload Discovery on AWS to generate architecture diagrams of the workloads.
D. Use AWS X-Ray to view the workload details. Build architecture diagrams with relationships.
Solution
Workload Discovery is purpose-built to automatically generate visual mappings of architectures across accounts and
Regions. This makes it the most operationally efficient way to meet the requirements.
121
AWS Well-Architected Framework
Framework Overview
The AWS Well-Architected Framework describes key concepts, design principles, and architectural best practices for
designing and running workloads in the cloud. By answering a few foundational questions, learn how well your architecture
aligns with cloud best practices and gain guidance for making improvements.
6 Pillars
Operational Excellence Pillar
The operational excellence pillar focuses on running and
monitoring systems, and continually improving processes
and procedures. Key topics include automating changes,
responding to events, and defining standards to manage
daily operations.
Security Pillar
The security pillar focuses on protecting information and
systems. Key topics include confidentiality and integrity of
data, managing user permissions, and establishing
controls to detect security events.
Performance Efficiency Pillar
The performance efficiency pillar focuses on structured
and streamlined allocation of IT and computing resources.
Key topics include selecting resource types and sizes
optimized for workload requirements, monitoring
performance, and maintaining efficiency as business needs
evolve.
Cost Optimization Pillar
The cost optimization pillar focuses on avoiding
unnecessary costs. Key topics include understanding
spending over time and controlling fund allocation,
selecting resources of the right type and quantity, and
scaling to meet business needs without overspending.
Reliability Pillar
The reliability pillar focuses on workloads performing their
intended functions and how to recover quickly from failure
to meet demands. Key topics include distributed system
design, recovery planning, and adapting to changing
requirements.
Sustainability Pillar
The sustainability pillar focuses on minimizing the
environmental impacts of running cloud workloads. Key
topics include a shared responsibility model for
sustainability, understanding impact, and maximizing
utilization to minimize required resources and reduce
downstream impacts.
Design Resilient Architectures
T
122
AWS Shared Responsibility Model
Question
A company is migrating its applications and databases to the AWS Cloud. The company will use Amazon Elastic
Container Service (Amazon ECS), AWS Direct Connect, and Amazon RDS.
Which activities will be managed by the company's operational team? (Choose three.)
A. Management of the Amazon RDS infrastructure layer, operating system, and platforms
B. Creation of an Amazon RDS DB instance and configuring the scheduled maintenance window
C. Configuration of additional software components on Amazon ECS for monitoring, patch management, log
management, and host intrusion detection
D. Installation of patches for all minor and major database versions for Amazon RDS
E. Ensure the physical security of the Amazon RDS infrastructure in the data center
F. Encryption of the data that moves in transit through Direct Connect
Solution
B: Creating an RDS instance and configuring the maintenance window is done by the customer.
C: Adding monitoring, logging, etc on ECS is managed by the customer.
F: Encrypting Direct Connect traffic is handled by the customer.
Question
A company wants to use Amazon Elastic Container Service (Amazon ECS) clusters and Amazon RDS DB instances to
build and run a payment processing application. The company will run the application in its on-premises data center
for compliance purposes.
A solutions architect wants to use AWS Outposts as part of the solution. The solutions architect is working with the
company's operational team to build the application.
Which activities are the responsibility of the company's operational team? (Choose three.)
A. Providing resilient power and network connectivity to the Outposts racks
B. Managing the virtualization hypervisor, storage systems, and the AWS services that run on Outposts
C. Physical security and access controls of the data center environment
D. Availability of the Outposts infrastructure including the power supplies, servers, and networking equipment within
the Outposts racks
E. Physical maintenance of Outposts components
F. Providing extra capacity for Amazon ECS clusters to mitigate server failures and maintenance events
Solution
According to the AWS Shared Responsibility Model
2
, AWS operates, manages, and controls the components from the host operating system and virtualization layer down
to the physical security of the facilities in which the service operates. However, the customer is responsible for the
physical security and access controls of the data center environment, providing resilient power and network
connectivity to the Outposts racks, and ensuring the availability of the Outposts infrastructure including the power
supplies, servers, and networking equipment within the Outposts racks.
Therefore, the company's operational team is responsible for providing the necessary infrastructure and security
measures to support the Outposts racks and ensure the availability of the Outposts infrastructure.
A and C are obviously right. D is wrong because "within the Outpost racks". Between E and F, E is wrong because
(https://aws.amazon.com/outposts/rack/faqs/) says "If there is a need to perform physical maintenance, AWS will
reach out to schedule a time to visit your site. AWS may replace a given module as appropriate but will not perform
any host or network switch servicing on customer premises." So, choosing F.
123
Online tips
I however wrote some things down after the exam of topics I didn't quite understand or wasn't sure about (and I could
remember), hope this helps others out here!
-
fsx replicating to DR, best option datasync?
lots of migration questions using App migration, server migration, app discovery, migration evaluator
Lots of questions about Storage gateway
Lots of questions about Organisations
dynamoDB auto scaling (application auto scaling!), how to be cost effective on peakloads
babelfish postgresql for mssql migration
session states in IAM
direct connect and gateways
When to use fargate/lambda/ecs/eks/... regarding operational efficiency, cost efficiency, using managed services,...
Useful Tutorials Dojo Cheat sheet:
https://tutorialsdojo.com/aws-cheat-sheets/
124
125
Download