Cloud Concepts • cloud computing - using a network of remote servers hosted on Internet to store, manage and process data, rather then locally • evolution of cloud hosting – dedicated server - physical machine – VPS - one physical machine virtualized into sub-machines – shared hosting - one physical machine shared by hundred of businesses - cheap, but poor isolation – cloud hosting - multiple physical machines acting as one system abstracted into multiple cloud services - distributed computing • AWS - Amazon Web Services - CSP - Cloud Service Provider – first service - Simple Queue Service SQS – Simple Storage Service S3 – Elastic Compute Cloud EC2 CSP • CSP – provides cloud services – can be chained into cloud architectures – accessible via Single Unified API eg. AWS API – metered billing – rich monitoring eg. AWS CloudTrail – IaaS offering - infrastructure as a service – automation via IaC (Infrastructure as Code) – landscape ∗ tier 1 - AWS, Azure, Google Cloud Platform GCP, Alibaba Cloud (China) ∗ tier 2 - IBM Cloud, Oracle Cloud, OpenStack ∗ tier 3 (VPS turned to IaaS) - Digital Ocean (with serverless), Linode, Vultr (with Kubernetes) • Magic Quadrant - market research Cloud Services • Common Cloud Services – the 4 core of IaaS ∗ compute ∗ networking ∗ storage ∗ databases – AWS has 200+ cloud services • AWS technology overview – compute - EC2 Virtual Machines – storage - EBS Virtual Hard Drives 1 – DB - RDS SQL DBs – networking/CDN - VPC Private Cloud Network Cloud Computing • Dedicated -> VMs -> Containers -> Functions (Serverless) • types of cloud computing – SaaS - Software as a Service - for customers - eg. Office 365 – PaaS - Platform as a Service - for developers - eg. Heroku – IaaS - Infrastructure as a Service - for admins - eg. AWS, Azure • deployment models – public cloud - everything is built on the CSP - new or small companies – private cloud - built on company’s dataCenters (On-Premise) – hybrid - On-Premise + CSP - big companies Banks, FinTech – Cross-Cloud - multiple CSPs - eg. Azure Arc, Anthos AWS Account • root-user • IAM - service – Account Alias - eg. company name – users - add user - name and access key (programmatic) and password for console – user groups - different access policies - AdministratorAccess, PowerUserAccess – always use IAM user (no root) • region – some services have region attached – Cloudfront and S3 are Global • metered billing • budget • SNS - Simple Notification Service Digital Transformation • • • • Konratiev waves - cycle like waves in global economy burning platform aws pdf about transformation general computing (Elastic Compute Cloud EC2), GPU computing (Inf1), Quantum Computing (AWS Bracket) Benefits of Cloud • six core defined by AWS whitepaper 2 – trade capital expense for variable expense – benefits form massive economies of scale – stop guessing capacity – increase speed and agility – no running data centers – go global in minutes • generalized – cost-effective - pay-as-you go pricing – agility – global – secure – reliable – scalable – elastic – current (maintained and patched) Global Infrastructure • regions with one or more Availability Zones AZ (generally 3) • US-EAST-1 is main region (billing, 1st to launch) • choosing a region – regulatory compliance – cost – available services – distance/latency to end user • global services - S3, CloudFront, Route53, IAM • AZ - physical location of one or more datacenters (generally 3 - High Availability) – described as a letter after region eg. eu-central-1a – when launching resource you choose subnet of an AZ • fault tolerance – fault domain - section of a network vulnerable to damage and if failure occurs it will not cascade outside of domain – fault level - collection of fault domains – AWS regions are isolated – AZ as independent failure zone – Multi-AZ for High Availability • AWS Global Network – Edge Locations - in and out – VPC Endpoints - ensure resources stay within AWS Network, not over public Internet PoPs • Point of Presence PoP 3 – – – – – between AWS Region and end user Edge Locations - cached most popular files Regional Edge Locations - cached less popular files PoPs are at edge/intersections of two networks tier 1 network is a network which can reach any other network on the internet without purchasing IP transit or paying for peering – AWS AZ are redundantly connected to multiple tier 1 transit providers • services – CloudFront - CDN – S3 Transfer Acceleration - generates url for end user to upload files to edge location – Global Accelerator - optimal path from end user to web-server Direct Connect • AWS Direct Connect - private/dedicated connection between datacenter/office and AWS – lower and higher bandwidth – consistent network connection – increases bandwidth • Direct Connect Locations - trusted partnered data centers for Direct Connect Zones • Local Zones - datacenters located very close to densely populate area – only specific services – for high demand • Wavelength Zones - edge-computing on 5G network for EC2s on the edge of the targeted 5G networks Data Residency • data residency - physical location of data – compliance boundaries – data sovereignty – AWS Outposts - physical rack of servers with AWS infrastructure ∗ 42U full rack ∗ 1U and 2U pieces – AWS Config - AWS resources configuration – IAM Policies - denying access to specific AWS Regions Gov • government 4 – regulatory compliance programs – GovCloud - FedRAMP (program with standardized approach) – CSP offers isolated region to run FedRAMP workloads • AWS China - separate (within The Great Firewall) - needs certificate Sustainability • • • • The Climate Pledge renewable energy by 2025 cloud efficiency water stewardship AWS Ground Station • satellite communications • schedule a Contact Cloud Architecture • solutions architect - technical solutions using multiple • cloud architect - technical solutions using cloud services – availability ∗ workload across multiple AZ - Elastic Load Balancer ∗ min 2 – scalability ∗ vertical scaling - bigger server ∗ horizontal scaling - more servers of the same size – elasticity ∗ automatically changing capacity ∗ horizontal scaling out (more) and in (less) ∗ Auto Scaling Groups ASG ∗ vertical scaling difficult – fault tolerance ∗ no single point of failure ∗ fail-over - plan to shift traffic ∗ ex. secondary DB ∗ RDS Multi-AZ - secondary in different AZ – disaster recovery ∗ prevent the loss of data ∗ backup, corruption prevention ∗ CloudEnsure Disaster Recovery – two business factors - how secure? and how expensive? • Business Continuity Plan BCP - document on how to operate during unplanned disruption 5 – Recovery Point Objective RPO - how much data can be lost (time since last data recovery point ) – Recovery Time Objective RTO - maximum downtime • Disaster Recovery Options – Backup & Restore - hours (low cost) – Pilot Light - data replicated to another region with min services - 10 min – Warm Standby - scaled down copy - minutes – Multi-site - full copy in another region - real time (good for mission critical services) • RTO against Acceptable Recovery Cost AWS API • HTTP API • each AWS Service has a Service Endpoint • AWS Management Console – console.aws.amazon.com – Service Consoles • AWS SDK • AWS tools for PowerShell ARN • Amazon Resource Name - unique ids for AWS resources across all of AWS • format arn:partition:service:region:account-id:resource-id – also arn:partition:service:region:account-id:resource-type:resource-id – also arn:partition:service:region:account-id:resource-type/resource-id • partition - aws, aws-cn, aws-us-gov • account id (12 digits) • wildcard * for policies AWS CLI • Command Line Interface – terminal - text interface – console - physical computer that inputs – shell - command line program to interact with - eg. Bash • Python executable Program • aws • AWS CloudShell in Console ex. aws s3 list • ~/.aws/credentials - setting up credentials 6 AWS SDK • • • • • software development kit collection of software dev tools in one installable package can get a part of the bundle - for specific service EXPORT ENV_VAR=123 - env variable in bash Cloud 9 - IDE on EC2 – automatic credentials – micro is free AWS CloudShell • • • • • shell in console AWS Cli, Python, Node, vim, wget, pip… 1GB storage free per region Bash, Powershell, Zsh only in some regions IaC • infrastructure as code • configuration script automating creation, updates and destruction of cloud infrastructure • blueprint • Cloud Formation CFN - declarative IaC - verbose, JSON / YAML / XML – CDK generates CloudFormation – can lead to large files – limited in dynamic / repeatable infrastructure - CDK better in it – example - S3 Bucket in yaml AWSTemplateFormatVersion: "2010-09-09" Resources: MyBucket: Type: AWS::s3::Bucket Outputs: MyBucketDomain: Value: !GetAtt MyBucket.DomainName – in CFN service create stack from yaml file – in CLI aws cloudformation create-stack file_path • Cloud Development Kit CDK - imperative IaC - implicit, does more, in Python / JS – TS was first – generates CDB – large library of reusable cloud components - CDK Construct – great for CI/CD – testing framework for Unit and Integration testing 7 – – – – CDK manages state, unlike SDK npm cdk cdk init cdk deploy AWS Toolkit plugin for VSCode • • • • Explorer - resources in linked account CDK Explorers - stacks in CDK Elastic Container Services - IntelliSense for ECS task-definition files Serverless Applications Access Keys • • • • • • aka AWS Credentials access key ID and secret access key you can have two active access keys per user ~/.aws/credentials in SDK EXPORT aws configure Shared Responsibility Model • • • • • • cloud security framework obligations of customer to CSP visualization of responsibility of customer and AWS basically if you can configure or insert data - you are responsible if you can’t configure - CSP is responsible customer - security in the cloud - IN - Data and configuration – managed services / third party software ∗ platforms ∗ applications ∗ IAM – configuration of virtual infrastructure ∗ operating system ∗ network ∗ firewall – security configuration of data ∗ client side data encryption ∗ server side encryption ∗ network traffic protection ∗ customer data • AWS - security of the cloud - OF - global infrastructure and hardware 8 – hardware/global infrastructure - regions, AZ, Edge Locations, Physical security – software - compute, storage, db, networking • types of cloud computing responsibility – On-Premise - customer responsible for all – Iaas ∗ CSP responsible for hardware, VMs, networking ∗ customer - apps, data, middleware, os – PaaS - customer only apps and data – SaaS - CSP responsible for all Compute • IaaS – EC2 Bare Metal - customer can configure all (OS, Hypervisor) – EC2 - customer responsible for guest os, container runtime; AWS has hypervisor – Elastic Container Service ECS ∗ customer - configuration, deployment, storage of containers ∗ AWS - OS, hypervisor, container runtime • PaaS – Elastic Beanstalk (Managed Platform) ∗ customer - uploading code, config of env, deployment strategies ∗ AWS - server, os… • SaaS – Amazon WorkDocs (content collaboration) - customer only contents of files and management of them • FaaS – Lambda ∗ customer - code upload ∗ AWS - deployment, container runtime and rest Architecture • traditional - more responsibility - EC2, Elastic Beanstalk • microservices - Fargate, ECS/EKS • serverless - less responsibility - Lambda, Amplify (Serverless Framework) EC2 • • • • launching VM - emulation of physical computer creating, copying, resizing and migrating a server launched VM - instance EC2 is a highly configurable server - choose AMI (amazon machine image - predefined config) 9 – vCPUs – RAM – Network bandwidth – OS - Win, Ubuntu, Amazon Linux 2… • backbone of AWS - S3, RDS, DynamoDB, Lambda - all use it Computing Services • Lightsail - managed virtual server service - friendly EC2 - e.g. for Wordpress • containers (microservices) – Elastic Container Service ECS - container orchestration service - supports Docker and launches a cluster of servers on EC2 instances ∗ task definition file - kind of docker compose · image url · specify how much memory and vCPU (1024 per 1) ∗ service / task – Elastics Constrainer Registry ECR - repository for images – ECS Fargate - serverless orchestration container service - AWS manages server, so you only pay on-demand per running container – Elastic Kubernetes Service EKS - fully managed Kubernetes service - K8 is the standard for managing microservices • AWS Lambda - serverless function service - run code without managing servers – choose how much memory and how long can run before timing out – charged per runtime rounded to nearest 100ms Higher Performance Computing • HPC - cluster of hundred of thousands of servers with fast connections to boost computing capacity • nitro system - dedicated hardware and lightweight hypervisor – nitro cards - specialized for VPC – nitro security chips – nitro hypervisor • bare metal instance - M5 and R5 EC2 instances • Bottlerocket - Linux based OS for AWS for running containers or VM or bare metal hosts • AWS ParallelCluster - aws cluster management tool to deploy and manage HPC clusters Edge Computing • edge computing - pushing computing workloads outside of network to run closer to destination - on phones, IoT devices, external servers 10 • hybrid computing - running workloads on on-premise datacenter and AWS Virtual Private Cloud VPC • AWS Outposts • AWS Wavelength - run in a telecom datacenter - 5G network • VMWare Cloud on AWS - on premise VM • AWS Local Zones - edge datacenters Cost and capacity management • EC2 Spot instances, reserved instances and savings plan • AWS Batch - batch computing workloads • AWS Compute Optimizer - ML analyzes history and suggests how to reduce costs and improve performance • EC2 AutoScaling Groups • Elastic Load Balancers ELB - different AZ, healthy/unhealthy instances • AWS Elastic Beanstalk Storage Types of Storage Services • block - virtual hard drive for VM - Elastic Block Store EBS - only single write volume • file - AWS Elastic File Storage - data with metadata, multiple reads, writing locks the file - when VM need to access the same drive • object - S3 - data, metadata, Unique ID - multiple reads and writes S3 • manages data as objects • unlimited storage • S3 Object – key - name of the obj – value - data itself, as sequence of bytes – version ID – metadata – between 0 bytes and 5 TB • S3 Bucket – holds objects, can have folders – S3 is a universal namespace, bucket names have to be unique S3 storage classes - from most expensive • S3 Standard (default) - 99.99% availability, 11 9’ durability, replicated in 3 AZs 11 • S3 Intelligent Tiering - ML to analyze object usage and automatic switch to proper tier • S3 Standard-IA - Infrequent Access - cheaper if you access less than once a month, fee per retrieval, 50% less than standard • S3 One-Zone-IA - in one AZ 99.95% availability, retrieval fee, 20% less • S3 Glacier - long term cold storage, retrieval can take minutes to hours, but its very cheap • S3 Glacier Deep Archive - retrieval 12 hours AWS Snow Family • • • • • • storage and compute devices to physically move data in or out the cloud when over internet is to slow/difficult/expensive Snowcone - 8TB / 14 TB Snowball Edge - storage optimized 80 TB and compute optimized 39.5 TB Snowmobile - 100 PB data delivered to Amazon S3 Storage Services • Simple Storage Service S3 - serverless object storage service • S3 Glacier - cold storage on older HDD • Elastic Block Store EBS - persistent block storage service – SSD, IOPS SSD, Throughput HDD, Cold HDD – volumes • Elastic File Storage EFS - cloud-native NFS file system service - sharing files between multiple servers • Storage Gateway - hybrid cloud storage – File Gateway - extends local storage to S3 – Volume Gateway - caches local drives to S3 for continuous backup – Tape Gateway - stores files on virtual tapes for backup (cost effective) • Snow Family • AWS Backup - managed backup service that centralizes backup across many services eg. EC2, EBS, RDS, DynamoDB, Storage Gateway • CloudEndure Disaster Recovery - continuous replication of machines into low-cost staging area • Amazon FSx - feature rich and performant file system – FSx for Windows File Service – FSx for Lustre - Linux # DB ## Databases • relational row-oriented data store • data-store for semi-structured and structured data 12 • complex data store because it requires formal design and modelling techniques • relational - tabular data - row-oriented ore columnar-oriented • non-relational - semi-structured • specialized language to query and modelling strategies to optimize retrieval Data Warehouse • • • • • relational db for analytic workloads - column-oriented data-store in TBs mainly performs aggregation - grouping column data designed to be HOT - very fast even though millions of rows infrequently accessed Key value store • • • • • NoSQL with simple key-value method to store data dumb and fast no relationships, indexes, aggregation unique keys simple key/value store will interpret data (bits) resembling a dictionary associative arrays or hash • will look like tabular but it doesn’t have consistent columns per row (schemaless) • can scale well beyond relational dbs Document store • • • • • • NoSQL db that stores documents as primary data structure can be XML but more commonly JSON or JSON-like sub-class of key/value stores documents with fields embedding and linking instead of joins have indexes NoSQL Services • DynamoDB - serverless NoSQL key/value and document db – designed to scale to billions of records – no shards – flagship aws db – for massively scalable database • DocumentDB - NoSQL document database – mongoDB compatible – very popular amongst devs – when you want a mongodb 13 • Amazon Keyspaces - fully managed Apache Cassandra db (open source NoSQL key/value db) – similar to dynamoDB – some added functionality – when you want Apache Cassandra Relational DB Services • Relational Database Service RDS - multiple SQL engines – Online Transactional Processing OLTP – sql most commonly used – engines ∗ MySQL ∗ MariaDB - fork of mysql ∗ Postgres PSQL - most popular for devs, added complexity ∗ Oracle - enterprise companies, license to use it ∗ Microsoft SQL Server - license to use it ∗ Aurora - fully managed aws db • Aurora – managed db of MySQL (5x faster) and PSQL (3x faster) – when you want a highly available durable scalable and secure db for MySQL or PSQL • Aurora Serverless – serverless on demand Aurora – when you want most benefits of Aurora, but can trade to have cold starts or not big traffic demand – cheaper • RDS on VMare - deploy RDS supported engine on premise – datacenter must be using VMare for virtualization – when you want AWS managed db on premise Other Db services • Redshift - petabyte-size data warehouse – for Online Analytical Processing OLAP – expensive because data is hot – very fast very complex queries – when you want to quickly generate analytics or reports from a large amount of data • ElastiCache - managed db of the in-memory and caching open-source dbs Redis or Memcached – when you want to improve app performance by adding caching layer in front of web server or db • Neptune - managed graph db – data as interconnected node 14 – when you need to understand connections between data eg. social media relationships • Amazon Timestreams - fully managed time series db – devices that send data over time - IoT devices – when you need to measure how things change over time • Amazon Quantum Ledger Database - ledger db – transparent, immutable, cryptographically variable transaction logs – when you need to record history of financial activities that can be trusted • Database Migration Service DMS – from on premise to aws – from two db in different SQL engines – from SQL db to NoSQL db Networking Cloud Native Networking Services • VPC - Virtual Private Cloud - logically isolated section of aws cloud, where you can launch resources • • • • Internet Gateway Route Tables - where traffic from subnets are directed AZ with resources subnets - logical partition of an IP network into multiple smaller network segments • NACLs - firewalls at subnet level • Security Groups - firewall at the instance level Enterprise/Hybrid Networking Services • AWS Virtual Private Network VPN - secure connection between on premise, remote offices … • DirectConnect - dedicated gigabit connection from on premise data center to aws • PrivateLinks - VPC interface endpoints - keeps traffic within aws network to not traverse the internet in order to keep traffic secure VPC and subnets • in VPS you choose a range of IPs using CIDR range eg. 10.0.0.0/16 = 65 536 IP addresses • subnets break up IP range for VPC into smaller networks eg. 10.0.0.0/24 = 256 addresses • public subnet - can reach internet • private subnet - cannot, but has to be ensured by user 15 Security groups vs NACLs • Network Access Control Lists NACLs – virtual firewall at the subnet level – Allow and Deny rules – eg. block traffic from specific IP address known for abuse • Security Groups – virtual firewall at the instance level – implicitly denies traffic – only Allow rules – eg. allow an EC2 instance access on port 22 for SSH – you cannot block a single IP address AWS CloudFront • global CDN • distributions – origin eg. bucket – protocols – caching rules – geographic restrictions - allow and block lists – invalidations EC2 • highly configurable virtual server/VM • resizable compute capacity • choices – OS via Amazon Machine Image AMI – instance type eg. t2.nano – add storage - EBS, EFS and type of it • configure instance – security groups – key pairs – user data – IAM roles – placement groups Instance Families • combinations of CPU, memory, storage and networking capacity • different hardware for different app requirements • general purpose - e.g. T2, Mac – well balanced – for web servers, code repositories • compute optimized eg. C5, C4 16 – high performance processor – for scientific modelling, gaming servers and server engines • memory optimized eg. R4, R5 – for workloads with large data sets in memory – for memory caches, in memory dbs, real time big data analytics • accelerated optimized eg. P2, P3 – hardware accelerators, co processors – for ML, computational finance, speech recognition • storage optimized eg. I3, D2 – high sequential read and write access to very large data on local storage – for NoSQL, data warehousing Instance Types • Instance Type is size + family • common pattern: nano, micro, small, medium, large, xlarge, 2xlarge, 4xlarge… • c6g.metal - bare metal machine • not always powers of 2 • sizes generally double in price Dedicated Instance vs Dedicated Host • dedicated hosts are single-tenant EC2 instances designed for bring-YourOwn-License (BYOL) based on machine characteristics • host has physical server isolation • host has per host billing (vs per instance) • host has visibility of physical characteristics of the server (sockets, cores, host ID) • host has additional control over instance placement on physical server • host can add capacity using an allocation request Three Levels of Tenancy • Dedicated Host - whole server for customer always the same machine and control of physical attributes • Dedicated Instance - part of the server resources for customer, still always the same machine • Default - instance lives in the same place until reboot Additional • auto assign public IP - to reach internet • user data - can set commands to run • key pair create and download in .pem file 17 • • • • • – this file for ssh -i – can upload to the CLI in Console and from it ssh -i – ec2-user session manager as the preferred way of connecting Elastic IP – normally on every reboot instance will have different IP – allocate IP address from AWS pool of addresses – associate this IP address with an instance Actions -> Images and Templates -> – create image - the AMI – create template - AMI + all settings Auto Scaling Group ASG – from template – can be in different AZ, subnets – load balancer – min, desired, max number of instances – health checks – scaling policies Application Load Balancer ALB – can attach SSL cert – application LB, network LB, gateway LB – security group – AZs and subnets – target groups - can associate ASG with ALB – you get a dns EC2 Pricing Models On-Demand • • • • • • • • • • least commitment low-cost, flexible short term, spiky, unpredictable workloads cannot be interrupted for first time apps (new apps) pay as you go model PAYG default model no upfront payment and no commitment minimum 60s per second (more common) or per hour (on pricing shows always per hour) Spot • biggest savings 18 • • • • • • • • up to 90% savings request spare computing capacity (idle servers) can handle interruptions - server randomly starts and stops for non critical background jobs AWS can terminate at any time and you won’t be charged rarely really get terminated when you terminate you will be charged for any hour that it ran AWS Batch is easy and convenient way to use Spot Pricing Reserved (RI) • • • • • • best long term savings up to 75% off steady state or predictable usage or require reserved capacity commit to EC2 for 1-3 years can resell unused server instances reduced pricing based on Term x Class Offering x RI Attributes x Payment Option – term - 1-3 year - don’t renew automatically, they switch to OnDemand – class - the less flexible the greater savings ∗ standard - up to 75% off ∗ convertible - up to 54% - can exchange RI based on RI Attributes (if greater or equal in value) – payment option ∗ all upfront ∗ partial upfront ∗ no upfront – 4 RI Attributes ∗ Instance Type - Family + Size ∗ Region ∗ Tenancy - shared is default ∗ platform - eg. Linux • can be shared between multiple accounts within an aws organization • can be sold on Reserved Instance Marketplace • when purchasing you determine the scope of RI – doesn’t affect the price – Regional RI ∗ doesn’t reserve capacity ∗ access to multiple AZs ∗ RI discount applies to instance Family regardless of size ∗ you can queue purchases – Zonal RI ∗ reserves capacity ∗ one AZ - no AZ flexibility ∗ no instance size flexibility 19 • RI Limits – per month – 20 Regional RI per Region - you cant exceed the limit of running on demand instances – 20 Zonal RI per AZ - you can exceed • Capacity Reservations – finite amount of servers in AZ – request a reservation of EC2 Instance Type – charged whether the instance is running in it or not – can benefit from RI • standard vs convertible RI – standard ∗ RI attributes can be modified · change AZ in Region · change scope Regional/Zonal ∗ cant be exchanged ∗ can be bought or sold on RI Marketplace – convertible ∗ RI attr can’t be modified ∗ can be exchanged for another convertible RI ∗ can’t be on RI Marketplace • RI Marketplace – sell unused standard RI – active for at least 30 days and after upfront payment – US bank account – min one month remaining – seller sets the upfront price – usage price and configuration remains the same as it was purchased – term length is rounded down to nearest month – max 20 000$ in sales per year – RI from GovCloud cant be sold here Dedicated • • • • • • most expensive dedicated server can be on-demand, reserved or spot guarantee of isolate hardware (enterprise requirements) designed to meet regulatory requirements strict-bound licensing which won’t support multi-tenancy or cloud deployments – multi-tenant - virtual isolation – single tenant - physical isolation • for – on-demand – RI - 60% off 20 – Spot - 90% off • choose tenancy when you launch EC2 • security concerns or regulatory obligations for enterprises Savings Plans • similar to RI but simplifies purchasing process • 3 types – Compute Savings Plans - most flexibility ∗ cost reduction up to 66% ∗ automatically apply to EC2, Fargate, Lambda ∗ flexibility between families – EC2 Instance Savings Plans ∗ lowest prices - to 72% ∗ commitment to usage of individual instance families in a region ∗ flexibility to change usage between instances in a family – SageMaker Savings Plans - for SageMaker (ML) to 64% • term - 1-3 year • choose hourly commitment • choose payment option (as above) Identity Zero Trust • model – trust no one, verify everything – identity is the primary security perimeter ∗ first line of defense ∗ defines security controls – old way - network centric ∗ firewalls, VPNs ∗ network is the boundary – new way - identity centric ∗ remote workstations ∗ MFA ∗ provisional access ∗ it augments not replaces network centric • on AWS IAM • AWS is considered to not have a true zero trust offering, third party services need to be used - aws doesn’t have ready to use intelligent identity controls • aws services can be set up in intelligent way in direction of identity concerns but requires expert knowledge – AWS CloudTrail - tracks all API calls 21 – Amazon GuardDuty - detects suspicious or malicious activity based on CloudTrail and other logs – Amazon Detective - analyze, investigate and identify security issues (can ingest findings from Guard Duty) • third parties – Azure Active Directory - real time and calculated risk detection based on more data points than aws eg. Device and Application, robust logic restriction – Google BeyondCorp – JumpCloud – you can connect these solutions to AWS Single Sign On (SSO) Directory Service • maps the names of network resources to their network addresses • shared information infrastructure for locating, managing, administering and organizing resources (objects) eg. volumes, folders, users, devices … • critical component of a network operating system • directory server (name server) • well known ones – Domain Name Service DNS - for the internet – Microsoft Active Directory – Apache Directory Server – Oracle Internet Directory – Cloud Identity • Active Directory – single identity per user – forest of domains Identity Providers (IdPs) • system entity that creates, maintains and manages identity information • provides authentication for applications within a federation or distributed network • eg. FB, Amazon, Google, Github, LinkedIn • federated identity - method of linking users identity across multiple separate identity management systems • OpenId - open standard and decentralized authentication protocol - about providing who are you • OAuth2.0 - standard protocol for authorization – uses authorization tokens to prove identity (instead of user + password) – about granting access to functionality • SAML - Security Assertion Markup Language – open standard for exchanging authentication and authorization between identity provider and service provider 22 – important use cas is Single-Sign-On via web browser • Single Sign On SSO – authentication scheme that allows user to log in with single ID and password to different systems and software – allows IT departments to only manage single identity – eg. Azure Active Directory -> SAML -> SSO -> Slack, aws, stash… – login for SSO is seamless - one login screen LDAP • Lightweight Directory Access Protocol • vendor neutral, standard application protocol for accessing and maintaining distributed directory information services over IP (Internet Protocol) network • common use is to provide a central storage for usernames and passwords • enables same-sign on - single ID and password every time you login • most SSO systems use LDAP • LDAP was not designed to work natively with web apps • some systems only work with LDAP not SSO MFA • protection against stolen passwords • on aws is strongly recommended Security Keys • • • • • secondary device as a second step in authentication looks like a memory stick generates and autofill a security token eg. Yubikey FIDO2, WebAuthn, U2F AWS IAM • IAM policies – JSON documents with permissions for specific user, group, role – attached to IAM identities • IAM permission - API actions restrictions in IAM Policy document • IAM Identities – IAM Users - end users – IAM Groups - group of users with the same permission level – IAM Roles - grant aws resources permissions to specific aws api actions • permission boundaries • Service Control Policies - organization wide policies 23 • IAM Policy Conditions – restrict IP address – restrict on region – restrict if MFA is off – restrict based on time of day • IAM Policy – policy language version - 2012-10-17 – statement container - can have multiple statements – Sid - optional - labels – Effect - Allow/Deny – Action - list of actions that the policy allows or denies – Principal - account/user/role/federated user – Resource – Condition - optional - circumstances Principle of Least Privilege PoLP • computer security concept of providing a user with the least amount of permissions to perform operation • Just-Enough-Access JEA - smallest permissions • Just-In-Time JIT - smallest duration • risk based adaptive policies - each attempt to access generates a risk score (device, location, services, MFA, IP) - aws doesn’t have it built into IAM • ConsoleMe - open source Netflix project to self-serve short-lived IAM policies for JEA and JIT AWS Account Root User • account - holds all AWS resources • root user - special account with full access which can’t be deleted – email and password to login – permissions can’t be limited (only with AWS Organizations service control policy SCP) – never use root user access keys • user - account for common tasks - logs with account id or alias or username and password • administrative tasks for root user – change account settings - name, email, root user password and access keys – restore IAM user permissions – activate IAM access to the billing and cost management console – certain tax invoices – close aws account – change or cancel aws support plan – register as a seller on RI Marketplace – enable mfa delete on S3 Bucket 24 – sign up for GovCloud – create an organization AWS SSO • create or connect workforce identities in aws once and manage access centrally across aws organization • choose Identity source – AWS SSO – Active Directory – SAML 2.0 IdP • manage suer permissions centrally – aws account and apps – SAML apps • user get single click access Application Integration • process of letting two independent apps to communicate and work together, commonly facilitated by intermediate system • cloud workloads encourage systems and services to be loosely coupled • aws has many services for common application integrations – queueing – streaming – Pub/Sub – API gateways – state machines – event bus Queuing and SQS • messaging system - asynchronous communication and decouple process via messages / events - from sender to receiver (producer and consumer) • queuing system - messaging system that generally deletes messages once they are consumed • simple communication, not real time, have to pull and not reactive • Simple Queuing Service SQS - fully managed queuing service to decouple and scale microservices, distributed systems and serverless apps • use case - queue up transaction emails to be sent eg. registration Streaming • multiple consumers can react to events (messages) • can react in real time to events live in the stream • Amazon Kinesis - fully managed solution for collecting, processing and analyzing streaming data in the cloud 25 • for real time data Pub/Sub • • • • • • • • publish-subscribe pattern - common in messaging systems publishers don’t send messages directly to receivers they send them to event bus, which categorizes into groups subscribers subscribe to the groups publisher has no knowledge of subscribers subscribers don’t pull messages - they are automatically pushed to them use - real-time chat system, web-hook system Simple Notification Service SNS - highly available, durable, secure, fully managed pub/sub messaging service – used to decouple microservices, distributed systems and serverless apps – eg. Service -> SNS -> Lambda – more managed then Kinesis API Gateway • program between a single entry point and multiple backends • allows for throttling, logging, routing logic and formatting req and res • Amazon API Gateway - solution for creating secure APIs in cloud at any scale State Machines • abstract model which decides how state is changed based on a series of conditions (flow chart) • AWS Step Functions – coordinate multiple AWS services into a serverless workflow – graphical console – automatic triggers and tracks each step – logs each step for debugging Event Bus • receives events from a source and routes events to a target based on rules • EventBridge – serverless event bus for application integration by streaming real-time data to apps – formerly Amazon CloudWatch Events – busses ∗ default event bus ∗ custom event bus - scoped to multiple accounts ∗ SaaS event bus - scoped to third-party SaaS providers 26 – – – – – producers - aws services which emit events events - JSON objects which stream within the bus partner sources - third party app that can emit events rules - what events capture and pass (max 100 rules per buss) targets - aws services that consume events (5 targets per rule) ex. Lambda All Services • Simple Notification Service SNS - pub/sub messaging system (email, webhooks, SMS) • Simple Queue Service SQS - queueing messaging system (background jobs) • Step Functions - state machine service • Event Bridge - serverless event bus • Kinesis - real time streaming data service • Amazon MQ - managed message broker service on Apache ActiveMQ • Managed Kafka Service MSK - fully managed Apache Kafka service (Kafka is open source platform for real time streaming data pipelines and apps) - similar to Kinesis but more robust • API Gateway • App Sync - fully managed GraphQL service Containers • EC2 - host operation system, hypervisor (manages VMs) - on it VMs with a lot of unnecessary stuff - always some unused space • Containers on EC2 - host operation system, docker daemon - on it containers - flexible containers - space is not wasted it is available for more containers • VMs don’t make best use of space and apps are not isolated - config conflicts, security and resource hogging • containers - virtually isolated, configure OS dependencies per container Microservices • monolithic architecture - one app on one server responsible for everything - functionality is tightly coupled • microservices architecture - multiple apps responsible for one thing - functionality is isolate and stateless Kubernetes K8 • open source container orchestration system for auto deployment, scaling and management of containers • created by Google and maintained by Cloud Native Computing Foundation CNCF 27 • advantage of K8 over docker is the ability to run multiple containers across multiple VMs • Pods - group of containers with shared storage, network resources and settings • Nodes with Pods with Containers • designed to be used with tens to hounders of different services Docker • set of PaaS products that use OS-level virtualization to deliver software in packages (containers) • earliest popular open-source • Docker ClI • Dockerfile • Docker Compose • Docker Swarm - orchestration tool for managing deployed multi-containers architectures • Dockerhub • Open Container Initiative OCI - established by Docker and maintained by the Linux Foundation • looses favor due to introduction of paid open-source model Podman • • • • • container engine OCI complaint, drop in replacement for Docker pods like in K8 daemon-less still to be used with Buildah (tool to build OCI Images) and Skopeo (tool for moving images between different types of container storage) Container Services • primary – Elastic Container Service ECS - no cold start, self-managed EC2 (will pay if not running) – AWS Fargate - more robust than lambda, scale to zero cost, awsmanged EC2, cold start – Elastic Kubernetes Services EKS - open source, avoid vendor lock-in – AWS Lambda - only about code, short running tasks, can deploy custom containers • provisioning and deployment – Elastic Beanstalk EB - easier ECS, PaaS – App Runner - PaaS specifically for containers – AWS Copilot CLI - build, release and operate containerized app on App Runner, ECS, Fargate 28 • supporting – Elastic Container Registry ECR - repos for images – X-Ray - analyze and debug microservices – Step Functions - stitch together lambda and ECS tasks Governance Organizations and accounts • aws Organizations - create accounts, manage billing, cont rol access, compliance, security • root account user - single sighs in identity with complete access - Master/Root account • Organization Units (OU) - group of accounts, which can contain OU hierarchy • every account will have root user account and can have more user accounts • service control policies - central control over permissions • can be turned on, after enabling can’t be turned off • aws account is not the same as a user account AWS Control Tower • quickly set up secure aws multi-account • provides baseline environment (Landing Zone) to build multi-account architecture – aws SSO enabled – centralized logging for CloudTrail – cross account security auditing • Account Factory - automates provisioning new accounts in organization (account and network configs, region selection, AWS Service Catalog - self service with restrictions) • Guardrails - pre-packaged governance rules for security, operations and compliance - enterprise wide rules • replaced aws Landing Zones AWS Config • change management - in cloud infrastructure formal process to monitor, enforce and remediate changes • Compliance-as-code CaC - programming to automate change management • aws Config is CaC framework in aws account on per region basis – use when you want a resource to stay configured in a specific way for compliance – keep track of config changes to resources – list of all resources in a region 29 – analyze potential security weaknesses from detailed historical information AWS Quick Starts • prebuilt templates for aws and aws partners • 3 parts – a reference architecture for deployment – aws CloudFormation templates that automate and configure deployment – deployment guide (docs) Tagging • key and value pair assigned to aws resources • eg. dept, status, team, environment, project, location • organize – resource management - environments – cost management and optimization - cost tracking, budgets, alerts – operations management - mission critical services – security - can be used with IAM policies – governance and regulatory compliance – automation – workload optimization • EC2 basic tag - name • Resource Groups – collection of resources that share a tag – metrics, alarms, configs – in Global Console Header and under Systems Manager – work with IAM policies Business Centric Services • • • • • • Amazon Connect - virtual call center service WorkSpaces - virtual remote desktop service WorkDocs - shared collaboration service Chime - video-conferences WorkMail - managed business email, contacts and calendar service Pinpoint - marketing campaign management service - sending targeted emails, SMS, push notifications • Simple Email Service SES - transactional email service - integrated into an app to send emails • QuickSight - business intelligence BI service - visualize data 30 Provisioning • allocation or creation of resources to a customer • AWS provisioning services set up and manage aws services • Elastic Beanstalk EB - PaaS for deploying web apps – EC2, S3, SNS, CloudWatch, EC2 ASG, ELB – on CloudFormation templates – not recommended for production in enterprise – RDS, security – can run dockerized environments • AWS OpsWorks - config management service • CloudFormation - infrastructure modelling and provisioning service JSON or YAML files - IaC • QuickStarts • aws Marketplace - digital catalogue • AWS Amplify - mobile and web app framework which provision multiple aws services for serverless backend • AWS App Runner - PaaS for containers • AWS Copilot - CLI to manage containers • AWS CodeStar - UI for managing software dev activities in common stacks eg. LAMP • AWS Cloud DEvelopment Kit CDK - IaC tool for generating CloudFormation Serverless • • • • • • • • • • when underlying servers, infrastructure and OS is managed by CSP generally highly available, scalable and cost-effective DynamoDB S3 ECS Fargate - with ECS you have to keep server running, with Fargate pay on demand per running container AWS Lambda Step Functions - Lambdas and Fargate Tasks Aurora Serverless - most benefits of aurora but cold-starts serverless architecture describes fully managed cloud services degree of serverless – elastic and scalable – highly available, durable – secure by default – billed based on the execution of the task – can Scale-to-Zero - when you don’t use the resource it costs nothing – Pay-for-Value - don’t pay for idle servers 31 Windows on AWS • Windows servers on EC2 • SQL Server on RDS • aws Directory Service - run Microsoft Active Directory AD as managed service • aws License Manager – Bring-Your-Own-License BYOL - use own license to run vendor software on cloud’s vendor computing service – License Mobility from Microsoft for server applications covered by Microsoft Software Assurance SA – software licensed based on vCPUs, physical cores, sockets, number of machines – EC2 (Dedicated Instances, Dedicated Hosts and spot instances), RDS (Oracle DB) – for microsoft windows server and microsoft SQL server license you generally need a Dedicated host • Amazon FSx fro Windows File Server - fully managed scalable storage for Windows • aws SDK for .NET • Amazon Workspaces - launch Windows desktop • Lambdas support PowerShell • AWS Migration Acceleration Program MAP - migration methodology for large enterprise Logging • CloudTrail - logs all API Calls (SDK, CLI) between aws services - blaming – detect dev misconfiguration – detect malicious actors – automate responses – governance, compliance, operational auditing, risk auditing – where, when, who, what – logs by default and collects for last 90 days via Event History – for more days create a Trail – Trail outputs to S3 and to analyze it use Amazon Athena • CloudWatch - collection of services – Logs - store cloud services and app logs ∗ log streams ∗ log events in a log file ∗ Log Insights - interactively search and analyze log data · more robust then simple Filter events in Log stream · easier than export to S3 and analyze with Athena · Query Syntax · single request can query max 20 log groups an will time out 32 after 15 min – Metrics ∗ time-ordered set of data points ∗ EC2 PerInstance metrics eg. CPUUtilization, NetworkIn, DiskReadBytes … – Events (EventBridge) - event based on condition eg. snapshot of server every hour – Alarms - notifications based on metrics ∗ alarm breaches (outside defined threshold) and changes state OK, ALARM, INSUFFICIENT_DATA ∗ when state changes action can be triggered - notification, ASG, EC2 Action ∗ period - interval of checking – Dashboard - visualizations • AWS X-Ray - distributed tracing system – pinpoint issues with microservices – how data moves between apps, how long and if fails Machine Learning • AI - ML - DL • Amazon SageMaker - fully managed service to build, train and deploy ML models at scale – Apache MXNet - open source DL framework – TensorFlow - ML library – PyTorch - ML framework • Amazon SageMaker Ground Truth - data-labelling service - self-service humans label • Amazon Augmented AI - human-intervention review service - mark for human review • Amazon CodeGuru - ML code analysis service • Amazon Lex - conversion interface service - voice and text chatbots • Amazon Personalize - real time recommendations • Amazon Polly - text-to-speech service • Amazon Rekognition - image and video recognition • Amazon Transcribe - speech-to-text • Amazon Textract - OCR (extract text from scanned documents) service • Amazon Translate - neural ML translation service • Amazon Comprehend - natural language processor NLP service • Amazon Forecast - time-series forecasting service • AWS DL AMIs • AWS DL Containers • AWS DeepComposer • AWS DeepLens • AWS DeepRacer 33 • Amazon Elastic Interface - attach GPU acceleration to EC2 to reduce cost up to 75% • Amazon Fraud Detector - fully managed fraud detection service • Amazon Kendra - enterprise ML search engine service Big Data and Analytics • big data - massive volumes of structured/unstructured data - difficult to move and process using traditional db and software techniques • Amazon Athena - serverless interactive query service - takes CSV or JSON to S3 Bucket, loads into temporary SQL tables to run SQL queries - when you want to query CSV or JSON files • Amazon CloudSearch - fully managed full-text search service - when you want to add search to your app • Amazon Elasticsearch Service ES - managed Elasticsearch cluster – elasticsearch is open-source full-text search engine – more robust than CloudSearch but requires more resources • Amazon Elastic MapReduce EMR - data processing and analytics - like Redshift but more suited for unstructured data • Kinesis Data Streams - real-time streaming data service • Kinesis Firehose - serverless and simpler Data Streams, pay on demand • Amazon Kinesis Data Analytics • Amazon Kinesis Video Streams - analyze or process real time streaming video • Amazon Managed Streaming for Apache Kafka MSK - fully managed Apache Kafka service • Redshift - when you need to quickly generate analytics or reports from a large amount of data • Amazon QuickSight - uses SPICE (super-fast, parallel, in-memory, calculation engine), uses ML for responding to prompts • AWS Data Pipeline - movement of data • AWS Glue - extract, transform and load ETL service - transform and move the data - like DMS but more robust • AWS Lake Formation - centralized, curated and secured repository that stores all data - data lake is a storage repository that holds vast amount of raw data in its native format • AWS Data Exchange - catalog of third party data AWS Well-Architected Framework • whitepaper by aws for building using best practices • foundation - general definitions, general design principles and the review process • 5 pillars with their own detailed whitepapers 34 – Operational Excellence - run and monitor systems – Security - protect data and systems, mitigate risk – Reliability - mitigate and recover from disruptions – Performance Efficiency - use computing resources effectively – Cost Optimization • trade off pillars based on business context • pillar – design principles – definition - overview of best practice categories – best practices - for each aws service – resources - docs, whitepapers • general definitions – Component - code, configuration, aws resource against a requirement – Workload - set of components that work together to deliver business value – Milestones - key changes in architecture through product life cycle – Architecture - how components work together in a workload – Technology Portfolio - collection of workloads required for business to operate Team architecture • enterprises generally have centralized with specific roles (TOGAF, Zachman) • aws distributed teams with flexible roles – Practices eg. Team Experts (Raise the Bar) – Mechanisms eg. Automated Checks for Standards – Amazon Leadership Principles - 16 ∗ set of principles for decision making, problem solving, simple brainstorming and hiring ∗ customer obsession ∗ ownership - responsibility ∗ invent and simplify ∗ learn and be curious ∗ hire and develop the best ∗ insist on highest standards ∗ think big ∗ bias for action ∗ frugality ∗ earn trust ∗ dive deep ∗ have backbone - disagree and commit ∗ deliver results ∗ success and scale bring responsibility 35 General Design Principles • stop guessing your capacity needs • test systems at production scale - clone prod env to testing and turn it off after • automate to make architectural experimentation easier - CloudFormation with ChangeSets • allow for evolutionary architectures - CI/CD, Lambdas deprecate runtimes forcing you to evolve • drive architecture using data - CloudWatch • improve through game days - simulate traffic on prod to kill EC2 instances to test recovery Operational Excellence - design principles • perform operations as code - the same discipline in code and cloud infrastructure - limit human error and enable consistent responses to events - IaC • make frequent small reversible changes - rollbacks, incremental changes, blue/green, CI/CD • refine operations procedures frequently - game days • anticipate failure - post mortems on system failures, test code, test recovery • learn from all operational failures - share knowledge across entire organization Security - design principles • implement a strong identity foundation - PoLP, centralized identity, avoid long-lived credentials • enable traceability - monitor, log, alert, audit • apply security at all layers • automate security best practices • protect data in transit and at rest • keep people away from data • prepare for security events - detect, investigate and recover Reliability - design principles • automatically recover from failure - monitor Key Performance Indicators KPIs and trigger automation on threshold • test recovery procedures • scale horizontally to increase aggregate system availability - one point of failure won’t be critical • stop guessing capacity • manage change in automation - IaC 36 Performance Efficiency - design principles • • • • democratize advanced technologies go global in minutes use serverless architectures experiment more often - right-sizing, virtual and automatable resources enable quick comparative testing • consider mechanical sympathy - use technology that aligns best with your workload Cost Optimization - design principles • • • • implement cloud financial management adopt a consumption model measure overall efficiency stop paying for undifferentiated heavy lifting - managing physical servers • analyze and attribute expenditure AWS Well-Architected Tool • an auditing tool to asses workloads for alignment with the framework • essentially a checklist • apply lenses AWS Architecture Center • web-portal with best practices • reference architecture with CloudFormation Total Cost of Ownership TCO • • • • financial estimate of costs of a product of service useful when migrating from on-premise to cloud big migration investments on-premise – software-license fees – implementation – config – training – physical security – hardware – IT personal – maintenance • cloud - aws – subscription fees 37 – implementation – config – training • maybe 50-75% savings with cloud CAPEX vs OPEX • capital expenditure CAPEX – spending money upfront on physical infrastructure – deducting expenses from tax bill over time – servers, storage, network (routers, cables, switches) costs – backups and recovery – data center costs - rent, cooling, physical security – personal – you have to guess upfront what you plan to spend • operational expenditure OPEX – concerned with non-physical costs – leasing software and customizing features – training employees in cloud services – paying for cloud support – billing based on cloud metrics - compute and store usage – try without investing in equipment • shifting IT teams - not redundant – needed during migration phase – transition roles into new cloud roles – hybrid approach – change employees activities from managing infrastructure to revenue generating activities - retraining AWS Pricing Calculator • free cost estimate tool in web browser • used to be TCO calculator • export final estimate to CSV AWS Migration Evaluator • formally known as TSO Logic • estimate tool to determine comparison between on-premise cost and cloud with migration cost • Agentless Collector to collect on-premise infrastructure data - extract onpremise cost EC2 VM Import/Export • import VM into EC2 38 • for VMWare, Citrix, Microsoft Hyper-V, Windows and Linux VHD from Azure • prepare virtual image -> upload to S3 -> AWS CLI import to generate AMI Database Migration Service DMS • source db -> source endpoint -> replication instance with replication task -> target endpoint -> target db • very flexible for very different dbs • AWS Schema Conversion Tool - auto convert source db schema to target db schema AWS Cloud Adoption Framework CAF • whitepaper for migration • six focus areas – business - managers, budget owners - how to update staff skills and organizational resources to optimize business value – people - HR - optimize and maintain workforce and ensure competencies – governance - CIO, Project Managers - ensure business governance in the cloud, manage cloud investments to evaluate business outcomes – platform - CTO, IT managers - deliver and optimize cloud solutions and services – security - CISO, IT Security - architecture in the cloud is compliant with security requirements – operations - system health and reliability during migration and then to operate agile ongoing cloud computing best practices Billing Free Services • • • • • • • • • IAM VPC Auto Scaling Cloud Formation Elastic Beanstalk Opsworks, Amplify, AppSync, CodeStar Organizations and Consolidated Billing AWS Cost Explorer, SageMaker however the ones that provision resources can cost money - like CloudFormation 39 AWS Support Plans (important) • Basic – email support only – for billing and account, service limit increase – 7 trusted advisor checks – 0 USD / month • Developer – tech support via email - 24h reply – no third party support (Azure, Ruby on Rails) – general guidance < 24h – system impaired < 12h – 7 trusted advisor checks – >29 USD / month or 3% of monthly aws usage - whichever is greater • Business – tech support via email - 24h reply – third party support – tech support via chat and phone anytime 24/7 – general guidance < 24h – system impaired < 12h – prod system impaired < 4h – prod system down < 1h – all trusted advisor checks – >100 USD / month or percent of usage in different brackets (incremental) • Enterprise – tech support via email - 24h reply – third party support – tech support via chat and phone anytime 24/7 – general guidance < 24h – system impaired < 12h – prod system impaired < 4h – prod system down < 1h – business critical system down < 15m – personal concierge – TAM - technical account manager - architecture, billing – all trusted advisor checks – >15 000 USD / month or percent of usage in different brackets (incremental) TAM • • • • proactive guidance and reactive support build solutions, advocate for customers healthy, reduce complexity consulting 40 • • • • • • • customer obsession find opportunities for value consultative expertise solve problems during migration uplift customer capabilities by running workshops, brown bag sessions TAM follow Amazon Leadership Principles - especially customer obsession only Enterprise Support tier AWS Marketplace • curated digital catalogue of software listings from independent software vendors • run software that already runs on aws • free or associated charged (added to aws bill - aws pays the vendor) • sales channel for ISVs and Consulting Partners allows to sell • AMIs, CloudFormation templates, SaaS offerings, Web ACL, aws WAF rules Consolidated Billing • • • • organization can pay for multiple aws accounts with one bill master/root account pays the charges for all member accounts Cost Explorer to visualize usage (in US-East-1) Volume Discounts – the more you use the more you save – only for aws organizations AWS Trusted Advisor • • • • recommendation tool monitors account and provide actional recommendations automated checklist of best practices 5 categories – Cost Optimization ∗ idle load balancers ∗ unassociated Elastic IP address – Performance ∗ high utilization on EC2 instance – Security ∗ key rotation – Fault Tolerance ∗ backups – Service Limits • basic/dev - 7 checks – mfa on root – security groups - specific unrestricted ports 41 – S3 Bucket permissions – EVS Public Snapshots – RDS Public Snapshots – IAM use - don’t use root access – service limits • business/enterprise - all Service Level Agreements SLA • formal commitment about expected level of service - if not met customer might receive compensation • SLI - Service Level Indicator - metric • SLO - Service Level Objective - target percentage over period of time – 99.95% – 99.99% – nine nines – nine elevens • in DynamoDB less than 95% -> 100% Service Credit refund • Compute SLAs • per service Service Health Dashboard • general status of aws services Personal Health Dashboard • alerts and guidance for aws events in your environment • all aws customers can use AWS Abuse • AWS Trust And Safety - team concerned with abuses • contact if aws owned IP address – spam – port scanning – DoS attacks – Intrusion attempts – hosting prohibited content – distributing malware • abuse@amazonaws.com - abuse form AWS Free Tier • for 12 months of sign up • free usage up to certain monthly limit forever 42 • • • • • • • • EC2 - t2.micro 750h/month for 1 year RDS MySQL or Postgres - t2.db.micro 750h/month for 1 year ELB - 750h/month for 1 year CloudFront - 50GB transfer for 1 year ElastiCache - cache.t3.micro 750h/month for 1 year SES - 62 000 emails per month forever CodePipeline - 1 pipeline AWS Lambda – 1 M free requests per month – 3.2 M seconds of compute time per month AWS Credits • • • • promotional credits - equivalent to USD joining aws activate startup program, hackatons, surveys expiry date cannot be used in some cases eg. purchasing a domain via Route53 AWS Partner Network APN • • • • consulting or technology partner tiers: Select, Advanced, Premier annual fees commitments knowledge requirements – aws certifications – aws apn-exclusive certs • promotional credits • requirement to be a sponsor with a vendor booth at aws events AWS Budgets • • • • • • • alerts const, usage or reservation budgets support EC2, RDS, Redshift, ElastiCache reservations can forecast costs but limited Budgets API get notified via email or chatbot first two budgets are free, cheap to use AWS Budget Reports • daily/weekly/monthly reports via email • more convenient 43 AWS Cost and Usage Reports CUR • generates detailed spreadsheet • places in S3, analyze with Athena • QuickSight to visualize Cost Allocation Tags • optional metadata attached to aws resources • two types of tags – user defined – aws generated Billing Alarms • CloudWatch Alarms for monitoring spend • turn on Billing Alerts • much more flexible than AWS Budgets - for complex use cases Pricing API • • • • programmatically access pricing for services Query API - The Pricing Service API via JSON Batch API - The Price List API via HTML or subscribe to SNS Security • vulnerability - design flaw or implementation bug that allows attacker to cause harm to stakeholders of an application – buffer overflow – improper data validation – least privilege violation – memory leak – using a broken or risky cryptographic algorithm • encryption - encoding information using a key or a cypher to store sensitive data in an unintelligible format - plaintext -> cyphertext • symmetric encryption - the same key is used for encoding and decoding eg. Advanced Encryption Standard AES • asymmetric encryption - two keys are used eg. Rivest-Shamir-Aldeman RSA • hashing function - accepts arbitrary size value and maps it to fixed-size data structure (can reduce the size of the store value) – one way deterministic process – used to store passwords in dbs - not in plaintext (in theory it is one-way so safe) 44 – popular MD5, SHA256 and Bcrypt – if attacker knows the function they can enumerate a dictionary of passwords to determine the password – salting passwords - random string added to hashing function to mitigate its deterministic nature Defense in Depth • the 7 layers – Data - access, encryption – Application - secure and free of security vulnerabilities – Compute - access to VM - ports, on premise, cloud – Network - limits communication - segmentation and access controls – Perimeter - protection against DDoS - filter large-scale attacks – Identity and Access - primary layer of consideration for aws customer – Physical - limit access to data center CIA Triad • Confidentiality, Integrity and Availability triad - model describing the foundation of security principles and their trade offs • Confidentiality - privacy for data protection against unauthorized access cryptographic keys to encrypt data and using keys to encrypt keys (envelope encryption) • Integrity - maintaining and assuring accuracy and completeness of data over its entire life cycle - utilizing ACID compliant dbs for valid transactions and temper-evident / tamper-proof hardware security modules HSM • Availability - information available when needed - high availability, mitigation DDoS, decryption access Digital Signatures and Signing • digital signature - mathematical scheme for verifying authenticity of digital messages and documents • gives tamper-evidence - modification and source • three algorithms – key generation - public and private key – signing - generating digital signature with private key and inputted message – signing verification - verify the authenticity of the message with public key • SSH - uses public and private key to authorize access to remote machine eg. VM – it is common to use RSA ssh-keygen • code signing - digital signature to ensure code has not been tampered 45 In-Transit vs At-Rest Encryption • in-transit – TLS - Transport Layer Security - encryption protocol for data integrity between two or more communicating computer applications 1.0 and 1.1 deprecated, 1.2 and 1.3 current best practice – SSL - Secure Socket Layers - as above - 1.0, 2.0 and 3.0 are deprecated • at-rest - AES, RSA Common Compliance Programs • set of internal policies and procedures to comply with laws, regulations and to uphold business reputation • ISO - International Organization for Standardization and International Electrotechnical Commission – ISO/IEC 27001 - control implementation guidance – ISO/IEC 27017 - focus on cloud security – ISO/IEC 27008 - protection of personal data in the cloud – ISO/IEC 27701 - Privacy Information Management System PIMS framework • System and Organization Controls SOC - SOC 1,2,3 • Payment Card Industry Data Security Standard PCI DSS - all companies processing credit card info • Federal Information Processing Standard FIPS 140-2 - cryptographic modules protecting sensitive info • Health Insurance Portability and Accountability Act HIPAA - protected health information • Cloud Security Alliance CSA STAR Certification • Federal Risk and Authorization Management Program FedRAMP - security authorizations for cloud service offerings • General Data Protection Regulation GDPR Pen Testing • authorized simulated cyber attack to evaluate security of the system • pen testing is allowed on AWS – permitted - EC2, NAT Gateways, ELB, RDS, CloudFront, Aurora, API Gateways, Lambda, Lightsail, Elastic Beanstalk – prohibited - DNS zone walking via Route 53 hosted zones, DoS and DDoS (there is separate policy), port, protocol and request flooding • for other simulated events submit a request AWS Artifact • self-service portal for on-demand access to AWS compliance reports • choose a report and download pdf 46 AWS Inspector • hardening - eliminating as many security risks as possible - common for VMs security checks • runs a security benchmark against specific EC2 instances • network and host assessments • install aws agent on EC2 instance, run assessment and review and remediate security issues • popular benchmark is CIS with 699 checks (Center for Internet Security) AWS Shield • managed DDoS protection service • when you route traffic through Route 53 or CloudFront you use AWS Shield Standard • protects against layer 3, 4 and 7 attacks - network, transport and app • Shield Standard - free – most common DDoS attacks – best practices to build DDoS resilient architecture • Shield Advanced - > 3000 USD / year – larger and more sophisticated attacks – available on Route 53, CloudFront, ELB, Global Accelerator, Elastic IP – visibility and reporting on 3,4 and 7 – DDoS cost protection – comes with SLA • both plans integrate with aws Web Application Firewall WAF to give layer 7 protection AWS Guard Duty • • • • IDS - intrusion detection system IPS - intrusion protection service threat detection service uses ML to analyze aws logs – CloudTrail logs – VPC Flow Logs – DNS logs • alerts through EventBridge AWS Macie • fully managed service monitoring S3 data access for anomalies, unauthorized access, data leaks • ML analyze CloudTrail logs • identifies most at-risk users 47 AWS VPN • secure and private tunnel from your network or device to the aws global network • aws site-to-site VPN - connect on-premise network or branch office site to VPC • aws client VPN - connect users to aws or on-premise networks • IPSec - Internet Protocol Security - secure network protocol suite - encrypted communication over Internet Protocol network - used in VPNs AWS WAF • • • • • Web Application Firewall rules to allow or deny traffic based on contents of http req ruleset from trusted aws security partner or aws WAF Rules Marketplace attached to CloudFront or ALB protect from attacks in the OWASP (Open Web Application Security Project) top 10 most dangerous attacks – injection – broken authentication – sensitive data exposure – XML External Entities XXE – broken access control – security misconfig – cross site scripting XSS – insecure deserialization – components with known vulnerabilities – insufficient logging and monitoring Hardware Security Module HSM • piece of hardware designed to store encryption keys • hold keys in memory - never write on disk • follow FIPS – multi-tenant - FIPS 140-2 level 2 compliant - AWS KMS – single-tenant - FIPS 140-2 level 3 - AWS CloudHSM (for regulatory enterprise compliance) AWS KMS • • • • Key Management System create and control encryption keys many aws services integrated to use KMS to encrypt data uses Envelope Encryption – master data encrypts – data key encrypts – data 48 CloudHSM • automates hardware provisioning, software patching, high availability and backups • HSM Client on VM • built on HSM industry standards • can migrate keys to other commercial HSM solutions • config KMS to use CloudHSM as a custom key store Variation Study AWS Config vs AWS AppConfig • aws Config – governance tool for CoC - compliance as code • aws AppConfig – automate process of deploying apps configuration variables change to web-apps – write validator to check if the change wont break the app – monitor deployments SNS vs SQS • both connect apps via messages • SNS – PubSub – multiple protocols - HTTP, email, SQS, SMS – plain text emails triggered by aws services eg. billing alarms – can retry for HTTPS – really good for webhooks, simple internal emails and triggering lambda functions • SQS – Queue, guaranteed delivery – apps pull queue with AWS SDK – can retain messages for 14 days – can send in sequential order or parallel – can ensure only one message is sent – can ensure message is delivered at least once – really good for delayed tasks, queueing up emails SNS vs SES vs PinPoint vs Workmail • all send emails • SNS - practical and internal emails • SES – transactional emails (in app actions - sign up, reset password, invoice) 49 – can send html emails – email templates – custom domain name email – monitor email reputation • PinPoint – promotional emails – campaigns – customer journeys via emails – A/B emailing testing • Workmail – email web client – web client within aws management console Amazon Inspector vs AWS Trusted Advisor • security tools and perform audits • Inspector - audit single EC2 Instance - report from long list of security checks • Trusted Advisor - doesn’t generate pdf report, holistic view of recommendations across multiple services Connect Names Services • Direct Connect - dedicated fiber optics connection from your data center to aws • Amazon Connect - Call Center as a Service - IVS (Interactive Voice System) • Media Connect - converts videos into different video types Elastic Transcoder vs Media Convert • Elastic Transcoder - the old way • aws Elemental MediaConvert - the new way - more robust, overlays images, captions data Artifact vs Inspector • compile out PDFs reports • Artifact - why should an enterprise trust aws - global compliance frameworks • Inspector - is EC2 instance secure - audit tool for EC2 instances Load Balancers • ELB - main service with 4 types of load balancers • ALB 50 – layer 7 HTTP/s – routing rules (based on information in http req), can attach WAF • NLB – layer 3 and 4 (TCP and UDP) – extreme performance for TCP and TLS traffic – ultra low latencies – optimized for sudden and volatile traffic patterns while using single static IP address per AZ – in video games • GWLB – when you need to deploy a fleet of third party virtual appliances that support GENEVE • CLB – layer 3,4 and 7 – for apps built in EC2-Classic network – doesn’t use Target Groups – retired in 2022 51