EBOOK GitOps and Everything-as-Code for cloud-native applications How to achieve scalable, consistent, and secure management of environments for distributed applications © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 GITOPS AND EVERYTHING-AS-CODE FOR CLOUD-NATIVE APPLICATIONS Table of contents Implementing scalable strategies for your DevOps efforts ……………..…….……….… 3 Infrastructure as Code (IaC) ……..………..………..……………………………………….… 7 Environments as Code (EaC)…….………………………………………………………….… 9 An introduction to GitOps ……..…..………..………..……..…….…………………..……. 10 Going beyond GitOps …….…………………………………………………………………. 12 Configuration as Code (CaC) ……..…..……………........……………………………..……. 13 Policy as Code (PaC) ……….……..…..…………………………….………….………..…… 14 Implementing guardrails ..…...…………………...….…..….…..….…..….…..….…..….… 16 Lessons from application development ….....….……………………………………….… 17 Everything-as-Code (EaC) summary …….....…….……...……..….…………………….… 18 Tools for building cloud-native ………..………………………………………………....… 19 Conclusion …….…….…….…………..…………...……….…..……..…………...……….… 32 © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2 Implementing scalable strategies for your DevOps efforts As engineering teams seek to increase agility by breaking apart monoliths and adopting cloud-native application architectures, the dynamic nature of managing infrastructure, applications, and configuration in these environments can be daunting. Without the proper knowledge, tooling, and roadmap, it’s easy for things to get out of control. Here, we’ll dive deep into the concept of GitOps and outline how to build an Everything-as-Code (EaC) practice using version control to ensure managing environments for distributed applications is scalable, consistent, secure, and portable. You’ll take away practical next steps to help you increase IT agility and be empowered to innovate faster. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3 The evolution of application design and infrastructure Along with evolution of software architecture, there was also an evolution of infrastructure— how we build it and maintain it, and how new tools and roles have emerged. When organizations started out, the approach was simple: You had single-server designs in which the entire application stack, database, and helper applications were all installed on a single server. If you needed to install something or troubleshoot a problem, you just logged into the server. This design worked well for a little while. The focus was more on hardware and building in performance and reliability, and hardware vendors were making extremely reliable servers with multiple power units and disk arrays. Servers kept increasing in size, but there were limits to the redundancies that were intended to make them resilient: If a motherboard went out, for example, the server was useless. And there was no application segregation, as multiple unrelated applications were typically installed on a single server to optimize space. Due to flaws in this design, systems architects began decoupling components. Databases and unrelated applications had their own dedicated servers. This design was, at first, still very manageable. Admin teams grew and pretty soon there were fleets of people managing the operating system (OS), patches, and applications. Job roles got even further refined and silos were born, the thinking being that infrastructure and application strategies had become so complex that you needed individuals who specialized in those areas to keep things running. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 4 Mainframe monolithic Client server Web application Service-oriented architecture Microservices Input/Output GUI GUI – Browser Layered services Granular services Processing Main computing on server Main computing on web server SOAP web services REST/ message queue Data storage I/O at client Layered components Binary Communication Based upon subdomains Reporting LAN Internet-based Internet Based Internet-based 1990s 2000s 2010s Present Time division multiplexing 1980s An inflection point with this technology arrived as a shift in design and application architectures. The shift was more related to using commodity hardware and built-in redundancy horizontally, rather than trying to build robust hardware. The idea was to think of servers as fungible components that you expected would someday fail. Application design also changed as one had to think about session and data persistence. This shift launched a rapid expansion with the number of servers and infrastructure that teams had to support. To keep up, tools started popping up on the market to manage and maintain all these components—monitoring systems, logging systems, firewall management systems, patching systems … the list is practically endless. This huge variety of tools solved one problem but caused many others —they were so specialized that they required their own teams of experts to manage them. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Today, many organizations are still using this pattern to manage and maintain applications, infrastructure, and configuration. The challenge is that now most organizations have a proliferation of tools and technology, but the culture built around this tooling slows down application release processes. Somehow the process got in front of the entire reason developers began utilizing the technology. Instead of being empowered to address these challenges head on, they were often blamed as the cause of outages and security issues which led to multiple gates and checks put into place to address the symptom. Cloud-native technologies and DevOps strategies are turning this mentality on its head. 5 Everything-as-Code provides a flexible and powerful approach Today, you no longer need to manage and maintain servers as was previously necessary. Servers, patching, configuration, and infrastructure are all being treated as fungible components that don’t change in different environments. This is made possible by cloudnative technologies, which make everything you need to consume accessible through an application programming interface (API). In the past, when deploying a new server, you needed to consider usage projections, purchase the hardware, and have it installed in a data center. Then someone needed to install an OS, configure application dependencies, join it to the network, add backups, monitoring, application, and many other burdensome tasks. This process often took months. With cloud-native and AWS, you can define all these components in a file and launch them in minutes anywhere around the world in an AWS Region. Everything-as-Code (EaC) makes this possible. EaC is defined as the idea of codifying all aspects of software development, including the infrastructure, schemas, pipelines, configuration—anything that is associated with the cloud should be codified. This gives you several benefits: You can deploy in minutes anywhere around the world The variability that is often introduced by human error is removed, resulting in better consistency and predictability Code is not only used to tell computers what to do, it’s a form of documentation and communication Code is reusable—you can move it from one project to another and create templates for other teams to consume Code is easier to test and validate compared to physical components Now that we’ve looked at some application development history, let’s get to the next step in your journey to cloud-native maturity: Implementing an EaC approach in your own practice. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6 Infrastructure as Code (IaC) Let’s first take a look at the most common subset of EaC outside of applications and software: IaC enables you to treat infrastructure the same way applications are treated— infrastructure is defined in a declarative way and stored in a source code control system. There are many different tools on the market to help you manage deployed infrastructure, but the difference for cloud-native comes down to some underpinning best practices that help you unlock speed and agility. 2. Everything as Code © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7 Infrastructure as Code (IaC) Best Practices Infrastructure should be treated as immutable. Ideally, once deployed, there should be no attempt to alter the configuration of resources, such as deploying patches. With cloudnative, a new container should be built and deployed. A new version should be updated with the latest patches for AWS Lambda. This also includes the nodes on which containers run. When updates need to happen with the nodes, it’s best to deploy a new node that has already been updated and leverage a container orchestrator such as Kubernetes to migrate the containers to the new node. Infrastructure code and application code should live in close proximity to one another. When using the cloud, you are often going to need related services, and both code bases are often tied together. There are two schools of thought here: 1. Have IaC live in the same repository as application code. Both should be treated as a single system as they are interlocked with one another. They can be deployed together and, more importantly, tested together. 2. Have IaC live in multiple separate repositories near your application code. Changes happen independently and there is no reason to deploy an application update when a piece of infrastructure changes and vice versa. IaC should be dynamic. Don’t hard-code parameters and settings. The problem with hard-coding is that you need to run tests in the same environment that the hard-coded parameter targets. As an example, if you have hard-coded a database endpoint, your application will not work in an environment that cannot connect to that endpoint. resource “aws_db_instance” “db” { allocated_storage = 10 … } resource “aws_ecs_service” “ecs” { name = “ecsApp” … ordered_placement_strategy { type = “binpack” … } } © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Cloud Tool Web Servers Database 8 Environments as Code (EaC) Environments as Code is an abstraction over IaC that provides a declarative way of defining an entire environment. Think of environments as consisting of such elements as development, testing, staging, and production. An environment should include all those components necessary to stand up an entire environment—not only the compute components, but also DNS, networking, firewall, and load balancers. The intent of the Environments as Code concept is to codify everything so you can spin up new environments using a single code base. The advantage you gain here is multi-pronged: It allows you to quickly spin up new environments for testing and developer sandboxes and, as a bonus, you have part of your disaster recovery plan codified and testable with very little effort. Using Environments as Code provides you with a single source of truth for all environments. Cost Optimization Having Environments as Code for a capability is extremely powerful, not only for testing and developer sandboxes, but also to drive cost savings. With this approach, you no longer need to have long-running environments that are sitting idle most of the time. No more paying for idle time from which you are not getting any value. For development environments, these can and should be turned off when developers are not using them, such as nights and weekends—40 hours a week versus the traditional 168 hours a week means a potential 76 percent cost saving. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9 An introduction to GitOps We’ve looked at IaC and Environments as Code as ways to codify infrastructure and environments. Now let’s transition to GitOps—what it is and how you can use it as part of your IaC strategy. The company Weaveworks originally coined the term GitOps and in their definition it means two things: An operating model for Kubernetes and cloud-native that provides a set of best practices to integrate deployment, management, and monitoring for containerized clusters and applications. A path toward a developer-centric experience for managing applications —applying Git workflows to not only development but also operations. First and foremost, GitOps is centered around Git and Kubernetes. Let’s look at how these two components work together. As an engineer, you would have an infrastructure manifest stored in Git. This manifest can be defined using plain YAML, helm charts, Kustomize files, or Jsonnet. Example use case: If you want to change something running in Kubernetes such as the number of replicas, you would change the manifest and push the change to Git. Kubernetes would recognize that your repository has a new commit and will apply these changes in Kubernetes. This process works because Kubernetes is designed around the idea of a desired state configuration. Kubernetes is aware of its current running configuration, and when a new configuration is updated, Kubernetes compares the old configuration with the new and updates its running state accordingly. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10 Another important piece of the puzzle is a helper application that runs on Kubernetes to watch for changes in a Git repository that notifies Kubernetes of a new manifest when it notices a change. But the helper application does more than just notify Kubernetes of changes; it also manages versions, has a user interface (UI), keeps logs and statistics, and more. Two popular applications for this are Argo CD and Flux CD. Infrastructure as Code CI/CD Config management Kubernetes Application code “Source of truth” for declarative code Update to code source triggers a pipeline Pipeline runs a series of tasks, resulting in the update of the runtime environment to match the source AWS takes this concept one step further. With Amazon Elastic Kubernetes Service (EKS) and AWS Controllers for Kubernetes (ACK), you can use GitOps to not only define your Kubernetes infrastructure, but to also define and use AWS services and resources outside your cluster. As an example, if your application needs an Amazon DynamoDB Database, you define this in your manifest file and let GitOps take care of enabling and managing your DynamoDB table and schema for you. A key takeaway is that you can use GitOps not just for containers, but also for managing AWS resources such as AWS Lambda, Amazon Relational Database Service (Amazon RDS), Amazon ElastiCache, and many others. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 11 Going beyond GitOps Inner-sourcing We’ve talked about GitOps as it relates to two different technologies, Git and Kubernetes, but when we’re seeing so many benefits to operating in a GitOps-based manner, why not extend the approach to other technologies? The idea of using Git for the center of everything is a precursor to the concept of inner-sourcing. Innersourcing is similar to open-sourcing except that projects are internal to an organization. With inner-sourcing you are sharing code with the rest of your organization with the idea that: Let’s look at Git as a single source of truth for more than just infrastructure. You can use it for configuration, DevSecOps, policy, guardrails, basically anything we want to define in files so that actions are repeatable and consistent. In this way, Git becomes not only a single source of truth, it also becomes a collaboration tool where teams from security, development, testing, operations, and compliance are all working together. Git is also used as a self-service tool. For example, if you need a new public hostname record, you can go to the Git repository for the DNS entry, add the entry in that you need and create a pull request. • The more people that use it, contribute to it, and extend it, the better the code and solution become. • It makes sense to reuse existing components—if someone in your own organization has built something that meets your needs, why reinvent the wheel? Going beyond GitOps in this way is an implementation evolution of DevOps and is really only possible with tooling that is available in a cloud-native approach. Infrastructure as Code CI/CD Config management Application code “Source of truth” for declarative code Update to code source triggers a pipeline Sample environment © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12 Configuration as Code (CaC) One of the core points of the practices we’re discussing here is the usage of Git as a single source of truth, so it’s fitting that it would be used as a source of truth for configuration as well. Configuration is the information that is needed by an application at startup and runtime. While IaC concerns itself with the deployment of the underlying infrastructure, CaC treats configuration just like it’s application code: It is stored in Git, versioned, and deployed through a continuous integration/ continuous delivery (CI/CD) pipeline. CaC versus IaC There is occasional confusion around the fundamental differences between these two concepts. Configuration pertains to the information that is needed by an application at startup and runtime while IaC pertains to the deployment of the underlying infrastructure. There are two conventions for CaC: 1. One is to store it in a separate repository per environment. As the environment is being deployed, it combines the configuration for that environment with the application. This method works well if you deploy applications separately from configuration changes. 2. The second convention is to store application and configuration code together. This method has an added benefit of being able to test using static code analysis tools and linters, as the environment does not need to be up and running to go through initial tests. This method happens to be a natural choice if you have a language that supports this such as Node.js with .env files. What about secrets? Something that is important to note is that you shouldn’t store secrets (bits of information you wouldn’t want others to know such as credentials, API keys, or tokens) in configuration files. If you have secrets, you will want to store these in either AWS Systems Manager Parameter Store or AWS Secrets Manager. Most IaC and CaC tools support reading directly from either of these two AWS services without exposing the secret. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13 Policy as Code (PaC) A policy is a rule or procedure that guides all users accessing and using an organization’s IT resources. Policies can be written or implemented using technology, but written policies don’t scale very well as they rely on humans for implementation. On the other hand, technology-defined policies are cumbersome to maintain as they are typically implemented directly in the product or application. An example is an access control policy for a database. To provide a person access, someone with grant abilities needs to add the user and provide the necessary permissions. This seems easy enough, but what if the same user now needs to be able to launch Kubernetes containers and pull messages off of a messaging queue? Now the problem starts to get a little more complex—you can probably write a script for this, but what happens when the policy changes? Now you have to go to each of the different systems and change the policy manually, hoping that you didn’t make a typo. This approach can quickly become unmanageable. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14 Open Policy Agent (OPA) To address the problem discussed on the previous page, there is an open-source project called Open Policy Agent. OPA is a general-purpose policy engine that separates policy decision making from your software. What this practically means is that you can define a general policy— for example, no containers unless they are from your repository named private.example.com —and any application that is integrated with OPA can ask it if this action can be allowed. In the context of Policy as Code, OPA is codified and the policy is deployed to an API server. The API server’s main job is to respond to queries regarding policies. This design decouples policy from the application, which means that you can share policies between applications regardless of language or framework. Code snippet from constraint.vaml apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sAllowedRepos metadata: name: allow-only-privateregistry spec: match: kinds: - apiGroups: [""] kinds: ["Pod"] parameters: repos: - "private.example.com" Amazon EKS private.example.com Running: myapp Denied: otherapp public.example.com A key concept here is that you can define Policy as Code in a Git repository with OPA. You deploy your policies through a DevOps pipeline to an OPA API server and have any number of applications checking against those policies. If a policy changes, you update the repository and all applications leveraging OPA will then have the latest policies. OPA is a cloud-native tool and has a number of supporting applications. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15 Implementing guardrails Guardrails are processes or practices that reduce both the occurrence and blast radius of undesirable application behavior. Cloud-native has empowered developers to move quickly, but security is still a topmost concern. A mechanism is needed that ensures security but doesn’t encumber developers with policies that can stifle their creativity. Guardrails provide that mechanism. In the context of EaC, an example of a guardrail might be “no public access for an Amazon Simple Storage Service (Amazon S3) bucket” or “secrets are checked into Git.” Guardrails are both proactive and reactive, but fully automated. Policies should not prevent someone from doing something; they are instead a sort of warning that comes either before or after a triggering event. When everything is defined as code, guardrails can be applied during the development or CI/CD process. Guardrails in action—an example scenario An engineer creates an IaC template to create a public Amazon S3 bucket. When they deploy this template, they’ll get a warning that they need to enable this action at the account level before they actually create the Amazon S3 bucket. After reviewing the warning, they can decide whether they still need a public bucket. They decide that they do, so they change the account policy and redeploy the template. Now their public bucket has been created. The guardrail doesn’t stop there. After a short period of time, a ticket will automatically open indicating that the developer has a public bucket in their account. The ticket will have instructions on how they can either disable public access to the bucket or file an exception. For the sake of this example, the developer files an exception. The guardrail made the developer review best practices for making an Amazon S3 bucket and certify that they did, in fact, intend for it to be public. The guardrail doesn’t prevent them from creating the public bucket, it provides them with warnings and best practices in the event that they created the public bucket inadvertently. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16 Lessons from application development We’ve talked through IaC, CaC, Policy as Code, and guardrails. Let’s also consider some of the key strategies many of us know about from our experiences in application development and see how we can apply them as best practices to EaC. Dev tools Develop using an integrated development environment (IDE) with plugins that support your IaC tool, linting, scanning, testing, and even formatting. Just like IDEs make it easier to develop actual code, they also make it easy to develop EaC. Traffic Treat your IaC as if it were real code. Develop tests and use principles such as DRY (don’t repeat yourself) and Keep it Simple. Fluid software Build for change. Just like software, EaC will constantly change. Make sure you name resources and variables in a way that makes it intuitive for those who come behind you to read the code and determine what is happening and what the code supports. Modularize Break your problem space into smaller manageable components. Everything-as-Code can get very complex, but when you modularize, you can reduce that complexity. Once you have modules, consider publishing those as both separate repositories in Git and into an artifact repository such as AWS CodeArtifact. When using this pattern, other teams can import these modules like any other code module so they are not reinventing the wheel. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Create a Continuous Integration/Continuous Delivery (CI/CD) pipeline for EaC. CI/CD 17 Everything-as-Code (EaC) summary We’ve covered a lot of ground so far. Let’s take a moment to revisit some highlights: Define all infrastructure as immutable Use OPA to create a policy as code framework and store IaC with or close to application code that can be used across a number of applications to define and enforce different org policies Define your entire environment with IaC including edge cases such as DNS and certificates GitOps is about using Git and Kubernetes Guardrails allow developers to go fast without encumbering them with policies that can stifle their creativity to deploy and manage infrastructure Use best practices from application development You can use Git as a single source of truth such as creating a CI/CD pipeline to build, test and deploy, and modularizing EaC to break problems into smaller, more manageable pieces for more than just infrastructure—use it for anything you want to define in files so actions are repeatable and consistent Configuration as Code (CaC) is treating your application configuration the same way you treat software © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18 Tools for building cloud-native In this section, we’ll look at some of the best-fit tools you could use to achieve the tenets discussed in the previous section. At AWS, we’ve long been believers in enabling builders to use the right tool for the job—and when you build with AWS, you’re provided with choice. You can build using the native services AWS provides or use AWS Marketplace to acquire third-party software offered by AWS Partners to take away the heavy lifting and allow your development teams to focus on delivering value to customers. Let’s take a deeper look at three key components at this stage of your cloud-native journey: a way to define IaC, the ability to set up policies and guardrails, and tools for Observability. Adding development capabilities with AWS Marketplace Find, try, and acquire tools across the DevOps landscape for building cloud-native applications Plan Test Build Secure Release Operate Amazon CloudWatch Amazon Kinesis Amazon EventBridge Amazon EKS Amazon DynamoDB AWS CodeCommit AWS Device Farm AWS Cloud9 AWS CodeDeploy CodeCatalyst AWS Lambd a Sample AWS and AWS Marketplace solutions 3,000+ vendors | 13,000+ products AWS Marketplace is a cloud marketplace that makes it easy to find, try, and acquire the tools you need to build cloud-native. More than 13,000 products from over 3,000 Independent Software Vendors are listed in AWS Marketplace—many of which you can try for free and, if you decide to use, will be billed through your AWS account. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19 Infrastructure as Code (IaC) The first component will help automate the process of managing and provisioning IaC so you don’t have to do it through time-consuming manual processes. Storage team Network team Application X team R&D team Storage module Network repository App X repository R&D repository Storage bucket CIDR resource App-stack module Y Cloud resources: network, storage, etc. Storage class Subnets resource Vault cluster module Logging Routing resource Remote: network workspace Replication Gateway resource New unique cloud services Build pipeline Plan Version control system Sentinel policy enforcement Application Y workspace Application X workspace Storage m odule App-stack m odule Y V1.5 V1.8 App-stack m odule X Vault cluster m odule V2.0 V3.0 Kube cluster m odule Com pute m odule V1.0 V1.2 App-stack module Y V1.8 Vault cluster V3.0 Unique settings Application Z workspace Terraform Enterprise Apply Unique settings Production Staging D evelopm ent R&D w orkspace Networking Network workspace Database workspaces Unique settings Outputs for consumption Storage Compute Security Production Database Unique cloud service A Sentinel policies Unique cloud service B Unique cloud service C Staging Private module registry D evelopm ent Workspaces Security team Terraform Cloud provides a number of features that support optimized IaC capabilities Terraform is an IaC tool and Terraform Cloud is an IaC service that provides features and functionality to support enterprises. Get started in AWS Marketplace › Watch a demo › Start a hands-on lab › • Private modules allow users to create and share infrastructure modules within an organization • Cost estimation features can provide insights into what a planned deployment may cost • Remote execution provides consistency and visibility for provisioning operations • Single sign-on (SSO) integration to allow seamless authentication to Terraform for end users • Integration with version control systems provides a cloud-native GitOps approach for deployments • Audit logging to provide visibility on who, what, where, and how © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20 Guardrails The second component will help with setting rules, standards and best practices related to your development pipeline, from coding and building through testing and release. IDE & pre-commit Commit and build Infrastructure as as Infrastructure code scanning scanning Code Pull request fixes into VCS Run Remediation lambda Run-time configuration analysis and drift detection Pull request or build triggers Bridgecrew Pull request orbuild-time build triggers Bridgecrewscanning build-time scanning IDE scanning and fixes Policy engine Bridgecrew provides security and compliance from development to production for IaC such as Terraform, container image scanning, and secrets. It scans against hundreds of pre-built security and compliance benchmarks. Try it in AWS Marketplace › © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notifications Dashboards Compliance reports Bridgecrew works with the IDE to provide fast feedback to engineers while they have context. And for teams using a Git or GitOps workflow, it integrates with version control systems to scan on pull requests and build triggers. Once IaC is deployed, Bridgecrew has integrations with AWS and Terraform Cloud to validate compliance and detect drift. Alerts and notifications can be configured to update ticket systems such as Jira or post to communication systems such as Slack. 21 Cloud-native monitoring Amazon EC2 End-user experience AWS Elastic Beanstalk AWS Lambda Data center Code-level visibility Service and infrastructure monitoring Microservice Observability Amazon Route 53 Infrastructure visibility Database visibility Amazon RDS AWS Storage Gateway Amazon Redshift Auto Scaling Amazon SNS Amazon EMR Amazon ElastiCache Amazon CloudSearch Elastic Load Balancing Amazon API Gateway AWS Billing Monitoring AWS OpsWorks Amazon EBS Third parties Business metrics One platform – One U.I. Observability Amazon DynamoDB Amazon SQS The third component of our solution will provide end-to-end visibility across application resources and infrastructure to help troubleshoot issues and determine the performance of your IaC configuration, as well as the services and systems that run your applications. AppDynamics is an application and business performance monitoring solution that allows you to prioritize your workflows with unified visibility. Context-aware visualizations and correlated insights across application domains allow users to build AWSnative applications with confidence while tying the deployments back to business outcomes in real time. Try it in AWS Marketplace › AppDynamics provides end-to-end visibility of the performance of the resources running your applications. You can use infrastructure visibility to identify and troubleshoot problems that can affect application performance such as service failures, JVM crashes, and network packet loss. AppDynamics baselines and measures transaction health across different microservices to provide a single-threaded view of any business transaction. Lastly, AppDynamics provides end-to-end visibility of the performance of your database, helps you troubleshoot problems such as slow response times and excessive load, and provides metrics on database activities, such as: • SQL statements or stored procedures that are consuming undue system resources • Statistics on procedures and SQL statements • Time spent on fetching, sorting, or waiting on a lock • Activity history from the previous day, week, or month © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22 How the Infrastructure as Code (IaC) GitOps architecture works A blue/green deployment is a deployment strategy in which you create two separate, but identical environments. One environment (blue) is running the current application version and one environment (green) is running the new application version. Below, you’ll find a high-level GitOps architecture for IaC. The architecture uses the principles of EaC and sets an application pipeline to support blue/green deployments with Amazon Elastic Container Service (Amazon ECS). Terraform provides templates and modules to reduce the effort required to develop IaC. Guardrails are provided by Bridgecrew, which checks the Terraform code to ensure that best practices are being followed. AppDynamics provides visibility and Observability to all the resources into a single pane of glass. This provides fast feedback to developers to optimize the application and IaC configuration. Pull-merge request Merge to main AWS Cloud Terraform Cloud Developer IDE Terraform code Project repo Plan Scan Bridgecrew Developer Scan Policy / guardrails Deploy AWS CodePipeline AWS CodeBuild AWS CodeDeploy Environment Drift detection Scan SRE / Ops AWS Fargate Amazon Elastic Container Registry (Amazon ECR) Amazon Inspector Git repo Terraform modules AppDynamics Security Amazon Elastic Container Service (Amazon ECS) Observability © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Logs/metrics/traces/events 23 Modules from Git As a developer, creating an environment should be straightforward. First fork a template—you want to use fork so if the base template changes, you can pull those changes into your project. Also, if you create useful changes to the template, you have the ability to push those changes to the forked repository. Once you fork a template, you’ll add in any missing modules and assign custom values to variables. In the example image below, we use modules from other Git repositories to compose our IaC project. Something else to note is that here we are also using a shared repository for the Git modules. As other teams iterate and improve infrastructure, it’s expected that everyone share back improvements so that any other teams using that module can benefit from those improvements. Developer IDE Terraform code Developer SRE / Ops Git repository Terraform modules In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct and separate piece of software. Free and opensource software is that which, by definition, may be forked from the original development team without prior permission and without violating copyright law. … provider "aws" { region = "us-west-2" } module ”network" { source = "git::https://example.com/net.git?ref=v1.2.0” domainname = var.domainname } module “ecs_app_stack” { source = “git::https://example.com/ecs.git?ref=v2.4.1” } Security © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24 Terraform code snippet: Amazon ECR enhanced scanning In the example below, we are setting up continuous scanning in Terraform for Amazon Elastic Container Registry (ECR). When a container is pushed to Amazon ECR, a scan is triggered in Amazon Inspector. Once the scan completes, the results are published to Amazon EventBridge. A rule in Amazon EventBridge sends the message to an Amazon Simple Notification Service (SNS) topic where a notification is posted and can be sent to subscribers as email or Slack messages, for example. Amazon EventBridge also has capabilities for calling API destinations, so webhooks can be triggered to perform actions such as a Jira or ServiceNow update. In this configuration, Amazon Inspector is set up for continuous scanning of Amazon ECR, which means that scans trigger not only when an image is pushed to Amazon ECR, but also when a new common vulnerability or exposure is added to the database. AWS Cloud resource "aws_ecr_registry_scanning_configuration" "configuration" { scan_type = "ENHANCED" rule { scan_frequency = "CONTINUOUS_SCAN" repository_filter { filter = ”*" filter_type = "WILDCARD" } } } © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Environment Scan on push Amazon Elastic Container Registry (Amazon ECR) Amazon Inspector Results Notify engineers Amazon Simple Notification Service (Amazon SNS) Amazon EventBridge 25 Terraform best practices for GitOps Let’s take a moment to look at some best practices for Terraform: Split your variables into a separate file instead of modifying the main template. Use the Terraform workspaces feature to create different environments. With Terraform, you can do this with a .tfvars file. You want to use the same code base for your deployments, and workspaces allow you to do that. Pin modules to Git versions. Create modules that are not too granular. This will prevent any deployment surprises and breaking changes. Keep in mind that you will also need to periodically check for updates to see if it’s time to bump versions as you want to receive improvements. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. If you find yourself creating a module for an Amazon Virtual Private Cloud (Amazon VPC), that’s too granular—you should use the resource for an Amazon VPC instead. 26 Bridgecrew fast feedback guardrails and policies Bridgecrew provides the guardrails and policies that are applied against the Terraform IaC components. It takes the Terraform code and compares it against best practices for infrastructure as well as custom policies that are created in the Bridgecrew system. Let’s look at how Bridgecrew interacts with the different components in this diagram. Pull-merge request Merge to main Terraform Cloud Developer IDE Terraform code Project repo Plan 1 4 Scan Policy / guardrails 2 Feedback /results Deploy Bridgecrew Drift detection Developer SRE / Ops Scan 3 Scan Git repo Terraform modules Security 1 2 A Bridgecrew scan is triggered when a pull-merge request is created. The result of the scan can be added to the pull request and/or prevent the merge, based on rules that are defined in the Bridgecrew UI. Bridgecrew has direct integrations with Terraform Cloud to test Terraform plans against guardrails, policies, and configuration drift. For GitOps to be effective, you should ensure that resources and changes can’t be introduced outside of the Git repository. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3 Bridgecrew scans against the Terraform modules shared repository. This ensures that any module being committed is using best practices for IaC. 4 Results are meaningless unless engineers are getting the fast feedback, they need to improve IaC. So Bridgecrew provides that fast feedback while engineers have context for the IaC module or project they are developing against. To get even faster feedback, Bridgecrew has a plugin for IDEs. 27 Codified pipeline with environment resources Let’s take a moment to look at some of the architecture components we defined with Terraform, namely the AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy stack. This pipeline is meant for your application stack and it’s a best practice to define this pipeline while deploying the infrastructure. AWS Cloud AWS CodePipeline AWS CodeBuild AWS CodeDeploy Environment Amazon Elastic Container Service (Amazon ECS) AWS Fargate Amazon Elastic Container Registry (Amazon ECR) Amazon Inspector © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. The reasoning behind this is that the infrastructure components are being deployed for the application, so the IaC has context regarding what was installed, the resources, and the endpoints. Naturally, this is a good place and time to pass this information to the pipeline configuration so it’s available to the application while building and deploying. 28 Codified Amazon ECS AppSpec During the pipeline IaC setup is also a great time to define AWS CodeDeploy to use a blue/green deployment strategy. Let’s look at a CodeDeploy AppSpec example for Amazon ECS. In this example, the AppSpec YAML file goes into the root directory of the applications source. This file provides AWS CodeDeploy with the task definition, container, and port to deploy. The hooks section is how you can optionally control the deployment: • • • • • • • • • • • • • • • version: 0.0 Resources: - TargetService: Type: AWS::ECS::Service Properties: - TaskDefinition: ”taskdefinition-ARN" LoadBalancerInfos: - ContainerName: ”ECS-container-name" ContainerPort: “container-port-to-route-traffic-to” Hooks: - BeforeInstall: "LambdaFunction" - AfterInstall: "LambdaFunction" - AfterAllowTestTraffic: "LambdaFunction" - BeforeAllowTraffic: "LambdaFunction" - AfterAllowTraffic: "LambdaFunction" • Before install is before the Amazon ECS application is installed on the replacement task set • After install is after the Amazon ECS application is installed • After allow test traffic is after the test listener sends traffic to the Amazon ECS application. • Before allow traffic is before sending traffic to the production listener. • After allow traffic is after sending traffic to the production listener. This section is where one would define AWS Lambda functions to monitor and gather test metrics. If something goes wrong, you would have the AWS Lambda function return a Failed status, and the deployment would be stopped and marked as failed. AWS CodeDeploy will then take care of the rollback which, in the case of blue/green deployment, is switching traffic back to blue. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 29 Here is the strategy for AWS CodeDeploy and Amazon ECS using blue/green deployment: • AWS CodeDeploy provisions the green tasks as shown in the bottom-right of the diagram • Traffic is shifted by the test traffic listener to the green target group • When tests pass, success is reported to AWS CodeDeploy and the production traffic listener switches to the green target group • Validation happens again and either a success or fail is reported to AWS CodeDeploy • Validation occurs against the test traffic listener and synthetic transactions and any other necessary tests are run Shift production traffic to green; roll back in case of alarm. 100% production traffic Production traffic listener (port 80) Blue target group Blue tasks: v1 code AWS Fargate service Application load balancer 100% test traffic Test traffic listener (port 8080) Green target group Green tasks: v2 code In the event of a failure or alarm, a fast rollback to blue tasks happens in seconds. And with a blue/green deployment model, no users would be impacted until production traffic is routed to the green target group. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 30 AppDynamics Observability across all environments We’ve seen AWS CodeDeploy codification and a blue/green deployment strategy used for Amazon ECS container deployments. Now let’s look at how we achieve visibility into all the different components of our model. All metrics, events, logs, and traces (MELT) data is sent to AppDynamics to provide fullstack visibility into all the different components of an application. It’s important to have a unified place to collect and surface this information as cloud-native architectures span multiple microservices and all the application components may not be completely under your control. As you leverage services to abstract away complexity, these services become gaps in visibility and an Observability tool like AppDynamics makes it easy to consume and correlate all those services. AWS Cloud AppDynamics Developer Logs/metrics/ traces/events Observability SRE / Ops Security AWS CodePipeline AWS CodeBuild AWS CodeDeploy Environment Logs/metrics/traces/events AWS Cloud Server Server Server Amazon Elastic Container Service (Amazon ECS) AWS Fargate Amazon Elastic Container Registry (Amazon ECR) Amazon Inspector SaaS Database © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Server Server Server Server 31 Continue your journey with AWS Marketplace To get started, visit: https://aws.amazon.com/marketplace/solutions/devops AWS Marketplace Over 13,000 products from 3,000+ vendors: Third-party research has found that customers using AWS Marketplace are experiencing an average time savings of 49 percent when needing to find, buy, and deploy a third-party solution. And some of the highest-rated benefits of using AWS Marketplace are identified as: Time to value Cloud readiness of the solution Return on Investment Part of the reason for this is that AWS Marketplace is supported by a team of solution architects, security experts, product specialists, and other experts to help you connect with the software and resources you need to succeed with your applications running on AWS. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Buy through AWS Billing using flexible purchasing options: • Free trial • Pay-as-you-go • Hourly | Monthly | Annual | Multi-Year • Bring your own license (BYOL) • Seller private offers • Channel Partner private offers Deploy with multiple deployment options: • AWS Control Tower • AWS Service Catalog • AWS CloudFormation (Infrastructure as Code) • Software as a Service (SaaS) • Amazon Machine Image (AMI) • Amazon Elastic Container Service (ECS) • Amazon Elastic Kubernetes Service (EKS) 32 Get started today Visit http://aws.amazon.com/marketplace to find, try and buy software with flexible pricing and multiple deployment options to support your use case. https://aws.amazon.com/marketplace/solutions/devops Authors: James Bland Global Tech Lead for DevOps, AWS Aditya Muppavarapu Global Segment Leader for DevOps, AWS © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33