N E W Y O R K | J U L Y 2 6 , © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2 0 2 3 ANT201 LaunchDarkly scales real-time pipelines by 100x using Amazon Kinesis Mike Zorn Nihar Sheth Software Architect LaunchDarkly Senior Product Manager AWS © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Why real-time data processing LaunchDarkly customer story © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Batch vs. real-time data processing Batch data processing Data sets Processing mode Data freshness © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-time data processing Bounded data sets (tables, files) Unbounded data sets (streams) Periodic Continuous Hours to days Milliseconds to minutes Time-critical decisions Preventive/predictive Value of data to decision-making Why real-time processing? Real time Actionable Seconds Traditional “batch” business intelligence Reactive Minutes Source: Mike Gualtieri, Forrester, Perishable Insights © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Historical Hours Days Months What is streaming data? High volume Continuous Ordered, incremental Low latency Mobile apps Web clickstream Application logs Metering records IoT sensors Smart buildings © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-time data processing use cases Fraud detection Healthcare © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customer experience Marketing campaigns Log analytics Predictive maintenance Data Streaming Pipelines INGEST, STORE, AND ANALYZE HIGH VOLUMES OF HIGH -VELOCITY DATA FROM A VARIETY OF SOURCES IN REAL TIME Source Stream ingestion Devices and/or applications that produce real-time data at high velocity Data from tens of thousands of data sources can be collected and ingested in real time Stream storage Stream processing Destination Data is stored in the order it was received for a set duration of time, and can be replayed indefinitely during that time Records are read in the order they’re produced, allowing for real-time analytics or streaming ETL Data lake Data warehouse (most common) Database (least common) © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Challenges of self managed data streaming Apache Kafka Apache Flink Difficult to set up Tricky to scale Hard to achieve high availability Integration requires development Error prone and complex to manage Expensive to maintain © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis services EASILY COLLECT, PROCESS, AND ANALYZE DATA STREAMS IN REAL TIME Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Collect and store data streams for analytics Load data streams into AWS data stores Analyze data streams with Kinesis Data Analytics Studio or Apache Flink © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Streams AWS SDK Amazon Kinesis Data Analytics Amazon Kinesis Producer Library (KPL) Amazon Kinesis Agent Open-source agents (Fluent Bit) Amazon Kinesis Data Firehose Kinesis Data Streams Ingests and stores data streams for processing Spark on EMR Amazon EC2 AWS Lambda 10+ AWS services (Amazon CloudWatch, AWS IoT Core, Amazon DynamoDB) Output Analyze streaming data using your favorite BI tools • Easy administration and low cost • Performance at scale • Serverless scaling • Concurrent consumers at low latency • Security, durability, and availability out of the box • Data retention up to 1 year © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Firehose AWS SDK – Direct put Amazon S3 Amazon Kinesis Data Streams Amazon Kinesis Agent Open-source agents (Fluent Bit) 20+ AWS services (Amazon CloudWatch, AWS IoT Greengrass) Amazon Redshift Kinesis Data Firehose Prepares and loads the data continuously to the destinations you choose Amazon OpenSearch Service Splunk Output HTTP endpoints Analyze Streaming data using your favorite BI tools • Zero administration and seamless scaling • Data format conversion to Parquet/ORC • Direct-to-data store integration • Dynamic partitioning to Amazon S3 • Serverless data transformations • • Buffer and batching flexibility Deliver data directly to 15+ destinations (Datadog, Sumo Logic, New Relic, and MongoDB) © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customers are generating insights at massive scale Trillions of records per day processed worldwide 2 million unique streams ingest tens of PB of data per day Largest streams are ingesting > 5 Gigabytes per second © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. LaunchDarkly customer story • What is LaunchDarkly? • Problem statement • Design goals & Kinesis Data Streams architecture • Migration • Cost optimization © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. LaunchDarkly accelerates your digital business De-risked releases With feature management, every developer can now deploy more often and release without risk Technology migrations Migrate and modernize your tech stack without the pain – cloud, database, API we have you covered Targeted experiences Deliver targeted product experiences to any customer segment Product experimentation Maximize the business impact of every product feature through experimentation Coordinated mobile releases Release mobile features and fixes on your schedule, not theirs © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. LaunchDarkly circa 2017 © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. LaunchDarkly circa 2017 ⏳slows down ⏳ © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. LaunchDarkly circa 2017 🔥🔥🔥 🔥🔥🔥 🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥 ⏳slows down ⏳ © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problems with the old architecture • Back pressure through host memory on event-recorder • Workloads affect one another • Impossible to recover lost data © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problems with this architecture Goals for new architecture • Back pressure is RAM on event- recorder • Workloads affect one another • Impossible to recover lost data © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. • Durable • Isolated • Replayable Technology options in 2017 Kafka Amazon SNS & Amazon SQS Kinesis Data Streams Durability ✅ ✅ ✅ Isolation ✅ ✅ ✅ Replay ✅ Managed © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. ✅ ✅ ✅ Kinesis powered architecture Kinesis Data Firehose © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aside: Where did Kinesis Data Firehose come from? © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aside: Where did Kinesis Data Firehose come from? Amazon Kinesis Data Firehose © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Can we migrate gradually? © Frank Schulenburg © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Migration over months Kinesis Data Firehose © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Migration workflow 🛫 Turn on new architecture for 30% more users © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Migration workflow 🛫 Turn on new architecture for 30% more users 📈 Notice a graph go up and to the right © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Migration workflow 🛫 Turn on new architecture for 30% more users 📈 Notice a graph go up and to the right 🛬 Switch that 30% back to the old way 👨💻 Fix the thing that impacted the graph and repeat © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits of feature flagged rollout ● Avoid data loss ● Deliver value sooner ● Iterate through issues © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kinesis optimization Cost & client error handling © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cost optimization: AWS PrivateLink © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cost optimization: Batching (or KPL) © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cost optimization: Batching (or KPL) © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Client error handling: Rate limits HEADROOM © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Client error handling: Retries © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do we scale to 100 TB/day? © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Conclusion • 1TB ➡️ 100 TB ingested per day • 99.99% availability • 99.99999% data durability • Kinesis Data Streams enabled an expansion of data workloads • Use PrivateLink • When appropriate for your use case ▪ Use batching ▪ Implement rate limits ▪ Increase client retries • On-demand streams solve scaling © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resources AWS Kinesis Services https://aws.amazon.com/kinesis/ Blog post: “LaunchDarkly’s journey from ingesting 1 TB to 100 TB per day with Amazon Kinesis Data Streams” https://aws.amazon.com/blogs/big-data/launchdarklys-journey-fromingesting-1-tb-to-100-tb-per-day-with-amazon-kinesis-data-streams/ © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Nihar Sheth linkedin.com/in/niharshethaws/ © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Please complete the session survey in the mobile app Mike Zorn LaunchDarkly Booth Booth #601