Uploaded by hofel13738

ANT201 LaunchDarklyScales E1 20230719 ZMEdited

advertisement
N E W
Y O R K
|
J U L Y
2 6 ,
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
2 0 2 3
ANT201
LaunchDarkly scales real-time pipelines
by 100x using Amazon Kinesis
Mike Zorn
Nihar Sheth
Software Architect
LaunchDarkly
Senior Product Manager
AWS
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Why real-time data processing
LaunchDarkly customer story
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Batch vs. real-time data processing
Batch data processing
Data sets
Processing
mode
Data
freshness
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Real-time data
processing
Bounded
data sets
(tables, files)
Unbounded data sets
(streams)
Periodic
Continuous
Hours to days
Milliseconds to minutes
Time-critical decisions
Preventive/predictive
Value of data to decision-making
Why real-time processing?
Real
time
Actionable
Seconds
Traditional “batch” business intelligence
Reactive
Minutes
Source: Mike Gualtieri, Forrester, Perishable Insights
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Historical
Hours
Days
Months
What is streaming data?
High volume
Continuous
Ordered,
incremental
Low latency
Mobile apps
Web clickstream
Application logs
Metering records
IoT sensors
Smart buildings
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Real-time data processing use cases
Fraud detection
Healthcare
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer experience
Marketing campaigns
Log analytics
Predictive maintenance
Data Streaming Pipelines
INGEST, STORE, AND ANALYZE HIGH VOLUMES OF HIGH -VELOCITY DATA FROM
A VARIETY OF SOURCES IN REAL TIME
Source
Stream ingestion
Devices and/or
applications that
produce real-time
data at high velocity
Data from tens of
thousands of data sources
can be collected and
ingested in real time
Stream storage Stream processing Destination
Data is stored in the order
it was received for a set
duration of time, and can
be replayed indefinitely
during that time
Records are read in
the order they’re produced,
allowing for real-time
analytics or streaming ETL
Data lake
Data warehouse
(most common)
Database
(least common)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Challenges of self managed data streaming
Apache
Kafka
Apache
Flink
Difficult to set up
Tricky to scale
Hard to achieve
high availability
Integration requires
development
Error prone and
complex to manage
Expensive
to maintain
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis services
EASILY COLLECT, PROCESS, AND ANALYZE DATA STREAMS IN REAL TIME
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Collect and store
data streams for
analytics
Load data streams
into AWS data stores
Analyze data streams with
Kinesis Data Analytics
Studio or Apache Flink
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Streams
AWS SDK
Amazon Kinesis
Data Analytics
Amazon Kinesis
Producer Library (KPL)
Amazon Kinesis
Agent
Open-source agents
(Fluent Bit)
Amazon Kinesis
Data Firehose
Kinesis Data Streams
Ingests and stores data
streams for processing
Spark on EMR
Amazon EC2
AWS Lambda
10+ AWS services
(Amazon CloudWatch, AWS IoT
Core, Amazon DynamoDB)
Output
Analyze streaming
data using your
favorite BI tools
•
Easy administration and low cost
•
Performance at scale
•
Serverless scaling
•
Concurrent consumers at low latency
•
Security, durability, and availability out of the box
•
Data retention up to 1 year
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Firehose
AWS SDK – Direct put
Amazon S3
Amazon Kinesis Data
Streams
Amazon Kinesis
Agent
Open-source agents
(Fluent Bit)
20+ AWS services
(Amazon CloudWatch, AWS
IoT Greengrass)
Amazon Redshift
Kinesis Data Firehose
Prepares and loads
the data continuously
to the destinations
you choose
Amazon
OpenSearch Service
Splunk
Output
HTTP endpoints
Analyze Streaming
data using your
favorite BI tools
•
Zero administration and seamless scaling
•
Data format conversion to Parquet/ORC
•
Direct-to-data store integration
•
Dynamic partitioning to Amazon S3
•
Serverless data transformations
•
•
Buffer and batching flexibility
Deliver data directly to 15+ destinations (Datadog,
Sumo Logic, New Relic, and MongoDB)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers are generating insights
at massive scale
Trillions of records per day processed worldwide
2 million unique streams ingest tens of PB of data per day
Largest streams are ingesting > 5 Gigabytes per second
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LaunchDarkly customer story
• What is LaunchDarkly?
• Problem statement
• Design goals & Kinesis Data Streams architecture
• Migration
• Cost optimization
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LaunchDarkly accelerates your digital business
De-risked releases
With feature management, every developer can now deploy
more often and release without risk
Technology migrations
Migrate and modernize your tech stack without the
pain – cloud, database, API we have you covered
Targeted experiences
Deliver targeted product experiences to any
customer segment
Product experimentation
Maximize the business impact of every product feature
through experimentation
Coordinated mobile releases
Release mobile features and fixes on your schedule, not theirs
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LaunchDarkly circa 2017
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LaunchDarkly circa 2017
⏳slows down ⏳
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LaunchDarkly circa 2017
🔥🔥🔥
🔥🔥🔥
🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥
⏳slows down ⏳
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problems with the old architecture
• Back pressure through host
memory on event-recorder
• Workloads affect one another
• Impossible to recover lost data
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problems with this architecture
Goals for new architecture
• Back pressure is RAM on event-
recorder
• Workloads affect one another
• Impossible to recover lost data
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Durable
• Isolated
• Replayable
Technology options in 2017
Kafka
Amazon SNS &
Amazon SQS
Kinesis Data
Streams
Durability
✅
✅
✅
Isolation
✅
✅
✅
Replay
✅
Managed
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
✅
✅
✅
Kinesis powered architecture
Kinesis Data
Firehose
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aside: Where did Kinesis Data
Firehose come from?
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aside: Where did Kinesis Data
Firehose come from?
Amazon Kinesis
Data Firehose
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Can we migrate gradually?
© Frank Schulenburg
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration over months
Kinesis Data
Firehose
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration workflow
🛫 Turn on new architecture for
30% more users
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration workflow
🛫 Turn on new architecture for
30% more users
📈 Notice a graph go up and to
the right
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration workflow
🛫 Turn on new architecture for
30% more users
📈 Notice a graph go up and to
the right
🛬 Switch that 30% back to the
old way
👨‍💻 Fix the thing that impacted
the graph and repeat
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits of feature flagged rollout
● Avoid data loss
● Deliver value sooner
● Iterate through issues
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Kinesis optimization
Cost & client error handling
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost optimization: AWS PrivateLink
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost optimization: Batching (or KPL)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost optimization: Batching (or KPL)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Client error handling: Rate limits
HEADROOM
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Client error handling: Retries
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do we scale to 100 TB/day?
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Conclusion
• 1TB ➡️ 100 TB ingested per day
• 99.99% availability
• 99.99999% data durability
• Kinesis Data Streams enabled an expansion of data workloads
• Use PrivateLink
• When appropriate for your use case
▪ Use batching
▪ Implement rate limits
▪ Increase client retries
• On-demand streams solve scaling
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resources
AWS Kinesis Services
https://aws.amazon.com/kinesis/
Blog post: “LaunchDarkly’s journey from ingesting 1 TB to 100 TB per day
with Amazon Kinesis Data Streams”
https://aws.amazon.com/blogs/big-data/launchdarklys-journey-fromingesting-1-tb-to-100-tb-per-day-with-amazon-kinesis-data-streams/
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Nihar Sheth
linkedin.com/in/niharshethaws/
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Please complete the session
survey in the mobile app
Mike Zorn
LaunchDarkly Booth
Booth #601
Download