Uploaded by kapil_j

Prophecy Dummies book webinar

advertisement
Low-code data
engineering on
Databricks for
Dummies
An essential guide to boosting business data
team productivity on the lakehouse
Our speakers
Mitesh Shah
Nathan Tong
Roberto Salcido
VP, Market Strategy
Sales Engineer
Senior Solutions Architect
A non-intimidating primer to low-code data
engineering on Databricks
Goals
Assumptions
✓ An approachable guide to
the lakehouse and low-code
data engineering
✓ You are early in your
low-code data engineering
journey
✓ Reinforce content of
Low-code Data Engineering
on Databricks for Dummies
✓ You might be early in your
lakehouse journey
✓ Bring to life the concepts
in the book through demo,
q&a, and more
✓ You might be new to the
concepts, but you’re no
dummy :-)
Agenda for today
✓ The need for lakehouse
and low-code data
engineering
Demo
✓ The role of AI
✓ Tales from the field:
How Prophecy & Databricks
drive customer impact
Tech leaders are to the right of the Data Maturity Curve
From hindsight to foresight
Automated
Decision
Making
Competitive Advantage
Prescriptive
Analytics
Predictive
Modeling
Data
Exploration
Ad Hoc
Queries
Clean
Data
Automatically make the best decision
How should we respond?
What will happen?
Reports
What happened?
©2023 Databricks Inc. — All rights reserved
Data + AI Maturity
4
Realizing this requires two disparate, incompatible data platforms
Data Maturity Curve
Competitive Advantage
Data Warehouse
for BI
Data Lake
for AI
Automated
Decision
Making
What happened?
Prescriptive
Analytics
Predictive
Modeling
Clean
Data
Reports
©2023 Databricks Inc. — All rights reserved
Ad Hoc
Queries
Data
Exploratio
n
Data + AI Maturity
What will happen?
5
There is no need to have two disparate platforms
Data
All ofData
the Science
data and very adaptable
Streaming
Highly reliable and efficient
Business
Intelligence
SQL
Analytics
Governance and Security
Table ACLs
& ML
Incomplete
support for
use cases
Incompatible
security and
governance models
Data Science
Business
& ML
Intelligence
Data
SQL
Streaming
Analytics
Governance and Security
Files and Blobs and Table ACLs
Copy subsets of data
Disjointed
and duplicative
data silos
Data Warehouse
Data Lake
Structured tables
Structured
tables
unstructured
Structured
andand
unstructured
filesfiles
©2023 Databricks Inc. — All rights reserved
This is the lakehouse paradigm
Technologies
Al / ML, SQL, BI,
and Streaming use cases
One security and governance
approach for all data assets
on all clouds
Data Science
& ML
Data
Streaming
Business
Intelligence
SQL
Analytics
Governance and Security
Files and Blobs and Table ACLs
A reliable
data platform
to efficiently handle all data types
Unity Catalog
Fine-grained governance
for data and AI
Delta Lake
Data reliability and performance
Data Lakehouse
Structured tables and unstructured files
©2023 Databricks Inc. — All rights reserved
Data Applications
Databricks
Lakehouse Platform
Lakehouse Platform
Data
Warehousing
Data
Engineering
Data
Streaming
Data Science
and ML
Simple
Unify your data warehousing and AI
use cases on a single platform
Unity Catalog
Fine-grained governance for data, analytics and AI
Delta Lake
Data reliability and performance
Cloud Data Lake
All structured and unstructured data
Multicloud
One consistent data platform across clouds
Open
Built on open source and open standards
©2023 Databricks Inc. — All rights reserved
8
Raw data is rarely suitable for immediate consumption
Data transformations are the key to building
AI- and analytics-ready data products
Data Science and
Machine Learning
Curated Data
Real time CDC
Databricks Machine Learning
Bronze
Silver
Raw Ingestion
and History
Filtered, Cleaned,
Augmented
Gold
Business Aggregates
& Data Models
Data Governance powered by
Databricks Unity Catalog
EDC
DBSQL
Warehouse
Databricks SQL
Enterprise
Reporting and BI
Existing transformation options have significant shortcomings
Code / Scripts
X SQL is limited to DWH/relational
X Dependency on skilled data engineers
X Missing capabilities: orchestration, lineage, deployment, …
Legacy low-code solutions
X Lock-in
X Non-native performance
X No support for DataOps
Data Science and
Machine Learning
Curated Data
Real time CDC
Databricks Machine Learning
Bronze
Silver
Raw Ingestion
and History
Filtered, Cleaned,
Augmented
Gold
Business Aggregates
& Data Models
Data Governance powered by
Databricks Unity Catalog
EDC
DBSQL
Warehouse
Databricks SQL
Enterprise
Reporting and BI
Prophecy is the low-code platform with
native-to-cloud execution
Data platform team
Business data teams
ACCESS LAYER
INTELLIGENCE LAYER
SYSTEMS LAYER
Low-code interfaces
Generative AI with
knowledge graphs on
metadata
Execution, orchestration,
observability, search,
quality, lineage, …
(uniquely with code &
extensibility)
for orchestration, scheduling
The complete low-code data transformation
platform for enterprises in the cloud
✓
Complete
✓
Prophecy is a complete data
engineering platform that spans data
pipeline development, deployment,
management, and orchestration
✓
Low code
Visual UI provides drag & drop interface
for building pipelines; Prophecy Data
Copilot generates pipelines based on
natural language prompts
✓
Open
Visual=Code generates 100% open,
git-committable code that is native to
the underlying cloud data platform enabling DataOps practices and
preventing lock-in
Standardization & Reuse
Framework Builder enables users to add
a library of visual components, building
standards for data and enabling reuse
across data stakeholders
✓
AI
Data engineers can quickly build
generative AI applications on
unstructured, enterprise data –
powering use cases that immediately
improve enterprise efficiency
Live demo
Discover how Prophecy democratizes
data engineering
“The hottest new programming
language is English.”
ANDREJ KARPATHY
Prophecy
Data Copilot
Type your query and Copilot
will create a data pipeline.
AI assistant
for trusted
data pipelines
●
Enable broader set of
users through natural
language
●
Accelerate pipeline
creation with
auto-generated pipelines
●
Improve pipeline quality
with intelligent
recommendations
Databricks Assistant
A context-aware AI assistant
Generates SQL queries and
code, explains code, and fixes
issues
Leverages Unity Catalog to
provide personalized responses
Available in Databricks
Notebooks, SQL editor, and file
editor
Generates SQL queries and code
Takes requests in natural language
and creates code snippets
Applies details from code cells,
libraries, runtime, and more to
improve accuracy
Explains, diagnoses, and fixes
issues from within a cell
Databricks AI Assistant + Prophecy Data Copilot
Using natural language to accelerate & democratize work across the data engineering stack
●
Generate SQL or Python code
●
Generate pipelines
●
Autocomplete code or queries
●
Suggest transformations
●
Fix issues
●
Auto-documentation
Tales from the field
Prophecy + Databricks drive business impact
Driving home faster data pipelines & league winning analytics
Legacy data architecture
limits data insights
●
Rigid architectures and
siloed data complicate
analytics
●
Limited data engineering
resources reduces time
to insights
Unlocking data access and
transformation for all
●
●
All data users can build
high quality data
pipelines and products
quickly, collaboratively,
and easily
Flexibility and
extensibility has
enabled self-serve
analytics at scale
Building their competitive
edge with data
●
7x faster pipeline
development
●
3x more analysts and
developers building
pipelines
●
10x faster meeting
stakeholder KPIs
24x faster data inspires confident investments
Complex ETL blocking
investment growth
●
Manual processes were
impacting time-to-value
●
Legacy tooling caused
missed investment
opportunities
Boosting business data user
productivity
●
●
Low-code approach
empowered data team
without reliance on
engineering
Access to underlying
code to ensure pipelines
are reliable and
performant
Data that moves at the speed
of the market
●
42x higher data team
productivity
●
24x faster
time-to-insights
Q&A with Roberto & Nathan
Prophecy +
Databricks
When to
migrate
Common
questions
FREE TRIAL
A low-code
approach to data
transformation
Start for free
https://app.prophecy.io/metadata/auth/signup
Q&A
●
How are you different from Alteryx? Matillion?
●
Can you define as what is prophecy is all about as
single definition?
●
What types of data does Prophecy work with?
Unstructured data?
●
It's all 'raw' data absorbed into the system
deposited into 'bronze 'tables? Or can
preprocessing steps occur within input elements.
i.e. - consider multiple high volume Kubernetes
streams.
Prophecy Free Trial
https://app.prophecy.io/metadata/auth/signup
Key takeaways
● Low-code tooling enables all data team members and
boosts their productivity
● We are able to maintain a high-quality codebase and
optimize as needed
●
Integration with Databricks Lakehouse unifies a low-code
approach with data, analytics, and AI use cases
Thank you
prophecy.io
Data transformation for everyone
A primer into the power of low-code data engineering and how it can enable
both technical and business data users with visual data transformation to
convert raw data into analytics and machine learning-ready data.
Read this ebook to learn:
● How the lakehouse architecture has transformed the modern data stack
● Why Spark and SQL expertise does not have to be a blocker to data
engineering success
● How Prophecy’s visual, low-code data solution democratizes data
engineering
● Real world use cases where Prophecy has unlocked the potential of the
lakehouse
● And much, much more
Get the book
Prophecy overview
Simplify data engineering for technical and
business data users
Download