Talend Big Data Partners Playbook

advertisement
Talend Big Data
Partners Playbook
Value Proposition,
Qualification, Objection
Handling and more…
2013-04 | v1.0
Table Of Contents
1.
Big Data Overview ................................................................................................................................................. 3
2.
Talend Big Data Value Proposition ........................................................................................................................ 4
3.
Talend Big Data Products....................................................................................................................................... 7
4.
How to Detect/ Create/ Qualify Opportunities ..................................................................................................... 8
5.
Pricing .................................................................................................................................................................. 10
6.
Market Overview ................................................................................................................................................. 10
7.
Competitive Intelligence ...................................................................................................................................... 11
8.
Customer Case Studies ........................................................................................................................................ 13
9.
Partners ............................................................................................................................................................... 14
10.
Glossary/Background ......................................................................................................................................... 15
Introduction
Welcome to the Talend Big Data Partner Playbook!
How to use
This document is meant as a reference guide for Talend partners and is confidential. It falls under the Talend
nondisclosure agreement signed as part of the standard partner agreement and must not be distributed. It is meant
for Talend Partners only.
Important Note: You will notice in some sections of the documents several icons like this one 
By clicking on them, you will access more detailed information on the corresponding section. The goal
is to keep the main document concise while offering additional information if needed. If when clicking
on the icon you get an error message about a “Word converter” please use this link to correct the
issue.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
2
1. Big Data Overview
What is big
data?
Big data represents a significant paradigm shift in enterprise technology. This advance allows
organizations to differentiate themselves by processing data they never thought possible,
increasing the speed in which they analyze and improve immense amounts of data.
Big data encompasses a complex and large set of diverse structured and unstructured datasets
that are difficult to process using traditional data management practices and tools. For example,
there is an increasing desire to collect call detail records, web logs, data from sensor networks,
financial transactions, social media and Internet text, and then analyze with existing data sources.
Over 80 percent of the world’s data is unstructured and it is growing at 15 times the rate of
structured data.
Big data technology is rather new and based in open source communities. Hadoop, the most
widely used big data technology is merely a version 1.0 (published January 2012).
What are the
challenges
implementing
big data
projects?
How can Talend
help?
While Hadoop and other big data technologies are standards compliant, they still require a very
explicit skillset to master and software tools to manage and deploy. Companies need to integrate
big data and non-relational (NoSQL) data sources requiring big data integration tools like Talend.
The primary implementation challenges include:
1.
Lack of development knowledge and skills
The big data technologies are new and the underlying concepts, such as Map Reduce, are
complex. In this nascent market, there are limited tools available to aid development and
implementation of these projects. You are required to find resources that understand these
complexities in order to be successful but there is only a handful available. Compounding this
challenge, the technology is not easy to learn.
2.
Lack of big data project management
Big data projects at this point are just that, projects. It is early in the adoption process and
most organizations are trying to sort out potential value and create an exploratory project or
special project team. Typically these projects go unmanaged. As with any corporate data
management discipline however, they will eventually need to comply with established
corporate standards and accepted project management norms for organization, deployment
and sharing of project artifacts.
3.
Poor big data quality can lead to big problems
Depending on the goal of a big data project, poor data quality can have a big impact on
effectiveness. It can be argued that inconsistent or invalid data could have exponential impact
on analysis in the big data world. As analysis on big data grows, so too will the need for
validation, standardization, enrichment and resolution of data. Even identification of linkages
can be considered a data quality issue that needs to be resolved for big data.
Talend Big Data is a powerful and versatile solution that simplifies integrating big data
technologies and data sources without writing and maintaining complex code.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
3
2. Talend Big Data Value Proposition
Talend presents an intuitive development and project management environment to aid in the deployment of a big
data program. It extends typical data management functions into big data with a wide range of functions across
integration and data quality. It not only simplifies development but also increases effectiveness of a big data program.
2.1
Talend’s Big Data Strategy
There are four key strategic pillars to the Talend big data strategy.
1. Big Data Integration
Landing big data (large volumes of log files, data from operational systems, social media, sensors, or
other sources) in Hadoop, via HDFS, HBase, Sqoop or Hive loading is considered an operational data
integration problem. Talend is the link between traditional resources, such as databases, applications
and file servers and these big data technologies.
It provides an intuitive set of graphical components and workspace that allows for interaction with a big
data source or target without need to learn and write complicated code. A configuration of a big data
connection is represented graphically and the underlying code is automatically generated and then can
be deployed as a service, executable or stand-alone job. The full set of Talend data integration
components (application, database, service and even a master data hub) are used so that data
movement can be orchestrated from any source or into almost any target. Finally, Talend provides
graphical components that enable easy configuration for NoSQL technologies such as MongoDB,
Cassandra, Neo4J, Hive and HBase to provide random, real-time read/write, columnar-oriented access to
Big Data.
2. Big Data Manipulation
There is a range of tools that enable a developer to take advantage of big data parallelization to perform
transformations on massive amounts of data. These languages such as Apache Pig and HBase provide a
scripting language to compare, filter, evaluate and group data within an HDFS cluster. Talend abstracts
these functions into a set of components that allow these scripts to be defined in a graphical
environment and as part of a data flow so they can be developed quickly and without any knowledge of
the underlying language.
3. Big Data Quality
Talend presents data quality functions that take advantage of the massively parallel processing (MPP)
environment of Hadoop, and because we rely only on generating native Hadoop code, users immediately
can apply data quality across their cluster. It provides explicit function to take advantage of the massively
parallel environment to identify duplicate records across these huge data stores in moments not days. It
also extends into profiling big data and other important quality issues as the Talend data quality
functions can be employed for big data tasks. This is a natural extension of a proven integrated data
quality and data integration solutions.
4. Big Data Project Management and Governance
While most of the early big data projects are free of explicit project management structure, this will
surely change as they become part of the bigger system. With that change, companies will need to wrap
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
4
standards and procedures around these projects just as they have with data management in the past.
Talend provides a complete set of functions for project management. With Talend, the ability to
schedule, monitor and deploy any big data job is included as well as a common repository so that
developers can collaborate on and share project metadata and artifacts.
2.2 Elevator Pitch
Talend Big Data is a powerful and versatile open source solution for big data integration that delivers integration at
any scale, both technically and economically, enabling profound change throughout businesses by allowing them to
unlock the power of their data.
Talend provides an easy-to-use graphical development environment that allows for interaction with big data sources
and targets without the need to learn and write complicated code or learn complex MapReduce techniques. Talend's
big data components have been tested and certified to work with leading big data Hadoop distributions, including
Amazon EMR (Elastic MapReduce), Cloudera, Google, Greenplum/Pivotal, Hortonworks and MapR. Talend provides
out-of-the-box support for a range of big data platforms from the leading appliance vendors including Greenplum,
Netezza, Teradata, and Vertica.
Talend simplifies the development of big data and facilitates the organization and orchestration required by these
projects so that you can focus on the key question… “What use should we make of data, big and small, and how am I
going to be the leader in using data to help my business?” Talend provides discreet value to your technical teams that
are tasked with big data implementation.
2.3 Key Benefits
Talend Big Data benefits and capabilities include:
Flexible. Deliver solutions for all your needs.

Provides comprehensive big data support. Native support for Hadoop HDFS, Hbase, Hive, Pig, Sqoop,
BigQuery. Certified for all major Hadoop based distributions – Amazon EMR, Cloudera, Hortonworks,
Mapr, Greenplum. Comprehensive support for NoSQL. Talend provides the necessary big data functions
and extends this with over 450 components that allow for integration with nearly any application,
warehouse or database. Additionally, you can deploy big data jobs as service, a self-contained executable
or as a scheduled task.

Provides easy-to-use, graphical code generating tools that simplify big data integration without writing or
maintaining complex big data code. Reduces time-to-market with drag-and-drop creation and
configuration, prebuilt packages and documented examples based on real-world experiences.

Talend Big Data is the only solution to natively generate MapReduce, Pig, and HiveQL code.

Deploy into production with confidence – rely on enterprise support and services from Talend

The worlds first E-L-T mapping tool for Hive – move data from Hive to Hive (this is visionary)
Scalable. Minimize disruption as your business volume increases.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
5

Rapidly deploy big data jobs on Hadoop. Faster performance through Talend big data code generation
(i.e. MapReduce, Pig, Hive) that is optimized for these MPP (massively parallel processing) environments
because the data job is 100% Hadoop native.

Big data quality jobs can be run in parallel in Hadoop.

Requires no installation (zero install on Hadoop cluster), no performance overhead, and is easy to
manage
Open. Improve your productivity as Talend Big Data provides:

A large collaborative community. As of January 2013 there have been over 45,000 downloads of Talend
Big Data (June 2012 to Jan 2013)

Software created through open standards and development processes eliminates vendor lock-in. Talend
Big Data is powered by the most widely used open source projects in the Apache community. We certify
with Hortonworks, Cloudera, Mapr, but we can work with any Hadoop platform and with the application
on these platforms, because we don’t have any special or proprietary code

Ready to start today. Talend Open Studio for Big Data is free to download and use for as long as you
want. No budget battles or endless delays - just accessible, reliable open source integration, starting
today.

Administer and manage even the most complex teams and projects whose members have different roles
and responsibilities

Has a proven lower TCO based on Talend customer successes

The Talend development studio increases developer productivity with a graphical environment that
allows them to drag, drop and configure components to implement big data projects in minutes not
weeks and days. Talend also provides a shared repository so developers can check in/check out
metadata and big data artifacts and reuse expertise across the project.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
6
3. Talend Big Data Products
Talend provides three big data products:
1.
Talend Open Studio for Big Data combines big data technologies into a unified open source environment
simplifying the loading, extraction, transformation and processing of large and diverse data sets.
o
2.
Talend Enterprise Big Data is the subscription license version of Open Studio with Gold support and many
advanced features including versioning and a shared repository.
o
3.
There are a few differences between this and Talend Open Studio for Data Integration. First, TOS4BD
includes big data components for Hadoop (HDFS), HBase, HCatalog, Hive, Sqoop and Pig. Second, it
does not include functions for context, metadata manager, business modeler and documentation.
A customer will upgrade to this product for all the same reasons one would upgrade from any open
source product to Talend commercial products. Currently, most big data projects do not employ much
project management. Talend provides these functions in our commercial offering.
Talend Platform for Big Data extends the Talend Enterprise Big Data product with all of our data quality features,
for example big data profiling and big data matching, and support for non-big data features of data integration
and data quality. It also provides advanced scalability options and platinum support.
A complete feature/benefit comparison matrix between Talend’s Big Data products is at:
http://www.talend.com/products/big-data/matrix
Many of your opportunities will be upselling TOS users to Talend
Enterprise Big Data or Talend Platform for Big Data. In addition to
sharing the detailed matrix to highlight differences, this upselling
guide shows the salient points.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
7
4. How to Detect/ Create/ Qualify Opportunities
There are three discreet audiences in the Talend Big Data sales opportunity: the CTO, the big data developer and the
data scientist. The majority of people exploring big data are exactly that, explorers. They can see the value but may
not be able to see how to get to where they need to be. They are all looking for the “right” use case for their
organization. While they all share these common concerns each has some individual characteristics worth noting.
CXO, Line of Business

CIO, CTO

Director/VP of IT

Director/VP of Application
Architecture

Director of Systems
Architecture
The CTO is looking for resources
and a use case for big data. They
can see the value and are looking
to implement quickly to set
differentiation and establish
competitive advantage in their
marketplace. With Talend the
learning curve for big data is
shortened so they can use existing
resources to get started on big
data projects today. The use cases
are still building but the early
winners are analytics and storage.
For the CTO we need to ask them
“why big data?”
Development

Data Stewards: Working on
quality assurance, security
and metadata management
projects

Developers using Java,
Hive, SQL, Pig, MapReduce,
Python or other Apache Big
Data tools
Big data technologies are new
and fairly complex. There are
not many resources familiar
with them and many are trying
to learn them quickly. With
Talend, big data technologies
are simplified or abstracted into
intuitive, graphical components
that generate the complex code.
This eliminates the need to
learn the complexities and
shorten the learning curve.
Data Scientist
Data Scientist / Architect:
Statistics, SQL and Hive
programming, scripting
language skills, data mining and
analysis, business analysis and
industry expertise.
The Data Scientist has become
the critical link between big
data and business value as they
are tasked with analyzing a
business problem and deriving
a solution that leverages the
available data. They are
nowhere without the data.
Talend simplifies big data
technologies so that they can
focus on the task at hand, the
analysis. It is the critical link to
supply data to a BI tool without
the complexities of coding the
interface.
Big data is an emerging space. The following use-cases are taken from industry
sources, to explain the types of situations that big data is being used today.
4.1 Qualification Questions
To qualify an opportunity, you will need to collect information at all stages of the sales cycle. The list below could be
applied to any opportunity, but is especially valid for big data opportunities that are usually bigger and more strategic:

What is the business issue your customer is trying to solve?

Do they have a compelling event (i.e. a reason for choosing a date for the application to be in production)
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
8

What is your customer's budget?

Who is the key decision maker? Who are the influencers?

What is the customer's buying process?

What are the customer's decision-making criteria?

What data integration vendor solution do they use today for ETL?
In addition to those generic questions, you could use the following matrix to identify information more specific to big
data opportunities:
BIG DATA
NEED
BIG DATA
MATURITY

Does the prospect have a big data business case identified?

Do they have a problem that can be resolved by big data but are not aware of a solution?

Does the prospect understand the issues of big data – volume, variety, velocity,
complexity?

Do you or your customer have a strong data architect or data scientist that is leading the
project?

Are they planning to use a Hadoop, NoSQL, or big data appliance?

Is there a partner with significant big data expertise involved? Perhaps Cloudera or
Greenplum?

What type and how much data do they have or will have in the future? (e.g terabytes of
unstructured data)

Is the data corporate, social media, html, text heavily?

Do they have a data quality practice today?

Have they considered the impact of their big data project on data quality? How do you
apply data quality to masses of data? Is trending more important that absolute perfect
underlying data?

Are they looking to standardize their data? Do they augment data with third party
sources?

Do they have manual process that could benefit from automation?

Do they have a data governance team?

Big data is many things, but providing real-time access to data can be tricky1. Has the
customer considered if they require real-time data access and how they will provide realtime access to the Hadoop?
VOLUME
QUALITY
GOVERNANCE
REAL-TIME
1
http://www.odbms.org/blog/2012/09/hadoop-and-nosql-interview-with-j-chris-anderson/
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
9
5. Pricing
Please contact your Talend Partner manager for more details.
6. Market Overview
6.1 What is the market – definition, size and segmentation?
Big data allows organizations to process data they never thought possible and increase the speed in which they
analyze and improve immense amounts of data in order to establish differentiation. With big data, processes are
improved and critical decisions can be improved with more information. It is changing entire markets and enabling
solution to challenges that had not even been thought of in the past.
According to Gartner, in 2013, big data is forecast to drive $34 billion of IT spending and will total $232 billion of IT
spending by 2016. Big data currently has the most significant impact in social network analysis and content analytics
with 45 percent of new spending each year. In traditional IT supplier markets, application infrastructure and
middleware is most affected (10 percent of new spending each year is influenced by big data in some way) when
compared with storage software, database management system, data integration/quality, business intelligence or
supply chain management (SCM). 2
It is important to note that this current research estimates revenues across some existing companies. This is the
nature of a nascent space; the estimates are a bit unreliable, however they are still substantial. The next generation
data warehouse companies are listed here and it can be argued that they are not true “big data” companies… yet.
Further insight into the current numbers can be seen in the following table. As you will see, it is a mix of a lot of
multiple segments. As defined by Wikibon, The big data market includes those technologies, tools, and services as
follows:
2

Hadoop distributions, software, subprojects and related hardware;

Next-generation data warehouses and related hardware;

Big data analytic platforms and applications;

Business intelligence, data mining and data visualization platforms and applications as applied to Big Data;

Data integration platforms and tools as applied to Big Data;

Big Data support, training, and professional services.
http://www.gartner.com/newsroom/id/2200815
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
10
7. Competitive Intelligence
7.1 Key Differentiators And Important Questions to Ask Your Customer
Talend Big Data has several key differentiators as outlined in this playbook, but here are those that are most
significant:
1. Cluster Scale vs. Non-native Engines
Talend holds a unique and technically superior position, in that our software generates 100% native Hadoop data jobs.
This means that our data jobs can infinitely scale. Processing massive data sets is the corner stone of big data, having a
technology that meets this need comes down to either:

Hand-coding, with mean higher developer fees and maintenance costs

Using Talend – re-use existing skills and reduce maintainability issues as Talend tools abstract the under
complexity
Our competitors require that their engine, which was written for a different environment, be installed, configured
and managed on an on-going based, to simply run a data job on Hadoop. This is not MapReduce and it doesn’t scale neither technically nor economically.

How many engines do you need to install?

What pricing model is applied?

What if I need to scale out the cluster if my data grows in the next 5 years?
With Talend we do not require the customer to install any special Talend software. A major proof point is the fact that
Hortonworks Data Platform embeds and promotes Talend Open Studio for Big Data as their preferred solution for big
data integration. Our code generation also delivers the following immediate benefits our customers:

Setup fees are zero. Talend does not require any pre-installed Talend software on either Hadoop or NoSQL

Upgrade and maintenance costs are largely non-existent

Our big data jobs are 100% compatible and ready to run, scale, process just about size of data, all within the
cluster
2. Predictable Costs vs. Runtime Based Pricing
Unlike competitors Talend does not charge for runtime or nodes for our big data solutions. We also include all
connectors to both enterprise data sources and of course the big data platforms within the same fee. We do not apply
additional charge based on where the software is hosted – you can run on or off the big data platform for the same
fee.

How many nodes do you expect in your big data cluster? How much will it grow every year? Does your budget
include a perpetual increase of the license cost (as data usually increases every year)?

How you considered the costs for each connector for your products? Talend’s fee is all inclusive of
connectors? Do you consider that you will need to run some jobs inside the cluster and some outside? How
you consider the additional server charges for this setup?

Do you intend to use the data quality features and how other vendors are charging for this additional
functionality?
3. Unified Platform vs. Patchwork of Incomplete Solutions
Talend provides an enterprise integration solution with a complete set of the required functions for successful
implementation. There is no extra charge for various connectors or integration and cleansing functions.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
11

Have you considered all the technologies you will need for this project? Do you have resources with expertise
in all the required functions? How many will you need to learn? Can you share these resources across the
products? How well do they fit together? Are they all upgraded at the same time?

What is the Total Cost of Ownership (TCO) for your integration project? Have you considered all of the
licenses you will need for Big Data, DI, DQ, MDM, ESB and BPM? Are they provided as one license, by the
same vendor? On the same platform? Do the other vendors provide a common repository to share project
metadata and artifacts?
4. Open Source and Extensible vs. Black Box Proprietary Solution
Talend Big Data is extensible and open source. This is not black-box proprietary software. You can open it up,
investigate and extend as necessary. NO other vendor provides this.

Can you customize the other big data solutions you are considering? If there is a function that you do not
have and need can you extend the Studio to include this without waiting for the vendor to make an update?
Does the other solution have a community of developers who create extensions to the solution and share
them with each other? Our partners are also providing connections for their NoSQL solutions.

How can you determine if an issue you have is a bug or an implementation problem? If it is a bug do you have
access to track the fix and install a patch as soon as it is available? Can you investigate the software and
possibly fix it yourself? Do the other vendors have a vital and active community who provides another level of
free support to each other?
5. Big Data Management vs. Simple Hadoop Connectivity
Talend Big Data goes beyond data integration, by delivery a real killer-app for big data –big data profiling. As identified
by our customers, big data presents a great opportunity but also a major challenge. All financial companies are
required to meet data compliance and governance obligations. For example, in the US, banks are required to assert
that all customers have well-formatted social security numbers and no duplicates entries exist. With big data we can
upload as much data as we have into Hadoop, but how do we audit the quality of the data when regulators come
knocking? That’s where Talend big data quality (as part of the Talend Platform for Big Data) starts to come in.
Today we offer 
Profiling of HIVE and the ability to run analysis remotely on HIVE databases, leveraging the processing power
of the cluster

Matching in Hadoop – matching is one of the most computationally intensive functions of DQ, therefore
running in on Hadoop is not only desirable, it’s simply a mandatory requirement, because
1.
The data is already loaded, with no option to download and process it elsewhere
2.
The cluster processing is required to tackle such a large dataset, therefore you must run on the cluster
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
12
8. Customer Case Studies
Customer case studies are made available through Talend.com.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
13
9. Partners
9.1 Summary
Talend has a broad ecosystem of big data partners which significantly benefits our customers as our products are built
to run in big data environments. We also have many established partners using our technology who can assist with
implementation and services if needed. The current (April 2013) list of partners includes the following:
Hadoop Distribution Partners
Big Data Appliance/Cloud
Partners
NoSQL Partners
Talend supports all the common
Hadoop distributions across:

Greenplum/Pivotal

DataStax/ Cassandra*

Netezza


10Gen/ MongoDB*
Amazon EMR

Vertica

Couchbase*

Apache

Teradata

Redis

Cloudera

Google Platform

Riak
HBase*

EMC/ Greenplum


Hortonworks

Membase

MapR

Neo4J*
* indicates that Talend supported
components are available
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
14
10.Glossary/Background
Big data is defined by countless new terms and technologies. Below is a small set of terms that are used within this
document.

Cassandra – Apache Cassandra is an OSS database that runs on top of the Hadoop filesystem (HDFS). It is
key/value store database that it provides its own query language and can be tuned to optimize huge commodity
server farms. It was originally developed by Facebook to search the inbox system.

Hadoop - Hadoop was born because existing approaches were inadequate to process huge amounts of data.
Hadoop was built to address the challenge of indexing the entire World Wide Web every day. Google developed a
paradigm called MapReduce in 2004, and Yahoo! eventually started Hadoop as an implementation of MapReduce
in 2005 and released it as an open source project in 2007. While it may have started as a MapReduce
implementation it has extended well beyond this and has transformed into a massive operating system for
distributed parallel processing of huge amounts of data. MapReduce was the first way to use this operating
system, but it has been joined by many other techniques, such as Apache Hive and Pig open source projects that
make Hadoop easier to use for particular purposes. Much like any other operating system, Hadoop has the basic
constructs needed to perform computing: It has a file system, a language to write programs, a way of managing
the distribution of those programs over a distributed cluster and a way of accepting the results of those programs,
ultimately combining them back into one result set.

HBase – HBase is an OSS database that runs on top of the Hadoop filesystem (HDFS). It is columnar database that
it provides fault-tolerant storage and quick access to large quantities of sparse data. It was originally developed by
Facebook to serve their messaging systems

HCatalog – Largely developed by Hortonworks and now part of Apache Hadoop, HCatalog addresses the need to
have meta-data to describe the structure of the underlying data stored in Hadoop. This makes the development
and maintenances of big data application more efficient.

Hive – Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization,
ad-hoc query, and analysis of large datasets. provides a mechanism to project structure onto this data and query
the data using a SQL-like language called HiveQL. It was originally developed by Facebook but is used in
production by many companies.

MapReduce – Map Reduce is a software framework introduced by Google in 2004. It allows a programmer to
express a transformation of data that can be executed on a cluster that may include thousands of computers
operating in parallel. At its core, it uses a series of “maps” to divide a problem across multiple parallel servers and
then uses a “reduce” to consolidate responses from each map and identify an answer to the original problem.

NoSQL – (Not only SQL) this refers to a large class of data storage mechanisms that differ significantly from the
well-known, traditional relational data stores (RDBMS). These technologies implement their own query language
and are typically built on advanced programming structures for key/value relationships, defined objects, tabular
methods or tuples. The term is often used to describe the wide range of data stores classified as big data.

Pig – the Apache Pig project is a high-level data-flow programming language and execution framework for
creating MapReduce programs used with Hadoop. The abstract language for this platform is called Pig Latin and it
abstracts the programming into a notation which makes MapReduce programming similar to that of SQL for
RDBMS systems. Pig Latin is extended using UDF (User Defined Functions) which the user can write in Java and
then call directly from the language. Talend takes advantage of UDF extensively.
Talend ©2013 | Talend Big Data Partners Playbook | 2013-04
15
Download