Data Mining

advertisement

Scratching the surface

And now you have…

Datameer

RapidMiner

Windows Azure Marketplace

by Prateek Burman

Datameer

• Integrate, Analyze, Visualize

• Scalable

• Secured access

• Excel like interface

• Targeted at Hadoop users

• Around since 2009

Integration

• Oracle, DB2, MS SQL, MySQL

• Teradata, Greenplum

• XML, JSON, CSV

• Hbase, Cassandra

Datameer cont’d.

• Twitter, Facebook, LinkedIn

• Email

• Log files

• SaaS – CRM, GitHub, JIRA

Analytics

• Time series analysis

• Clustering

• Decision trees

• Built-in Recommendation engine

• Column Dependencies

• Predictive analysis with R, PMML

Datameer cont’d.

Visualization

• Graphs

• Maps

• Shapes

• Tables

• Dashboard

• HTML5

• Visualization apps from apps market

RapidMiner – Yet Another Learning Environment (YALE)

• Around since 2001

• Open source - Older versions

• Client/Server model w/ Server as SaaS

• Most popular for data analytics

• GUI based – no need to write code

• Predictive analysis

• Text mining

• Sentiment analysis

• Direct Marketing

• Predictive Maintenance

RapidMiner cont’d…

• LabView type layout

• No coding – min. likelihood of error

• One operator's output is another operator’s input

• Only structured datasets

• 3D graphics & Interactive dashboards

• Launched in 2010

• Hundreds of apps

• Thousands of subscriptions

• Trillions of data point

• Scalable – load balance

• No need to move data

Windows Azure Marketplace

Data

• GitHub/svn of data

• Point of discoverability

• Clean - Ready to use data

• An economic model for broad access

• OData standard

• Excel, SQL server, Office,

• Deliver using RESTful web-service access

Marketplaces

• Infochimps

• Factual

• Datamarket

• Gnip

• Datasift

• Kasabi

R RapidMiner

• Cutting edge

Algorithms

• Learning curve

• Need to import data

• Slow

Datameer

• Point & click

• Excel like interface

• Extensible to R, Python etc.

• Need to import data

• Supports many Hadoop

Distributions

• Optimized for Hadoop

• Business Infograpics &

Dashboards

• HTML5 – view anywhere

• Intuitive

• Can execute R scripts

• Can be extended using Java or Ruby scripts

• Pretty graphics

• Need to import data

• Cron scheduler

Azure Marketplace

• Known tools like

Excel

• Data readily available

• Cleaner data

• Other Windows services

• Q & A

Download