https://in-database.com/in-database-processing/ In-Database Processing In-Database Processing refers to integrating data management workloads inside a production data warehouse. The process is popular for improving analytic efficiencies and is often called In-Database Analytics. The fact is, In-Database Processing has been around for a long time and In-Database Analytics is just the crown to In-Database Processing — the tip of an in-database iceberg. If analytics are the crown to the iceberg, then supporting the crown are vast, largely hidden data management and data mining processes. Data management and data mining represent 75% and analytics 25% of In-DataBase Processing. At In-DataBase Pioneers, our passion and muscle are Data Mining and Ad Hoc Analytics. By implementing In-Database Processing, we reduce work effort from hours to seconds. Gone are the data transfers across the network. Gone are the complex joins and aggregations on inefficient workstations. Instead, the data warehouse is used for what it does best: managing data. Analytical Data Sets are created at the source, in seconds rather than hours. This part of the In-Database Process – the hidden part of the iceberg – is where business analytics gain the most. Learn More, Click Image! In-DataBase Pioneers takes In-Database Processing to the next level. We remove the requirement for business analysts to manage large data volumes or complex data structures. InDataBase Pioneers bridges the gap between business analysts and the data management processes, removing 75% of the work — generating ad hoc analytics in seconds instead of hours — so analysts can focus on analytics. In-Database Analytics Preparing data for analytics — data prep or data mining — takes about 75% of the entire analytics process. This is where joins, sorts and aggregations are done, and can easily include several hundred million data rows — significantly impacting the time required to generate analytics. So, if an analytics process takes eight hours, then six hours are spent preparing the data; only two hours are spent on actual analytics. The analytics, or remaining 25%, when integrated into the data warehouse is appropriately called In-Database Analytics. By institutionalizing the data mining and analytic processes into the data warehouse, a number of stunning benefits emerge. First, depending on the database, a two-hour data mining and analytic process may literally reduce to under one minute. Second, the data mining and analytic processes become completely repeatable with absolute accuracy. Third, up to 85% of the data mining and analytic processes might be institutionalized with no additional hardware or software requirements – 100% when the analytics is simple statistics. Learn More, Click Image! Data Mining At In-DataBase Pioneers, Data Mining and Ad Hoc Analytics are our passion and our muscle. If Ad Hoc Analytics is the crown, then Data Mining is the brick-solid foundation to everything we do. Simply put, Data Prep or Data Mining — recently called “data janitor work” — is the data management process of creating Analytical Data Sets and Ad Hoc Analytics. “far too much handcrafted work — what data scientists call ‘data wrangling’, ‘data munging’, ‘data janitor work’ — is required … spending from 50% to 80% of time mired in this mundane labor of collecting & preparing unruly data….” – New York Times, Aug 2014 The Analytical Data Set (ADS) is the final data set used to conduct analytics. Creating the ADS is where the “heavy lifting” is done — joins, sorts, merges, aggregations, etc. — and can include hundreds of millions of data rows. For seasoned data miners, data prep is considered challenging and rewarding, but not janitor work! In fact, during Data Mining new data relationships and data transformations are most often discovered or created. And it’s during Data Mining when data scientists develop analytical approaches to answering the business question. Many business analysts don’t engage in data prep simply because they lack the data management skills. Instead, they depend on IT or reporting tools to provide data dumps — increasing the risk of errors in analytics through bad joins or improper aggregations. These errors are particularly damaging because they are buried in undocumented, desktop code. And searching the analytics for these errors is like searching the crown of an iceberg for navigational danger. Data prep can take 75% of the analytics process, which significantly impacts the time required to generate analytics. For example, if the complete analytics process takes eight hours, then six hours are spent preparing the data; only two hours are spent on actual analysis. Consequently, Data Mining is where we see the biggest gains from In-Database Processing – literally the low hanging fruit. The larger the data volumes and the more complex the data structures, the bigger the gains. And the brilliant part is that the entire Data Mining process can be institutionalized often with no additional hardware or software expense. Contact Thomas today to explore how In-DataBase Pioneers might reduce your data mining and analytics processing from hours to seconds Ad Hoc Analytics To remain competitive, Marketers rely heavily on custom Ad Hoc Analytics, like Consumer Behavior and Retention Analytics. Ad Hoc Analytics might require drilling deeper into existing analytics or creating new deep-dive analytics and are routinely attempted using in-house reporting tools. Reporting tools typically fail at Ad Hoc Analytics when data is either out of reach, or the data management is too complex. Then the business analyst must create – from scratch – a unique Analytical Data Set (ADS) and only after creating the ADS can the actual “analytics” begin. A common hurdle, however, is that business analysts aren’t trained to manage complex data structures or disparate data sources, and risk creating errors. Learn More about Retention Analytics, Click Image! “Fewer than one in four firms collect customer data like demographics or buying habits, and among those that do, more than half lack adequate staff to access customer data and generate business insights.” Barb Brynstad Loyalty360, August 2014 What makes errors in Analytical Data Sets particularly damaging is that they are hidden – buried in improvised desk-top code. And examining the resulting analytics for these errors is like searching the visible part of an iceberg for navigational danger. The errors are too deeply hidden. Data Mining and Ad Hoc Analytics “The lack of good data is the most common barrier to organisations seeking to employ predictive analytics.” Tom Davenport Sep 2014 “It is not true that companies need good data to use predictive analytics … The techniques can be robust in the face of terrible data, because they were invented by people who had terrible data,” James Taylor, September 2014 At In-DataBase Pioneers, we solve your Ad Hoc Analytics challenges, while understanding that you have limited resources. Our passion, our muscle and core competencies are deep-dive Data Mining and Ad Hoc Analytics. We work closely with Marketers to develop Marketing Directives. With directives in hand we identify, develop and validate data interactions and data relationships. Through transformations, joins and aggregations, we build the Analytical Data Sets – that’s 75% of the analytics process. We work collaboratively with Marketers to ensure the ADS is completely error-free. With a wellbuilt — brick solid — ADS, focused on Marketing’s directives, the actual analytics become much, much easier. Learn More, Click Image! “It’s an absolute myth that you can send an algorithm over raw data and have insights pop up…” The New York Times, August 18, 2014 Finally, if you plan to repeat the analytics, then it’s best to build the Analytical Data Set once and only once as an institutionalized ADS. By collaborating with Marketing and Technology we develop Analytical Data Sets directly inside your production data warehouse so they’re available to your analysts 24/7. Generally, up to 85% of the entire Ad Hoc Analytics process can be institutionalized without any additional software or hardware expenses. Marketers control the Marketing Technology! So if do you plan to repeat the analytics, don’t repeat the hours of re-building the same Analytical Data Sets when, within seconds, Analytical Data Sets are available 24/7! Learn More, Click Image! Contact In-DataBase Pioneers today to explore how we might reduce your data mining and analytics processing from hours to seconds!