White Book of Big Data Big Data started to maximum the business benefit from mid-2011 around the different industries by variate data sources. There is the 3V model to define Big Data that are volume, velocity and variety. Based on the data type semi-structured data, the ideal of ‘Linked Data’ links cross-referencing different information in loose web that can improve the quality of any query and the accuracy. In the book, it also concentrates a fourth V called value. Compared to transitional business analytical system, Big Data Solution generated in ‘realtime’ information that can respond to more quickly about the market trend. The improved business insights support organization see the pattern of product or customer. The key functions of Structure of a Big Data Solution are Data Integration, Data Storage, Search and data Visualization, in order to deliver the useful insights and help company to make better decisions. The 3V model defines what Big Data is especially data speed that can tell business decision-maker and business partner about product information in real time. In the addition, the four vital is value to help organization understand the business value. The business insights will keep tract product or customer in the long-term goal by using data analysis systems and concepts. Compared to traditional data warehousing and BI systems, Big Data Solution requires IT departments to provide platforms that can deliver quick answer to variate teams based on their issues and challenges. Many organization needs to be aware current data warehousing and BI system only operate on the constructed data. However, there are three basic data types that are structured data, unstructured data and semi-structured data. Big Data isn’t just search because search can’t handle velocity. Based on Big Data dentition of the four Vs, the structure of a Big Data solution has four key processes that are data integration, platform infrastructure, data access interface and visualization. The new point of the Big data solution is data storage function because the data can be processed and analyzed in near real time. Data visualization can let business people be easier to understand the story behind massive data. In other side, data privacy becomes important based on the growing ease of access. The Eight-Fold Path of Data Science The emerging communicate of data scientists summarizes the practical applications of data science. Big Data technologies provide a connection between engineer and scientist to find the meaningful insights of product or customer. The revolution focuses on connecting the smart devices and their data to improve action and insights, in order to figure out an issue and prevent damage. Annika Jimenez introduces an eightfold path of data science project including four phases and four differential factors. The four phases finally deliver a framework to apply a model and take action, and the four differential factors require in each phase to ensure the application integrating the model with a business need. Data scientist is popular in the education and industry fields. People has become to pay attention of data science. The basic methodology of analytics is Data Mining since it is foundation of data structure tool by data collection and the variate data source connection. The first phase of eight-fold for successful data science project is problem formulation. The best problem formulation is related the company target and to solve the problem. The second phase of eight-fold is data step that is to build data process based on connecting different tables. The third phase of eight-fold is modeling step that is to apply right data selection and data modeling to predict datasets. The fourth phase of eight-fold is application that is to create a framework to take action. The first differentiating factor of eight-fold is technology selection. It means to selection right tools and right platform to solve a question. The second differentiating factor of eight-fold is creative. It will create a new ideal to solve the issue by improve the speed and more efficient. The third differentiating factor of eight-fold is iterative approach. To make sure each phase if iterative to the goals, it will bring great impact to the company. The fourth differentiating factor of eight-fold is to build a narrative. It will explain clearly about the impact of the platform such as what they did and how they changed. Transforming Your Company into a Data Science-Drive Enterprise Data Science is the core of Big Data, and it introduces new methodologies to analyze data. Big Data is impacting C-level executives to use platform to collect, store and analyze data. An efficient enterprise depends on data science-driven such as predictive analytics rather than data-driven. An initiation of a data science-powered transformation not always has good outcome because of some degree of uncertainty and vulnerability. If a data sciencedriven enterprise has more thoughtful vision, the business value can increase faster. From the line of business, a company frequently change the model between centralize and decentralize model based on the project prioritization. Compared to Big Data, Data Science creates new way to analysis such as predictive modeling, machine learning, Map reduce and database algorithms. It more efficient methodology to solve problem and predict the trend such as customer needs and product sales. An initiative utilization of data science enterprise faces multiple levels such as C-level to consider strategic shift and risk. For data science-driven transformation, there are two extremes that are good outcome and bad outcome. If each level of the transformation catalyst follows the enterprise’s vision, the organization creates more business value. In the value chain, data scientist in the middle of the influence, and their function is to build model to centralize or decentralize data. Data availability is a fundamental data science-driven enterprise. It will become easier to access data and query data to deliver data visualization. To improve effectiveness in the company, the important idea is to choose the right platform to collect data and manipulate data. For advanced analytics, there are two category that are “Big Data” toolkits and “Small Data” toolkits. The tools include R, Stata, Matlab, SAS and SPSS. To build data scientist team, it needs to people who have solid data scientist background and right analytical leadership. The path of operationalization is key to deliver the value to the business. The key factors are production, storage, application and reporting. The lack of defined process and program management can lead unsuccessful delivery. In order to deliver wins, the instrumentation is key such as product managers needs to drive and deploy the project requirements. Data Mining from A to Z Predictive analytics and data mining become more important to delivery insight and inform business people to make better decisions. Data mining is to find the pattern of sale, customer and marketing, and used to predict the future trend. The methodology can help company to detect fraud, minimize risk and increase revenue. In the SAS data mining process, there five steps to complete the process. The applications of data mining can apply to different industries such as finance, marketing and health insurance. The article introduces three SAS tools that are SAS Enterprise Miner, SAS Rapid Predictive Modeler and SAS Model Manager. They focus on the different features, and face to different clients. Data mining can segment customer behavior by demographics or attitudes such as age, education and gender, in order to target valuable customers and increase revenue. The first step of data mining process is to sample the data to represent the target data sets. The sets include training set and validation set. The second step is to explore the data such as clustering, classification and regression to search the relationships. The third step is to modify the data to recognize the significant variables during model selection process. The fourth step is to model the data by using different analytical techniques based on the issues. The final step is to assess the data and models for usefulness and reliability. SAS Enterprise Miner is a great tool for data mining and predictive analytics. The tool follows the five steps od data mining process. SAS Rapid Predictive Modeler can quickly generate predictive models, and it is user friendly interface for business analysts. SAS Model Manager is another tool for government such as promote champion, validate models and monitor model performance. The first two SAS tools can allow business analyst to automate develop baseline, and the tools also can apply data mining and statistics methodology to analyze the data. The last one SAS tool can deliver a performance-monitoring dashboard.