Running Head: Big Data: An Opportunity or a Disaster Big Data: An Opportunity or a Disaster Maikel Hennen Seattle Pacific University 1 Big Data: An Opportunity or a Disaster 2 Abstract This paper will explore the concept of Big Data, better define its characteristics, and systematically provide evidence to support the need for Big Data. Specifically, this study will frame the positive impact Big Data is having on the concept of turning Big Data into big money. Big Data is a movement, currently sweeping the information systems management world. Research had proven that data creation is growing at an exponential rate. Every day, 2.5 quintillion bytes of data is added to the cloud. The Big Data movement is helping users, businesses, and governments unlock value that was previously unattainable or unknowable. Keywords: Big Data, big money, characteristics, 4 V’s, volume, variety, velocity, veracity. Big Data: An Opportunity or a Disaster 3 Big Data: An Opportunity or a Disaster Introduction “Data is the new science. Big Data holds the answers.” (Bhasin, 2014) “Data are becoming the new raw material of business.” (Cokins, 2014) “Information is the oil of the 21st century, and analytics is the combustion engine.” (Greengard, 2014) Big Data has become an all-encompassing, somewhat sprawling term. Seemingly, it has as many definitions as it does applications; what is Big Data, how can it be characterized and what does it mean in the context of government, business and consumer usage? Harris Interactive, one of the world’s leading market research firms, recently surveyed 154 companies, asking them to define “Big Data”. For 28% it meant "massive growth of transaction data." But 24% thought it referred to new technologies for managing massive data, and 19% defined it as the "requirement to store and archive data for regulatory compliance." (Davenport, 2014). Out of the 154 companies no one was able to, precisely, define nor explain it. To better utilize such a broad concept, its characteristics must be defined, studied, and understood. Big Data is a great phenomenon with many great benefits. By reviewing current Big Data trends, anyone can take advantage of it and turn it into big money. Turning Big Data into big money is a concept built around monetizing publically available, or free data, into revenue (Ohlhorst, 2012). Many current Fortune 500 companies were created with the simple idea of mining publically available data, which they transformed into massive cooperation’s worth billions of dollars. Big Data is the 21st century equivalence of the 1800’s Gold Rush. Individuals, and firms, with the right methodologies, and tools, can most definitely strike gold and unlock the unlimited potential of Big Data and analytics. Big Data is no longer a term. It is a movement (Goes, 2014). Big Data: An Opportunity or a Disaster 4 Big Data What is Big Data? Big Data, in a nutshell, is all the information floating around inside the Internet matrix. According to IBM: “Every day, we create 2.5 quintillion bytes of data, so much that 90% of the data in the world today has been created in the last two years alone.” (Gantz, 2007). Big Data can also be defined as extremely large datasets that may be analyzed computationally to reveal patterns, trends and associations, especially relating to human behavior and interactions. Many users, including both medium and large corporations are Big Data illiterate; they do not understand the concept fully to present an educated approach to this phenomenon. But, they use and contribute to Big Data on a daily basis. A typical user on any social platform, such as Facebook or Instagram, is likely using and contributing to Big Data while not realizing it. Google Now, Siri and IBM’s Watson supercomputer is a by-product of Big Data and analytics. A recent Forbes magazine article indicates “Data has always been used to develop high-level metrics and business intelligence. Smart organizations have long relied on data to help make strategic business decisions. But the power and allure of Big Data is how it enables organizations to leverage unconventional data points.” The invisible influence of Big Data effects everyone that uses an internet connected device (Sardana, 2013). Big Data Characteristics To better leverage, and understand, the concept of Big Data, its characteristics must be defined. Big Data has been characterized by the 4 V’s: Volume, Velocity, Variety and Veracity (Tirunillai, Tellis, 2014). Volume Big Data: An Opportunity or a Disaster 5 Volume, in the context of Big Data, is the quantity of data that is generated. It is the size of the data which determines the value and potential of the data under consideration, and whether it can actually be considered as Big Data or not. Consumers are creating, and consuming, an incredible amount of data. Every time a picture is posted to Instagram, a post is “Liked” on Facebook, or a box of diapers is purchased through Amazon, millions of bytes of data exchange hands and enter the data matrix. The data becomes valuable information, available to be mined and utilized. The amount of data, currently in circulation, is a fraction of the volume expected in the next 5 to 10 years. As of December 31, 2013, as shown in figure 1.1, only 39% of the world population is using the internet ("World Internet Users Statistics and 2014 World Population Stats", 2013). As technology becomes more widespread, and cheaper to own, the remaining 61% of the world population will soon jump on the internet wagon and join the World Wide Web. A recent white paper study, presented by Cisco ("World Internet Users Statistics and 2014 World Population Stats", 2014), found that “Annual global IP traffic will surpass the zettabyte (1000 exabytes) threshold in 2016. Global IP traffic will reach 1.1 zettabytes per year or 91.3 exabytes (one billion gigabytes) per month in 2016. By 2018, global IP traffic will reach 1.6 zettabytes per year, or 131.6 exabytes per month.” To put things in perspective, a zettabyte is equal to 1 000 000 000 000 000 000 000 Bytes. As the first V indicates: Big Data is gargantuan. Figure 1.1 ("World Internet Users Statistics and 2014 World Population Stats", 2013). Big Data: An Opportunity or a Disaster 6 Velocity Velocity, in the context of Big Data, refers to the speed of data generation; or how fast the data is generated, and processed to meet the demands of user queries. As technology becomes cheaper, thanks in part to Moor’s Law, billions of people and devices are becoming connected. These devices are exponentially creating and adding data, to our digital universe, at an extremely rabid rate. EMC, one of the largest providers of data storage systems in the world, indicated that the New York Stock Exchange captures 1 terabyte of trade information during each trading session; that is about 142 gigabytes every hour. EMC also projects that there will be 18.9 billion network connections, by 2016. That is almost 2.5 connections per person on earth (Davenport, 2014). High velocity, or fast data, can mean millions of rows of data per second. In a recent study, IBM indicated that data generation by humans has been growing exponentially for quite some time, fueling the growth of companies like EMC. In fact, 90% of the world’s data was created in the last 2 years (Bottles, 2014). This really demonstrates how the world has embraced Big Data. Variety Variety, in the context of Big Data, refers to the different types of data that can be used. Historically, structured data received most of the attention. Structured data always had the spot Big Data: An Opportunity or a Disaster 7 light since it neatly fit into tables or relational databases. In fact, 80% of the world’s data is unstructured such as voice, video, and images (Bottles, 2014). With Big Data technology, computational algorithms can now analyze and bring together data of different types such as social media, photos, sensor data and video recordings. This data can be structured, and mined to provide meaningful results. According to IBM (IBM, "What is Big Data", 2013), by the end of 2014 it’s anticipated there will be 420 million wearable, wireless health monitors collecting a variety of health information such as heart rate, calorie burn, and blood pressure; YouTube users, on average, would have watched 4 billion hours of video every month; and Facebook users would have shared 30 billion pieces of content every month. Data provided by the various conduits, gives a glimpse into the variety of data expected from the information systems of the future. On the topic of Big Data and information systems (IS) research, Professor Paulo Goes (Goes, 2014), from the University of Arizona, states: “I personally find variety the most interesting dimension of Big Data from an IS perspective. Putting together data from sensors, the “Internet of things,” and the vast repository that we call the Web, allows researchers to ask and answer questions that explain and predict individual behavior and detect population trends.” Figure 1.2 shows the variety of Big Data, and some of its sources. Figure 1.2 ("Intelligence by Variety - Where to Find and Access Big Data", 2013) Big Data: An Opportunity or a Disaster 8 Veracity Veracity is defined as truth or accuracy; the quality of being truthful or honest; and conformity with truth or fact. In the context of Big Data, veracity refers to the messiness or trustworthiness of the data. With the volume of data generated every minute, quality and accuracy are less controllable. Data veracity is often overlooked, and sometimes ignored; but it is as important as the other 3 V’s of Big Data: volume, velocity and variety. The concept of GIGO, Garbage in Garbage out, states that if a database is fed inaccurate data, the database will output garbage information. Databases and computer models have no way of discerning data inconsistency and incompleteness, without data boundaries. Data veracity is a “filter” that can be utilized to approximate data that is in doubt. The Big Data revolution will forces future designers to rethink the traditional data-warehouse and business intelligence architecture to accept massive amounts of both structured and unstructured data at great velocity. Big Data: An Opportunity or a Disaster 9 Trends and Benefits: Big Data and Big Money Big Data is critical to any information technology system. As research indicates, Big Data is the future of information management (Goes, 2014. p.38). The 4 V’s discussed, all seem to indicate that Big Data is complex, expensive to store, analyze and properly use. Even though Big Data is a complicated concept, its use has become critical and widespread. If deployed efficiently, Big Data can provide the end user with a more immersive and customized experience (Greengard, 2014, pp.12-14). Frank Ohlhost, in his book Big Data Analytics, explains that Big Data is not a new concept: “Big Data has its roots in the scientific and medical communities. Historically, scientists depended on complex analysis of large data sets, for the development of physics modeling, drug testing, and other forms of research. While the use of Big Data is now widespread, the concept of scientific research has changed what Big Data has come to be.” (Ohlhorst, 2013, pp. 76-80) The main driver of Big Data, in today’s age, is the concept of turning Big Data into big money. Big Data has grown from a buzzword into a practical definition for what it is all about, Big Analytics and profit (Goes, 2014). As Big Data is analyzed, mined and organized, small to large industries are starting to utilize the findings to introduce greater efficiencies, build new processes for revenue discovery, and fuel innovation. On realizing the value of Big Data, Frank Ohlhorst indicates “a number of industries, including health care, the public sector, retail and manufacturing, can obviously benefit from analyzing their rapidly growing mounds of data. Collecting and analyzing transactional data gives organizations more insight into their customers’ references, so the data can then be used as a basis for the creation of products and services. This allows the Big Data: An Opportunity or a Disaster 10 organizations to remedy emerging problems in a timely and more competitive manner.” (Ohlhorst, 2013, pp. 25-28) Research indicates that the use of Big Data, and analytics, is becoming key for competition and growth for several industries, and that it will fuel new opportunities of productivity and growth. Big Data can provide large cooperation’s and end-users, these benefits: knowledge discovery, actionable information, short and long term benefits and resolution of pain points (Bhasin, 2014). Imagine having the power to execute a complex plan, with guaranteed outcomes, all built around the four benefits mentioned above, with the click of a button. All of that is possible, due to advancements in information storage and analytics. Big Data is utilized every time a consumer asks Google Now, or Apple’s Siri, a question. Google Now, for example, is collecting massive, and detailed, information about its end-users, through their interactions with various Google services. Google Now is able to provide reminders, directions, food recommendations, travel tips, targeted advertisement and a form of “artificial intelligence”, at the click of a button. This process of data mining and analytics, is providing Google and its developers, a method to turn Big Data into big money. While a typical user does not have the resources needed to fund a massive data mining operation, most of the resources can be provided for free or for a small monthly charge. YouTube, an online video streaming service, has been a great crowdsourcing location for creative users. Typically, users will post their videos, which Google will host for free, and may choose to monetize their creations through advertisements. Google will do all the heavy lifting, mine their viewers, and provide the content creators an automated method to create revenue. Google is able to provide this service for free, by retaining a portion of the Big Data: An Opportunity or a Disaster 11 advertisement revenues. Anyone with an internet connection, can now cash-in on Big Data, and turn it into big money. General Electric (GE), is starting to implement the concept of Big Data and analytics into their broad portfolio of products. A recent GE webinar, on Big Data 2.0, outlined the fast adoption of Big Data and analytics in the commercial, government and consumer markets. Mark Little, GE’s chief technology officer, discussed how their new light fixtures would no longer be a source of light. Instead, it will be a Big Data storage and analytics engine, in the form of a light fixture (Little, 2014). GE’s street light fixtures, for example, will be able to function as a Wi-Fi hotspot, a security camera hub, a module to detect gunfire shots and poisonous gases, and at last a light fixture. A simple light fixture is now able to serve the three main e-commerce contributors: government, consumer, and business. All of that is possible, due to a connected device that is capable of providing real-time data, to an analytics engine, utilizing Big Data to make decisions. The data collected will be monetized, within GE’s legal rights, and offered to developers to be turned into additional services they can sell. This process, is again, turning publically available data into big money. The concept of turning Big Data into big money, is no longer a dream. It’s the reality of some of the top Fortune 500 companies in Silicon Valley. Two billion people use the Internet daily (Wright, 2014); that’s two billion minds to mine and two billion people that are in need of products and services. Companies, and even small start-ups, are relying on Big Data to unlock the hidden value of the fast expanding data. Big Data is driving increased growth and profitability. In the last 10 years, many inventors capitalized on the concept, and currently operate multi-billion dollar companies. Facebook is a prime example, with 1.23 billion active users (Sedghi, 2014), Facebook is continually mining “free data”, provided by its users, to sell Big Data: An Opportunity or a Disaster 12 billions of dollars in advertisement products, back to their users. Redfin also uses publically available data, and the Internet, to target and sell homes to individuals willing to avoid the hassle of buying a home the traditional way. Big Data truly is “Big Money”. “Big Data is not a fad. It’s a forgone conclusion. And I believe the application phase of Big Data’s development will be one of the biggest business drivers of the next ten years.” (Bawa, Forbes, 2013, p.2). Conclusion “More and more decisions will be coming from self-learning, self-adjusting software and machines. You’re already beginning to see it in products such IBM Watson, robotic vacuum cleaners, learning thermostats and selfdriving cars. We, at Rocket Fuel, are also leveraging such systems to analyze petabytes of data and drive heroic success for our customer’s marketing campaigns,” (Gupta, Forbes, 2013, p.1). Data creation is developing at a record rate. A white paper study, by EMC, predicts that between 2009 and 2020, digital data will grow 44 fold (Gantz, 2007). The volume, velocity variety and veracity of the data created, provide a great opportunity for analytics and data mining, and in turn big money. Big Data will expand our knowledge in human genomics, health care, finance, and many other areas. Big Data is also a crucial element in allowing our information systems to move towards a more autonomous decision making, by leveraging Artificial Intelligence. The technology is evolving to a point where it can analyze patterns and provide guidance never before possible. The Big Data movement can truly transform lives. Benefits range from unlocking the secrets of our universe, to providing a source of income (Tambe, 2014). Big Data should be studied, understood and accepted as the reality of information systems going forward. Big Data: An Opportunity or a Disaster 13 References Bhasin, M. K. (2014). Numbersense: How to Use Big Data to Your Advantage. Financial Analysts Journal, 70(3), 57-58. Bottles, K., Begoli, E., & Worley, B. (2014). Understanding the Pros and Cons of Big Data Analytics. Physician Executive, 40(4), 6-12. Cisco Visual Networking Index: Forecast and Methodology, 2013–2018. (2014, June 10). Retrieved November 25, 2014, from http://www.cisco.com/c/en/us/solutions/collateral/service-provider/ip-ngn-ip-nextgeneration-network/white_paper_c11-481360.pdf Cokins, G. (2014). MINING THE PAST TO SEE THE FUTURE. (cover story). Strategic Finance, 96(11), 23-30. Davenport, T. H. (2014). How strategists use “Big Data” to support internal business decisions, discovery and production. Strategy & Leadership, 42(4), 45-50. doi:10.1108/SL-05-20140034 Gantz, J. (2007). A Forecast of Worldwide Information Growth Through 2010. Retrieved November 15, 2014, from https://www.emc.com/collateral/analyst-reports/expandingdigital-idc-white-paper.pdf Goes, P. B. (2014). Big Data and IS Research. MIS Quarterly, 38(3), iii-viii. Greengard, S. (2014). Weathering a New Era of Big Data. Communications Of The ACM, 57(9), 12-14. doi:10.1145/2641225 Intelligence by Variety - Where to Find and Access Big Data. (2013, January 1). Retrieved November 25, 2014, from http://www.kapowsoftware.com/resources/infographics/intelligence-by-variety-where-tofind-and-access-big-data.php Little, M. (Director) (2014, October 1). Big Data 2.0. Lecture conducted from General Electric. Ohlhorst, Frank J. (2012). Big Data Analytics: Turning Big Data into Big Money. Retrieved from http://www.eblib.com Sardana, S. (2013, November 20). Big Data: It's Not A Buzzword, It's A Movement. Retrieved November 25, 2014, from http://www.forbes.com/sites/sanjeevsardana/2013/11/20/bigdata/ Sedghi, A. (2014, February 12). Facebook: 10 years of social networking, in numbers. Retrieved November 25, 2014, from http://www.theguardian.com/news/datablog/2014/feb/04/facebook-in-numbers-statistics Big Data: An Opportunity or a Disaster 14 Tambe, P. (2014). Big Data Investment, Skills, and Firm Value. Management Science, 60(6), 1452-1469. doi:10.1287/mnsc.2014.1899 Tirunillai, S., & Tellis, G. J. (2014). Mining Marketing Meaning from Online Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation. Journal Of Marketing Research (JMR), 51(4), 463-479. What Is Big Data. (2013, March 1). Retrieved November 30, 2014, from http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg World Internet Users Statistics and 2014 World Population Stats. (2013, December 12). Retrieved November 25, 2014, from http://www.internetworldstats.com/stats.htm Wright, A. (2014). Big Data Meets Big Science. Communications Of The ACM, 57(7), 13-15. doi:10.1145/2617660