Brief Summary on Big Data

Brief Summary on Big Data Abstract. This paper provides a brief summary on Big Data research. Summary The age of data science has just started and is experiencing the tactical evolution by leaps and bound in all dimensions of science. The digital footprints and sizeable traces of the data are being produced and left to deposit in massive quantities by various data outlet harbingers such as Facebook, Twitter, Wikipedia, Google to name a few. The scientists from almost every community are expressing significant interest and eagerness to access the spoors of the raw data to facilitate the optimal decision makings. Disparate scientific communities unilaterally cherish the granary of such a voluminous data, but argue about the accessibility mechanism and echo opinions to usher new technologies to cater the data processing needs of this very hour. The fast-paced growth of the modern data - structured and unstructured - raises the doubts on the computing capacity of the existing database models. In the last few years, due to the internet sites, the sources of data collection have been increased. The data is inundated from many potential sources such as sensors - gathering climatic information, power meter readings, social media sites, videos and digital pictures, traffic data, etc. These sources are responsible for the rapid growth of voluminous data about 2.5 quintillion bytes every day. Because of the growth of the data with such a fast velocity, the size of data is a central moving issue today. In 2012, in a single dataset, the size of the data likely to increase from few dozen terabytes to many petabytes and about 90% of the whole data in world today is produced in the last two years only. Not only for storage capacity, but also from processing aspects, the existing database management systems fall short to handle them. In this era of data science, in its entirety, the term “data” is further redefined, which is popularly known as “Big Data” today. This may help to convincingly reformulate the operational-oriented definition of “Big Science” into data-centric definition and can be rephrased as –“The science that deals with Big Data is Big Science.” The concept of Big Data is relatively new and has a large scope of further research. This section highlights some important aspects about the Big Data. Primarily, the Big Data is described to have five concerns: volume of the data, velocity of the data, variety of the data, veracity of the data, and the value of the data. All of the concerns are briefly described here. Today, the industries are inundated with voluminous data of all types. The data is growing easily from terabytes to even petabytes. Tweets from the social network sites alone are responsible for up to 12 terabytes of data every day. On an annual basis the power meters contribute to a huge amount of data – 350 billion readings. The data, getting collected at the enterprises is at a very fast pace. Sometimes even the delay of a minute may cause the discrepancy in the analyzed output. For especially time-sensitive systems such as fraud detection, the Big Data must be analyzed as it comes into the enterprises to obtain the best results such as scanning though the 5 million trade transactions, produced every day, to capture the potential frauds. As per the description about Big Data, it can be of any type, regardless of its structured or unstructured nature. The data type majorly includes – text, audio and video data, sensor data, log files, etc. One of the main reasons that the enterprises prefer to work with Big Data is the finding of new insights, when analyzing all such data types together. The examples include – monitoring live video feeds from surveillance cameras in order to narrow down to the point of interest; to analyze the bulk data growth in the videos, documents and images to enhance the customer satisfaction. In the business world, the leaders do not completely trust on the information extracted from Big Data. Though, they exploit this information to obtain better decision making. If someone does not trust on the input, how she/ he can trust on the output and changes that one will act of such outputs are next to negligible. As the Big Data is growing in size and variety every day, instating the confidence in Big Data poses greater challenges. Regardless the engineering arena, enormous efforts are invested for the curation of Big Data with the expectations of being it content-rich and valuable. Though, the richness of the data may be highly valuable for certain communities and may be of no use for others that do not see the apt outcome from the extraction. So, in the latter case, the efforts investment does not warrant any return. So, a primary careful analysis about the value of the Big Data before its curation may help building the competitive advantage. Conclusion. This paper was a one pager on Big Data research. 1 References Sharma, S. (2016). Expanded cloud plumes hiding Big Data ecosystem. Future Generation Computer Systems, 59, 63-92. Sharma, S., Chang, V., Tim, U. S., Wong, S, Gadia, S. (2016). Cloud-based Emerging Services Systems. International Journal of Information Management, Elsevier. Sharma, S. (2016). Concept of Association Rule of Data Mining Assists Mitigating the Increasing Obesity. International Journal of Information Retrieval Research (IJIRR), IGI Global, 2016. Sharma, S., Chang, V., Tim, U. S., Wong, S, Gadia, S. (2016). Growing Cloud Density & as-a-Service Modality and OTH-Cloud Classification in IOT Era. Sharma, S., Tim, U. S., Gadia, S., & Wong, J. (2016). Proliferating Cloud Density through Big Data Ecosystem, Novel XCLOUDX Classification and Emergence of as-a-Service Era. Sharma, S. (2015). Evolution of as-a-Service Era in Cloud. arXiv preprint arXiv:1507.00939. Sharma, S., Tim, U. S., Payton, M., Cohly, H., Gadia, S., Wong, J., & Karakala, S. (2015). Contextual motivation in physical activity by means of association rule mining. Egyptian Informatics Journal, 16(3), 243-251. Sharma, S., Gadia, S., Tim, U. S. (2016). SubVizCon: Visual Subsetting, Conversion and Complex Queries Exploitation in Spatio-Temporal Compound of Big Data. Sharma, S., Tim, U. S., Gadia, S., Wong, J., Shandilya, R., & Peddoju, S. K. (2015). Classification and comparison of NoSQL big data models. International Journal of Big Data Intelligence, 2(3), 201-221. Sharma, S., Shandilya, R., Patnaik, S., & Mahapatra, A. (2015). Leading NoSQL models for handling Big Data: a brief review. International Journal of Business Information Systems, Inderscience, 18(4). Sharma, S., Tim, U. S., Wong, J., Gadia, S., & Sharma, S. (2014). A brief review on leading big data models. Data Science Journal, 13(0), 138-157. Sharma, S., Tim, U. S., Gadia, S., & Wong, J. (2014). Does SNAP eligibility have racial or ethnic gradients: a geospatial social exploratory. International Journal of Information and Communication Technology, 6(2), 189-212. Sharma, S. (2014). Racial disparity in SNAP? No: a geospatial study for Iowa. International Journal of Indian Culture and Business Management, 9(3), 340-352. SUGAM, S., SHASHI, G., & UDOYARA SUNDAY, T. (2013). Parametric database approach integration for handling temporal data in GIS. Geo-spatial Information Science, 16(2), 91-99. SHARMA, S., TIM, U. S., & GADIA, S. (2012). AutoConViz: automating the conversion and visualization of spatio-temporal query results in GIS. Geo-spatial Information Science, 15(2), 85-93. Sharma, S., Wong, J., Tim, U. S., & Gadia, S. (2013). Bidirectional migration between variability and commonality in product line engineering of smart homes. International Journal of System Assurance Engineering and Management, 4(1), 1-12. Sharma, S., Goyal, S. B., Shandliya, R., & Samadhiya, D. (2012). Towards XML Interoperability. In Advances in Computer Science, Engineering & Applications (pp. 1035-1043). Springer Berlin Heidelberg. Shandilya, R., Sharma, S., & Qamar, S. (2012). A Domain Specific Indexing Technique for Hidden Web Documents. Communications in Information Science and Management Engineering. Sharma, S., Tim, U. S., Gadia, S., & Smith, P. (2011). Geo-spatial Pattern Determination for SNAP Eligibility in Iowa Using GIS. In Advances in Computing and Communications (pp. 191-200). Springer Berlin Heidelberg. Sharma, S., Yang, H. I., Wong, J., & Chang, C. K. (2011). Wrenching: transient migration from commonality to variability in product line engineering of smart homes. In Toward Useful Services for Elderly and People with Disabilities (pp. 230-235). Springer Berlin Heidelberg. 2 Meghanathan, N., Sharma, S., & Skelton, G. W. (2010). On Energy Efficient Dissemination in Wireless sensor Networks using Mobile Sinks. Journal of Theoretical and Applied Information Technology, 19(2), 7991. Sharma, S., Gadia, S., Kumar, N., Narayanan, V., & Zhao, X. (2010). Climate Analysis in IOWA Using XML and Spatio-Temporal Dataset-NC94. Int. J. Database Manage. Syst, 2(3), 82-93. Sharma, S., & Gadia, S. (2010). Perl Status Reporter (SRr) on Spatiotemporal Data Mining. International Journal Computer Science and Engineering Survey, AIRCC-IJCSES (August 2010). Sharma, S., & Gadia, S. K. (2010). On analyzing the degree of coldness in Iowa, a north central region, United States: An XML exploitation in spatial databases. In Recent Trends in Network Security and Applications (pp. 613-624). Springer Berlin Heidelberg. Sharma, S., Gadia, S., & Goyal, S. B. (2010, September). On the Calculation of Coldness in Iowa, a North Central Region, United States: A Summary on XML Based Scheme. In Information and Communication Technologies (pp. 400-405). Springer Berlin Heidelberg. Sharma, S., & Gadia, S. K. (2010, April). An XML-based range variation approach to render the coldness in Iowa, a North Central region, United States. In Information Management and Engineering (ICIME), 2010 The 2nd IEEE International Conference on (pp. 567-571). IEEE. Sharma, S., Goyal, S. B., & Qamar, S. (2009, December). Four-Layer Architecture Model for Energy Conservation in Wireless Sensor Networks. In Embedded and Multimedia Computing, 2009. EM-Com 2009. 4th International Conference on (pp. 1-3). IEEE. Sharma, S., & Cohly, H. (2009, December). On enhancing the energy conservation by ATIM window in wireless sensor networks. In Internet Multimedia Services Architecture and Applications (IMSAA), 2009 IEEE International Conference on (pp. 1-4). IEEE. Sharma, S., & Rani, M. (2009, October). Three-layer architecture model (TLAM) for energy conservation in wireless sensor networks. In Ultra Modern Telecommunications & Workshops, 2009. ICUMT'09. International Conference on (pp. 1-5). IEEE. Sharma, S., Rani, M., & Goyal, S. B. (2009, October). Energy efficient data dissemination with ATIM window and dynamic sink in wireless sensor networks. In Advances in Recent Technologies in Communication and Computing, 2009. ARTCom'09. International Conference on (pp. 559-564). IEEE. Sharma, S., Cohly, H., & Pei, T. (2010). On Generation of Firewall Log Status Reporter (SRr) Using Perl. arXiv preprint arXiv:1004.0604. Sharma, S., Pei, T., & Cohly, H. (2010). On Utilization and Importance of Perl Status Reporter (SRr) in Text Mining. arXiv preprint arXiv:1001.3277. Meghanathan, N., Sugam, S., & Skelton, G. W. (2008). Use of Mobile Sinks to Disseminate Data in WSNs. International Journal of Information Processing, 2(2). SHARMA, S., PEI, T., & COHLY, H. (2008). ToAccess PubMed Database to Extract Articles Using Perl_Su. IJCSES, 2(2), 92. Challa, V. S., Indracanti, J., White, L. D., Sharma, S., Baham, J. M., Rabarison, M. K., & Anjaneyulu, Y. (2007, September). Study of Meso-Scale Coastal Circulations in Mississippi Gulf coast with Mesonet Observations and Modeling. In Seventh Conference on Coastal Atmospheric and Oceanic Prediction and Processes. Singh, S. P., Bhanot, K., & Sharma, S. (2016). Critical Analysis of Clustering Algorithms for Wireless Sensor Networks. In Proceedings of Fifth International Conference on Soft Computing for Problem Solving (pp. 783-793). Springer Singapore. Sharma, S. (2015). An Extended Classification and Comparison of NoSQL Big Data Models. arXiv preprint arXiv:1509.08035. 3 4

Brief Summary on Big Data

Related documents

Products

Support

Brief Summary on Big Data

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib