Principles of Data Management Hotel Prediction Dataset TEAM MEMBERS 1. 2. 3. 4. 5. Hafizul Izwan Zulkifli - 2021123017 Muhamad Fathurrahman Bin Muhamadkharudin - 2020641832 Syed Abdullah Bin Syed Yusoff - 2021744185 Nur Madihah Binti Shaipol Yusoff - 2021591527 Nur Amirah Izzati Binti Mohd Said - 2021439618 INTRODUCTION Introduction ● The Covid-19 pandemic has disrupted travel plans across the globe with travel restrictions and aircraft cancellations, we are all aware of the shock waves it has caused. ● The consequences are backpackers and travelers are cancelling their bookings for hotels, flight and tour. ● The global travel industry has been overwhelmed by the large number of CoronaVirus included cancellation. ● So, can we predict whether or not a hotel was likely to receive a disproportionately high number of special request ? The Dataset 1. The Dataset used in this project is taken from Hotel Booking Demand Dataset Written by Nuno Antonio, Ana Almeida, Luis Nunes, Data in Brief on 2019 taken from Kaggle Platform 1. It consist of 1 csv files which are “Hotel_bookings with a total of 32 columns and 119391 rows. Objectives ● To determine which features are most crucial in predicting hotel cancellations or how significant a features is ● To predict the customers who are most likely to cancel their reservations to improve projections and lower the risk associated with business decisions ● Build a model that could predict bookings with a high cancellation probability ● To predict the future Number of Guests for each hotel type ROOT CAUSE IDENTIFICATION POTENTIAL SOLUTION, ISSUES AND CHALLENGES POTENTIAL SOLUTION ➔ We are able to resolve every problem ➔ We learn some new information while researching the solution. ➔ We are able to generate a new dataframe and remove any columns that aren’t necessary for our project. ➔ This plan of action was created to make sure that whatever we wanted to represent through data visualization was in good condition. To prevent any errors while operating the data system, we must manage the data structure, regardless of the data type, into accurate data. ISSUES AND CHALLENGES ❖ Before finding the suitable data. Difficult for us to discover the best and most appropriate data to proceed with this project. ❖ During the process of data analyzing Library and Packages We had to make too much time in installing packages in the Anaconda prompt These packages include matplotlib, numpy, panda, seaborn and Auto Arima model for the predicted future Number of Guests. For data cleaning we did some data cleaning for this data, so we drop the company and agent columns then we checked the missing value after we drop. DATA QUALITY ASSESSMENT Go to file Data Quality Assessment Link : https://drive.google.com/file/d/1xRCmmNYK4kLGFMIWo2wIjQF59gui48Jt/view?usp=share_l ink CONCLUSION 1. Big data is absolutely transformational for businesses in every industry. Data is now part of every sector and function of the global economy and, like other essential factors of production such as hard asserts and human capital, much of modern economic activity simply could not take place without them. 1. From telecommunications to fitness, banking to manufacturing big data is improving business operations, customer experience, resource optimization, and supply-chain efficiency. 1. The use big data large pools of data that can be brought together will become the key basics of competition and growth for individual firms, enhancing productivity and creating significant value for the world.