Uploaded by Syed Abdullah

ICT550 FINAL PROJECT POWERPOINT

advertisement
Principles of Data
Management
Hotel Prediction Dataset
TEAM MEMBERS
1.
2.
3.
4.
5.
Hafizul Izwan Zulkifli - 2021123017
Muhamad Fathurrahman Bin Muhamadkharudin - 2020641832
Syed Abdullah Bin Syed Yusoff - 2021744185
Nur Madihah Binti Shaipol Yusoff - 2021591527
Nur Amirah Izzati Binti Mohd Said - 2021439618
INTRODUCTION
Introduction
●
The Covid-19 pandemic has disrupted travel plans across the globe with travel
restrictions and aircraft cancellations, we are all aware of the shock waves it has
caused.
●
The consequences are backpackers and travelers are cancelling their bookings for
hotels, flight and tour.
●
The global travel industry has been overwhelmed by the large number of
CoronaVirus included cancellation.
●
So, can we predict whether or not a hotel was likely to receive a disproportionately
high number of special request ?
The Dataset
1.
The Dataset used in this project is taken from Hotel Booking
Demand Dataset Written by Nuno Antonio, Ana Almeida, Luis
Nunes, Data in Brief on 2019 taken from Kaggle Platform
1.
It consist of 1 csv files which are “Hotel_bookings with a total
of 32 columns and 119391 rows.
Objectives
●
To determine which features are most crucial in predicting hotel
cancellations or how significant a features is
●
To predict the customers who are most likely to cancel their
reservations to improve projections and lower the risk associated
with business decisions
●
Build a model that could predict bookings with a high cancellation
probability
●
To predict the future Number of Guests for each hotel type
ROOT CAUSE IDENTIFICATION
POTENTIAL SOLUTION, ISSUES AND
CHALLENGES
POTENTIAL SOLUTION
➔ We are able to resolve every problem
➔ We learn some new information while researching the solution.
➔ We are able to generate a new dataframe and remove any columns that aren’t necessary
for our project.
➔ This plan of action was created to make sure that whatever we wanted to represent
through data visualization was in good condition. To prevent any errors while operating
the data system, we must manage the data structure, regardless of the data type, into
accurate data.
ISSUES AND CHALLENGES
❖
Before finding the suitable data.
Difficult for us to discover the best and most appropriate data to proceed with this project.
❖
During the process of data analyzing
Library and Packages
We had to make too much time in installing packages in the Anaconda prompt These packages
include matplotlib, numpy, panda, seaborn and Auto Arima model for the predicted future
Number of Guests.
For data cleaning we did some data cleaning for this data, so we drop the company and agent
columns then we checked the missing value after we drop.
DATA QUALITY ASSESSMENT
Go to file Data Quality Assessment Link :
https://drive.google.com/file/d/1xRCmmNYK4kLGFMIWo2wIjQF59gui48Jt/view?usp=share_l
ink
CONCLUSION
1. Big data is absolutely transformational for businesses in every industry. Data is now part of every
sector and function of the global economy and, like other essential factors of production such as
hard asserts and human capital, much of modern economic activity simply could not take place
without them.
1. From telecommunications to fitness, banking to manufacturing big data is improving business
operations, customer experience, resource optimization, and supply-chain efficiency.
1. The use big data large pools of data that can be brought together will become the key basics of
competition and growth for individual firms, enhancing productivity and creating significant value
for the world.
Download