Uploaded by mihirsolanki104

Phase -3 Final Draft

advertisement
1
Team Project Phase-3 (Team No. 6)
Statistical data analysis of Zomato Inc. using Correlation and Regression
Mihir Ishwarbhai Solanki (2241428)
Grishma Manoj Macwan (2304814)
Nneka Jessica Madukwe (2238287)
Rushvic Padala (2308518)
Jenish Dholariya (2300967)
Honey Patel (2232107)
University Canada West
MBAF 502 Quantitative Reasoning and Analysis
Maira Jimenez Sanchez
September, 2023
2
Index
Sr. No.
Content
Page No.
PART-A
01
Introduction and overview of the study
03
02
Correlation analysis for dining service rating and doorstep delivery
04
service rating
03
Regression analysis between prices and dining service rating for all
07
restaurants and forecasting.
PART-B
04
Share prices analysis of Pepsi Co. Inc., using correlation and
15
predicting future prices based on linear equation.
05
References
17
06
Peer review
17
3
PART A
Introduction
Zomato Inc. is a multinational food delivery company that runs Cloud Kitchen and is partnered
with thousands of restaurants to facilitate food delivery. The current section studies the statistical
analysis (correlation and regression) of data collected from six restaurants in Hyderabad.
It is evident that the same food is served in-house and delivered to the customer’s doorsteps. As a
result, customers' feedback (on a scale of 0 to 5, 0-food is not good, 5-Excellent food quality)
should be correlated for dining service and delivery service. The first part examines the correlation
study.
The second part discusses a causal analysis between the food prices and dining service ratings. If
the food prices are high, it is perceived to be of good quality and stunning atmosphere, and
likewise, customers shall rate it higher on a scale, and vice versa. Based on this assumption, the
team performed regression analysis for food prices and dining service ratings. Here, food price is
an independent variable, and rating is a dependent. The detailed analysis is summarized in the
second part.
4
1. Correlation analysis
In the restaurant business, food quality is a key differentiator. People nowadays prefer to take away
instead of dining in restaurants or café. The restaurant delivers the same food to its customers,
whether it has to be served within the restaurant or delivered to the customer's doorstep. Hence,
the team believed a significant correlation between the dining service and doorstep delivery ratings
must exist. The first underlying assumption is that the same food quality is delivered to the
customer's doorstep. The second assumption is that the most essential element for feedback from
the customer's point of view is the overall quality of the food.
The team selected six restaurants for the preliminary study and data collected (Chandrakanth
,2023). The sample size (number of dishes) for which dining and doorstep delivery ratings are
taken are mentioned in Table 1.
Table 1
List of sample size collected from all six restaurants.
Sr. No. Name of the restaurant/café
Sample size (No. of dishes)
1
Doner King
45
2
Taco Bell
96
3
Brown Bear
34
4
Crystal Restaurant & Bar
231
5
Shah Ghouse Special Shawarma 96
6
Siddique Kabab Centre
46
The team analyzed correlation using MS Excel. The team computed correlation using CORREL
and PEARSON formula. Each function provides the exact value of correlation. Table 2 shows the
5
end results of this exercise. All the tests are performed at a 95% confidence level, which is pretty
standard for the restaurant business.
Table 2
Correlation coefficients and significance level for dining and delivery ratings.
Sr.
Restaurant / Café
No.
Correlation Coefficient
CORREL
PEARSON
Strength and
Sig. F
Direction
1
Doner King
0.8813
0.8813
High Positive
6.36E-16
2
Taco Bell
0.8372
0.8372
High Positive
1.20E-26
3
Brown Bear
0.9122
0.9122
Very High Positive
2.44E-14
4
Crystal Restaurant & 0.5905
0.5905
Moderate Positive
3.38E-23
0.8558
High Positive
6.00E-29
0.6299
Moderate Positive
2.09E-06
Bar
5
Shah Ghouse Special 0.8558
Shawarma
6
Siddique
Kabab 0.6299
Centre
It is evident from the study that the correlation coefficient is in the range of 0.59 to 0.91. The
correlation coefficient is significantly positive high for Brown Bear restaurant and moderately
positive for Siddique Kabab Centre and Crystal Restaurant & Bar. The correlation coefficient is
positively high for Doner King and Taco Bell. Also, the significance F value is less than the
significance level (0.05), indicating a statistical association between the dining and delivery
ratings. The underlying assumptions hold true in this case, which are;
1) The most important element for feedback is food quality.
6
2) The same food quality is served in-house and at doorstep delivery.
The Brown Bear is a luxurious café. It prioritizes all the elements of superior in-house experience,
such as food quality, ambiance, etc. It provides premium packaging in insulated boxes to ensure
food does not lose its charm and texture. As a result, the team was not surprised to see a very high
positive correlation between two different types of services. On the flip side, Siddique Kabab
Centre is a convenience restaurant that does business on the principle of economic pricing. It has
a moderate correlation. The reason may be the packaging. Also, it is in an old town, which is
usually crowded, resulting in delays in delivery times. These collectively contribute to differences
in dining and delivery ratings of the same food and moderate correlation. Shah Ghouse Special
Shawarma is also an economic restaurant, and it is again located in a congested area. However, it
has a high positive correlation. Doner King and Taco Bell are modern-day cafes that serve snacks
and fast-food during dates, meetings, or leisure time. Both cafes are popular in the city, and
likewise, they persistently serve the same taste and quality, even at-home delivery. As a result,
they both have a high positive correlation between the ratings.
7
2. Regression Analysis
The team believed that dishes with higher price tags offer significant value to the consumers
regarding taste and flavours, aroma, texture, appearance and presentation and food safety. As a
result, the team hypothesized that higher prices serve better to the consumers; likewise, customers
rate it higher while giving feedback. Henceforth, the team thought that food item prices
(independent variable) and its rating (dependent variable) have a significant causal relationship.
For this purpose, the team collected samples from previously selected six restaurants and cafes for
food item prices and ratings. The sample size for regression analysis is same as of correlation
(Table 1).
The team first constructed a scatter plotter diagram to check whether the relationship between
prices and ratings is positive, negative, or random. The initial results from the scatter plot showed
that a certain degree of positive correlation exists between the variables under consideration.
Figure 1 shows scatter plotter charts for all six restaurants, along with the trend line and R2 value.
Figure 1
Scatter plotter chart showing a correlation between prices and ratings for all six restaurants.
Dining rating of food items
Scatter plotter showing corelation between prices and rating of food items for dining service at
Doner king
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
y = 0.0156x - 0.2278
R² = 0.5381
100
150
200
250
Prices of food items (₹)
Dining Rating
Linear (Dining Rating)
300
8
Scatter plotter showing corelation between prices and rating of food items for dining service at
Taco Bell
5.0
Dining rating of food items
4.5
4.0
y = 0.0093x + 0.4256
R² = 0.7041
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0
50
100
150
200
250
300
350
400
450
500
Prices of food items at Taco Bell restaurant (₹)
Scatter plotter showing corelation between prices and rating of food items for dining service at
Brown Bear
5.0
Dining rating of food items
4.5
4.0
y = 0.0032x + 1.1647
R² = 0.6036
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0
200
400
600
800
1000
1200
1400
1600
1800
Prices of food items at BrownBear cafe (₹)
Rating
Linear (Rating)
Scatter plotter showing corelation between prices and rating of food items for dining service at
Crystal Restaurant & Bar
5
Dining rating of food items
4.5
4
3.5
3
2.5
y = 0.0035x + 2.1682
R² = 0.4581
2
1.5
1
0.5
0
0
100
200
300
400
Prices of food items at Crystal Restaurant & Bar (₹)
Rating
Linear (Rating)
500
600
700
9
Dining rating of food items
Scatter plotter showing corelation between prices and rating of food items for dining service at
Shah Ghouse Special Shawarma
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
y = 0.0169x - 0.7092
R² = 0.7484
50
100
150
200
250
300
350
Prices of food items at Shah Ghouse Special Shawarma (₹)
Rating
Linear (Rating)
Scatter plotter showing corelation between prices and rating of food items for dining service at
Siddique Kabab Centre
Dining rating of food items
5
4.5
4
3.5
3
2.5
y = 0.0069x + 1.5653
R² = 0.747
2
1.5
1
0.5
0
0
50
100
150
200
250
300
350
400
450
500
Prices of food items at Siddique Kabab Centre (₹)
Rating
Linear (Rating)
To understand the causal relationship better, a linear regression equation and value of R2 were
added from the formatting toolbox. The regression equation and R2 statistic are mentioned in the
figure. The line slope is highest (0.0169) for Shah Ghouse Special Shawarma, indicating that if the
customer pays ₹100 extra, his gratification level is exceeded by 1.69. In other words, for food
items, as prices are increased by ₹100, the rating is increased by 1.69. The line slope is the lowest
for Brown Bear restaurant (0.032), indicating that the relationship between prices and feedback
(customer satisfaction) changes slowly. Interestingly, it is observed that for high price food items,
10
the ratings are always on the range of 4 to 5, indicating that the price justifies customers’
expectations (it is observed from the right top side of every figure (figure 1)).
Furthermore, to check whether the existing correlation is statistically significant or not, the team
performed regression analysis for all the samples collected for all the restaurants. The team adopted
the industry standard, a 95% confidence level for the study, as it is a reasonable choice for the
restaurant business. The regression analysis was performed using the MS Excel analysis tool
package, and the results of R2, Multiple R (correlation coefficient), and significance level are
mentioned in Table 3.
Table 3
Regression analysis statistics (R2, Multiple R, and significance level) for all the restaurants under
study.
Sr.
Name of the restaurant or café
R2
Multiple Correlation level
Significance
R
F
1
Doner King
0.5380
0.7335
High Positive
6.678E-09
2
Taco Bell
0.7040
0.8391
High Positive
7.341E-27
3
Brown Bear
0.6036
0.7769
High Positive
4.073E-08
4
Crystal Restaurant & Bar
0.4581
0.6768
Moderate Positive 1.923E-32
5
Shah Ghouse Special Shawarma
0.7483
0.8650
High Positive
3.235E-30
6
Siddique Kabab Centre
0.7470
0.8643
High Positive
5.041E-15
Regression Analysis
Table 3 shows regression analysis results. A 45% to 75% causal relationship exists between the
price of the food item and the feedback customers provide. It indicates that if customers pay higher
11
prices, they expect better food quality, taste, hygiene, texture, and service and vice versa. And if it
is delivered, they rate food higher on a scale. For Crystal Restaurant & Bar, the causal relationship
(regression statistic) is 0.46, which is the least among all the restaurants. It is worth noting that this
restaurant offers higher menu varieties (231). Whereas, for Shah Ghouse Special Shawarma and
Siddique Kabab Centre, the prices justify 74% of the customer feedback. This value is 60% and
70% for Brown Bear and Taco Bell, respectively. Also, for Doner King, 54% of the prices are
explained by customer feedback. Statistically (per the R square value), A causal relationship exists
between the variables under study. Overall, a positive causal relationship exists between the prices
and ratings. Indicating that if items are costly, they deliver functional as well as pleasure value.
Significance level and Correlation coefficient
The significance F value for all the restaurants and cafes is significantly less than 0.05 (significance
level), which means a valid regression exists between food prices and their feedback.
The Pearson correlation coefficient (Multiple R) shows a moderate positive correlation for Crystal
Bar & Restaurant. For all other restaurants, it shows a strong positive correlation between prices
and ratings, indicating that all the restaurants justify the price value of food items.
Discussion on regression analysis
Further, the team conducted a survey to learn more about customers’ perspectives on feedback and
found that food quality is a significant component basis on which customer gives feedback.
However, customer feedback is an overall metric that includes factors such as ambiance, staff
behaviour, timely service, parking and valet facility, varieties in menu, location and convenience,
hygiene and safety, waiting time, place availability for hosting personal events and a few others.
12
Prediction of optimum price-rating combination (Forecasting)
Based on the derived linear regression equation, the team tried to understand the degree of
relationship between food prices and its rating. The team modelled the relationship to predict what
price range these restaurants should keep so that they can serve the best price-service combination
that satisfies all customers' spoken and unspoken needs. Table 4 shows a modelled linear equation
that shows the relationship between prices and customer feedback. Table 5 shows the predicted
rating as the price changes, which is graphically shown in Figure 2.
Table 4
Modelled linear equation for predicting optimum prices of food items.
Restaurant / Café
Modelled linear equation
Doner King
𝑦 = 0.0156𝑥 − 0.2278
Taco Bell
𝑦 = 0.0093𝑥 + 0.4256
Brown Bear
𝑦 = 0.0032𝑥 + 1.1647
Crystal Restaurant & Bar
𝑦 = 0.0035𝑥 + 2.1682
Shah Ghouse Special Shawarma 𝑦 = 0.0169𝑥 − 0.7092
Siddique Kabab Centre
𝑦 = 0.0069𝑥 + 1.5653
Note: y= Customer feedback based on a value derived from food and its prices, x= Food prices (in ₹)
To increase customer feedback ratings, it is not feasible for restaurants to keep prices high and the
same for all food products. Also, it is not practically possible in the real-life world. The forecasting
model is just a guide tool that helps to identify the range in which restaurants should keep their
hero products (highest-selling food items).
13
Furthermore, all restaurants try to attract different market segments; besides food, they provide
many other services. Some restaurants (Such as Shah Ghouse Special Shawarma) only provide
functional value at a reasonable price, while others (Such as Brown Bear) provide niche menu, a
premium experience and are in the downtown area. So, the selling price includes the cost of food
and services. The model shows that convenient restaurants such as Shah Ghouse Special
Shawarma and Doner King restaurant should price their highest-selling products in the range of
₹300 to ₹400. This selling price will provide value to their customers and maximize revenue.
Similarly, Premium restaurants such as Crystal Restaurant & Bar and Brown Bear should try to
maintain prices in the range of ₹750 to ₹1000. If items are sold at this price, customers will see it
as a value for money. The ideal price range should be ₹400 to ₹500 for Taco Bell and Siddique
Kabab Centre.
Table 5
Predicted customer feedback in accordance with prices of items.
Restaurant Name
Prices of food items (₹)
100
150
200 250
300 350 400
500 750 1000 1500
Doner King
1.3
2.1
2.9
3.6
4.4
5.0
5.0
5.0
5.0
5.0
5.0
Taco Bell
1.4
1.8
2.3
2.8
3.2
3.7
4.1
5.0
5.0
5.0
5.0
Brown Bear
1.5
1.6
1.8
1.9
2.1
2.2
2.4
2.7
3.5
4.3
5.0
Crystal Restaurant
2.5
3.2
3.4
3.9
4.8
4.4
5.0
5.0
5.0
Shah Ghouse Special
Shawarma
1.0
2.7
1.8
2.9
2.7
3.0
3.5
3.6
5.0
5.0
5.0
5.0
5.0
14
Siddique
Kabab
2.2
2.6
2.9
3.3
3.6
3.9
4.3
5.0
5.0
5.0
Centre
Figure 2
Customer feedback prediction based on food item prices.
Customer feedback prediction based on food item prices at all six restaurants
Predicted Customer feedback on a scale of 0 to 5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
100
200
300
400
500
600
Prices of food items (₹)
Doner King
Taco Bell
Brown Bear
Crystal Restaurant & Bar
Shah Ghouse Special Shawarma
Siddique Kabab Centre
700
5.0
15
PART - B
The problem statement has provided data on Pepsi Co., Inc.'s share price from 1990 to 2002. The
share prices are reported at the end of each year. Table 6 summarizes the given data and predicted
values of share prices based on forecasting. The team constructed a line chart (Figure 3) in MS
Excel based on the given data. The line chart clearly shows a positive trend. Later, a trendline and
linear equation were added. The predicted linear equation is mentioned below.
𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑆ℎ𝑎𝑟𝑒 𝑝𝑟𝑖𝑐𝑒 𝑎𝑡 𝑡ℎ𝑒 𝑒𝑛𝑑 𝑜𝑓 𝑔𝑖𝑣𝑒𝑛 𝑦𝑒𝑎𝑟
= 9.533 + 3.0106 (𝑅𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑡ℎ𝑎𝑡 𝑦𝑒𝑎𝑟)
Based on the linear equation the predicted share prices at the end of year 2006 and 2023 is $60.71
and $111.89, respectively.
Table 6
Share prices of Pepsi Co., Inc., at the end of each year
Reference Count Year Year Actual Share Price Predicted Share Price
1
1990 12.91
12.5436
2
1991 16.83
15.5542
3
1992 20.61
18.5648
4
1993 20.30
21.5754
5
1994 18.32
24.586
6
1995 27.75
27.5966
7
1996 29.06
30.6072
8
1997 36.02
33.6178
9
1998 40.61
36.6284
10
1999 35.02
39.639
16
11
2000 49.56
42.6496
12
2001 48.68
45.6602
13
2002 42.22
48.6708
17
2006 -
60.7132
34
2023 -
111.8934
Figure 3
Line chart along trend line showing share prices of Pepsi Co., Inc.,
selling price for a share of PepsiCo, Inc., at year end
A line chart along trend line showing share price of of Pepsi Co., Inc., over the years
60.00
49.56
50.00
48.68
42.22
40.61
40.00
36.02
27.75
30.00
20.00
20.61
20.30
1992
1993
16.83
35.02
29.06
y = 3.0106x + 9.533
R² = 0.8964
18.32
12.91
10.00
0.00
1990
1991
1994
1995
1996
1997
Year
Price
Linear (Price)
1998
1999
2000
2001
2002
17
References
Chandrakanth
U.
K.
(2023,
July
30).
Zomato
Restaurants
Dataset.
Kaggle.
https://www.kaggle.com/datasets/bharathdevanaboina/zomato-restaurants-dataset.
Peer Review
Name
Grade
Mihir Ishwarbhai Solanki
100
Grishma Macwan
100
Nneka Jessica Madukwe
100
Rushvic Padala
100
Jenish Dholariya
100
Honey Patel
100
Activity
- Responsible for conducting correlation and
regression analysis (Part A & B) and
auditing.
- The detailed write-up and APA.
- Responsible for conducting correlation and
regression analysis.
- Responsible for conducting correlation and
regression analysis.
- Responsible for conducting correlation and
regression analysis.
- Responsible for conducting correlation and
regression analysis (Part A & B).
- Auditing
- Responsible for conducting correlation and
regression analysis.
Download