1 Team Project Phase-3 (Team No. 6) Statistical data analysis of Zomato Inc. using Correlation and Regression Mihir Ishwarbhai Solanki (2241428) Grishma Manoj Macwan (2304814) Nneka Jessica Madukwe (2238287) Rushvic Padala (2308518) Jenish Dholariya (2300967) Honey Patel (2232107) University Canada West MBAF 502 Quantitative Reasoning and Analysis Maira Jimenez Sanchez September, 2023 2 Index Sr. No. Content Page No. PART-A 01 Introduction and overview of the study 03 02 Correlation analysis for dining service rating and doorstep delivery 04 service rating 03 Regression analysis between prices and dining service rating for all 07 restaurants and forecasting. PART-B 04 Share prices analysis of Pepsi Co. Inc., using correlation and 15 predicting future prices based on linear equation. 05 References 17 06 Peer review 17 3 PART A Introduction Zomato Inc. is a multinational food delivery company that runs Cloud Kitchen and is partnered with thousands of restaurants to facilitate food delivery. The current section studies the statistical analysis (correlation and regression) of data collected from six restaurants in Hyderabad. It is evident that the same food is served in-house and delivered to the customer’s doorsteps. As a result, customers' feedback (on a scale of 0 to 5, 0-food is not good, 5-Excellent food quality) should be correlated for dining service and delivery service. The first part examines the correlation study. The second part discusses a causal analysis between the food prices and dining service ratings. If the food prices are high, it is perceived to be of good quality and stunning atmosphere, and likewise, customers shall rate it higher on a scale, and vice versa. Based on this assumption, the team performed regression analysis for food prices and dining service ratings. Here, food price is an independent variable, and rating is a dependent. The detailed analysis is summarized in the second part. 4 1. Correlation analysis In the restaurant business, food quality is a key differentiator. People nowadays prefer to take away instead of dining in restaurants or café. The restaurant delivers the same food to its customers, whether it has to be served within the restaurant or delivered to the customer's doorstep. Hence, the team believed a significant correlation between the dining service and doorstep delivery ratings must exist. The first underlying assumption is that the same food quality is delivered to the customer's doorstep. The second assumption is that the most essential element for feedback from the customer's point of view is the overall quality of the food. The team selected six restaurants for the preliminary study and data collected (Chandrakanth ,2023). The sample size (number of dishes) for which dining and doorstep delivery ratings are taken are mentioned in Table 1. Table 1 List of sample size collected from all six restaurants. Sr. No. Name of the restaurant/café Sample size (No. of dishes) 1 Doner King 45 2 Taco Bell 96 3 Brown Bear 34 4 Crystal Restaurant & Bar 231 5 Shah Ghouse Special Shawarma 96 6 Siddique Kabab Centre 46 The team analyzed correlation using MS Excel. The team computed correlation using CORREL and PEARSON formula. Each function provides the exact value of correlation. Table 2 shows the 5 end results of this exercise. All the tests are performed at a 95% confidence level, which is pretty standard for the restaurant business. Table 2 Correlation coefficients and significance level for dining and delivery ratings. Sr. Restaurant / Café No. Correlation Coefficient CORREL PEARSON Strength and Sig. F Direction 1 Doner King 0.8813 0.8813 High Positive 6.36E-16 2 Taco Bell 0.8372 0.8372 High Positive 1.20E-26 3 Brown Bear 0.9122 0.9122 Very High Positive 2.44E-14 4 Crystal Restaurant & 0.5905 0.5905 Moderate Positive 3.38E-23 0.8558 High Positive 6.00E-29 0.6299 Moderate Positive 2.09E-06 Bar 5 Shah Ghouse Special 0.8558 Shawarma 6 Siddique Kabab 0.6299 Centre It is evident from the study that the correlation coefficient is in the range of 0.59 to 0.91. The correlation coefficient is significantly positive high for Brown Bear restaurant and moderately positive for Siddique Kabab Centre and Crystal Restaurant & Bar. The correlation coefficient is positively high for Doner King and Taco Bell. Also, the significance F value is less than the significance level (0.05), indicating a statistical association between the dining and delivery ratings. The underlying assumptions hold true in this case, which are; 1) The most important element for feedback is food quality. 6 2) The same food quality is served in-house and at doorstep delivery. The Brown Bear is a luxurious café. It prioritizes all the elements of superior in-house experience, such as food quality, ambiance, etc. It provides premium packaging in insulated boxes to ensure food does not lose its charm and texture. As a result, the team was not surprised to see a very high positive correlation between two different types of services. On the flip side, Siddique Kabab Centre is a convenience restaurant that does business on the principle of economic pricing. It has a moderate correlation. The reason may be the packaging. Also, it is in an old town, which is usually crowded, resulting in delays in delivery times. These collectively contribute to differences in dining and delivery ratings of the same food and moderate correlation. Shah Ghouse Special Shawarma is also an economic restaurant, and it is again located in a congested area. However, it has a high positive correlation. Doner King and Taco Bell are modern-day cafes that serve snacks and fast-food during dates, meetings, or leisure time. Both cafes are popular in the city, and likewise, they persistently serve the same taste and quality, even at-home delivery. As a result, they both have a high positive correlation between the ratings. 7 2. Regression Analysis The team believed that dishes with higher price tags offer significant value to the consumers regarding taste and flavours, aroma, texture, appearance and presentation and food safety. As a result, the team hypothesized that higher prices serve better to the consumers; likewise, customers rate it higher while giving feedback. Henceforth, the team thought that food item prices (independent variable) and its rating (dependent variable) have a significant causal relationship. For this purpose, the team collected samples from previously selected six restaurants and cafes for food item prices and ratings. The sample size for regression analysis is same as of correlation (Table 1). The team first constructed a scatter plotter diagram to check whether the relationship between prices and ratings is positive, negative, or random. The initial results from the scatter plot showed that a certain degree of positive correlation exists between the variables under consideration. Figure 1 shows scatter plotter charts for all six restaurants, along with the trend line and R2 value. Figure 1 Scatter plotter chart showing a correlation between prices and ratings for all six restaurants. Dining rating of food items Scatter plotter showing corelation between prices and rating of food items for dining service at Doner king 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 y = 0.0156x - 0.2278 R² = 0.5381 100 150 200 250 Prices of food items (₹) Dining Rating Linear (Dining Rating) 300 8 Scatter plotter showing corelation between prices and rating of food items for dining service at Taco Bell 5.0 Dining rating of food items 4.5 4.0 y = 0.0093x + 0.4256 R² = 0.7041 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 50 100 150 200 250 300 350 400 450 500 Prices of food items at Taco Bell restaurant (₹) Scatter plotter showing corelation between prices and rating of food items for dining service at Brown Bear 5.0 Dining rating of food items 4.5 4.0 y = 0.0032x + 1.1647 R² = 0.6036 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 200 400 600 800 1000 1200 1400 1600 1800 Prices of food items at BrownBear cafe (₹) Rating Linear (Rating) Scatter plotter showing corelation between prices and rating of food items for dining service at Crystal Restaurant & Bar 5 Dining rating of food items 4.5 4 3.5 3 2.5 y = 0.0035x + 2.1682 R² = 0.4581 2 1.5 1 0.5 0 0 100 200 300 400 Prices of food items at Crystal Restaurant & Bar (₹) Rating Linear (Rating) 500 600 700 9 Dining rating of food items Scatter plotter showing corelation between prices and rating of food items for dining service at Shah Ghouse Special Shawarma 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 y = 0.0169x - 0.7092 R² = 0.7484 50 100 150 200 250 300 350 Prices of food items at Shah Ghouse Special Shawarma (₹) Rating Linear (Rating) Scatter plotter showing corelation between prices and rating of food items for dining service at Siddique Kabab Centre Dining rating of food items 5 4.5 4 3.5 3 2.5 y = 0.0069x + 1.5653 R² = 0.747 2 1.5 1 0.5 0 0 50 100 150 200 250 300 350 400 450 500 Prices of food items at Siddique Kabab Centre (₹) Rating Linear (Rating) To understand the causal relationship better, a linear regression equation and value of R2 were added from the formatting toolbox. The regression equation and R2 statistic are mentioned in the figure. The line slope is highest (0.0169) for Shah Ghouse Special Shawarma, indicating that if the customer pays ₹100 extra, his gratification level is exceeded by 1.69. In other words, for food items, as prices are increased by ₹100, the rating is increased by 1.69. The line slope is the lowest for Brown Bear restaurant (0.032), indicating that the relationship between prices and feedback (customer satisfaction) changes slowly. Interestingly, it is observed that for high price food items, 10 the ratings are always on the range of 4 to 5, indicating that the price justifies customers’ expectations (it is observed from the right top side of every figure (figure 1)). Furthermore, to check whether the existing correlation is statistically significant or not, the team performed regression analysis for all the samples collected for all the restaurants. The team adopted the industry standard, a 95% confidence level for the study, as it is a reasonable choice for the restaurant business. The regression analysis was performed using the MS Excel analysis tool package, and the results of R2, Multiple R (correlation coefficient), and significance level are mentioned in Table 3. Table 3 Regression analysis statistics (R2, Multiple R, and significance level) for all the restaurants under study. Sr. Name of the restaurant or café R2 Multiple Correlation level Significance R F 1 Doner King 0.5380 0.7335 High Positive 6.678E-09 2 Taco Bell 0.7040 0.8391 High Positive 7.341E-27 3 Brown Bear 0.6036 0.7769 High Positive 4.073E-08 4 Crystal Restaurant & Bar 0.4581 0.6768 Moderate Positive 1.923E-32 5 Shah Ghouse Special Shawarma 0.7483 0.8650 High Positive 3.235E-30 6 Siddique Kabab Centre 0.7470 0.8643 High Positive 5.041E-15 Regression Analysis Table 3 shows regression analysis results. A 45% to 75% causal relationship exists between the price of the food item and the feedback customers provide. It indicates that if customers pay higher 11 prices, they expect better food quality, taste, hygiene, texture, and service and vice versa. And if it is delivered, they rate food higher on a scale. For Crystal Restaurant & Bar, the causal relationship (regression statistic) is 0.46, which is the least among all the restaurants. It is worth noting that this restaurant offers higher menu varieties (231). Whereas, for Shah Ghouse Special Shawarma and Siddique Kabab Centre, the prices justify 74% of the customer feedback. This value is 60% and 70% for Brown Bear and Taco Bell, respectively. Also, for Doner King, 54% of the prices are explained by customer feedback. Statistically (per the R square value), A causal relationship exists between the variables under study. Overall, a positive causal relationship exists between the prices and ratings. Indicating that if items are costly, they deliver functional as well as pleasure value. Significance level and Correlation coefficient The significance F value for all the restaurants and cafes is significantly less than 0.05 (significance level), which means a valid regression exists between food prices and their feedback. The Pearson correlation coefficient (Multiple R) shows a moderate positive correlation for Crystal Bar & Restaurant. For all other restaurants, it shows a strong positive correlation between prices and ratings, indicating that all the restaurants justify the price value of food items. Discussion on regression analysis Further, the team conducted a survey to learn more about customers’ perspectives on feedback and found that food quality is a significant component basis on which customer gives feedback. However, customer feedback is an overall metric that includes factors such as ambiance, staff behaviour, timely service, parking and valet facility, varieties in menu, location and convenience, hygiene and safety, waiting time, place availability for hosting personal events and a few others. 12 Prediction of optimum price-rating combination (Forecasting) Based on the derived linear regression equation, the team tried to understand the degree of relationship between food prices and its rating. The team modelled the relationship to predict what price range these restaurants should keep so that they can serve the best price-service combination that satisfies all customers' spoken and unspoken needs. Table 4 shows a modelled linear equation that shows the relationship between prices and customer feedback. Table 5 shows the predicted rating as the price changes, which is graphically shown in Figure 2. Table 4 Modelled linear equation for predicting optimum prices of food items. Restaurant / Café Modelled linear equation Doner King 𝑦 = 0.0156𝑥 − 0.2278 Taco Bell 𝑦 = 0.0093𝑥 + 0.4256 Brown Bear 𝑦 = 0.0032𝑥 + 1.1647 Crystal Restaurant & Bar 𝑦 = 0.0035𝑥 + 2.1682 Shah Ghouse Special Shawarma 𝑦 = 0.0169𝑥 − 0.7092 Siddique Kabab Centre 𝑦 = 0.0069𝑥 + 1.5653 Note: y= Customer feedback based on a value derived from food and its prices, x= Food prices (in ₹) To increase customer feedback ratings, it is not feasible for restaurants to keep prices high and the same for all food products. Also, it is not practically possible in the real-life world. The forecasting model is just a guide tool that helps to identify the range in which restaurants should keep their hero products (highest-selling food items). 13 Furthermore, all restaurants try to attract different market segments; besides food, they provide many other services. Some restaurants (Such as Shah Ghouse Special Shawarma) only provide functional value at a reasonable price, while others (Such as Brown Bear) provide niche menu, a premium experience and are in the downtown area. So, the selling price includes the cost of food and services. The model shows that convenient restaurants such as Shah Ghouse Special Shawarma and Doner King restaurant should price their highest-selling products in the range of ₹300 to ₹400. This selling price will provide value to their customers and maximize revenue. Similarly, Premium restaurants such as Crystal Restaurant & Bar and Brown Bear should try to maintain prices in the range of ₹750 to ₹1000. If items are sold at this price, customers will see it as a value for money. The ideal price range should be ₹400 to ₹500 for Taco Bell and Siddique Kabab Centre. Table 5 Predicted customer feedback in accordance with prices of items. Restaurant Name Prices of food items (₹) 100 150 200 250 300 350 400 500 750 1000 1500 Doner King 1.3 2.1 2.9 3.6 4.4 5.0 5.0 5.0 5.0 5.0 5.0 Taco Bell 1.4 1.8 2.3 2.8 3.2 3.7 4.1 5.0 5.0 5.0 5.0 Brown Bear 1.5 1.6 1.8 1.9 2.1 2.2 2.4 2.7 3.5 4.3 5.0 Crystal Restaurant 2.5 3.2 3.4 3.9 4.8 4.4 5.0 5.0 5.0 Shah Ghouse Special Shawarma 1.0 2.7 1.8 2.9 2.7 3.0 3.5 3.6 5.0 5.0 5.0 5.0 5.0 14 Siddique Kabab 2.2 2.6 2.9 3.3 3.6 3.9 4.3 5.0 5.0 5.0 Centre Figure 2 Customer feedback prediction based on food item prices. Customer feedback prediction based on food item prices at all six restaurants Predicted Customer feedback on a scale of 0 to 5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 100 200 300 400 500 600 Prices of food items (₹) Doner King Taco Bell Brown Bear Crystal Restaurant & Bar Shah Ghouse Special Shawarma Siddique Kabab Centre 700 5.0 15 PART - B The problem statement has provided data on Pepsi Co., Inc.'s share price from 1990 to 2002. The share prices are reported at the end of each year. Table 6 summarizes the given data and predicted values of share prices based on forecasting. The team constructed a line chart (Figure 3) in MS Excel based on the given data. The line chart clearly shows a positive trend. Later, a trendline and linear equation were added. The predicted linear equation is mentioned below. 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑆ℎ𝑎𝑟𝑒 𝑝𝑟𝑖𝑐𝑒 𝑎𝑡 𝑡ℎ𝑒 𝑒𝑛𝑑 𝑜𝑓 𝑔𝑖𝑣𝑒𝑛 𝑦𝑒𝑎𝑟 = 9.533 + 3.0106 (𝑅𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑡ℎ𝑎𝑡 𝑦𝑒𝑎𝑟) Based on the linear equation the predicted share prices at the end of year 2006 and 2023 is $60.71 and $111.89, respectively. Table 6 Share prices of Pepsi Co., Inc., at the end of each year Reference Count Year Year Actual Share Price Predicted Share Price 1 1990 12.91 12.5436 2 1991 16.83 15.5542 3 1992 20.61 18.5648 4 1993 20.30 21.5754 5 1994 18.32 24.586 6 1995 27.75 27.5966 7 1996 29.06 30.6072 8 1997 36.02 33.6178 9 1998 40.61 36.6284 10 1999 35.02 39.639 16 11 2000 49.56 42.6496 12 2001 48.68 45.6602 13 2002 42.22 48.6708 17 2006 - 60.7132 34 2023 - 111.8934 Figure 3 Line chart along trend line showing share prices of Pepsi Co., Inc., selling price for a share of PepsiCo, Inc., at year end A line chart along trend line showing share price of of Pepsi Co., Inc., over the years 60.00 49.56 50.00 48.68 42.22 40.61 40.00 36.02 27.75 30.00 20.00 20.61 20.30 1992 1993 16.83 35.02 29.06 y = 3.0106x + 9.533 R² = 0.8964 18.32 12.91 10.00 0.00 1990 1991 1994 1995 1996 1997 Year Price Linear (Price) 1998 1999 2000 2001 2002 17 References Chandrakanth U. K. (2023, July 30). Zomato Restaurants Dataset. Kaggle. https://www.kaggle.com/datasets/bharathdevanaboina/zomato-restaurants-dataset. Peer Review Name Grade Mihir Ishwarbhai Solanki 100 Grishma Macwan 100 Nneka Jessica Madukwe 100 Rushvic Padala 100 Jenish Dholariya 100 Honey Patel 100 Activity - Responsible for conducting correlation and regression analysis (Part A & B) and auditing. - The detailed write-up and APA. - Responsible for conducting correlation and regression analysis. - Responsible for conducting correlation and regression analysis. - Responsible for conducting correlation and regression analysis. - Responsible for conducting correlation and regression analysis (Part A & B). - Auditing - Responsible for conducting correlation and regression analysis.