HERNANI BINTI MARIKAN 2022741801 QPSP4 FACULTY OF BUSINESS AND MANAGEMENT DEGREE IN OFFICE SYSTEM MANAGEMENT (BA232) MGT 555 BUSINESS ANALYTIC INDIVIDUAL ASSIGNMENT MINI PROJECT 2: DATA SET 2 SHOPEE WEBSITE VISITOR PREPARED FOR: HJH NURNAZIRAH BT JAMADIN PREPARED BY: HERNANI BINTI MARIKAN (2022741801) SUBMISSION DATE: 30 JANUARI 2025 HERNANI BINTI MARIKAN 2022741801 QPSP4 I would like to express gratitude to Allah SWT, the Almighty God for the blessing, kindness, and inspiration in lending us to accomplish this grouping assignment. Without Him, I couldn’t stay patient and in control in producing this assignment from the first page to the last page. Second, Shalawat and Salam always dedicated to our beloved prophet Muhammad SAW, our Muallim, the last prophet and the prophet who had brought us from the darkness to the brightness. I realize that I cannot complete this grouping assignment without the aid of others. It would be impossible to mention all of them who helped me to complete this assignment. We wish, however, to give my sincerest gratitude and appreciation to all people until this grouping assignment can be completely finished. Therefore, my deepest appreciation goes to Madam Hjh Nurnazirah bt Jamadin, whose guidance and expertise were invaluable throughout this journey. Your constant support and insightful feedback shaped the trajectory of this project in countless ways. Special thanks again to Madam Hjh Nurnazirah bt Jamadin for her expert advice and contributions, which were crucial in navigating the more challenging aspects of our project. I would also like to acknowledge the support of family and friends for providing the necessary resources and environment to conduct this project. The assistance from various departments, particularly families and friends greatly facilitated our work. To all other friends and colleagues who contributed to various capacities - your support did not go unnoticed. Whether it was administrative assistance, technical support, or simply words of encouragement, it all played a significant role in our journey. Finally, I express my gratitude for entrusting me with this project. Your faith in my abilities fuelled my motivation and ambition to deliver my best. This project was not just a professional milestone but also a learning experience, and the lessons I have learned will be invaluable for our future endeavours. Thank you all for your contributions, big and small, which culminated in the successful completion of this group assignment. HERNANI BINTI MARIKAN (2022741801) HERNANI BINTI MARIKAN 2022741801 QPSP4 MGT555 MINI PROJECT 2 DATA SET 2 SHOPEE WEBSITE VISITOR Shopee is the leading e-commerce online shopping platform in southeast Asia and Taiwan. It provides customers with an easy, secure and fast online shopping experience through strong payment and logistical support. Shopee has a wide selection of product categories ranging from consumer electronics to home & living, health & beauty, baby & toys, fashion, and fitness equipment. As of 2021, Shopee is considered the largest e-commerce platform in Southeast Asia with 343 million monthly visitors. Given this data, answer all questions. QUESTION 1 Write hypothesis statement, compute ANOVA, and prepare full report for each of the following sub-questions. a. Is there any differences on the number of page loads among shoppe website visitor on Friday, Saturday and Sunday? Hypothesis Statement: (i) There is no statistical differences number of page loads between Friday, Saturday and Sunday (ii) There is a statistical differences number of page loads between Friday, Saturday and Sunday RESULT F-Test 624.8046 F>F Critical F-Critical 2.999874 P Value 0.00000052 Decision REJECT HO <0.05 HERNANI BINTI MARIKAN 2022741801 QPSP4 The total number of observations as reported in the table is 2,172 observations where Friday, Saturday and Sunday is 724 respectively. As can be seen in the Annova statistical table, the F value is 624.8046, F-Critical value is 2.999874 where F > Fcrit. The P value is 0.0000052. The P value for this data is less than P < 0.05. Therefore, the decision for this data analysis is reject Null Hypothesis.There is a statistical differences number of page loads between Friday, Saturday and Sunday b. Is there any differences on customer unique visits on shoppe website on Friday, Saturday and Sunday? Hypothesis Statement (i) There is no statistical differences on customer unique between Friday, Saturday and Sunday (ii) There is a statistical difference on customer unique between Friday, Saturday and Sunday The total number of observations as reported in the table is 2,172 observations where Friday, Saturday and Sunday is 724 respectively. As can be seen in the Annova statistical table, the F value is 581.8818, F-Critical value is 2.999874 where F > Fcrit. The P value is 0.0000049. The P value for this data is less than P < 0.05. Therefore, the decision for this data analysis is reject Null Hypothesis.There is a statistical differences number of page loads between Friday, Saturday and Sunday HERNANI BINTI MARIKAN 2022741801 QPSP4 c) Is there any differences on the number of Shopee’s first time website visitor from Monday to Sunday? Hypothesis Statement (i) There is no statistical differences on the number Shopee’s first time from Monday to Sunday (ii) There is a statistical difference on the number Shopee’s first time from Monday to Sunday The total number of observations as reported in the table is 5,068 observations where Monday to Sunday is 724 respectively. As can be seen in the Annova statistical table, the F value is980.3771, F-Critical value is 2.10038 F > Fcrit. The P value is 0. The P value for this data is less than P < 0.05. Therefore, the decision for this data analysis is reject Null Hypothesis. There is a statistical differences number of page loads between Friday, Saturday and Sunday HERNANI BINTI MARIKAN 2022741801 QPSP4 QUESTION 2 Write hypothesis statement, compute t test and prepare full report. a. Is there any differences on the number of page loads among shoppe website visitor on Friday and Saturday? Hypothesis Statement (i) There is no significance differences of Page Loads among shoppe website on Friday and Saturday (ii) There is a significance difference Page Loads among shoppe website on Friday and Saturday The T-statistic is positive 37.79, The P-Value is 0.00000082 where reject Null Hypothesis because P is less than 0.05 and the result is significance. Therefore, there is a significance difference Page Loads among shoppe website on Friday and Saturday b. Is there any differences on customer unique visits on shoppe website on Tuesday and Sunday? Hypothesis Statement (i) There is no significance differences of customer unique visits on shoppe website on Tuesday and Sunday (ii) There is a significance difference of customer unique visits on shoppe website on Tuesday and Sunday HERNANI BINTI MARIKAN 2022741801 QPSP4 The T-statistic is positive 37.22, The P-Value is 0.000000839 where reject Null Hypothesis because P is less than 0.05 and the result is significance. Therefore, there is a significance difference of customer unique visits on shoppe website on Tuesday and Sunday. QUESTION 3 Write hypothesis statement, compute F test and prepare full report. Is there any differences on the number of Shopee’s first time website visitor between Monday and Saturday? Hypothesis Statement (iii) There is no significance differences of the number of Shopee’s first time website visitor between Monday and Saturday (iv) There is a significance difference of the number of Shopee’s first time website visitor between Monday and Saturday HERNANI BINTI MARIKAN 2022741801 QPSP4 The F-value is 1, the F critical value is 0.885 and the P-Value is 0.5 where fail to reject Null Hypothesis because P is 0.5 greater than 0.05 and the result is not significance. Therefore, there is no significance differences of the number of Shopee’s first time website visitor between Monday and Saturday. QUESTION 4 Select data from row 2 until row 29, prepare 4 weeks and overall trendline analysis chart Which day shows highest number of visitors, and which day shows the lowest number of visitors? Write a full report. WEEK 1 HERNANI BINTI MARIKAN 2022741801 QPSP4 Data points (x,y) & number of observation The report analyses date and price data consisting of 7 samples, with data points (x,y) and the trendline chart shown above Data Trend The data trend of date and price is downward, indicating that number of visitors have decrease over days in a week 1. Linear Equation The linear for this data is Y = a + bx, with an intercept of 3023.7 and a slope of 228.25. Coefficient of Determination (R Square) The coefficient of Determination (R-Square) value for this data is between 0 and 1. In this case the R-square values is 0.7236, indicating that 72.36% of the data is accurately represented by linear equation Conclusion In conclusion, the data analysis shows that as the days ended in a week, there is a decreased in number of visitors WEEK 2 HERNANI BINTI MARIKAN 2022741801 QPSP4 Data points (x,y) & number of observation The report analyses date and price data consisting of 7 samples, with data points (x,y) and the trendline chart shown above Data Trend The data trend of date and price is downward, indicating that number of visitors have decrease over days in a week 2. Linear Equation The linear for this data is Y = a + bx, with an intercept of 3528.9 and a slope of 265. Coefficient of Determination (R Square) The coefficient of Determination (R-Square) value for this data is between 0 and 1. In this case the Rsquare values is 0.5468, indicating that 54.68% of the data is accurately represented by linear equation Conclusion In conclusion, the data analysis shows that as the days ended in a week, there is a decreased in number of visitors HERNANI BINTI MARIKAN 2022741801 QPSP4 Data points (x,y) & number of observation The report analyzes date and price data consisting of 7 samples, with data points (x,y) and the trendline chart shown above Data Trend The data trend of date and price is downward,indicating that number of visitors have decrease over days in a week 3. Linear Equation The linear for this data is Y = a + bx, with an intercept of 3359 and a slope of 243.71. Coefficient of Determination (R Square) The coefficient of Determination (R-Square) value for this data is between 0 and 1. In this case the R-square values is 0.7961, indicating that 79.61% of the data is accurately represented by linear equation Conclusion In conclusion, the data analysis shows that as the days ended in a week, there is a decreased in number of visitors HERNANI BINTI MARIKAN 2022741801 QPSP4 Data points (x,y) & number of observation The report analyses date and price data consisting of 7 samples, with data points (x,y) and the trendline chart shown above Data Trend The data trend of date and price is downward, indicating that number of visitors have decrease over days in a week 4. Linear Equation The linear for this data is Y = a + bx, with an intercept of 3638.6 and a slope of 261.43. Coefficient of Determination (R Square) The coefficient of Determination (R-Square) value for this data is between 0 and 1. In this case the Rsquare values is 0.7623, indicating that 76.23% of the data is accurately represented by linear equation Conclusion In conclusion, the data analysis shows that as the days ended in a week, there is a decreased in number of visitors HERNANI BINTI MARIKAN 2022741801 QPSP4 OVERALL ANALYSIS 1. Looking at Trends Over 4 Weeks The data gathered over four weeks shows that the number of visitors has been steadily decreasing. Here are the trendline formulae for each week: In Week 1, the equation is Y = 3023.7 - 228.25X, and the value of R² is 0.7236. Week 2: Y = 3528.9 - 265X R² = 0.5468 Week 3: Y = 3359 - 243.71X R² = 0.7961 Week 4: Y = 3638.6 - 261.43X R² = 0.7623 HERNANI BINTI MARIKAN 2022741801 QPSP4 The trendline from the scatterplot gives us the linear regression equation: 𝑌 = 2346.2 + 2.9595 𝑥 Y=2346.2+2.9595x The intercept (2346.2) shows the predicted number of visitors at the beginning of the observation period (Day 0). This figure sets the baseline for predicting visitor numbers. The slope (2.9595) shows a small rise in the number of visitors for each extra day in the data. The slope is positive, but it does not match the overall falling trend in the data. This implies that the linear model may not fully represent how the data behaves. 2. Trend analysis based on chart. The scatterplot shows that the number of visitors goes up and down on certain days. Even with these changes, the data from the 27 reports shows that visitor numbers are going down by Week 2. The linear trendline, however, shows a slight upward trajectory, which contradicts the obvious downward trend in the scatterplot. 3. Coefficient of Determination ( 𝑅 2 R 2 ) The R² number for the trendline is 0.0014 (0.14%), which is very low. It interprets that the trendline explains just 0.14% of the changes in visitor numbers. HERNANI BINTI MARIKAN 2022741801 QPSP4 QUESTION 5 Compute correlation analysis based on the data. Identify independent variables (X) and dependent variable (Y). Write a full report. Your result should be supported by correlation interpretation table. 1. The table above shows a correlation analysis between Pages Loads, Unique Visits and First-time visits. The total number of observations that were analysed in this data is 5068 Samples. 2. The significance value that were set for this data is at P <0.05 3. By looking to the correlation interpretation table, the correlation coefficient analysis shows that there is a very strong correlation, between unique visits and page loads where r = (0.988106981). Next is the correlation coefficient study shows that there is a very strong correlation strong correlation, between unique visits and page loads where r = (0.988106981). Next is the correlation coefficient analysis shows that there is a very strong link between First time visits and page loads where r= (0.986961747). Thirdly, the correlation coefficient study shows that there very strong correlation returning visits and page loads where r = (0.830574305). In addition, the correlation interpretation table, the correlation coefficient analysis shows that there is a very strong link, between first time visits and unique visits where r = (0.99302759). Subsequently, the correlation coefficient study shows that there is a very strong correlation between returning visits and unique visits where r = (0.865291462). Finally, the correlation coefficient study shows a very strong connection between returning visits and first-time visits, with a value of r = 0.80016761. This finding follows a common guideline for understanding correlation that is used in schools and universities around the world. QUESTION 6 Prepare a regression analysis. Write a complete report. HERNANI BINTI MARIKAN 2022741801 QPSP4 REGRESSION STATISTIC 1. Number of observations The number of samples analyzed in this model is 5068 observations 2. Multiple R (r) The Correlation coefficient of this samples is r = 1 3.R square (Coefficient of Determination): The value of R Square (Coefficient of Determination) Of this model reported as 1. This value is strong fit proportion of the variability in the dependent variable. The value 1 shows 100% of data are fit to the line. This means 100% of the variation in data of Shopee website visitors can be explained by Page loads, unique visits, first time visits Shopee website visitors show a strong relationship between page loads, unique visits and first time visits. 4.Standard error The standard error of estimate of this model is 0.000000439. ANOVA The Anova is used to test the entire model. The significance value (p-value) is represented by Significance F. The value of significance for this model is p = 0 which is less than p <0.5. This HERNANI BINTI MARIKAN 2022741801 QPSP4 means the whole model is significance COEFFICIENT AND WRITE EQUATION Slope: The slope is less than 0 (Negative) where the coefficients value shows the first time visit -1, Next slope is less than 0 (positive) where the coefficients value shows unique visits 1. Lastly, slope is less than 0 (negative) where the coefficient value shows the page loads 0.000000444 P Value: The P value for first time visit and unique visits shows (0) which is significance. The P-value for page load show 0.000000291 which is significance Equation: The equation for this model is Y = a + bx were returning visits = -0.000000123 + Page loads 0.000000444 + unique visits 1 - first time visit 1 CONCLUSION In conclusion, the unique visits are strong fit proporation to 100%. The P value of unique visits is p <0.05 which is significance. The decision is reject null hypothesis. There is a relationship between unique visit on Shopee website visitors which is at strong fit. TOTAL: 100 MARKS HERNANI BINTI MARIKAN 2022741801 QPSP4 Appendix Month Total Sales(RM) in '000 3976770 11 9 7 5 3 1 3943420 3895860 4123800 0 1000000 2000000 3000000 3366320 3545250 4000000 Total Sales 4674510 4984230 6413060 6983490 5710680 6085650 5000000 6000000 7000000 8000000