ASSESSMENT 2 GUIDE COMM1190 Data, Insights and Decisions Term 2, 2024 Assessment Details Icon legend Due Date Weighting Format Length/Duration Submission Turnitin Turnitin is an originality checking and plagiarism prevention tool that enables checking of submitted written work for improper citation or misappropriated content. Each Turnitin assignment is checked against other students' work, the Internet and key resources selected by your Course Coordinator. If you are instructed to submit your assessment via Turnitin, you will find the link to the Turnitin submission in your Moodle course site. You can submit your assessment well before the deadline and use the Similarity Report to improve your academic writing skills before submitting your final version. You can find out more information in the Turnitin information site for students. Late Submissions The parameters for late submissions are outlined in the UNSW Assessment Implementation Procedure. For this course, if you submit your assessments after the due date, you will incur penalties for late submission unless you have Special Consideration (see below). Late submission is 5% per day (including weekends), calculated from the marks allocated to that assessment (not your grade). Assessments will not be accepted more than 5 days late. Extensions You are expected to manage your time to meet assessment due dates. If you do require an extension to your assessment, please make a request as early as possible before the due date via the special consideration portal on myUNSW (My Student profile > Special Consideration). You can find more information on Special Consideration and the application process below. Lecturers and tutors do not have the ability to grant extensions. Special Consideration Special consideration is the process for assessing the impact of short-term events beyond your control (exceptional circumstances), on your performance in a specific assessment task. What are circumstances beyond my control? These are exceptional circumstances or situations that may: • • • • Prevent you from completing a course requirement, Keep you from attending an assessment, Stop you from submitting an assessment, Significantly affect your assessment performance. Available here is a list of circumstances that may be beyond your control. This is only a list of examples, and your exact circumstances may not be listed. You can find more detail and the application form on the Special Consideration site, or in the UNSW Special Consideration Application and Assessment Information for Students. Use of AI For this assessment, you may use AI-based software however please take note of the attribution requirements described below: Coding: You may freely use generative AI to generate R code for your analysis, without attribution. You do not need to report this. Written report: Use of generative AI in any way to produce the written (prose) portions of the report must be completely documented by providing full transcripts of the input and output from generative AI with your submission. Examples of use that must be documented (this is not an exhaustive list): editing your first draft, generating text for the report, translation from another language into English. Any output of generative AI software that is used within your written report must be attributed with full referencing. If the outputs of generative AI software form part of your submission and is not appropriately attributed, your marker will determine whether the omission is significant. If so, you may be asked to explain your understanding of your submission. If you are unable to satisfactorily demonstrate your understanding of your submission you may be referred to UNSW Conduct & Integrity Office for investigation for academic misconduct and possible penalties. AI-related resources and support: • • • Ethical and Responsible Use of Artificial Intelligence at UNSW Referencing and acknowledging the use of artificial intelligence tools Guide to Using Microsoft Copilot with Commercial Data Protection for UNSW Students UNSW Business School Assessment 2: Customer churn project Stage 1 – Individual Report Stage 2 – Group Report Week 7: 5:00pm Wednesday 10 July 2024 Week 9: 5:00pm Wednesday 24 July 2024 10% 20% Individual report (template provided) Group report 2 pages ~ 4 pages Via Turnitin and attached as appendices with Stage 2 submission Via Turnitin Description of assessment tasks This is a group assessment with reporting being done in two stages. Students will be assigned to groups in Week 5 when this documentation is released. While reporting in done in two stages students are encouraged to commence their collaboration within their group early in the process before the submission of the Stage 1 individual reports. Stage 1: Complete the 2-page individual task using a data set specific to your Assessment 2 project group. This first-stage submission contains key inputs into the group work that will result in the single group report produced in Stage 2. Students who do not submit a complete, legitimate attempt of this assessment will not be awarded marks for Stage 2. This individual task will be separately assessed together with the Stage 2 group report, and associated marks will be available together with Stage 2 marks. Because of the nature of the relationship between the Stage 1 and 2 tasks, you will not receive your Stage 1 marks before submitting Stage 2. Stage 2: As a group, use the results from Stage 1 to produce a report for the Head of Management Services. You will use R to explore a dataset that includes the pilot data together with extra observations and variables (see attached Appendix A Data Dictionary). The pilot data are common to all students, but the extra observations will vary across students according to their SID as they did in Assessment 1. A group-specific data set will be determined by nominating the SID of one of the group members to generate the single data set used by all group members in both stages of the project. Details for obtaining the personalized group-specific data set will be provided on Moodle Your Stage 2 group mark will be common to all students in your group who have submitted a complete, legitimate attempt of Stage 1. Note: The course content from Weeks 4, 5, and 7 will be of particular relevance to completing this Assessment. Context of assessment tasks The Head of Management Services of Freshland, a large grocery store chain in Australia, has made use of your updated report (from Assessment 1) to deliver a presentation to the Senior Executive Group. → Access this presentation via your Moodle course site. Based on this initial analysis and recommendations, approval has been given for further analysis of customer loyalty and churn using an expanded data set. The core task will involve a comparison of predictive models and subsequent recommendations on how to use and improve these to inform future retention policies. The analysis in the presentation to the Senior Executive Group was based on the initial pilot data set which was used by the intern to produce the initial report and was part of the data provided to you with Assessment 1. These data have now been extended, with extra variables being added. These extra variables are: ltmem mamt1 mamt2 fr1 fr2 rind =1 if 𝑚𝑒𝑚𝑏𝑒𝑟 ≥ 3 Average monthly expenditure ($) in first 6 months of previous year (2023) Average monthly expenditure ($) in second 6 months of previous year (2023) Frequency of monthly transactions in first 6 months of previous year; 1 (low) 2 (medium), 3 (high) Frequency of monthly transactions in second 6 months of previous year; 1 (low) 2 (medium), 3 (high) XYZ risk index in the form of a predicted probability of customer churn You and your team have been tasked with investigating alternative algorithms for predicting customer churn. Given the structure of data that has been made available, you have been advised to define churning to be when a customer has previously had non-zero transactions for at least 6 months but then has zero transactions in the next six-month period. The outcome of interest is the binary variable churn. Given the available data, an observation for a customer will have 𝑐ℎ𝑢𝑟𝑛 = 1 if 𝑚𝑎𝑚𝑡1 > 0 & 𝑚𝑎𝑚𝑡2 = 0 and 𝑐ℎ𝑢𝑟𝑛 = 0 if 𝑚𝑎𝑚𝑡1 > 0 & 𝑚𝑎𝑚𝑡2 > 0. The Head of Management Services has given you authority to use your expert judgment to make the necessary modelling choices but has outlined an overarching research plan for you and your group to follow: • • • • • Currently, Management Services has a basic regression model (details below) that can be used to predict future customer expenditure for members of the rewards program. It has been suggested that this could be used to generate a risk index where those with predicted expenditures that are low relative to actual expenditures being deemed as high risk of no longer shopping at the store. However, there were suggestions that the existing model could be improved as a predictor of expenditures and your group has been asked to evaluate a range of model extensions. The current focus is on predicting churn. Based on the performance of the alternative models in predicting expenditures, choose one and analyse whether it also performs well in predicting churn. An analytics firm, XYZ, that uses proprietary predictive methodology has offered a trial of their products by providing a predictor of churn. Your evaluation of predictive performance should include a comparison of this predictor with that generated by your chosen regression-based predictor. Based on this analysis, make recommendations on using such algorithms in initiatives targeting customers at risk of churning with the aim of retaining them as loyal customers. o Notice that any recommendation to employ the predictors of the analytics firm would involve additional cost compared to a method produced in-house by Management Services. o In addition, any decision to employ the predictors of XYZ will not include documentation of the methodology used to generate the predictions. o It might also be that you conclude that neither predictor is adequate and that it would be appropriate to explore alternative predictors or approaches. You are not expected to explore such alternatives. UNSW Business School The base regression model used for prediction by Management Services for the ith customer takes the following form: 𝑚𝑎𝑚𝑡1𝑖 = 𝛽0 + 𝛽1 𝑎𝑔𝑒𝑖 + 𝛽2 𝑓𝑒𝑚𝑎𝑙𝑒𝑖 + 𝛽3 𝑚𝑒𝑡𝑟𝑜𝑖 + 𝛽4 𝑙𝑡𝑚𝑒𝑚𝑖 + 𝑢𝑖 . Each individual group member will use the group data common to all group members to compare the predictive performance of this base model with one of the following extended models: A: add age squared to base model B: add regional dummy variable to base model (𝑟𝑒𝑔𝑖𝑜𝑛𝑎𝑙 = 1 if 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 = 2; = 0 otherwise) C: to the base model add age squared and replace variable ltmem with variable member D: to the base model add age squared and regional and replace ltmem with member. In the case of groups with less than four members prioritize A and C, with B and D being optional. So, a group of 3 would choose A, C and one of B or D. Approach to the assessment tasks Stage 1 instructions Compare the predictive performance of the base model with the indicated extension, using the pilot data for estimation (training) and the non-pilot data as the testing sample. Provide a justification of this split and document any modifications to the samples used due to missing data and/or outliers. Complete the questions and table in the attached Appendix B Individual Report Template. All questions must be attempted, and the report submitted by the due date for a student to qualify for the Stage 2 mark received by their group. Stage 2 instructions The group task is to use the Stage 1 results to inform the choice of data and preferred predictive model to use in predicting churn. Recall that the idea is to associate predicted expenditures that are low relative to actual expenditures in one 6-month period with a high probability of churning in the subsequent 6-month period. The Head of Management Services has suggested classifying a customer as someone predicted to churn if their regression residual (actual minus predicted expenditure) was below a cutoff of, say, the 25th percentile of expenditure residuals in the pilot data. Ultimately, it is up to you how you proceed, but this does seem like a sensible approach to consider. The risk index provided by the analytics firm XYZ provides an alternative predictor of churning. The Head of Management Services is especially interested in the relative predictive performance of this index as the use of this in the future would involve extra costs to the firm. Recall that the focus of the analysis and subsequent recommendations arising from the group work is to inform management about identifying customers at risk of churning and, hence, potential retention strategies. The Head of Management Services has outlined an overall strategy on how to proceed but ultimately there are other details that have been left unspecified and it is the responsibility of you and your group to make the associated decisions. These are decisions that require judgement and will not necessarily be right or wrong. What is important is that the decisions are supported by sensible arguments. Your report should explain your strategy, subsequent analysis and the recommendations that follow. Include a critical evaluation of the strengths and weaknesses of the alternative predictors and any potential improvements, as this will be an ongoing focus of management. Use the Assessment 2 marking rubric for the Stage 2 group submission as a tool to check your work before submission and to ensure that you have addressed the assessment task in full. Structure: Your Stage 2 submission should take the form of a report to the Head of Management Services. Use the initial report provided in Assignment 1 to guide you. Your Stage 2 Report should be approximately the same length and structure as UNSW Business School that report, although the section headings will likely change, and the emphasis on graphical presentations is likely to be less. Ultimately decisions about presentation are to be made by your group. Your group must also submit a separate file containing R code used to conduct your analysis and generate your visualisations. No marks will be associated with this code file, but your submission will be deemed incomplete and given a mark of zero if such a file is not included. Overall, the assessment is designed to help you develop your skills in using R for data analysis and in communicating insights from such exercises. Writing support The following links will take you to resources for writing support and study skills: • Writing Skills Support • Academic Skills One-to-One Consultations Submission instructions • • • • Submit your Stage 1 report using the Turnitin assessment submission link on Moodle. Submit your Stage 2 report and code file as separate documents using the Turnitin assessment submission link on Moodle. You are free to choose the structure of the code file and it is not subject to a word limit. Late submission will incur a penalty of 5% per day or part thereof (including weekends) from the due date and time. An assessment will not be accepted after 5 days (120 hours) of the original deadline unless special consideration has been approved. For further information please refer to Policies and Support. Special consideration will only be granted in the case of serious illness, misadventure, or bereavement, which must be supported with documentary evidence. In these circumstances, students must apply for Special Consideration. Because of the sequential nature of the assessment tasks, it is very difficult to allow extensions without impacting the academic integrity of the assessment. As such this course does not use the short extension process that you may have seen in other courses. Moreover, in the event you are granted special consideration due to exceptional circumstances precluding you from completing the assessment task on time you are likely to have your final exam reweighted rather than being granted an extension. UNSW Business School Appendix A: Data dictionary The personalized data set contains information on customers from the rewards program data base. It includes the following variables: age female member metro location ID last cash sat pilot fr1 fr2 ltmem mamt1 mamt2 rind Age of the customer in years =1 if customer is female; =0 otherwise Number of years as member of loyalty club (top coded at 4) =1 if customer located in metropolitan area; =0 otherwise Customer location: 1=metropolitan; 2=regional; 3=all other regions Unique customer identifier Amount ($) of the last transaction in previous 12 months =1 if the last transaction paid in cash; =0 otherwise Satisfaction rating of last transaction; 1 (highest=excellent) to 5 (lowest=poor) =1 if initial data collected for pilot study; =0 otherwise Frequency of monthly transactions in first 6 months of previous year (2023); 1 (low) 2 (medium) 3 (high) Frequency of monthly transactions in second 6 months of previous year (2023); 1 (low) 2 (medium) 3 (high) =1 if member for 3 or more years; =0 otherwise Average monthly expenditure ($) in first 6 months of previous year (2023) Average monthly expenditure ($) in second 6 months of previous year (2023) XYZ risk index in the form of a predicted probability of customer churn UNSW Business School Appendix B: Assessment 2 Individual Report Template Name: Assigned extended model: A B C D Group specific data SID: SID: Q1: (3 MARKS) What modifications, including treatment of outliers and missing data, did you undertake to decide on the final data set used in comparing your predictive models? Justify these choices. Q2: (3 MARKS) Divide your modified data set into two subsamples according to the binary variable pilot. Compare the relative performance of your two models in predicting average monthly expenditure using the subsample with 𝑝𝑖𝑙𝑜𝑡 = 1 as your estimation (training) sample and that with 𝑝𝑖𝑙𝑜𝑡 = 0 as your testing sample. Justify this process to evaluate predictive performance. UNSW Business School Q3: (4 MARKS) In the table below present your regression results for your two models estimated using the pilot data subsample of your modified (cleaned) data. Which model do you prefer, the base model or your assigned extension? What are the justifications for your choice? Table: Regression results using your chosen estimation sample Dependent variable Amt1 Amt1 Base model Extended model Constant age age squared female metro regional ltmem member Adjusted R squared No. of observations Notes: (i) This table reports regression coefficients estimated by ordinary least squares using your selected estimation (testing) sample. (ii) The estimated standard errors are reported in brackets ( ) under these estimates. (ii) As with the base model not all variables will have an associated parameter estimate. This will depend on which extended model is being compared to the base model. UNSW Business School Assessment 2 Marking Rubric Criteria % Analysis 50% Fails to demonstrate a basic Evaluation and role of Stage 1 predictors in Stage 2 Fail Pass Credit Distinction High Distinction (0%-49%) (50%-64%) (65%-74%) (75%-84%) (85%-100%) Demonstrates a proficient Demonstrates a good understanding Demonstrates an advanced Demonstrates an exceptional understanding of the business understanding of the business problem of the business problem or issue and understanding of the business understanding of the business problem or issue or of the link or issue and attempts to link the attempts to link the business problem or issue and presents a problem or issue and presents a with appropriate analytical business problem and the analytical problem and the analytical link between the business problem coherent and clear logic linking the techniques to be employed. techniques employed are not always techniques employed. and the analytical techniques business problem with the analytical clear. employed. techniques employed. Demonstrates limited Usually chooses and uses awareness of tools and Sometimes applies appropriate tools, appropriate tools and methods to Chooses and uses appropriate Chooses and expertly uses methods to use and of the and methods in an attempt to extract extract insights from the data. tools and methods to extract appropriate tools and methods to modelling decisions that insights from the data. Explanations of the issues identified useful insights from the data while extract useful and perceptive needed to be made. and modelling decisions that needed providing good explanations of the insights from the data while Explanations of the issues identified providing compelling explanations and modelling decisions that needed to be made are provided but are not issues identified and modelling decisions that needed to be made. of the issues identified and to be made are often incomplete or not always complete or compelling. modelling decisions that needed to compelling. be made. 15% Little understanding of the Demonstrates a satisfactory Demonstrates a good understanding Demonstrates a very good Demonstrates an outstanding alternative predictor models understanding of the alternative of the alternative predictor models understanding of the alternative understanding of the alternative considered in Stage 1 and how predictor models considered in Stage 1 considered in Stage 1 and how they predictor models considered in predictor models considered in they should be compared. and how they should be compared. should be compared. Stage 1 and how they should be Stage 1 and how they should be compared. compared. Little or no use of the Stage 1 Some attempt to link Stage 1 and Good use of the comparison of analyses in Stage 2. Stage 2 analyses but not always with success. 25% Develops no conclusions or Provides conclusions that are not Quality of conclusions that are not based always well-justified by the results of conclusions and on the results of the analysis. the analysis. recommendations Recommendations are absent Recommendations are provided but or not relevant. Criteria Report structure Stage 1 predictors and choice of Very good use of the comparison Compelling use of the comparison data to justify the preferred predictor of Stage 1 predictors and choice of of Stage 1 predictors and choice of that is compared to the XYZ risk data to justify the preferred data to justify the preferred predictor index. predictor that is compared to the that is compared to the XYZ risk XYZ risk index. index. Develops appropriate conclusions Develops appropriate conclusions Develops perceptive, appropriate based on the results of the analysis. that are well-supported by the conclusions that are well-supported by the analysis. Recommendations are provided that analysis. are sensible but the link with the are not always clear or actionable. The supporting analysis is sometimes link with the supporting analysis is tenuous. sometimes tenuous. Unsatisfactory 10% Did not follow the instructions of the Head and produced a report that differs markedly from that of the intern in dimensions other than section headings. Recommendations are clear, actionable, and supported by the analysis provided. Recommendations are clear, actionable, and are well-supported by the analysis provided. Satisfactory Followed instructions of the Head and produced a report that approximates that of the intern after appropriate modification to section headings.
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )