Questions: We look at weighted average because we want to see how retention is changing over time. This will show if our company is truly creating stickier users or not that are retained and increasingly relying on our service in this case (door to door delivery of products—currently mostly food). Basic inisights show if the company is becoming more and more useful to a user. We can also see if the company’s activation of new users increases. If it does then the value of our product to a new user increases relative to old users and that means a higher investment of time, energy, and resources from our users. Here is a graph to show that: As you can see the month 2 retention to month 3 retention to month 4 retention and so on increases over time month over month. Each subsequent line from bottom to top shows how the second and third and so forth’s month has evolved. From a business perspective this shows us how different company priorities correspond to our retention. Does the dip in June suggest that the product had significant outages or is June a low month for people to come back and want to order (maybe it summertime). Linking our actions to our data give us a better picture of product-market fit. This paints the picture that our product is more valuable to customers over time. Exercise I found it much neater if the exercise allowed usage of GitHub code. 1. Use https://github.com/ankane/groupdate.sql SELECT GD_MONTH (actual_delivery_time, SF) AS MONTH, COUNT * FROM consumer_delivery GROUP BY MONTH; CONSUMER_DELIVERY Month 2014-11-01 00:00:00 2014-12-01 00:00:00 2015-01-01 00:00:00 2014-02-01 00:00:00 Count W X Y Z 2. Assuming date format is 4-digit-year/2-digit-month/2-digit-day SELECT GD_MONTH (actual_delivery_time, San Francisco) AS month, COUNT (consumer_delivery.id) AS monthly_total_deliveries, FROM consumer_delivery JOIN consumer_consumer ON consumer_delivery.id = consumer_delivery.id WHERE DATE_FORMAT(consumer_delivery.actual_delivery_time, ‘%Y/%m’)= ‘2014/11’ GROUP BY restaurant_restaurant.market_id, Month Month 2014/11 2014/12 2015/01 2015/02 Market 1 2 4 5 monthly_total_deliveries 3 22 55 88 3.Create a new table with join function with all fields and the columns “Month” and “monthly_total_deliveries” we produced in #1 and #2 called Retention and name the columns the same name for simplicity: SELECT Retention.months Consumer_delivery.*, SUM(monthly_total_deliveries) FROM consumer_delivery AS total_deliveries, Retention.monthly_total_deliveries, consumer_consumer.id, Consumer_delivery.monthly_total_deliveries AS total_deliveries, consumer_delivery.actual_delivery_time, Retention.monthly_total_deliveries/retention.total_deliveries*100 AS Retention.Percentage FROM SELECT ( DATE_FORMAT(consumer_delivery.actual_delivery_time, ‘%Y/%m’) AS Cohort, COUNT (DISTINCT consumer_consumer.id) AS total_users, FROM consumer_consumer JOIN consumer_delivery ON consumer_consumer.id= consumer_delivery.id GROUP BY cohort, months) AS Retention.cohort JOIN ( SELECT DATE_FORMAT(date, ‘%Y/%m’) AS Cohort Count(id) AS Total FROM consumer_consumer GROUP BY cohort ) AS retention.Total ON retention.cohort=consumer_delivery.monthly_total_deliveries WHERE Retention.Retention_Percentage BETWEEN 0 AND 100 AND Retention.Month< Date_FORMAT(NOW(), ‘%Y/%m’) 4. See above. We will cohort our users by the # of month they are in using a DATEDIFF function, subtract the most recent event firing date from the date from consumer_ordercart.is_first_order_cart, divide by 30.4 (30.4 is the average days in a month) inserted below as indicated: SELECT ( INSERT DATEDIFF FUNCTION HERE DATE_FORMAT(consumer_delivery.actual_delivery_time, ‘%Y/%m’) AS Cohort, COUNT (DISTINCT consumer_consumer.id) AS total_users, FROM consumer_consumer JOIN consumer_delivery ON consumer_consumer.id= consumer_delivery.id GROUP BY cohort, months) AS Retention.cohort 5. If you want to see by a particular market: See #3 but this time add AND restaurant_restaurant.market_id=1 If you want to group by market: See #3 but this time add GROUP BY restaurant_restaurant.market_id, Month