Week 11 Tutorial Worksheet AY23/24 Semester 1 No submission required Question 1 In this tutorial, we continue to work with the Olympics data from TidyTuesday. We will use the following two data sets: • olympics.csv contains information on Olympic games from 1896 to 2016. • regions.csv maps the 3-digit NOC codes to the country/region names. 1. Create a graph on the number of nations and events in each Summer Olympics from 1896 to 2016 using ggplot2 plotting. Your graph can be similar to the following. Number of Events and Nations in Summer Olympics 300 Events 200 Nations 100 0 1890 1920 1950 1980 2010 2. Which five countries won the most gold medals in 2016? Use your dplyr skills answer this question. After that, extract the names (in noc) of the top five countries as a vector named summer_top_5. The vector should take the following value: 1 summer_top_5 ## [1] "USA" "GBR" "CHN" "RUS" "GER" 3. Compute the number of gold medals received for the summer_top_5 countries since the year of 2000. Use it to re-create the plot below, as closely as you can. Gold medals received for selected countries, by year 50 Number of golds received USA 40 30 UK China 20 Russia Germany 10 2000 2004 2008 2012 2016 4. Instead of the total number of golds, let us now examine the countries’ rankings in golds received across years. Prepare the data to obtain the rankings in golds received for the summer_top_5 countries from 2000 to 2016. Save your resulting data frame as an object named ranks_top_5. Hint: Ranking functions in R vary in how they handle tied values. dplyr provides two handy functions: min_rank() and dense_rank(). For sports data, the former is typically the conventional choice. More information is available in their documentation. 5. Use ranks_top_5 to re-create, as much as you can, the plot below. 2 Rankings of golds received, by year for countries that won the most golds in 2016 1 USA 2 UK 3 China 4 Russia 5 Germany 6 7 8 9 2000 2004 2008 2012 2016 Note: dplyr::min_rank() was used to compute country rankings Requirements • You code should generate a vector named summer_top_5 and a data frame named ranks_top_5. • The knitted HTML should contain three plots, one each for Question 1.1, Question 1.3, and Question 1.5. 3