Assignment 1 – Commerce 3DA3 This homework assignment contains 6 questions. Your submission should be one Jupyter Notebook file. The code in your jupyter notebook should produce results that reflect the items asked in the questions below. For full marks, avoid manual operations (such as manual data entry, copy/paste, etc.) as much as possible. If something can be done using code, you must use code to do it. Submit the assignment by uploading the file(s) into the "Assignment 1" folder on Avenue to Learn. You can find this folder under Assessments>Assignments on the course webpage. Submitting work after the deadline may result in a deduction of points or a penalty. The deadline for your submission is 11:59PM on Thursday, February 15. Data: We are going to use the data stored in the dataset client_data.csv. This dataset includes information on clients (Name, Surname, Email, CLIENT_ID, TRANSACTION_ID, Country, Region) as well as on the employees associated with each client (SALES_MANAGER SALES_VICE_PRESIDENT), together to the number of years the employees have been associated with the client, the sales of the current year and the previous year on the same day, and the respective category (YEARS_AS_CLIENT, CURRENT_YEAR_SALES, PREVIOUS_YEAR_SALES, CATEGORY). Question 0 In the first cell of your Jupyter notebook, insert the following text as markdown. Add your first name, last name, your student ID number, and the section number. Question 1 Import the data from the "client_data.csv" dataset and assign it to a dataframe called df. Explore the structure of this dataframe (number of rows, column types, etc.). Display the first few rows of the dataframe. In a markdown cell, describe the dataset to the best of your ability. Try to be as comprehensive as possible. To answer this question, you do not need to run additional code. This description should be based solely on the procedures performed in this question. Question 2 Show how many variables in the dataframe have missing values. Then update the dataframe by removing rows with missing values. Confirm that the dataframe now has no missing values. Question 3 Create a new dataframe df_EU, containing all the variables but only for customers whose country is a member of the European Union. Reset the indexes for this dataframe. Update the dataframe by removing the Country variable. Confirm that the variable has been removed. Create a CSV file in your working directory, df_EU.csv, based on this dataframe. Question 4 Working with the dataframe df, create a bar chart based on the CATEGORY variable. Add labels for the axes and a title. Change the color of the bars and modify the tick marks on the horizontal axis to show all the unique categories. Create a pie chart based on the CATEGORY variable. Add a legend. In a markdown cell, based on the charts, describe the distribution of the CATEGORY variable with as much detail as possible. Question 5 Working with the dataframe df, create a scatter plot that shows the relationship between CURRENT_YEAR_SALES and PREVIOUS_YEAR_SALES variables. Indicate, if you were to add a trend line, which polynomial degree would be appropriate for this association (you do not need to add the trend line). Also, explain what this relationship means for the sales of the current and previous years of customers.