ENGR 132 Spring 2023 A11 ∙ Linear Regression Introduction Assignment Goals This assignment uses Excel to build on the least squares analysis you learned in ENGR 131 and then adds two MATLAB linear regression problems with engineering contexts. Successful Completion This assignment has 3 problems. All problems go with Classes A and B. 1. Read Notes Before You Start, on Page 1. 2. Read each problem carefully. You are responsible for following all instructions within each problem. a. The deliverables list within each problem contains everything you are expected to submit. 3. Complete the problems using the problem-specific templates when a template is provided in the assignment download. 4. For any file, replace template or login in the filename with your Purdue Career Account login. 5. Review your work using the learning objective evidences. 6. When your work is complete, confirm your deliverables are submitted to Gradescope. a. Note the three different assignments in Gradescope. i. A11 – All Problems: submit your deliverables for all problems. Help link. ii. A11 – Team Planning: submit your team plan for Problem 3 as a team. Help link. 7. b. You can resubmit your work as many times as you want; only the final submission will be graded. c. Do NOT upload any document not listed in the deliverables. Do not upload temporary versions of mfiles (*.m~ or *.asv) – these files will be ignored by Gradescope. d. You must submit an m-file or xlsx file when one is requested; failing to do so will result in a 0 on the problem. Late submissions will be accepted up to 24 hours after the due date and will result in a 25% penalty. Learning Objectives & Grading This course uses learning objectives (LOs) to assess your work. You can find a full list of the course LOs here. Review the grading outline at the end of each problem in this assignment to see each problem’s LOs. Notes Before You Start Helpful MATLAB Commands Learn about the following built-in MATLAB commands, which might be useful in your solutions: polyfit, polyval Publishing code You will publish m-files as PDF documents in this assignment. See detailed instructions on publishing here. Page 1 of 10 ENGR 132 Spring 2023 Problem 1: Excel Linear Regression Introduction This problem has you using Excel to determine least squares model and goodness of fit for a data set. You will compare models and discuss their differences. You will use your knowledge of your data set to make reasonable predictions using your model. Submission Gradescope Assignment A11 – All Problems Deliverables Requested results and information Assignment Type Individual A11Prob1_wireBonds_login.xlsx A11Prob1_figure_login.png Problem Wire bonding is a method that uses thin wires and heat to attach and connect integrated circuits in microelectronic devices. The wire bonds the semiconductor to a terminal on the chip package. These circuits are used in electronic applications across all engineering disciplines. For some applications, the strength of the connections must be quantified and confirmed. The United States military has a standard method for testing wire bond strength for critical aviation and space components. The National Aeronautics and Space Administration (NASA) also uses this standard for testing mission critical components. The method can be done destructively or non-destructively. For the destructive test, a machine hooks onto a wire in the circuit and pulls up until the bond fails and the connection between the semiconductor and the terminal is broken. You work as a quality control engineer for a microprocessor manufacturer that sells components for aviation and space applications. Your team selected random circuits from production batches and conducted destructive wire bond testing on them. You placed the circuits in a testing machine that places a hook on the wire and pulls with an upward force until the wire bond fails. The machine reports the applied force and the height of the hook (i.e., a measure of wire stretch) at the point of failure. Your company has been using a model, 𝐹 = 21.9ℎ + 3.3 to predict wire strength in the circuits, where ℎ is the hook height above its starting point (mm) and 𝐹 is the pull strength at which the bond fails (grams-force). You must perform least-squares regression to find a new model for the pull strength as a function of hook height at failure. You will find the data in the CALCULATIONS sheet of the provided Excel template. Instructions Excel Calculations Using the CALCULATIONS sheet in the Excel template, you must 1. Use the data and the least-squares equations to determine the least-square line for the data. Page 2 of 10 ENGR 132 2. 3. Spring 2023 Plot the data and confirm your least-squares line using Excel’s “add a trendline” feature. Show both the model equation and the r-squared value. Update the equation display using clear and appropriate variable names in place of 𝑥 and 𝑦. Format the plot for technical presentation. Determine the goodness of fit (SSE, SST, and r2) for both the provided model and your least-squares model. Save a plot image After you have completed the calculations and created the plot, you need to save an image of your Excel plot that shows the data, the trendline, and the least-squares model (as described in Step 2 above). Save an image of your Excel plot with the trendline and r-squared value visible. • If you have a Mac, right click the chart and select “Save as picture”. • If you have Windows, you may need to Copy the figure, paste into Powerpoint as an image, right click the image in Powerpoint and select “Save as picture”. *You can use any image file type (png, jpg, tiff, etc.) that will render in Gradescope but name the file appropriately. Answer Questions & Submit to Gradescope After you complete the calculations, open Gradescope and select the assignment A11 – All Problems. You will find 5 parts within Q1. Excel – Linear Regression. Answer all the questions and their parts. Save your answers periodically in case you lose internet connection or need to stop working on the assignment before it is complete. a. Submit your XLSX file, with the appropriate file name. b. Submit your plot image, with the appropriate file name. c. Answer the questions presented in Gradescope. Note: Failure to submit the XLSX worksheet will result in a 0 on this problem. References National Aeronautics and Space Administration. (n.d.). Wirebond – Basic Information. https://nepp.nasa.gov/index.cfm/20911 National Institute of Standards and Technology. (1976). The destructive bond pull test. https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nbsspecialpublication400-18.pdf Grading LOs: PC05, MAT03, MOD01, EPS01, EPS02 Point value: 10 points. Problem is graded by question. Partial credit is available and is based on evidences in MAT03, MOD01, EPS01, EPS02. If you do not meet the PC05 expectations, you will lose additional credit. Q.1.(1) Q.1.(2) Q.1.(3) Q.1.(4) Q.1.(5) Figure Points 2 1 1 2 2 2 PC05 (1) PC05 (2) PC05 (4) PC05 (8) Points -100% -25% -15% -100% Page 3 of 10 ENGR 132 Spring 2023 Grading Process Page 4 of 10 ENGR 132 Spring 2023 Problem 2: MATLAB Linear Regression Introduction You will write script that performs least squares regression using MATLAB and use the model you find to make reasonable predictions with justifications. You will learn to publish your MATLAB file as a PDF. Submission Individual Gradescope Assignment A11 – All Problems Deliverables A11Prob2_oxygenLevels_login.m Assignment Type Individual Supporting files: A11Prob2_oxygenLevels_login.pdf Data_DO_concentrations.csv Problem A large focus within biological engineering is limnology, the study of inland bodies of water. Besides water itself, the second most important factor for living organisms underwater is dissolved oxygen (DO). DO is an important measure for water quality and has a big impact on the living creatures in natural waters. Too little oxygen in the water can cause aquatic plants and animals to die, and prolonged low oxygen levels can destroy an ecosystem. There are a lot of factors that influence the DO levels in water, but the one that we are concentrating on is temperature. Warmer water holds less DO, while colder water tends to hold more DO. You are working with a team of biological engineers to predict the concentration of dissolved oxygen in the Colorado river at different temperatures based on samples that were collected by your team. The data collected can be found in the data file Data_DO_concentrations.csv. The temperature is measured in degrees Celsius (°C) and the dissolved oxygen concentration is measured in milligrams per liter (mg/l). Using linear regression, you need to create a model that can estimate the concentration of dissolved oxygen at a given temperature. You will create a script that must do the following: • Perform least squares analysis and goodness of fit calculations: o Perform linear regression on the data to get the least-squares coefficients. o Determine the predicted values of the linear model. o Calculate SSE, SST, and r2 values for the model. Page 5 of 10 ENGR 132 • Spring 2023 Display information to the Command Window, using professional communication standards for all displays: o The linear model equation with clear, descriptive variable names in the format 𝑦 = 𝑎𝑥 + 𝑏. o The goodness of fit results for SSE, SST, and r2. • Create a figure that displays the raw data and the model line on the same axes. Format the figure for technical presentation. • Use the least-squares model to calculate the predicted dissolved oxygen levels at 17 deg C and 45 deg C. Publish your code When your code is working properly, publish your m-file as a PDF. When you publish, MATLAB will run your code and will create a PDF file that has your code with all outputs and displays it produces. Be sure to suppress commands and variable assignments so that only the requested displays (text and figure) are visible in the PDF. 1. 2. 3. Open your m-file in the MATLAB Editor and then click on the Publish tab. Click the bottom half of the Publish button and then click Edit Publishing Options… Set the Output Format to pdf and click Publish. Detailed instructions and videos for publishing are on this website for both MATLAB Online and MATLAB desktop. If you encounter an issue while publishing, check the Troubleshooting section. Answer questions & submit to Gradescope Open the Gradescope assignment. • • • • Enter your predicted values for 17 and 45 deg C in the appropriate space in Gradescope. Do not just enter values alone. Use your knowledge of the data to justify your prediction for each value. Use professional communication standards in both answers. Copy-paste your Command Window text display into Gradescope (equation and goodness of fit results). Submit your published PDF that shows your code, text display, and figure all in one document. Submit your m-file and data file. References Oxygen Saturation Technology. (n.d.). Solitude Lake Management. Retrieved September 30, 2022, from https://www.solitudelakemanagement.com/blog/oxygen-saturation-technology-blog/ La Hee, J. (September 8, 2020). Stop pond and lake odors at the source. Vertex Aquatic Solutions. https://vertexaquaticsolutions.com/aeration-for-pond-lake-odors/ Grading LOs: PC05, MAT03, MOD01, EPS01, EPS02 Point value: 10 points. Partial credit is available and is based on evidences in MAT03, MOD01, EPS01, EPS02. If you do not meet the PC05 expectations, you will lose additional credit. Evidence Penalty Graded item LOs Points PC05 (1) PC05 (2) PC05 (3) PC05 (4) Lose full credit on problem Lose 25% of full credit on problem Lose 25% of full credit on problem Lose 15% of full credit on problem Predictions Model and goodness of fit values and display EPS01 MAT03, MOD01, EPS01 2.5 4.5 Page 6 of 10 ENGR 132 Spring 2023 Figure Display Problem 3: EPS02 3 Greenhouse Gas Analysis Introduction This problem will allow you to apply linear regression concepts to data sets that contain environmental data used in news, commercial, academic, and scientific settings. You will discuss what you know about the models using your knowledge of the data. Submission Individual Gradescope Assignment A11 – All Problems Deliverables A11Prob3_airPollution_login.m Assignment Type Individual Supporting files: A11Prob3_airPollution_login.pdf Data_NOAA_ESRL_co2_trend_1980-2022-global.csv Data_NOAA_ESRL_n2o_trend_2001-2022.txt Team Plan Gradescope Assignment A11 – Team Planning Deliverables Requested information Assignment Type Team Problem Greenhouse gases cause the Earth’s atmosphere to retain heat by absorbing infrared radiation and blocking it from being reemitted back into space. Earth is habitable due to naturally occurring greenhouse gases that form a carbon exchange cycle, but industrial production of these gases well exceeds what the natural environment’s cycle can manage. Fossil fuel emissions and wildfires are primary sources of greenhouse gases and contribute more carbon to the atmosphere than the land and ocean carbon sinks can remove, which leads to unnatural and dangerous warming on a global scale. Greenhouse gases and their effects are studied in almost all engineering disciplines, including environmental, agricultural, mechanical, chemical, construction, aerospace, and civil engineering. Carbon dioxide (CO2) and nitrous oxide (N2O) emissions are two significant greenhouse gases. The National Oceanic and Atmospheric Administration’s Earth System Research Laboratory (ESRL) maintains records of atmospheric greenhouse gas concentrations. Page 7 of 10 ENGR 132 Spring 2023 For this problem, you must model the global averages in CO2 and N2O atmospheric concentrations. You will find the CO2 data, with its complete NOAA header, in the file named Data_NOAA_ESRL_co2_trend_1980-2022global.csv and the N2O data in the file Data_NOAA_ESRL_n2o_trend_2001-2022.txt. Write a MATLAB no-input, no-output function to perform least squares regression on the provided data. You will model average CO2 (in ppm) as a function of decimal year and then model average N2O (in ppb) as a function of decimal year. Your function must • Perform linear regression on the data to get the least-squares coefficients, first for CO2 and then for N2O. • Determine the predicted values of each linear model. • Calculate SSE, SST, and r2 values for each model. • Display the linear model equation (with clear variable names), SSE, SST, and r2 to the Command Window for each model. Make sure you can differentiate between the information for CO2 and N2O. • Generate one figure with two subplots stacked vertically. On the top subplot, display the data and the trend line on the same axes for CO2. On the bottom subplot, display the data and trend line on the same axes for N2O. Format both subplots and the figure for technical presentation. o Learn about the xlim function to set the x-axis limits of both subplots. Make your subplots have identical axes limits on the x-axis and show all data in both subplots. Enter information and answer analysis questions in Gradescope In Gradescope, you will need to enter your Command Window text display, and you find two questions to answer. Copy and paste your equations and goodness of fit displays for both gases. For the analysis questions, your answers must meet course communication expectations. Q1. From your analyses, can you draw a conclusion about the accuracy of the data measurements? Provide justification for your answer. Q2. For which data set does a linear model best explain the variation that exists in the data? Clearly state the basis of your reasoning and provide justification for your answer. Publish your code When your code is working properly, publish your m-file as a PDF. When you publish, MATLAB will run your code and will create a PDF file that has your code with all outputs and displays it produces. Be sure to suppress commands and variable assignments so that only the requested displays (text and figure) are visible in the PDF. 1. 2. 3. Open your m-file in the MATLAB Editor and then click on the Publish tab. Click the bottom half of the Publish button and then click Edit Publishing Options… Set the Output Format to pdf and click Publish. Detailed instructions and videos for publishing are on this website for both MATLAB Online and MATLAB desktop. If you encounter an issue while publishing, check the Troubleshooting section. Instructions 1. Read through the entire problem statement. 2. Team Planning: As a team, outline a plan to solve this problem that has at least 2-3 steps, is not highly detailed, and does not contain code. Submit the plan to Gradescope. Link to thorough instructions. Page 8 of 10 ENGR 132 Spring 2023 o 3. All teammates confirm that you get a submission email and verify that you can see the team plan submission in your Gradescope. Individually: a. Complete your m-file and run it to get your results. o The team plan is an initial start on the problem. It may not be completely correct, and you may find flaws in the plan once you start coding. You should make any individual changes that are necessary to obtain the best solution. You will be assessed on your individual solution to the problem. b. Cite any peers you worked with in your script header if their help changed how you decided to solve the problem. Make sure you also completed the rest of the script header. c. Answer the questions and submit your properly named files to the individual Gradescope assignment (see the submission list at the beginning of this problem). o Do not submit any other files. References Ed Dlugokencky, NOAA/GML (gml.noaa.gov/ccgg/trends_n2o/) Ed Dlugokencky and Pieter Tans, NOAA/GML (www.esrl.noaa.gov/gmd/ccgg/trends/) https://gml.noaa.gov/ccgg/carbontracker/ Grading LOs: PC05, MAT03, MOD01, EPS01, EPS02 Team planning: 1 points Point value: 10 points. Problem is graded by question. Partial credit is available and is based on evidences in MAT03, MOD01, EPS01, EPS02. If you do not meet the PC05 expectations, you will lose additional credit. LO Table (1) (2) (3) (4) (5) (6) (7) (8) PC05 -100% -25% -10% -15% 0 0 0 0 EPS02 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0 MAT08 0.5 1.5 0.5 0 0 0 0 0 MAT03 0.6 0.6 0.6 0.6 0 0 0 0 Analysis Q1 Q2 Points* 1 1.2 * Assessment of analysis questions is based on LOs in EPS01 Page 9 of 10 ENGR 132 Spring 2023 Grading Process Page 10 of 10