MATH 1180: Calculus for Biologists II (Spring 2011) Lab Meets: April 6, 2011 Report due date: April 20, 2011 Section 002: Wednesday 9:40−10:30am Section 003: Wednesday 10:45−11:35am Lab location − LCB 115 Lab instructor − Erica Graham, graham@math.utah.edu Lab webpage − www.math.utah.edu/~graham/Math1180.html Lab office hours − Monday, 9:00 − 10:30 a.m., LCB 115 Lab 10 General Lab Instructions In−class Exploration Review: Last week, we used simulated data to find various probability distributions pertaining to two random variables and explored the relationship between them. Background: In the current lab, we will see a similar example and explore the concept of covariance and correlation between two random variables. In particular, we will look at virus susceptibility and acquisition between two different populations: (1) immune−compromised individuals; and (2) elementary school teachers. The two interacting random variables will be the level of exposure to pathogens (L) and health status (S). restart; with(Statistics): read("/u/ma/graham/public_html/Math1180/1180files/susceptibility") : Assume that each person in the simulation is tested at a single time point for viral illness and that we know their pathogen exposure level by design. The values that L and S can take on and their meanings are as follows: L S 0 low exposure level 0 not sick 1 moderate exposure level 1 sick 2 high exposure level Notice that the number of outcomes for each random variable is different. In the susceptibility file we imported, we can use the SickData( ) command to simulate the 2 populations’ characteristics. Let’s do this for N = 50 people. We’ll save the data for the immune−compromised group to ’immunecomp’ and for the elementary school teachers to ’teachers.’ SickData( ) takes two arguments. The first is the number of people in the population considered, and the second is the population type. Type 1 is immune− compromised; type 2 is teachers. The output will be the values for L and S. N:=50; immunecomp:=SickData(N,1); teachers:=SickData(N,2); N := 50 immunecomp := "L" = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , "S" = 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1 teachers := "L" = 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 1, 2, 2, (2.1) 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 0, 2, 1, 2, 2, 2, 2, 0 , "S" = 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 We can get an idea of the population characteristics by looking at the marginal probabilities. The MargProb ( ) command in the ’susceptibility’ file will print these for us. Notice that teachers experience much more intense pathogen exposure than immune−compromised people do, but the amount of sick individuals in both groups is quite different. iMarg:=MargProb(immunecomp); tMarg:=MargProb(teachers); 24 1 1 14 11 iMarg := "Pr(L)" = 0, , 1, , 2, , "Pr(S)" = 0, , 1, 25 50 50 25 25 1 7 41 49 1 tMarg := "Pr(L)" = 0, , 1, , 2, , "Pr(S)" = 0, , 1, (2.2) 25 50 50 50 50 The marginal probabilities, of course, don’t tell us much information about how each variable interacts with the other within the two populations. For this, we look to the conditional probabilities, which we can get from CondProb(data,version) in our imported file. Version 1 will give Pr(S|L) and version 2 will give Pr (L|S). Icond:=CondProb(immunecomp,1),CondProb(immunecomp,2); Tcond:=CondProb(teachers,1),CondProb(teachers,2); 7 5 1 0 0 12 12 Icond := "Pr(S|L)" = , "Pr(L|S)" = 10 1 1 0 1 11 22 22 0 1 Tcond := "Pr(S|L)" = 1 0 1 0 40 1 41 41 , "Pr(L|S)" = 2 1 49 7 40 49 0 1 0 (2.3) It appears that teachers, for example, are more likely to stay healthy given little and moderate levels of exposure, whereas compromised people are guaranteed to be sick if they’re severely exposed to dangerous particles. We can see this more clearly by looking at the expected values of each random variable for both populations. We’ll save the data to a more user−friendly form, then use Maple’s ExpectedValue( ) command. iL:=rhs(immunecomp[1]): iS:=rhs(immunecomp[2]): tL:=rhs(teachers[1]): tS:=rhs(teachers[2]): ExpectedValue(iL); ExpectedValue(iS); ExpectedValue(tL); ExpectedValue(tS); 0.06000000000 0.4400000000 1.780000000 (2.4) 0.02000000000 These expectations say that teachers are most likely to be exposed to severe levels (close to 2) of pathogens and immune−compromised people to lower levels (close to 0), in line with the marginal probabilities above. Yet, teachers are more likely to be healthy, while immune−compromised people are pretty much equally likely to be sick or healthy. Another thing we could do is determine the correlations of the random variables for both groups. To do this, we need the covariances and standard deviations for L and S within each population. For now, we’ll use Maple’s Covariance( ) and StandardDeviation( ) commands. Luckily, you’ll be able to calculate at least one of these by brute force in your homework. icov:=Covariance(iL,iS); icorr:=icov/(StandardDeviation(iL)*StandardDeviation(iS)); tcov:=Covariance(tL,tS); tcorr:=tcov/(StandardDeviation(tL)*StandardDeviation(tS)); icov := 0.03360000000 icorr := 0.2136517648 tcov := 0.004400000000 tcorr := 0.06140382117 (2.5) Or, we could just use Maple’s Correlation( ) command, which may have some floating−point arithmetic− induced inaccuracies. Correlation(iL,iS); Correlation(tL,tS); 0.2180120048 0.06265696037 (2.6) In any event, we can make some conclusions about the relationship between L and S in both populations. For example, L and S are correlated in the immune−compromised population with moderately low certainty. Please copy the entire section below into a new worksheet, and save it as something you’ll remember. Lab 10 Homework Problems Your Full Name: Your (registered) Lab Section: Useful Tip #1: Read each problem carefully, and be sure to follow the directions specified for each question! I will take a vow of silence if you ask me a question that is clearly stated in a problem. Useful Tip #2: Minimize your code by not simply copying and pasting absolutely everything we do in class. See if you can eliminate unnecessary commands by knowing what it is you have to do and what tools you (minimally) need to do it. Useful Tip #3: When in doubt, restart! Useful Tip #4: When you reopen a saved .mw file, remember to re−execute everything in order to use previously defined things. Maple can show you the output of what you’ve done before, but it won’t remember how it got there. Useful Tip #5: Read through your completed assignment again before handing it in, to make sure that things make sense. It helps both the learning and grading processes. Paper−saving tip: Make the size of your output graphs smaller to save paper when you print them. Please ask me if you’re unsure of how to do this. (You can see how much paper you’d use beforehand by going to File Print Preview.) Also, please DO NOT attach printer header sheets (usually yellow, pink or blue) to your assignment. Recycle them instead! (0) Import the Maple Statistics package and susceptibility file we used in class. with(Statistics): read("/u/ma/graham/public_html/Math1180/1180files/susceptibility") : Note #1: You will be penalized one raw point for each unnecessary Maple command. Make sure you understand what’s needed and what’s superfluous! Note #2: I will not grade anything (i.e. you will lose full points for any problem) that is wrong because you failed to follow directions in any capacity. Be careful! (1) Simulate a population of 100 immune−compromised individuals and one of 100 elementary school teachers. Save your lists to ’i1’ and ’t1,’ respectively. Suppress your output. ## immune−compromised ## teachers (2)(a) Calculate the marginal and conditional distributions for the immune−compromised group. ## marginal distribution ## conditional distributions (b) Calculate the marginal and conditional distributions for teachers. ## marginal distribution ## conditional distributions (3) Use the above probabilities to calculate (by hand!) the joint distributions for both populations. Fill in the following table with your results. Keep your answers in fraction form. Immune−compromised Teachers S S 1 2 1 0 0 1 1 2 2 2 L (4)(a) Use Maple to find the expected value of L and S for both groups. Save these to iLbar and iSbar for immune−compromised people and to tLbar and tSbar for teachers. For example, tLbar should be the expected value of L for teachers. ## 4 expected values (b) Calculate Cov(L,S), by hand, for both groups using the joint probabilities and expected values you calculated along with the following formula: 2 3 Cov L, S = j=1 i=1 li E L sj E S pij where l and s are all possible values of L and S, respectively. Show your work. immunecomp: Cov(L,S) = teachers: Cov(L,S) = (c) Use the appropriate Maple command to verify your answers. ## verifications (5)(a) Are L and S in the immune−compromised group positively or negatively correlated? (b) Are L and S in the elementary teacher group positively or negatively correlated? Now you’ll explore the inclusion of an additional level of exposure: none. The values of L and S will therefore be as follows: L S 0 no exposure 0 not sick 1 low exposure level 1 sick 2 moderate exposure level 3 high exposure level (6) The SickData2(N, type) command will simulate this situation for the same 2 populations. Generate data for 200 people in each population (recall that type 1 = immune−compromised and type 2 = teachers). Save your lists to i2 and t2, and suppress your output. ## immune−compromised ## teachers (7)(a) Determine the expected values of L and S for i2. ## expected L ## expected S (b) Interpret these expectations for this population. (8)(a) Determine the expected values of L and S for t2. ## expected L ## expected S (b) Interpret these values for this population. (9) Find both conditional distributions for i2 and t2 using the command CondProb2(data,version). ## 4 conditional distributions (10)(a) Given the conditional distributions for the immune−compromised, what does exposure level say about health status? (b) Under what circumstances does this population do best? (c) What does health status indicate about exposure level for this group? (d) Why does this make sense with respect to this population? Give a thorough explanation. (11)(a) Given the conditional distributions for the elementary school teachers, what does exposure level say about health status? (b) Under what circumstances does this population do worst? (c) What does health status indicate about exposure level for this group? (d) Why does this make sense with respect to this population? Give a thorough explanation. (12) What Maple statistical commands did you use today (there are 4 of them)? List and describe what they do. Do not include any commands in the ’susceptibility’ file. Did you remember to save paper?