MATH 1180: Calculus for Biologists II (Spring 2011)

advertisement
MATH 1180: Calculus for Biologists II (Spring 2011)
Lab Meets: March 2, 2011
Report due date: March 9, 2011
Section 002: Wednesday 9:40−10:30am
Section 003: Wednesday 10:45−11:35am
Lab location − LCB 115
Lab instructor − Erica Graham, graham@math.utah.edu
Lab webpage − www.math.utah.edu/~graham/Math1180.html
Lab office hours − Monday, 9:00 − 10:30 a.m., LCB 115
Lab 07
General Lab Instructions
In−class Exploration
Review: Last week, we simulated a discrete−time stochastic diffusion model and observed just how
obnoxious ReallyObnoxiousStuff could be.
Background: In today’s lab, we will simulate the rare disease example from the text to determine how good
screening tests are at actually catching sick people.
restart;
with(Statistics):
Suppose a rare disease infects 5% of a population. A diagnostic test always identifies people with the
disease, but it also generates 10% false positives. The test will return a 1 for those who test positive and a 0
for those who test negative. We can simulate a fake population of 100 people to see how many actually
have the disease before we deal with testing inaccuracy. Again, we’ll need the RandomVariable and
ProbabilityTable( ) commands to set up the characteristics of the population.
pdisease:=0.05: ## probability of having the disease
P1:=[1−pdisease,pdisease]; ## probability table, yields 1 if
healthy, 2 if sick
X:=RandomVariable(ProbabilityTable(P1)): ## necessary randomness
P1 := 0.95, 0.05
(2.1)
Now we can simulate the sick and healthy people for the initial population size. Because sampling X will
return only 1s and 2s, we need to modify the results to be 0 when there’s a 1 and 1 when there’s a 2 (like we
did last week). The difference is we’re sampling 100 people, so we need to organize things using seq( ).
Our entire population, reduced to a series of 0s and 1s, is represented by the list ’status.’
population:=150:
xsample:=Sample(X,population): ## sample of 100 people (1s and 2s)
status:=[seq(xsample[j]−1,j=1..population)]: ## convert sample to
0s and 1s
To figure out how many sick people are in the population, we will add up all the 1s in ’status.’ The number
of healthy people is simply the total population size minus the number of sick people.
sickpeople:=add(status[j],j=1..population); ## all the 1s
healthypeople:=population−sickpeople; ## all the 0s
sickpeople := 9
(2.2)
healthypeople := 141
(2.2)
We now know what our population looks like. For all the healthy people, the original setup states that the
test will give a false positive 10% of the time. We’ll see how many people in the healthy population would
receive a false positive if they were tested. To do this, we can follow the same process as before, assigning
different names for things.
pfalse:=0.1: ## probability of getting a false positive
P2:=[1−pfalse,pfalse];
X2:=RandomVariable(ProbabilityTable(P2)):
P2 := 0.9, 0.1
(2.3)
Our new sample size is ’healthypeople.’
x2sample:=Sample(X2,healthypeople):
diagnostic:=[seq(x2sample[j]−1,j=1..healthypeople)]:
falsepositives:=add(diagnostic[j],j=1..healthypeople); ## how
many healthy people tested positive
falsepositives := 15
(2.4)
What fraction of the positives detected actually identified sick people? Put another way, what is the
probability of being sick given a positive result?
There are 2 ways to answer this question:
[1] We can estimate the fraction using the results from our simulation. (We’ll do this now.)
[2] We can use Bayes’ theorem for the actual fraction. (You’ll do this later.)
We need the following to answer do method [1]: (true positives)/(true positives + false positives). Since the
test is 100% accurate for catching the people with the disease, we know that all ’sickpeople’ were flagged
with 1s. The false positives that we just found add to the total number of positives. So, the percent ’score’
for this particular diagnostic test in identifying sick people is simply 100*(appropriate fraction).
score:=evalf(100*sickpeople/(sickpeople+falsepositives)); ##
percentage of positive tests that identify sick people
score := 37.50000000
(2.5)
Please copy the entire section below into a new worksheet, and save it as something you’ll
remember.
Lab 07 Homework Problems
Your Full Name:
Your (registered) Lab Section:
Useful Tip #1: Read each problem carefully, and be sure to follow the directions specified for each
question! I will take a vow of silence if you ask me a question that is clearly stated in a problem.
Useful Tip #2: Try to minimize your code by not simply copying and pasting absolutely everything we do
in class. See if you can eliminate unnecessary commands by knowing what it is you have to do and what
tools you (minimally) need to do it.
Useful Tip #3: Don’t be afraid to troubleshoot! Does your answer make sense to you? If not, explore why.
If you’re still unsure, ask me.
Useful Tip #4: Whenever you re−open Maple to complete an assignment, you will need to re−execute
everything that you’ve done. Maple has no memory of anything you did; it just shows you your non−
suppressed output.
Paper−saving tip: Make the size of your output graphs smaller to save paper when you print them. Please
ask me if you’re unsure of how to do this. (You can see how much paper you’d use beforehand by going to
File
Print Preview.) Also, please DO NOT attach printer header sheets (usually yellow, pink or blue) to
your assignment. Recycle them instead!
NOTE: For all assignments, you should re−define/assign any parameters or functions we used in class, as
needed. This will require an understanding of what’s being asked of you. Again, everything that we did in
class does not necessarily need to be done here.
(0) Import the Maple Statistics package we used in class.
with(Statistics):
(1)(a) Use method [2] at the end of the in−class exploration to determine the actual percentage of the
flagged positives that identified sick people. This is equivalent to 100*Pr(sick|positive).
Hint: For any 2 events, A and B, Bayes’ theorem says that Pr(B|A) = (Pr(A|B)*Pr(B))/Pr(A). You have all
the necessary pieces of information to answer this problem. Try not to get overwhelmed.
Note: You DO NOT need Maple for this problem.
(b) Compare your result to the ’score’ we calculated from our simulation. Is the simulated score better or
worse?
Now suppose we know that a particular subgroup of 100 people within a different population from before is
more likely to have the disease than the general population. Assume that the probability of being sick in
this target group is 0.4.
(2)(a) Simulate only the disease statuses of this fake subpopulation.
## simulate 100 fake people given their probability of being sick
(b) Count the number of individuals here who are sick.
## how many are sick?
(3)(a) With the group’s healthy people, simulate the situation in which the test generates 10% false
positives.
## simulate tests for healthy people
(b) How many false positives are there among the healthy subgroup?
## how many were told they were sick, but weren’t?
(4) Estimate the percentage of positive tests that actually identify people who are sick. Use our method
from class as a guide.
## what’s the ’score’ this time?
(5)(a) What is the actual chance (in percentage) that a person in this target group who tests positive is
actually sick? (You’ll need Bayes’ theorem for this.)
(b) How does this answer compare to your estimated result?
(6)(a) What do you notice about the actual scores of the diagnostic tests for this subpopulation versus the
population we simulated in class? Which test performs better?
(b) What’s so good about identifying highly susceptible subpopulations for a rare disease? Explain your
answer.
Did you remember to save paper?
Download