Lab 2 – Functions and Antibiotics Date: August 30, 2011 Assignment and Report Due Date: September 6, 2011 Goal: In this lab you will learn about a model for antibiotic efficiency and see how it relates to data. We will consider the relationship between dosage and efficiency during antibiotic treatment of an infection. During the course of the lab you will learn two new commands: function() and rep(). First, you will learn how to define and use functions in R. You will then use this knowledge to plot a given model and see how it relates to experimental data. New commands function() – defines a function and it’s inputs rep() – creates a vector with repeated entries, i.e. (1,1,1,1,1) Introduction to Functions in R During last week’s lab we learned how to create input and output vectors in order to plot functions. There is actually an easier way to do this! We can explicitly define functions in R using the function command. Recall that last week we plotted the line f(x) = 2*x + 1 using the following code. We first defined the slope and y-intercept. >m=2 >b=1 Next we defined the input vector. > x = 1:5 Then we defined the output vector using the function f(x) = m*x + b. > f = m*x + b Finally we plotted the result. > plot(x,f,type = “o”) Now let’s see how we can do this using function definitions. Instead of just defining an output vector f, we will define the actual function which can be used to give us multiple outputs. Let’s call this function F. > F = function(input){m*input + b} Let’s talk through the pieces of this definition. In the parentheses we put our input which for now I have called “input”. Inside the curly brackets we put the formula for what we are going to do to the input to get output. In this case we are plotting a line and using the slope-intercept form of a line as our formula to get output. NOTE: remember variables can be called anything you would like as long as you are consistent within the function definition. We could have also defined the function in this way. > F = function(x){m*x + b} So now let’s see why this is useful. Imagine we want to know the output when our input = 1. > F(1) This gives the output when our input is one. Before when we wanted to plug in the value of 1 for our input we typed the following. > m*1 + b Function definitions make evaluating functions for given values much easier. We can also easily define a set of input and output vectors very simply in order to plot them. > x = 1:5 > f = F(x) > plot(x,f,type=”o”) Now let’s change the input vector to see a different part of the line. > x = -10:0 > plot(x,F(x),type=”o”) We can even call the function F inside the plot command to simplify our life more since we know F(x) will give us the output values we want. What if we want to plot a different line by changing the values of m and b. Since m and b are included in our definition of F, this is very simple. First redefine m and b. Then just replot the function. > m = -1 >b=0 > plot(x,F(x),type=”o”) As you can see, R knows just how to handle this since we have used functions. We only had to define our function once and then it becomes simple to evaluate and plot the function as we change parameters or change the vector of input values. In the lecture you discussed point-slope forms of lines by using a function of the following form. Here we must define the parameters and to plot this line. Try using the following parameters. > m = -65 > x0 = 8 > y0 = 305 Define your input vector, x. > x = 8:13 Define the point-slope function. > F = function(x){m*(x-x0)+y0} Plot the function > plot(x,F(x),type=”o”) What could this line represent? Give an interpretation that could relate to everyday life. Let’s try one more function to make sure we get it. This time we will plot a parabola which has three parameters: a,h, and k. For fun we will let our input be called w. >a=1 >h=0 >k= 0 > parabola = function(w){a*(w-h)^2+k} First let’s evaluate our function “parabola” at a couple points and see what we get. > parabola(-3) > parabola(2) Let’s define a new input vector over which to plot this parabola. We’ll call this input vector “y”. > y = seq(from=-3, to = 3, by = .1) > plot(y,parabola(y),type = “o”) There is our parabola! Change the values of the parameters and see what happens to the parabola. If you have any questions with function definitions or how to use them please ask now. Biology Background Physicians treat many patients for otitis media, commonly known as an ear infection. When the middle ear becomes infected with bacteria it can cause pressure and pain. Bacteria such as Streptococcus pneumonia or Pseudomona aeruginosa may be the culprits behind painful earaches. As you know, antibiotics are used to treat bacterial infections. Amoxicillin is often prescribed to children and adults for such infections. Amoxicillin inhibits the synthesis of the cell wall in bacteria which ultimately kills the bacteria. It has been proven a very effective medication for treating otitis media. A natural question to ask is how does antibiotic dosage affect the amount of bacteria killed? This can be thought of as antibiotic efficacy. How much antibiotic is needed to efficiently kill the bacteria causing an infection? If you give a patient 250mg of amoxicillin, how many bacteria will that kill? How about if you give them a dose of 500mg? Will that kill twice as much as a dose of 250mg? Take a minute to think about what a plot might look like if you were to draw it on the following figure. Model describing Antibiotic Efficacy We will use the following function to describe antibiotic efficacy. In this equations our input, d, represents the dose of antibiotic and the output, E(d), is how we will measure efficacy, the number of bacteria killed in response to the antibiotic. There are two parameters in this equation: M and k. Let’s define this function in R and plot it to see what it looks like. Before the function definition, define the parameters as shown. > M = 50 >k=5 > E = function(d){M*d^2/(k^2*M^2+d^2)} Let’s first set up a blank plot of the appropriate size and label it. > plot(c(0,1100),c(0,90),type="n",ann=FALSE) > title(main = "Antibiotic Efficacy",xlab = "Amoxicillin Dosage (mg)",ylab = "# of Dead Bacteria (*10^8)") Since our input values are dosages, we will define them to be in a realistic dosage range. Patients are generally described less than 1000 mg/day so we will set our dosage vector to range from 0 to 1100. > DoseVector = seq(from=0,to=1100,by=25) >lines(DoseVector,E(DoseVector),col=”blue”) Think about this curve. Why is it a good model for antibiotic efficacy? Let’s now see how our model relates to some sample data. Hypothetical Experiment Suppose a research lab conducted a study on the efficacy of amoxicillin. For six months, the lab monitored patients who present with otitis media. Each patient was given the following dose of amoxicillin for their first day of treatment. The amount of dead bacteria was recorded before treatment continued on the second day. The table below shows this data for 60 patients. PATIENT DATA # Dead Dose Bacteria 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 5.3 6.21 9.22 8.99 9.35 9.97 6.63 6.95 7.32 4.77 4.83 5.2 3.81 4.03 4.1 # Dead Dose Bacteria 250 250 250 250 250 250 250 250 250 250 250 250 250 250 250 9.75 11.91 15.08 18.75 19.23 22.52 24.82 24.89 25 22.39 22.84 23.74 19.27 20.07 20.81 # Dead Dose Bacteria 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 7.36 9.01 15.12 23.74 26.93 28.08 40.13 41.4 42.57 47.74 48.27 49.02 49.92 49.99 50 # Dead Dose Bacteria 750 750 750 750 750 750 750 750 750 750 750 750 750 750 750 8.07 11.44 14.94 20.05 21.03 28.34 40.24 42.81 46.9 59.21 59.76 60.25 67.55 68.26 69.77 Let’s add these data points to our plot in R. We will first plot the points for patients who were given 100mg. In order to do this we can use the rep() command to make a vector with repeated dosages. First let’s see what rep() does. > rep(1,times=5) As you can see, this creates a vector with five ones. We will now use rep when plotting our data points since we have repeated dosages. > points(rep(100,times=15),c(5.13,6.21,9.22,8.99,9.35,9.97,6.63,6.95,7.32,4.77,4.83,5.2,3.81, 4.03,4.1),pch = 3) > points(rep(250,times=15),c(9.75,11.91,15.08,18.75,19.23,22.52,24.82,24.89,25,22.39,22.84, 23.74,19.27,20.07,20.81),pch = 3) > points(rep(500,times=15),c(7.36,9.01,15.12,23.74,26.93,28.08,40.13,41.4,42.57,47.74, 48.27,49.02,49.92,49.99,50),pch = 3) > points(rep(750,times=15),c(8.07,11.44,14.94,20.05,21.03,28.34,40.24,42.81,46.9,59.21, 59.76,60.25,67.55,68.26,69.77),pch = 3) You should now have your data points plotted as black + signs on your plot. Does our model(the blue curve) look like a good fit for the experimental data? Why or why not? What could be missing in our model? Why doesn’t the data fit into the curve we plotted before? STOP here and make sure you answer the questions before continuing!!! They will be needed for your report. Our model has not taken into account body mass! The data we have been given in the table also does not tell us how big these patients were that were receiving the specified doses of amoxicillin. They could be infants or large sumo wrestlers. Our model actually does include body mass in the parameter, M, which we set to be 50 when we originally plotted our function. This body mass is measured in kilograms so what we have plotted so far represents a curve describing antibiotic efficacy for a person weighing 50kg or about 110 lbs. We now need to see how our model changes for different parameter values. We will add lines to our existing plot to show what the curves look like for different body masses. We will use the following values of mass to represent different groups of people and will plot each line in a different color. M = 10 Babies M = 25 Children M = 50 Small Adults M = 75 Medium Adults M = 100 Large Adults > M = 10 > lines(DoseVector,E(DoseVector), col=2) > M = 25 > lines(DoseVector,E(DoseVector), col = 3) > M = 50 > lines(DoseVector,E(DoseVector),col=4) > M = 75 > lines(DoseVector,E(DoseVector),col=5) > M = 100 > lines(DoseVector,E(DoseVector),col=6) We have now plotted curves for five different body masses. You can see that the curves for different body masses are very different which could account for the varying data points we plotted from the table given. Let’s now look at a data table where we include the mass of patients. Dose mass #Dead Dose mass #Dead Dose mass #Dead Dose mass #Dead 100 5.74 5.3 250 10.15 9.75 500 7.4 7.36 750 8.08 8.07 100 6.96 6.21 250 12.68 11.91 500 9.09 9.01 750 11.44 11.44 100 13.31 9.22 250 16.78 15.08 500 15.48 15.12 750 15.09 14.94 100 31.94 8.99 250 22.58 18.75 500 25.25 23.74 750 20.42 20.05 100 28.93 9.35 250 23.47 19.23 500 29.24 26.93 750 21.47 21.03 100 21.71 9.97 250 31.41 22.52 500 30.74 28.08 750 29.43 28.34 100 52.75 6.63 250 44.38 24.82 500 50.28 40.13 750 43.65 40.24 100 49.48 6.95 250 45.59 24.89 500 53.06 41.4 750 47.01 42.81 100 45.9 7.32 250 49.08 25 500 55.86 42.57 750 52.68 46.9 100 78.86 4.77 250 80.66 22.39 500 73.62 47.74 750 73.38 59.21 100 77.75 4.83 250 76.91 22.84 500 76.56 48.27 750 74.5 59.76 100 71.35 5.2 250 69.15 23.74 500 81.86 49.02 750 75.52 60.25 100 101.01 3.81 250 106.19 19.27 500 94.39 49.92 750 94.16 67.55 100 95.14 4.03 250 99.39 20.07 500 98.04 49.99 750 96.53 68.26 100 93.15 4.1 250 93.32 20.81 500 99.96 50 750 102.08 69.77 We will now color code these points when plotting them according to their masses. Use the following color codes as shown in the table below. mass <= 20 Babies col = 2 20 < mass <= 40 Children col = 3 40 < mass <= 60 Small Adults col = 4 60 < mass <= 80 Medium Adults col = 5 80 < mass Large Adults col = 6 We will plot all the values for the baby masses first as an example of how to plot the other points. > points(c(100,100,100,250,250,250,500,500,500,750,750,750),c(5.3,6.21,9.22,9.75,11.91,15.08, 7.36,9.01,15.02,8.07,11.44,14.94),pch=3,col=2) You should now have your data points for baby masses plotted in red. >points(c(100,100,100,250,250,250,500,500,500,750,750,750),c(8.99,9.35,9.97,18.75,19.23, 22.52,23.74,26.93,28.08,20.05,21.03,28.34),pch=3,col=3) You should now have your data points for baby masses plotted in green. >points(c(100,100,100,250,250,250,500,500,500,750,750,750),c(6.63,6.95,7.32,24.82,24.89,25, 40.13,41.4,42.75,40.24,42.81,46.9),pch=3,col=4) You should now have your data points for small adult masses plotted in blue. Follow these examples to plot the data points for medium adults (light blue) and large adults (magenta). How does the data seem to match with our model now? Does it appear our model is a good fit for the data collected? What have you learned about models and how they relate to data from this lab?