Stat330 - Homework 1 Maximum score is 20 points, due date is Wednesday, Sep 6th 12pm. You can either hand in the solution electronically with WebCT or on paper during class. 1 Broken Components Four components are inspected and three events are defined as follows: A : “all four components are found to be defective” B : “exactly two components are found defective” C : “at most three components are found defective” Interpret the following events. Start by defining a suitable sample space Ω:: a) B ∪ C. b) B ∩ C. c) A ∪ C. d) A ∩ C. (2 points) 2 Playing Sports Out of a group of 40 students all play at least one of badminton, volleyball or table tennis. 8 students play all three games, 10 students play badminton and table tennis 20 students play table tennis and volleyball, 12 students play badminton and volleyball 30 students play table tennis, 25 students play volleyball. Draw a Venn Diagram of the situation. (a) How many of the students play only badminton? (b) How many of the students play badminton? Assume, one student is picked at random out of this group. What is the probability that he/she (c) plays badminton? (d) plays at least two sports (out of badminton, volleyball and table tennis)? Assume the student you’ve picked is a volleyball player. (e) What is now the probability that he/she is a badminton player? (f) Why is the probability for picking a badminton player in (e) different from the probability in (c)? (3 points) 3 Sample Spaces Suggest suitable sample spaces, and identify the subset corresponding to the event A, for the following situations: (a) A coin, which can show heads (H) or tails (T), is tossed three times; A is the event that the coin shows heads twice. (b) A game of football is played; A is the event that the match is drawn. (c) A couple have two children; A is the event that both are girls. (d) A shot hits a circular target of radius 10 centimetres; A is the event that the shot hits within 3 centimetres of the centre of the target. (e) A money value x is chosen at random from the financial pages of a newspaper; A is the event that the first digit of x is 1. In each case discuss what the probability P (A) might be, commenting on how precise you can be, and on any assumptions you need to make. (5 points) 4 Counting (a) The Delta Secret Philosophy society has just recruited three football players, two nerds, and four computer geeks. Big Nico picks four of these new members at random for the philosophical discussion team . (i) How many different sets of four new members are possible? (ii) How many of those sets of 4 contain both nerds and two computer geeks? (b) Egg Hunt: How many different possibilities are there to distribute 20 eggs to three nests? (c) On a party six people shake hands: each one shakes hands with everybody else (only once). How many handshakes does that make in total? (d) A delivery of 50 transistors contains 40 good ones and 10 defectives. In a test five of them are checked. How many possibilities are there to have 3 good ones and 2 defective transistors in the test set? (e) Someone has 15 books - 3 on cooking, 5 on music and 7 novels. How many ways does Someone have to arrange the books on a shelf, if books on the same topic are supposed to be together? (f) How many ways are there to arrange the letters M I S S I S S I P P I to different ”words”? (6 points) 5 Break-in A hacker wants to break into a computer, which is password-protected. Assume that there are 300 equally likely passwords out of which only one is the correct password. The hacker chooses passwords from this set independently at random and tries them. Determine the probability that the hacker is successful in exactly the 5th attempt if a) unsuccessful passwords are not eliminated from further selections, b) unsuccessful passwords are eliminated. (4 points) 2 Stat 330 - Homework 2 Maximum score is 20 points (plus 2 extra points), due date is Friday, Sep 17th 12pm. You can either hand in the solution electronically with WebCT or on paper during class. 6 Voter Percentages The following two way table shows the age of U.S. civilians and their participation in the presidential election of 1984. Each column refers to a single age group and contains the number of people in that age group who voted, registered but didn’t vote and didn’t register to vote. The total population in each age group is given at the bottom of the table. The numbers are millions of people. Age Group Voting Status 18-24 25-34 35-44 45-64 ≥ 64 Total Voted 11.4 22.0 19.5 30.9 18.1 101.9 Registered, no vote 3.0 3.5 2.3 3.0 2.5 14.3 Not registered 13.6 14.8 8.9 10.4 6.2 53.9 Total 28.0 40.3 30.7 44.3 26.8 170.1 (a) For each age group compute the proportion of the group that voted in the 1984 presidential election. Compute the proportion of the total population that voted in the 1984 election. (b) Write a brief description of the relationship between voting behavior and age. (c) Suppose we are interested in 18-24 year-old voters. i Find the percentage of 18-24 year-olds that voted in 1984. (You can get this from (a)). ii Find the percentage of 1984 voters that are 18-24 years-old. iii These two percentages differ because they relate to two different conditional probabilities - which one? Write the mathematical expressions. (5 points) 7 Milk There are five containers of milk on a shelf; unknown to you, two of them have passed their use-by date. You grab two at random. What’s the probability that neither have passed their use-by date? Suppose someone else has got in just ahead of you, taking one container after examining the dates. What’s the probability that the two you take after that are ahead of their use-by dates? (2 points) 8 from Trivedi Problem 4, p.44 Plants A, B, C produce 35% , 15% and 50% of the total output. Nondefective output is 75% , 95% and 85% respectively. A customer receives a defective product. What is the probability that it came from plant C? (3 points) 3 9 Debugging with Probabilities Consider the following program segment: if B then while B1 do S else while B2 do S Assume that P (B = true ) = p, P (B1 = true ) = 2/5, and P (B2 = true ) = 3/5. S prints the message ’good day’ in each iteration. Past evaluations have shown that the probability for exactly two ’good day’ messages is 3/25. Use a tree diagram to determine the value of p (You will not be able to draw the whole tree, since it is infinitely large. Just draw the part you need to solve the problem.) (3 points) 10 Missile Protection System A missile protection system consists of n radar sets operating independently, each with probability 0.9 of detecting a missile entering a zone that is covered by all of the units. a) Find a nice expression for the probability that at least one unit detects the missile. (Tip: what is the probability, that a missile is not detected by any of the sets? - what has this probability to do with the above?) b) How large must n be if we require that the probability of detecting a missile that enters the zone be 0.999 ? 0.999999 ? c) If n = 5 and a missile enters the zone, what is the probability that exactly 4 sets detect the missile? At least one set? (4 points) 11 Taxis in Athens (From D. Kahneman & A. Tversky (1982) ”Evidential Impact of Base Rates” in D. Kahneman, P. Slovic & A. Tversky (eds) Judgement Under Uncertainty, Cambridge.) You are a witness of a night-time hit-and-run accident involving a taxi in Athens. All taxis in Athens are blue or green. You swear, under oath, that the taxi was blue. Extensive testing shows that under the dim lighting conditions, discrimination between blue and green is 75% reliable, i.e. the probability to mistake one color for the other is 25%. Is it possible to calculate the most likely colour for the taxi of the accident? (Hint: distinguish between the proposition that the taxi is blue and the proposition that it appears blue and think BAYES.) What now, given that 9 out of 10 Athenian taxis are green? (5 points) 4 Stat 330 - Homework 3 Maximum score is 20 (+2) points, due date is Friday, Sep 19th 12pm. You can either hand in the solution electronically with WebCT or on paper during class. 12 Driving Test An individual repeatedly attempts to pass their driving test. Suppose that the probability that the test is passed is 0.25, and that the results of successive tests are independent. Let X be the random variable corresponding to the number of tests taken until the individual passes. Find the probability mass function of X, and evaluate the probability that a) the test is passed in three or less attempts, b) five or more attempts are necessary to pass the test. (4 points) 13 Sum of two Dice Assume, you throw two dice. Denote by X the random variable for their sum. Find the probability mass function of X, i.e fill out the following table: x P (X = x) 2 3 4 ... What is the expected value and the variance of X? 14 (4 points) Hat Problem Three players enter a room and a red or blue hat is placed on each person’s head. The color of each hat is determined by a coin toss, with the outcome of one coin toss having no effect on the others. Each person can see the other players’ hats but not his own. No communication of any sort is allowed, except for an initial strategy session before the game begins. Once they have had a chance to look at the other hats, the players must simultaneously guess the color of their own hats or pass. The group shares a hypothetical $3 million prize if at least one player guesses correctly and no players guess incorrectly. One obvious strategy for the players, for instance, would be for one player to always guess ”red” while the other players pass. a) What is the expected amount of money the players win following the above strategy? b) Suggest a different strategy and compute the expected win for it (extra point, if your strategy is better than the above). (4 + 1 points) 15 Racing Candidate Suppose that 40% of a large population are in favor of candidate A. A pollster selects a random sample of 20 people and asks them about their opinion about the candidate. Let X be the number of people in favor of candidate A. 5 a) What is the sample space of X, what is the distribution of X? b) What is the probability that 10 people in the sample are in favor of candidate A? c) What is the probability that between 6 and 10 people in the sample are in favor of candidate A? (limits are included) d) What is the probability that the majority of people in the sample favors candidate A. e) What is the expected value of X, what is the variance? (4 points) 16 Snow in October After October 1st the probability that a blizzard will occur on any particular day in the Midwest of Northern America is 0.1. To simplify the problem, assume for the following questions that October has infinitely many days. a) What is the probability that there won’t be a blizzard in the first 10 days of October? b) What is the probability that the first blizzard will occur on October 20? c) What is the expected date for the first blizzard (starting October 1st)? d) Extra point question: How does the distribution change when October only has 31 days? (4+1 points) 6 Stat330 - Homework 4 Maximum score is 20 points, due date is Friday, Sep 26th12pm. You can either hand in the solution electronically with WebCT or on paper during class. 17 Finding Antony Smith You are trying to locate an old high school friend who lives in Chicago. Unfortunately, your friend’s name is Anthony Smith and the Chicago phone book lists phone numbers for 24 different people named Anthony Smith. a) If you call 10 of these Anthony Smith’s at random, what is the probability that you will call your friend? (Assume that your friend’s phone number is listed in the phone book, and that you don’t call anybody twice.) b) Let X be the number of calls you need to make until you find your friend. Give the probability mass function for X. c) How many calls do you expect to make until you find your friend? (Again, assume that your friend’s phone number is listed in the phone book, and that you don’t call anybody twice.) (4 points) 18 Tossing a Coin A fair coin is tossed 10 times. Let X be the random variable corresponding to the difference between the number of heads and the number of tails observed. a) Find the image of X and compute the probability mass function of X. (Hint: Instead of looking at X directly, define a new random variable Y , which just counts the number of heads in 10 tosses. What is the relationship between X and Y ?) b) Draw a diagram to show the probability mass function. Would you be able to tell the expected value of X from the diagram? Explain. c) Compute the variance and standard deviation of X. (4 points) 19 Deaths from Horse Kicks After analyzing data from the Prussian cavalry for a period of 20 years, statisticians came to the conclusion that the number of deaths from horse kicks follows a Poisson distribution with λ = 0.6. Let X be the number of deaths from horse kicks in 20 years. What is the probability that a) no deaths occurred? b) one or two deaths occurred? c) 3 or more deaths occurred? d) more than 3 deaths occurred? (2 points) 7 20 Which distribution is it? Some jobs submitted for processing on a particular CPU have fatal programming errors, while others do not. Suppose that the long run fraction of jobs with fatal programming errors is p = 0.05. a) Find the probability that among the next 10 jobs submitted there are less than 3 with fatal errors. Define a random variable and state your distribution assumption. b) Find the probability that among the next 25 jobs submitted there are less than 5 with fatal errors. Define a random variable and state your distribution assumption. c) One begins monitoring the processing of jobs and lets X = the total number of jobs processed before the first with a fatal error. What is the distribution of X? Find P (X ≤ 30) d) Monitor the processing of jobs again and define Y = the total number of jobs processed until the 2nd with a fatal error. Find the probability mass function for Y . (6 points) 21 Discrete Compound PMFs The joint pmf of two discrete r.v. X and Y is given as: X\Y −2 −1 1 2 −1 1/16 1/8 1/8 1/16 0 1/16 1/16 1/16 1/16 1 1/16 1/8 1/8 1/16 a) Find the following probabilities: (i) P (X ≥ 2) (ii) P (X > Y ) (iii) P (Y > 0) b) Are X and Y independent? c) Are X and Y uncorrelated? (4 points) 8 Stat 330 - Homework 5 Maximum score is 20+2 points, due date is Friday, Oct 10th 12:00pm. You can either hand in the solution electronically with WebCT or on paper during class. 22 Robodogs Imagine you and your friend have built your own Robodogs. Now, you want to battle against each other (the Robodogs are going to bark at each other until the battery of one of them gives up). Using the same kind of batteries you know that your Robodog lasts on average for five hours, while your friend’s Robodog lasts for four hours. Let X be the time until your Robodog’s battery stops working and Y be the time until your friend’s Robodog stops barking. Assume further, that X and Y are exponentially distributed variables, i.e. X ∼ Expλ , Y ∼ Expµ . (a) Find µ and λ. (b) Let Z be the time until the contest is over, i.e. until one of the dogs stops barking. Z can then be written as Z = min (X, Y ). Find the distribution of Z, i.e. compute P (Z ≤ z). Hint: min (X, Y ) ≤ z ≡ (X ≤ z) ∪ (Y ≤ z); you can also assume that X and Y are independent variables. (c) Do you recognize the distribution of Z? State expected value and variance. Interpret the expected value. (3+2 points) 23 Density Functions (a) What are the two properties of a probability density function f (x)? (b) Which of the following are valid probability density functions? Explain why or why not, a yes or no is not a sufficient answer. ( x2 + 2x, for − 1 ≤ x ≤ 1 f (x) = 0, otherwise. ( 2x, g(x) = 0, for 0 ≤ x ≤ 1 otherwise. (c) Find expected value and variance of the random variable X with distribution function 0 for t < 0 FX (t) = 1 − e−3t for t ≥ 0 (5 points) 9 24 Continuous Random Variables For a continuous random variable X the following function is given: k(2 − 2x) for 0 ≤ x ≤ 1 f (x) = 0 otherwise (a) Find k, so that f (x) is a density function. (b) Compute P (X = 0.5). (c) Compute P (X ≤ 0.5). (d) Find E[X]. (4 points) 25 Customers in a Bank In the Ames International Campus Bank (open 24h every day) 5 customers arrive on average during an hour. For the following questions state each time which random variable you use and what distribution assumption you make. (a) What is the probability that during an hour no customer arrives? (b) What is the probability that during an hour more than 7 customers arrive? (c) What is the probability that there’s more than 30 minutes between the 2nd and 3rd customer on New Year’s Day? (d) Starting at some time 0. What is the probability that the first customer arrives after exactly 10 min? within the first ten minutes? (e) What is the probability that you have to wait less than an hour for seven customers to arrive? (f) How many minutes do you expect to wait until the 12th customer arrives? (g) How many minutes do you expect to wait on average between arrivals? (h) How many customers do you expect to arrive within 3 hours? (i) Why can’t we compute the probability that the bank is empty at a given time t? (yet) (8 points) 10 Stat330 - Homework 6 Maximum score is 20 points, due date is Wednesday, Mar 5th 4:00pm. You can either hand in the solution electronically with WebCT or on paper during class. 26 Apple-Tree Farm Able and Baker are both apple-tree farmers (they grow apple trees). Assume, that apple trees grow according to a normal distribution. On the Able Farm, trees grow with a mean of 1m per year and a standard deviation of 25 cm. Baker manages to get an average growth of 1.1m per year with a standard deviation of 35cm. For the following questions again state each time which random variable you use and what distribution assumption you make. (a) What is the probability for a tree on the Able Farm to grow more than 1.2m in a year? what if the tree was on the Baker Farm? (b) What is the probability for a tree on the Able Farm to grow less than .8m in a year? what if the tree was on the Baker Farm? (c) What is the probability for a tree on the Able Farm to grow between 0.7m and 0.9m in a year? what if the tree was on the Baker Farm? (d) Assume, you’ve got two trees. One from Able and one from Baker. What can you say about the difference D in their heights? What is the distribution of D? (e) On average, trees from the Baker farm will grow more than Able’s trees. But what is the exact probability that a Baker tree has grown more than an Able tree in one year? (4 points) 27 Normal Approximation of Binomial (a) Find the probability of exactly 55 heads in 100 tosses of a coin, compute both the actual value and an approximation. (b) A certain college would like to have 1050 freshmen. This college cannot accommodate more than 1060. Assume that each applicant accepts with probability .6 and that the acceptances can be modeled by a Binomial distribution. If the college accepts 1700, what is the probability that it will have too many acceptances? (2 points) 28 1st Midterm Exam 1. Counting methods (i) Suppose that a restaurant offers 7 different starters, 12 main courses, and 9 desserts. How many different three course meals does the restaurant offer? (ii) Suppose that 20 candies are distributed to five children. How many distributions are there if all candies are different? 11 (iii) How many anagrams (= different ”words”) are there from the letters ‘BANANA BREAD’ ? Do not ignore the space and count rearrangements of these letters so that there are two distinct words (each must have at least one letter). (iv) A club has 20 members; 12 are women and 8 are men. A committee of 6 members is to be chosen. How many different committees are possible if: i. There are no other restrictions. ii. The committee must have 4 women and 2 men iii. The committee must have at least 1 woman and 1 man. 2. Lawyers and Liars A fashionable club has 100 members, 30 of whom are lawyers. 25 of the members are liars, while 55 are neither lawyers nor liars. (i) How many of the lawyers are liars? (ii) A liar is chosen at random. What is the probability that he is a lawyer? 3. Continuous Distributions The probability density function of the lifetime of a certain type of artificial heart valve (measured in minutes) is given by c/x3 for x > 10 f (x) = 0 for x ≤ 10 (i) Find c. (ii) Graph the distribution function (not the density!) using your value of c from above. Label the vertical axis appropriately, and give the indicated values. F(0) = F(10) = F(20) = (iii) Find the probability that a valve will last at least 15 months. 4. Thanksgiving Break I haven’t decided what to do over Thanksgiving break. There is a 50% chance that I’ll go skiing, a 30% chance that I’ll go hiking, and a 20% chance that I’ll stay home and play soccer. The probability of my getting injured is 30% if I go skiing, 10% if I go hiking, and 20% if I play soccer. (i) What is the probability that I will get injured over the break? (ii) If I come back from vacation with an injury, what is the probability that I got it skiing? 5. Table Look-Ups Use the distribution tables to find the following probabilities. Show all the steps you make: 12 (i) X ∼ B10,0.3 : (ii) X ∼ P o2 : P (X ≤ 5) = P (X < 4) = (iii) X ∼ B7,0.25 : (iv) X ∼ P o1.3 : P (3 < X < 5) = P (X = 4) = (v) X ∼ B30,0.01 : P (X ≥ 2) = 6. Discrete Compound PMFs The joint pmf of two discrete r.v. X and Y is given as: X\Y −2 −1 1 2 −1 0 1/16 1/16 1/8 1/16 1/8 1/16 1/16 1/16 1 1/16 1/8 1/8 1/16 (i) Find the following probabilities: i. P (X ≥ 2) ii. P (X > Y ) iii. P (Y > 0) (ii) Are X and Y independent? Explain. (iii) Are X and Y uncorrelated? Explain. 7. Traffic Tickets During an internship you drive to work with your car each day, five days a week. The probability that you get a traffic ticket is 0.05 on any given day. (i) What is the probability that you get at most 2 tickets in 15 days? Define a random variable and state your distribution assumption. (ii) What is the probability that you get at least 4 tickets in the 3 months (12 weeks) you are working there? Define a random variable and state your distribution assumption. (iii) You start monitoring your driving behavior and define X = number of days you drive until first ticket State a distribution assumption for X and give the probability that you get the first ticket within 15 days (of driving). What is the expected day of the first ticket? (iv) Now define Y = number of days you drive until 3rd ticket. i. What is the expected day of the third ticket? ii. Find the probability mass function of Y (10 points) 13 Stat 330 - Homework 7 Maximum score is 20 points, due date is Friday, Oct 24 12pm. 29 Central Limit Theorem A bank accepts rolls of pennies and gives 50 cents credit to a customer without counting the contents. Assume that a roll contains 49 pennies 30 percent of the time, 50 pennies 60 percent of the time, and 51 pennies 10 percent of the time. (a) Find the expected value and the variance for the amount that the bank loses on a typical roll. (b) Estimate the probability that the bank will lose more than 25 cents in 100 rolls. (c) Estimate the probability that the bank will lose exactly 25 cents in 100 rolls. (d) Estimate the probability that the bank will lose any money in 100 rolls. (e) How many rolls does the bank need to collect to have a 99 percent chance of a net loss? (4 points) 30 Random Number Generator Adjust the Marsaglia-Multicarry Random Number Generator (shown below) to a programming language of your choice: #define znew ((z=36969*(z&65535)+(z>>16))<<16) #define wnew ((w=18000*(w&65535)+(w>>16))&65535) #define IUNI (znew+wnew) #define UNI (znew+wnew)*2.328306e-10 static unsigned long z=362436069, w=521288629; void setseed(unsigned long i1,unsigned long i2){z=i1; w=i2;} (2 points) 31 Webpage Counter Based on the Marsaglia-Multicarry Random Number Generator, simulate a counter on a webpage. Assume, the rate of hits per minute is 2 and that hits occur independently. (a) Simulate the first 20 hits on the web page. Print out a table of the hit times. Draw a graph of hits versus time. (b) Now suppose that a surfer stays on average 45 s on the page. Assume that the staying times are exponentially distributed. Simulate the staying times for the first 20 surfers. Print out a table of the staying times. Draw a graph of the number of surfers on the page over time. (Remember: times between hits are exponentially distributed, the number of hits in k minutes is Poisson with rate kλ) (8 points) 14 32 Board Game - Simulation Assume your favourite board game lets you roll a number of dice and look for the highest die. Computing the probability that the highest die out of three is a 6 is fairly easy, as well as computing the probability that the maximum is 1. All other values are fairly tricky to get at, doing a simulation will be a lot easier. (a) Simulate to throw three dice 1000 times. For each throw of three dice, we are interested in the maximum number of spots. Compute the (empirical) probability mass function for the maximum number of spots in each turn (i.e. compute the (empirical) probability, that the maximum of the three dice is 1,2,3,4,5 or 6 (six probabilities)). Note: If you use the software R (www.r-project.org) the command pmax computes a vector of maximal values when fed with several vectors, see also help(pmax) in R. (b) Compare your results of (a) for a maximum of 6 or 1 with the exact probabilities. (c) Simulate 1000 throws of another two dice. Compute the probability, that the maximum of three dice is higher than the maximum of two dice, i.e. compute how often the above maximum is higher that the maximum of the two dice for each of the 1000 turns. Does the result surprise you? Explain. (6 points) 15 Stat 330 - Homework 8 Maximum score is 20 points, due date is Monday, Nov 3rd 12pm. 33 Convenience Store A small but busy convenience store has room in its building for only 4 customers. It has 2 employees who work (interchangeably) waiting on customers and performing other tasks (such as stocking shelves). The work rule the employee follow is this: When there are 3 or 4 customers in the store, both wait on customers. When there are 2 or fewer customers in the store, only 1 employee waits on customers. Customers arrive at the store according to a Poisson process with rate λ = 1 (and go away if there is no room in the store). We will assume that they spend no time finding their items and that either employee can service them at a rate µ = 0.8. (a) Carefully draw an appropriate birth and death process transition rate diagram for this scenario. (b) What is the (large t) probability that there is no customer in the store? (c) What is the probability that there are less than 3 customers in the store? (d) What is the probability that an arriving customer is turned away? (6 points) 34 Digital Communication System A secure digital communications system has a receiving station with a CPU that can decode messages at a rate of 1 per minute, and a buffer that will hold up to 3 of these messages. Suppose that messages arrive according to a Poisson process with rate .7 per minute. Making the assumption that decode times are exponential random variables, answer the following questions about the large t behavior of the receiving station. (a) Draw a state diagram of the number of messages in the system. (b) What fraction of the time is the CPU idle? (c) What fraction of incoming messages are lost (i.e. arrive when the buffer is full)? (d) What is the mean number of messages in the buffer? (6 points) 35 Bank Robberies Chicago has been struck by a crime wave. Alarmed by the increasing number of bank robberies and concerned about their effect on bank customers, the Banking Upper Management Society (BUMS) adopts the following policies at each bank: • one teller’s window is reserved for the exclusive use of bank robbers. • in order to conserve space, bank robberies may be committed only by a lone bandit. • if two or more robberies occur simultaneously, the robbers are served on a first-come, first-serve basis. 16 You are engaged as a consultant by the Bank Robbers Federation (BARF). Your job is to determine if the proposed arrangement with the BUMS is adequate. The data you are given is: • Robbers arrive at random 24/7 (during open hours), the average arrival rate is 2 robberies per hour. • Teller service is exponential with an average of 10 min (special robber withdrawal forms expedite service) You are asked to determine (a) the probability that there is no robber in the bank (large t probability). (b) the probability that the robber’s teller is busy. (c) the average number of robbers in the bank. (d) the average time a robber must queue for service (a robbery). (e) the average time required for a robbery (queuing time plus service time) (f) the probability that a robber spends more than 15 min in the bank (the original version of this problem is due to Shelly Weinberg of IBM) 17 (8 points) Stat 330 - Homework 9 Maximum score is 20 points, due date is Monday, Nov 10th 12 pm. 36 M/M/c/c Queue Based on your notes in class for queueing systems discuss an M/M/c/c queue as thoroughly as possible. You can think of properties of interest yourself - but you should come up with at least five properties. (Why are Wq and Lq very easy to compute?) Don’t forget to draw a state diagram! Give a real-life example that can be modelled as a M/M/c/c queue. Explain! (10 points) 37 Argument at Exxtol Two analysts at Exxtol Petrol Ltd, Log Jam and Bot L. Neck, are having an argument. They are comparing an M/M/1 system with mean arrival rate λ and mean service rate 2µ ( we assume λ < 2µ with an M/M/2 system with mean arrival rate λ and mean service rate µ for each server. Log says the M/M/2 system is best because WqM/M/2 < WqM/M/1 . Bot responds that Log has it all wrong because WM/M/2 > WM/M/1 . and therefore the M/M/1 system is better. Who is right? Explain. Note that the two systems have equal capacity. Log and Bot are comparing a single-server system to a double-server system with half-speed servers. (5 points) 38 Collecting Data - Google Response Times Google.com claims to find ”the needle in 109 haystacks - 106 times a day”. For that they have to be fairly quick. For each query Google states the response time in seconds. Make 20 queries to Google and write down the response times. (a) Hand in (email to hofmann@iastate.edu) a table with your results in the form: query stat330 egg hunt VEISHEA ... response time X ? Y ? (b) Think about what X and Y could be. What are (potential) additional important issues that may influence your results? (c) Describe response times in terms of data summaries. Draw a histogram of the response times. Based on the histogram, which distribution would you choose for the response times? (5 points) 18 Stat 330 - Homework 10 Maximum score is 20 points, due date is Friday, Dec 5th. 39 Biased-Unbiased Parameter Estimation Show that for n i.i.d values X1 , . . . , Xn with E[Xi ] = µ and V ar[Xi ] = σ 2 Pn • s2 = n1 i=1 (Xi − X̄)2 is a biased parameter estimate for σ 2 . Pn 1 2 2 • s2 = n−1 i=1 (Xi − X̄) is an unbiased parameter estimate for σ . Hint: Pn Pn The sum i=1 (Xi − X̄)2 can be written as ( i=1 Xi2 ) − nX̄ 2 . If Y is a random variable, the expected value of Y 2 can be computed as: E[Y 2 ] = V ar[Y ] + (E[Y ])2 (3 points) 40 Maximum Likelihood Estimator The Weibull distribution has two parameters: α and β, its density function f (x) is given as fα,β (x) = αβ −α xα−1 e−(x/β) α for x ≥ 0. (You could go to the web site http://calculators.stat.ucla.edu/cdf/weibull/weibullplot.php to have a look at the shape of this density function. Try to vary the parameters to get different shapes.) Assuming response times of the Google search machine have a Weibull distribution with α = 2 (and are independent), find an ML Estimator for β. Show all five steps of the way. The following ten numbers are Google response times (in seconds). Using your above results, find β̂. 0.10, 0.08, 0.07, 0.08, 0.04, 0.06, 0.24, 0.12, 0.05, 0.17 Draw a histogram of the response times, draw the Weibull density function with the parameter estimates you found next to it. Do they match? (5 points) 41 Reading Assignment Read pages 87 to 95 of the (online) lecture notes and come up with a table for all α 100% confidence intervals described. You can use the following format: Parameter Estimate two-sided C.I. one-sided C.I. −1 α+1 −1 −1 µ µ̂ µ̂ ± √σ̂n N0,1 ( 2 ) µ̂ + √σ̂n N0,1 (α) or µ̂ − √σ̂n N0,1 (α) ... (2 points) 42 Large Sample Confidence Intervals (a) Highway Speed There is concern about the speed of automobiles traveling over a particular stretch of highway. For a random sample of thirty automobiles, radar indicated the following speeds, in miles per hour: 82 78 64 78 77 57 74 70 81 80 75 78 85 77 78 69 73 79 78 66 71 70 61 65 66 57 72 67 64 74 19 (i) Find the sample mean and variance. (ii) Find a 95 % confidence interval for the mean speed of all automobiles travelling over this stretch of highway. (iii) Could one argue that people are speeding, if the legal speed on this highway is 65 mph? (b) Laboratory Scale To assess whether a laboratory scale is accurate we can take a standard weight known to weigh exactly 100 grams and weigh it repeatedly. Suppose that the scale readings are normally distributed with standard deviation σ= 0.25 grams. If the scale is accurate then the population mean µ(the mean obtained in many repeated weighings) would be 100 grams but if the scale is inaccurate the population mean could be higher or lower. (i) The weight is weighed 35 times and the sample mean is X̄ = 99.90. Construct a 99% confidence interval for µ. Do you believe the scale is accurate based on this interval? (ii) Construct a 90% confidence interval for µ. Do you believe the scale is accurate based on this interval? (iii) Explain why one confidence interval finds the scale inaccurate while the other finds the scale accurate. (iv) The analysis is criticized because it turns out that the scale’s measurements are not approximately normal. Explain why this is not really a problem. (c) Program Execution Times A program was tested on 30 data sets, execution times were measured. Sample mean and deviation for execution times are: X̄ = 65 ms and s = 6 ms. Compute a 90% and a 95% confidence interval for the mean response time. (10 points) 20