server We have 1000 random times in the last year where...

advertisement
A UW database technician has brought you the following dataset server and explanation:
We have 1000 random times in the last year where we recorded the usage level on our server, and for
each measurement we found the response time the server was taking per request at that time. We
know that as the usage rises the response time for the server is longer, but we need to know more
exactly what that relationship is. A few months ago we predicted badly and the server actually crashed
for a few days. We don’t want that to happen again.
We don’t want to read a report. We just want:
1) A prediction equation that tells us the response time based on the usage level.
(Delete negative and zeros) time= 3.682-.0007356*usage+.00002694*usage^2
(Delete negative leave in zeros) time=2.857+.002583*usage+.00002392*usage^2
(Delete zeros, make negative positive) Time = 3.685– 0.0007526*usage + 0.00002696*usage2
Bad ways to do it:
(What you get if you DON’T fix any errors:) time=2.788+.003153*usage+.00002303*usage^2
2) Predictions specifically for when the usage level is:
a. At 200 : 4.58608
b. At 1000 : 29.2224
c. At 5000 : 657.172 (although this is extrapolation)
3) A rule for how much we can expect the response time to climb per usage increase.
It will climb depending on how much the usage was at before. Look at the picture
4) The code you used to analyze the data
server <read.delim("C:\\Users\\scrawfo8\\AppData\\Local\\Temp\\RtmpGAZdjX\\data7504008b4")
plot(server$time~server$usage)
server[server$time<0,2]<-server[server$time<0,2]*-1
server<-server[server$time>0,]
plot(server$time~server$usage^2)
plot(server$time~exp(server$usage))
fit1<-lm(time~usage+I(usage^2),data=server)
fit2<-lm(log(time)~usage,data=server)
par(mfrow=c(2,2))
plot(fit1)
plot(fit2)
par(mfrow=c(1,1))
summary(fit1)
usage<-seq(0,1000,length=1000)
time<-3.685-.0007526*usage+.00002629*usage^2
plot(time~usage,type="l")
Download