November 3 Part 2

advertisement
Capacity Setting and Queuing Theory
BAMS 580B
Capacity and Resources
 A key lever for improving patient flow.
 How do we measure capacity?
 What is the capacity of a 20 seat restaurant?
 A 16 bed ward?
 Capacity is a RATE
 Patients/day
 Customers/hour
 We can view a 16 bed ward as a queuing system with 16
servers
 What is the capacity of a bed?
 Does this analogy apply to the restaurant?
 A system is composed of resources with
capacities.
 Often we use the expressions “resource” and “capacity”
interchangeably (hopefully without confusion)
How Much Capacity is Needed?
or How Many Resources are Needed?
Ward Occupancy
30
Midnight census
Surge capacity
25
Base capacity
20
15
0
100
200
Day
300
400
Capacity tradeoffs when demand is variable
 Too much capacity or too many resources =
idleness
 Not enough capacity – waits
 Should we set capacity equal to demand?
 What does this mean?
 This is called a balanced system
 It works perfectly when there is no variation in
the system
 It works terribly when there is variation! Why?
• Once behind, you never can catch up.
 Queuing theory quantifies these tradeoffs in
terms of performance measures.
Queuing Models
 (Mathematical) queuing models help us set
capacity (or determine the number of resources
needed) to meet:
 Service level targets
 Average wait time targets
 Average queue length targets
 Queuing models provide an alternative to
simulation
 They provide insights into how to plan, operate
and manage a system
 Where are there queues in the health care
system?
A single server queuing system
Buffer
Server
• A queue forms in a buffer
• Servers may be people or physical space
• The buffer may have a finite or unlimited capacity
• The most basic models assume “customers” are of one type
and have common arrival and service rates
A multiple server queuing system
Server
Buffer
Server
Server
Several parallel singer server queues
Buffer
Buffer
Buffer
Server
Server
Server
Parallel Queues vs. Multiple server Queues
 Provide examples of multiple server queues
(MSQs)
 Provided examples of parallel queues (PQs)
 In what situations would each of these
queuing systems be most appropriate? Why?
Networks of queues
 Most health care systems are interconnected
networks of queues and servers with multiple
waiting points and heterogeneous customers.
 What examples have we seen in the
course?
 Often we model these complex systems
with simulation.
• But in some cases we can use formulae to get
results
Queuing Theory background
 Developed to analyze telephone systems in
the 1930’s by Erlang.
 How many lines are needed to ensure a
caller tries to dial and obtains a “line”.
 Applied to analyze internet traffic,
telecommunications systems, call centers,
airport security lines, banks and restaurants,
rail networks, etc.
Queues and Variability
 There are two components of a queuing system
subject to variability
 The inter-arrival times of “jobs”
 The service times or LOS
 Why are these variable?
 We describe the variability by
 Mean
 Standard deviation
 Probability distribution
• Usually the normal distribution doesn’t fit well
• Often an exponential distribution fits well
– If we know its rate or mean we know everything about it.
The exponential distribution
 P(T ≤ t) = 1 – e-λt
 The quantity λ is the rate.
 The mean and standard deviation of the exponential distribution
is 1/rate (1/λ).
 Example; Patients arrive at rate 4 per hour.
 The mean interarrival time is 15 minutes.
 What is the probability the time between two arrivals is less than
10 minutes (1/6 of an hour)
• P( T ≤ 1/6) = 1 – e-4∙(1/6) = 1- e-2/3 = 1 - .487 = .513.
 The exponential distribution underlies queuing theory.
 A queue with exponential service times and exponential inter-arrival
times and one (FCFS) server is called an M/M/1 queue.
 Exponential distributions don’t allow negative times and have a small
probability of long service times.
Capacity management and queuing
systems
 Capacity management involves determining the
number of servers to use and the size of the
waiting rooms.
 Examples
 How many long term care beds are needed?
 How many porters are needed?
 How many nurses are needed?
 How many cubicles are needed in an ED?
 Some healthcare systems have no buffers; all the
waiting is done outside of the system or
upstream.
 ALC cases waiting for LTC beds
Analyzing a queuing system
Outputs
Inputs
Arrival Rate
Service Rate
Number of
Servers
Buffer Size
Queue
Analyzer
QUEUMMCK_EMBA.xls
Capacity Utilization
Wait Time in Queue
Queue Length
Blocking Probability
Service Levels
Single server queues – some definitions
 Ri – average inflow rate (customers/time) ()
 1/Ri – average time between customer arrivals
 Tp – average processing time by one server
 1/Tp – average processing rate of a single server ()
 c – number of servers
 Rp = c/Tp – system service rate (often c=1)
 K – buffer capacity (often K=)
A single server queuing system is stable whenever Rp > Ri
A single server queuing system is balanced whenever Rp =
Ri
Examples
 A Finite Capacity Loss System
 Model for an (old-fashion) phone system
• c servers
• K=0
• When all servers are busy, system is blocked
and customers are lost
 Performance measure – fraction of lost
jobs – this is legislated!
 Walk-in Clinic with 6 seats and 1 doctor
c=1
K=6
Characteristics and Performance
Measures
 System characteristics
 Traffic Intensity (or utilization) =  = arrival rate/service rate
 Safety Capacity = Rs = Service rate – arrival rate
 Performance Measures






Average waiting time (in queue) – Ti
Average time spent at the server - Tp
Average flow time (in process) – T = Ti + Tp
Average queue length – Ii
Average number of customers being served - Ip
Average number of customers in the system – I =Ii + Ip
Performance measure formulas
(M/M/1 queue – no limit on queue size)

System Utilization = P(Server is occupied) = 

If traffic intensity increases, the likelihood the server is occupied increases

This occurs if the arrival rate increases or the service rate decreases

P(System is empty) = 1- 

P(k in system) = k(1- )

Average Time in System = 1/ Safety capacity

Average Time in Queue = Average time in system – average service time

If safety capacity decreases; time in queue increases!

Average Number of jobs in the system (including being served) = /(1- )

Average Queue Length = 2/(1- )

If we know safety capacity, service time and traffic intensity, we can compute all
system properties
Little’s Law holds too
number in queue = arrival rate x waiting time in queue

An Example - M/M/1 Queue

Customers arrive at rate 4 per hour, mean service time is 10 minutes.
 Service rate is 6 per hour
 System utilization = Probability the server is occupied =  = 2/3.
 Safety capacity = service rate – arrival rate = 2
 P(System is empty) = 1-  = 1/3.
 P(k in the system) = k(1- ) = (1/3)(2/3)k
 Average Time in system= 1/safety capacity = ½ hour
 Average Time in queue = Average time in system – average service
time = ½ - 1/6 = 1/3 hour
 Average Queue Length = 2/(1- ) = 4/3
 Suppose arrival rate increases to 5.9 customers per hour.
 Then  =5.9/6 = .9833
 So P(System is empty) = .0167; Average time in system = 10 hours and
Average number of customers in the system = 58.9!
About QUEUMMCK.xls
 An M/M/c queue is the same as an M/M/1 queue except that there
may be more than one server.
 In this model, there is a single buffer and c servers in the resource pool.
 Customers are processed on a FIFO basis.
 When there are more than c customers in the system, the buffer is
occupied and waiting for service occurs.
 An M/M/c/K queue is an M/M/c queue with a finite buffer of size K.
 There are at most K + c customers in the system.
 When the buffer is filled, the system is blocked and customers are lost.
 QUEUMMCK.xls, which is now called performance.xls, computes
performance measures including blocking probabilities for the
M/M/c/K queue.
Problem 1
 Patients arrive at rate 5/hr. They require on average 1
hour of treatment.
 How many service providers do we need to ensure that
the average wait time is 30 minutes?
 Assume a large waiting room.
 Running QUEUEMMCK.xls we find that with
 6 service providers - average wait is 1 hour and
average number waiting is 2.94
 7 service providers - average wait is ½ hour and
average number waiting is .80
 Note that with 7 service providers all 7 are occupied
less than 1% of the time.
 Thus we tradeoff throughput with capacity utilization
Problem 2 – A LTC Facility
 Bed requests arrive at the rate of 3 per month
 Patients remain in beds for about 15 months.
 How many beds are required so that the average
wait for beds is 1 month.
 Trial and error with queummck shows that 59
beds are required.
 Also we can see that there is only a 3% chance
of waiting and average occupancy is 45 beds.
 We can also do sensitivity analysis with arrival
rates and length of stays
Problem 3
 A walk in clinic has 3 doctors;
 Average time spent with a patient is 15
minutes
 Patients arrive at rate of 12 per hour
 How many chairs should we have in the
waiting room so only 5% of patients are
turned away?
 Queummck suggests 17.
Implications of queuing formulas
 As the safety capacity vanishes, or equivalently, the
traffic intensity increases to 1:
 waiting time increases without bound!
 queue lengths become arbitrarily long!
 In the presence of variability in inter-arrival times and
service times, a balanced system will be highly unstable.
 These formulas enable the manager to derive
performance measures on the basis of a few basic
descriptors of the queuing system
 The arrival rate
 The service rate
 The number of servers
 When the system has a finite buffer, the percentage of
jobs that are blocked can also be computed
Don’t Match Capacity with Demand
 If service rate is close to arrival rate then there will be long wait times.
 Recall average queue length = 2/(1- )
• If traffic intensity near 1, queue length will be very small.
Queue Length vs. Traffic
Intensity
120
100
80
60
40
20
0
0.5
0.6
0.7
0.8
0.9
1
Idle Capacity And Wait Time Targets
Relationship between Wait Times and Idle Capacity
Proportion of Patients
Exceeding Wait Time Target
100
75
To ensure only 5% of patients
exceed wait time target, there will
be idle capacity 23% of the time.
50
25
0
0
20
Percentage of time there is idle capacity
40
Summary
 When the manager knows the arrival rate and service
rate, he/she can compute:




The average number of jobs in the queue.
The average time spent in the queue.
The probability an arriving patient has to wait.
The system utilization.
 This can be done without simulation!
 This information can be used to set capacity or
explore the sensitivity of recommendations to
assumptions or changes.
 Thus queuing theory provides a powerful tool to
manage capacity.
Download