Minimizing Flow Time on Multiple Machines

advertisement
Minimizing Flow Time on
Multiple Machines
Nikhil Bansal
IBM Research, T.J. Watson
Scheduling
Collection of m machines, n jobs
Arrival time or release time (rj)
Service requirement or size (pj)
r1
t=0
r2
r3
C1
m=1
Job preempted
C3
C2
Scheduling
Flow Time = Time job spends
= Completion time – release time
= Waiting + Processing
r1
t=0
r2
r3
C1
C3
m=1
Flow time of job 2
C2
Scheduling
Flow Time = Time job spends
= Completion time – release time
= Waiting + Processing
r1
t=0
minimize total
flow time
r2
r3
C1
C3
m=1
Flow time of job 3
C2
Total Flow Time (Another View)
Imagine each job costs $1 per unit time.
Cost of a job = Its flow time
Total cost
= Total flow time
Total cost = t cost at time t
= t # jobs at time t
Total Flow Time (Single Machine)
Total cost = t # jobs at time t
Processor has a “to do” list of jobs
Goal: Minimize number of jobs on list
Work on the job it can finish earliest.
Shortest remaining processing time (SRPT):
Optimal algorithm
Flow Time on multiple Machines (m ¸ 2)
NP-Hard:
Breakthrough: O(log n) competitive
[Leonardi, Raz 97]
Works for arbitrary # of machines (m)
Any online algorithm: (log n) competitive
Improvements:
No migrations [Awerbuch et al 99]
Immediate dispatch [Avrahami and Azar 03]
Flow Time on Multiple Machines
What about approximation algorithms?
O(log n) best known, even for m=2
Lower bounds: NP-Hard, APX-Hard ?
Flow Time on Multiple Machines
Main Result: A (1+) approximation scheme
Running Time = nO(m log n)
Or, nO(log n) for m=O(1)
Suggests: PTAS likely for O(1) machines
Basic Idea
Rounding: Simplify the input without losing quality
too much
Search: Dynamic Programming
over some reasonable space of schedules
Related Problem
Minimizing total completion time:
( i ci or equivalently i (ri + fi) )
Same as flow time wrt optimality
But easier for approximation
PTASes known with runtime poly(n,m)
Techniques not applicable to flow time
[Afrati et al 99]
Rounding for Flow Time
Flow Time is quite sensitive
Suppose round size to powers of (1+)
Cannot distinguish between
Job of size 1 arrives at t=1,2,…,n
Job of size 1+ arrives at t=1,2,…,n
Very Different: (n) vs (n2)
!!!
Rounding for Flow Time
Can show:
Let B be largest size,
Rounding ri, pi to multiples of  B/n2 is fine
Proof: Each job affected by ·  B/n
Opt ¸ B
Implies:
Sizes 2 [1,n2/]
, Events at [1,n3/]
Still bad for exhaustive search over all schedules.
Restricting possible schedules
Jobs assigned to a machine, worked in SRPT order.
Given a machine, which jobs assigned to it?
(2n possibilities)
Approx state under SRPT in O(log2 n) bits of info.
Store for each machine.
Dynamic program: For (state,t) whats the best flow
time achievable.
State
Properties
1) Enough information: State at t+1
computable from that at time t.
2) Gives number of jobs to within 1+ factor
Property of SRPT
At any time, among jobs with size 2 [a,b], at most
one has remaining processing < a.
Property of SRPT
At any time, among jobs with size 2 [a,b], at most
one has remaining processing < a.
Proof:
b
a
Not executed
until blue finishes
Property of SRPT
At any time, among jobs with size 2 [a,b], at most
one has remaining processing < a.
Proof:
b
a
Both cannot be < a
at some time
Property of SRPT
At any time, among jobs with size 2 [a,b], at most
one has remaining processing < a.
Suppose
a= (1+)i,
b=(1+)i+1
Given, total remaining size (x) of jobs s.t. pi 2 [a,b]
x/b · Estimate # of jobs · x/a + 1
Configuration on a machine
Consider O(log n/) size-classes [(1+)i,(1+)i+1]
For each class,
 Total remaining processing times
 1/ largest remaining processing times
x/(1+)i+1 · # of jobs · x/(1+i) + 1
Class 1: (Total 1, x1,x2,…,x1/)
…
Class k: (Total k, y1,y2,…,y1/)
k=O(log n)
In all O(log2 n) bits
Updating a configuration
At most O(m log2n) bits of information
Gives number of jobs to within 1+
How to update, as time passes?
Class 1: (Total 1, x1,x2,…,x1/)
…
Class j : (Total j, y1,y2,…,y1/)
On arrival, guess the machine & update state
m branches
Updating a configuration
At most O(m log2n) bits of information
Gives number of jobs to within 1+
How to update, as time passes?
Class 1: (Total 1, x1,x2,…,x1/)
…
Class j : (Total j, y1,y2,…,y1/)
Working step: For each machine, guess class with
smallest remaining time job [(log n)m choices]
Fitting it all together
At any time,
O(m log2n/2) total bits of info.
Know how to update.
Dynamic program over all possible states.
Weighted Flow Time ( i wi fi)
NP-Hard for m=1,
No o(n) approximation known, even for m=1
m=1: (1+) approx, time nO(log B log W)
02]
B: max/min size
[Chekuri, Khanna
W: max/min wt
This paper: Extend to m=O(1), time nO(m log Bn log Wn)
Hardness: Exponential dependence on m likely
(1+ ) approx with running time 2O(polylog(n,m,W,B))
) NP µ DTIME(npolylog(n))
Open Problems
1) PTAS or O(1) approx for minimizing
flow time on O(1) machines?
[Our QPTAS => PTAS likely]
2) For arbitrary number of machines.
PTAS or APX-Hard?
Download