Document

advertisement
EE5900 Advanced Embedded
System For Smart Infrastructure
Static Scheduling
1
Time Frame
• Given a set of tasks, let H denote the smallest hyper
period of all tasks.
– T1=(1,4), T2=(1.8,5), T3=(1,20), T4=(2,20)
– H=20
• Divide time into frames and frame size f should divide
H.
– f could be 2,4,5,10,20
• Choose small frame size since this will make the
scheduling solution more useful
2
Network flow formulation
• Denote all the tasks as J1,J2,…,Jn
• Vertices
– N job vertices
– H/f time frame vertices
– Source
– Sink
• Edges
– Source to job vertex with capacity set to
execution time ei
– Job vertex to time frame vertex with capacity f if
the job can run in the time frame
– Time frame to sink with capacity f
3
Flow network
4
Computing scheduling
•
If the obtained maximum flow is equal to the sum of execution time of all
tasks, then the task set is schedulable.
5
Flow network
• Given a directed graph G
• A source node s
• A sink node t
Goal: To send as much information from s to t
6
Flows
An s-t flow is a function f which satisfies:
(capacity constraint)
(conservation of flows (at intermediate vertices)
7
Value of the flow
Maximum flow problem: maximize this value
3
4
G:
10
10
s
9
10
2 0
7
8
9
9
6 6
9
10
10
10
t
Value = 19
8
Cuts
• An s-t cut is a set of edges whose removal disconnect
s and t
• The capacity of a cut is defined as the sum of the
capacity of the edges in the cut
Minimum s-t cut problem:
minimize this capacity of a s-t cut
9
Flows ≤ cuts
• Let C be a cut and S be the connected component of
G-C containing s.
10
Main result
• Value of max s-t flow ≤ capacity of min s-t cut
• (Ford Fulkerson 1956)
Max flow = Min cut
• A polynomial time algorithm
11
Greedy method?
• Find an s-t path where every edge has f(e) < c(e)
• Add this path to the flow
• Repeat until no such path can be found.
• Does it work?
12
A counterexample
20
10
30
10
20
The greedy algorithm produces a flow of value 20 while the
maximum flow has value of 30.
13
Residual graph
• Key idea allow flows to push back
f(e) = 2
c(e) = 10
c(e) = 8
Can send 8 units forward
or push 2 units back.
Advantage of this representation
is not to distinguish send forward
or push back
c(e) = 2
14
Ford-Fulkerson Algorithm
1.
Start from an empty flow f
2.
While there is an s-t path P in residual graph
update f along the original graph
3.
Return f
15
Ford-Fulkerson Algorithm
G:
0
10
s
0
10
2
0
4
2 0
0
8
3
0
9
4
60
0
10
5
0
10
flow
capacity
t
Flow value = 0
16
Ford-Fulkerson Algorithm
G:
8 X
0
10
s
0
10
2
0
4
2 0
0 8
X
8
3
0
9
4
60
5
0
10
8 X
0
10
flow
capacity
t
Flow value = 0
2
4
4
2
8
6
10
3
9
5
10
Gf:
10
s
17
10
residual capacity
t
Ford-Fulkerson Algorithm
2
G:
10 X
8
10
0
10
s
2 X
0
2
3
0
4
8
8
0 2
X
9
4
60
5
0
10
10 X
8
10
t
Flow value = 8
Gf:
2
4
4
2
2
8
6
10
10
3
9
5
2
8
s
8
18
t
Ford-Fulkerson Algorithm
G:
10
10
s
0 6
X
10
2
0
4
2 2
8
8
3
2 8
X
9
4
6X
0
6
5
0 6
X
10
10
10
t
Flow value = 10
Gf:
s
2
4
4
10
2
8
6
10
10
3
7
5
10
2
19
t
Ford-Fulkerson Algorithm
2
G:
10
10
s
6 8
X
10
2 X
2
0
3
0 2
X
4
8
8
8
9
4
66
6 8
X
10
5
10
10
t
Flow value = 16
2
Gf:
4
6
s
10
2
8
6
4
4
3
1
5
10
6
20
4
8
t
Ford-Fulkerson Algorithm
G:
10
10
s
8 9
X
10
2
2 3
X
4
2 0
8 7
X
8
3
8 9
X
9
4
66
8 9
X
10
5
10
10
t
Flow value = 18
2
2
Gf:
4
8
s
10
2
8
6
2
2
3
1
5
10
8
21
2
8
t
Ford-Fulkerson Algorithm
G:
10
10
s
9
10
2
3
4
2 0
7
8
3
9
9
4
66
9
10
5
10
10
t
Flow value = 19
3
2
Gf:
s
10
2
7
1
3
9
9
22
1
4
1
9
6
1
5
10
t
Ford-Fulkerson Algorithm
G:
10
10
s
9
10
2
3
4
2 0
7
8
3
9
9
4
66
9
10
5
10
10
Cut capacity = 19
t
Flow value = 19
3
2
Gf:
s
10
2
7
1
3
9
9
23
1
4
1
9
6
1
5
10
t
Max-flow min-cut theorem
• Consider the set S of all vertices reachable from s
• s is in S, but t is not in S
• No incoming flow coming in S (otherwise push back)
• Achieve full capacity from S to T
Min cut!
24
Integrality theorem
• If every edge has integer capacity,
then there is a flow of integer value.
25
Complexity
• Assume edge capacity between 1 to C
• At most mC iterations
• Finding an s-t path can be done in O(m) time
• Total running time O(m2C)
26
Speedup with capacity scaling
• Capacity scaling to find paths with large capacity
– Find 2p-1  C  2p
– For i from p-1 to 0
• Compute the graph with edge capacity at least 2i
• Find maximum flow there
• At iteration i, there are at most m edges, the capacity of the min
cut is at most m2i+1 and each augmenting path has flow value at
least 2i, so there are at most 2m augmentations. Runtime is
bounded in O(m2logC).
27
Speedup with BFS
•
In each iteration, compute the breadth first search in the residual graph
and choose the path with fewest edges.
•
Let leveli(v) denote the distance from s to v in the residual graph.
•
Leveli(v) cannot decrease during iterations. Prove by induction.
–
Suppose that in the i+1 iteration, edge u->v is picked in the residual graph for pushing
flow. If u->v is an edge in the residual graph in last iteration,
leveli(u)+1=leveli(v)<=leveli+1(u)+1=leveli+1(v) by induction
–
Otherwise, v->u is in the augmenting path of iteration i, which means that it is along the
shortest path, so leveli(v)=leveli(u)-1<leveli(u)+1<=leveli+1(u)+1=leveli+1(v)
•
Each edge cannot appear and disappear many times.
•
Given a consecutive disappearance in Gi and appearance in Gj of an
edge u->v in two residual graphs. u->v is on the augmenting path of Gi
and v->u is on the augmenting path of Gj, so leveli(u)+1=leveli(v) and
levelj(v)+1=levelj(u). Note that levelj(v)>=leveli(v). We have
levelj(u)>=leveli(u)+2.
28
Speedup with BDF (2)
• Distance from s to u increases by at least 2 for
disappearance and appearance. The level is at most n, so
the number of disappearance is bounded by n/2.
• Each edge can disappear at most n/2 times, totally m
edges which means that the total disappearance is nm/2
• At least one edge disappears, so at most nm/2 iterations
• Total runtime O(nm2)
29
Precedence and nonpreemption
• Suppose that J1 needs to be scheduled before J2,
then make sure that the release time of J1 is before
J2. In the resulting schedule, if (part of) J1 is
scheduled after (part of) J2, then just swap them.
• Nonpreemption cannot be handled and it is NPhard.
30
NP completeness proof
• Reduce from 3-partition problem
• Given a set S of 3m elements where
each element a has a value v(s) and ∑s
∈S v(s)=mB, one asks whether S can be
partitioned into m disjoint subsets
S1,S2,…,Sm such that for each subset ∑ s
∈ Si v(s)=B?
31
Reduction
• Given an instance of 3-partition, form an
instance of nonpreemptive scheduling problem
which contains 3m+1 tasks, T1,T2,…,T3m+1 as
follows.
• For each element si, create a task Ti with
p=d=mB+m and c=v(si).
• Create a task T3m+1 with p=B+1 and d=c=1.
• We claim that the task set is schedulable if
and only if the 3-partition instance is feasible.
32
Only if direction
• When the task set is schedulable
– Task T3m+1 is scheduled at time 0, B+1, 2(B+1), …
– Consider the hyper period mB+m. All of the first 3m tasks
need to be scheduled within it.
– During this hyper period, T3m+1 has run for m times with total
time m.
– Thus, mB time is for all other tasks.
– The available time between the first and the second T3m+1 is
B.
– The task set between them has total time bounded by B. Let
S1 denote the corresponding set in S, so ∑ s ∈ S1 v(s) ≦ B
– Similarly, ∑ s ∈ Si v(s) ≦ B for all 1 ≦ i ≦ m since T3m+1 has run
for m times
– On the other hand, ∑ s ∈ S1 v(s) + ∑ s ∈ S2 v(s) +…+ ∑ s ∈
Smv(s)=mB. One has that each ∑ s ∈ Si v(s)=B.
33
If direction
• When there is a feasible 3-partition
solution,
– One can schedule T3m+1 at time 0, B+1,
2(B+1),…
– One then puts the other tasks according to
the 3-partition solution
34
3-partition
• First show that numerical 4DM is NPcomplete. Reduce from 3DM.
• 4DM problem says that given four sets
S1,S2,S3,S4, each of which consists of
some distinct elements, and a collection
C=S1⨯S2⨯S3⨯S4, one asks whether
there exists a subcollection C’ to partition
the union of four sets and the sum of
values of each set in C’ is B.
35
Reduce from 3DM to numerical 4DM
•
•
•
•
•
•
•
•
•
Create four elements for each candidate set (xa,yb,zc) in M. e1 in S1, e2 in
S2, e3 in S3 and e4 in S4.
If xa is in the candidate set, create an element e1 with value either
2q3+aq2 (core) or aq2 (dummy).
If yb is in the candidate set, create an element e2 with value either bq
(core) or q3+bq (dummy).
If zc is in the candidate set, create an element e3 with value either c
(core) or q3+c (dummy).
create an element e4 with value 2q3-aq2 -bq-c.
If there is only one occurrence of a variable (e.g., x1) in M, then there is
only one core element generated.
If there are k occurrences (e.g., z7) in M, then there are k elements
generated where contains one core element and k-1 dummy elements.
Note that different elements can have the same value.
Candidate sets in 4DM is created such that it contains either all core
elements or all dummy elements. Enumerate all possible candidate sets.
Set B=4q3.
36
Reduction example
• Suppose that the candidate sets M in 3DM is as follows.
• (x1,y5,z7), (x2,y2,z7), (x2,y5,z5) …
• (x1,y5,z7) produces e11 with value 2q3+q2, e21 with value 5q, e31
with value q3+7, e41 with value 2q3-q2-5q-7.
• (x2,y2,z7) produces e12 with value 2q3+2q2, e22 with value 2q, e32
with value 7, e42 with value 2q3-2q2-2q-7.
• (x2,y5,z5) produces e13 with value 2q2, e23 with value q3+5q, e33
with value 5, e43 with value 2q3-2q2-5q-5.
• If (x1,y5,z7) is picked in M, we pick (e11 e21 e32 e41).
• Since (x2,y2,z7), (x2,y5,z5) are not picked, we pick (2q2 q3+2q e31
e42) and (2q2 e23 e33 e43).
• The elements with values are those generated from other
candidate sets in M.
• e12 e22 e13 are not picked and they will be picked corresponding to
some sets picked in M.
37
If direction
•
•
•
•
•
When there is solution of 3DM problem,
If a set is picked in 3DM, the corresponding core set is picked in
numerical 4DM. Otherwise, the corresponding dummy set is picked.
Each variable is picked exactly once in 3DM, so each core element is
picked exactly once. Note that core elements generated from multiple
sets in M could be combined together and picked (since we enumerate
candidate sets in numerical 4DM).
Given k occurrences of a variable in M, they are in k candidate sets in M.
One of them is picked (so is the corresponding core element), and k-1 of
them is not picked (so the corresponding k-1 dummy elements are
picked). Thus, each generated element is picked exactly once. There is
only one e4 for each set in M, which will be used to make the sum of
values 4q3.
This is the subcollection of sets to partition the union of four sets and
each set with the sum of values to be B.
38
Only if direction
• Given a solution to numerical 4DM, each
core element is covered exactly once.
There exists sets which contain only the
core elements and one can pick the
corresponding sets in M.
39
Download