Introduction to Algorithms: 6.006 Massachusetts Institute of Technology September 15, 2011

advertisement
Introduction to Algorithms: 6.006
Massachusetts Institute of Technology
Professors Erik Demaine and Srini Devadas
September 15, 2011
Problem Set 2
Problem Set 2
Both theory and programming questions are due Tuesday, September 27 at 11:59PM.
Please download the .zip archive for this problem set, and refer to the README.txt file for
instructions on preparing your solutions.
Remember, your goal is to communicate. Full credit will be given only to a correct solution
which is described clearly. Convoluted and obtuse descriptions might receive low marks, even
when they are correct. Also, aim for concise solutions, as it will save you time spent on write-ups,
and also help you conceptualize the key idea of the problem.
We will provide the solutions to the problem set 10 hours after the problem set is due, which
you will use to find any errors in the proof that you submitted. You will need to submit a critique of
your solutions by Thursday, September 29th, 11:59PM. Your grade will be based on both your
solutions and your critique of the solutions.
Problem 2-1. [40 points] Fractal Rendering
You landed a consulting gig with Gopple, who is about to introduce a new line of mobile phones
with Retina HD displays, which are based on unicorn e-ink and thus have infinite resolution. The
high-level executives heard that fractals have infinite levels of detail, and decreed that the new
phones’ background will be the Koch snowflake (Figure 1).
Figure 1: The Koch snowflake fractal, rendered at Level of Detail (LoD) 0 through 5.
Unfortunately, the phone’s processor (CPU) and the graphics chip (GPU) powering the display do
not have infinite processing power, so the Koch fractal cannot be rendered in infinite detail. Gopple
engineers will stop the recursion at a fixed depth n in order to cap the processing requirement. For
example, at n = 0, the fractal is just a triangle. Because higher depths result in more detailed
drawing, this depth is usually called the Level of Detail (LoD).
The Koch snowflake at LoD n can be drawn using an algorithm following the sketch below:
1
Problem Set 2
S NOWFLAKE(n)
1 e1 , e2 , e3 = edges of an equilateral triangle with side length 1
2 S NOWFLAKE -E DGE(e1 , n)
3 S NOWFLAKE -E DGE(e2 , n)
4 S NOWFLAKE -E DGE(e3 , n)
S NOWFLAKE -E DGE(edge, n)
1 if n == 0
2
edge is an edge on the snowflake
3 else
4
e1 , e2 , e3 = split edge in 3 equal parts
5
S NOWFLAKE -E DGE(e1 , n − 1)
6
f2 , g2 = edges of an equilateral triangle whose 3rd edge is e2 , pointing outside the snowflake
7
∆(f2 , g2 , e2 ) is a triangle on the snowflake’s surface
8
S NOWFLAKE -E DGE(f2 , n − 1)
9
S NOWFLAKE -E DGE(g2 , n − 1)
10
S NOWFLAKE -E DGE(e3 , n − 1)
The sketch above should be sufficient for solving this problem. If you are curious about the missing
details, you may download and unpack the problem set’s .zip archive, and read the CoffeeScript
implementation in fractal/src/fractal.coffee.
In this problem, you will explore the computational requirements of four different methods for
rendering the fractal, as a function of the LoD n. For the purpose of the analysis, consider the
recursive calls to S NOWFLAKE -E DGE; do not count the main call to S NOWFLAKE as part of the
recursion tree. (You can think of it as a super-root node at a special level -1, but it behaves differently from all other levels, so we do not include it in the tree.) Thus, the recursion tree is actually
a forest of trees, though we still refer to the entire forest as the “recursion tree”. The root calls to
S NOWFLAKE -E DGE are all at level 0.
Gopple’s engineers have prepared a prototype of the Koch fractal drawing software, which you can
use to gain a better understanding of the problem. To use the prototype, download and unpack the
problem set’s .zip archive, and use Google Chrome to open fractal/bin/fractal.html.
First, in 3D hardware-accelerated rendering (e.g., iPhone), surfaces are broken down into triangles
(Figure 2). The CPU compiles a list of coordinates for the triangles’ vertices, and the GPU is
responsible for producing the final image. So, from the CPU’s perspective, rendering a triangle
costs the same, no matter what its surface area is, and the time for rendering the snowflake fractal
is proportional to the number of triangles in its decomposition.
(a) [1 point] What is the height of the recursion tree for rendering a snowflake of LoD n?
1. log n
2. n
2
Problem Set 2
Figure 2: Koch snowflake drawn with triangles.
3. 3 n
4. 4 n
(b) [2 points] How many nodes are there in the recursion tree at level i, for 0 ≤ i ≤ n?
1.
2.
3.
4.
3i
4i
4i+1
3 · 4i
(c) [1 point] What is the asymptotic rendering time (triangle count) for a node in the
recursion tree at level i, for 0 ≤ i < n?
1. 0
2. Θ(1)
i
3. Θ( 19 )
i
4. Θ( 13 )
(d) [1 point] What is the asymptotic rendering time (triangle count) at each level i of the
recursion tree, for 0 ≤ i < n?
1.
2.
3.
4.
0
i
Θ( 49 )
Θ(3i )
Θ(4i )
(e) [2 points] What is the total asymptotic cost for the CPU, when rendering a snowflake
with LoD n using 3D hardware-accelerated rendering?
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
3
Problem Set 2
Second, when using 2D hardware-accelerated rendering, the surfaces’ outlines are broken down
into open or closed paths (list of connected line segments). For example, our snowflake is one
closed path composed of straight lines. The CPU compiles the list of cooordinates in each path to
be drawn, and sends it to the GPU, which renders the final image. This approach is also used for
talking to high-end toys such as laser cutters and plotters.
(f) [1 point] What is the height of the recursion tree for rendering a snowflake of LoD n
using 2D hardware-accelerated rendering?
1.
2.
3.
4.
log n
n
3n
4n
(g) [1 point] How many nodes are there in the recursion tree at level i, for 0 ≤ i ≤ n?
1.
2.
3.
4.
3i
4i
4i+1
3 · 4i
(h) [1 point] What is the asymptotic rendering time (line segment count) for a node in the
recursion tree at level i, for 0 ≤ i < n?
1. 0
2. Θ(1)
i
3. Θ( 19 )
i
4. Θ( 13 )
(i) [1 point] What is the asymptotic rendering time (line segment count) for a node in the
last level n of the recursion tree?
1.
2.
3.
4.
0
Θ(1)
n
Θ( 19 )
n
Θ( 13 )
(j) [1 point] What is the asymptotic rendering time (line segment count) at each level i
of the recursion tree, for 0 ≤ i < n?
1.
2.
3.
4.
0
i
Θ( 49 )
Θ(3i )
Θ(4i )
(k) [1 point] What is the asymptotic rendering time (line segment count) at the last level
n in the recursion tree?
4
Problem Set 2
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
(l) [1 point] What is the total asymptotic cost for the CPU, when rendering a snowflake
with LoD n using 2D hardware-accelerated rendering?
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
Third, in 2D rendering without a hardware accelerator (also called software rendering), the CPU
compiles a list of line segments for each path like in the previous part, but then it is also responsible
for “rasterizing” each line segment. Rasterizing takes the coordinates of the segment’s endpoints
and computes the coordinates of all the pixels that lie on the line segment. Changing the colors of
these pixels effectively draws the line segment on the display. We know an algorithm to rasterize a
line segment in time proportional to the length of the segment. It is easy to see that this algorithm
is optimal, because the number of pixels on the segment is proportional to the segment’s length.
Throughout this problem, assume that all line segments have length at least one pixel, so that the
cost of rasterizing is greater than the cost of compiling the line segments.
It might be interesting to note that the cost of 2D software rendering is proportional to the total
length of the path, which is also the power required to cut the path with a laser cutter, or the amount
of ink needed to print the path on paper.
(m) [1 point] What is the height of the recursion tree for rendering a snowflake of LoD n?
1.
2.
3.
4.
log n
n
3n
4n
(n) [1 point] How many nodes are there in the recursion tree at level i, for 0 ≤ i ≤ n?
1.
2.
3.
4.
3i
4i
4i+1
3 · 4i
(o) [1 point] What is the asymptotic rendering time (line segment length) for a node in
the recursion tree at level i, for 0 ≤ i < n? Assume that the sides of the initial triangle
have length 1.
1. 0
5
Problem Set 2
2. Θ(1)
i
3. Θ( 19 )
i
4. Θ( 13 )
(p) [1 point] What is the asymptotic rendering time (line segment length) for a node in
the last level n of the recursion tree?
1.
2.
3.
4.
0
Θ(1)
n
Θ( 19 )
n
Θ( 13 )
(q) [1 point] What is the asymptotic rendering time (line segment length) at each level i
of the recursion tree, for 0 ≤ i < n?
1.
2.
3.
4.
0
i
Θ( 49 )
Θ(3i )
Θ(4i )
(r) [1 point] What is the asymptotic rendering time (line segment length) at the last level
n in the recursion tree?
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
(s) [1 point] What is the total asymptotic cost for the CPU, when rendering a snowflake
with LoD n using 2D software (not hardware-accelerated) rendering?
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
The fourth and last case we consider is 3D rendering without hardware acceleration. In this case,
the CPU compiles a list of triangles, and then rasterizes each triangle. We know an algorithm to
rasterize a triangle that runs in time proportional to the triangle’s surface area. This algorithm is
optimal, because the number of pixels inside a triangle is proportional to the triangle’s area. For
the purpose of this problem, you can assume that the area of a triangle with side length l is Θ(l2 ).
We also assume that the cost of rasterizing is greater than the cost of compiling the line segments.
(t) [4 points] What is the total asymptotic cost of rendering a snowflake with LoD n?
Assume that initial triangle’s side length is 1.
6
Problem Set 2
1.
2.
3.
4.
Θ(1)
Θ(n)
n
Θ( 43 )
Θ(4n )
(u) [15 points] Write a succinct proof for your answer using the recursion-tree method.
Problem 2-2. [60 points] Digital Circuit Simulation
Your 6.006 skills landed you a nice internship at the chip manufacturer AMDtel. Their hardware
verification team has been complaining that their circuit simulator is slow, and your manager decided that your algorithmic chops make you the perfect candidate for optimizing the simulator.
A combinational circuit is made up of gates, which are devices that take Boolean (True / 1 and
False / 0) input signals, and output a signal that is a function of the input signals. Gates take some
time to compute their functions, so a gate’s output at time τ reflects the gate’s inputs at time τ − δ,
where δ is the gate’s delay. For the purposes of this simulator, a gate’s output transitions between
0 and 1 instantly. Gates’ output terminals are connected to other gates’ inputs terminals by wires
that propagate the signal instantly without altering it.
For example, a 2-input XOR gate with inputs A and B (Figure 3) with a 2 nanosecond (ns) delay
works as follows:
Time (ns)
0
1
2
3
4
5
Input A
0
0
1
1
Input B
0
1
0
1
Output O
0
1
1
0
A
Explanation
Reflects inputs at time -2
Reflects inputs at time -1
0 XOR 0, given at time 0
0 XOR 1, given at time 1
1 XOR 0, given at time 2
1 XOR 1, given at time 3
O
B
Figure 3: 2-input XOR gate; A and B supply the inputs, and O receives the output.
The circuit simulator takes an input file that describes a circuit layout, including gates’ delays,
probes (indicating the gates that we want to monitor the output), and external inputs. It then
simulates the transitions at the output terminals of all the gates as time progresses. It also outputs
transitions at the probed gates in the order of the timing of those transitions.
This problem will walk you through the best known approach for fixing performance issues in a
system. You will profile the code, find the performance bottleneck, understand the reason behind
it, and remove the bottleneck by optimizing the code.
7
Problem Set 2
To start working with AMDtel’s circuit simulation source code, download and unpack the problem
set’s .zip archive, and go to the circuit/ directory.
The circuit simulator is in circuit.py. The AMDtel engineers pointed out that the simulation
input in tests/5devadas13.in takes too long to run. We have also provided an automated
test suite at test-circuit.py, together with other simulation inputs. You can ignore these
files until you get to the last part of the problem set.
(a) [8 points] Run the code under the python profiler with the command below, and
identify the method that takes up most of the CPU time. If two methods have similar
CPU usage times, ignore the simpler one.
python -m cProfile -s time circuit.py < tests/5devadas13.in
Warning: the command above can take 15-30 minutes to complete, and bring the CPU
usage to 100% on one of your cores. Plan accordingly.
What is the name of the method with the highest CPU usage?
(b) [6 points] How many times is the method called?
(c) [8 points] The class containing the troublesome method is implementing a familiar
data structure. What is the tightest asymptotic bound for the worst-case running time
of the method that contains the bottleneck? Express your answer in terms of n, the
number of elements in the data structure.
1.
2.
3.
4.
5.
6.
O(1).
O(log n).
O(n).
O(n log n).
O(n log2 n).
O(n2 ).
(d) [8 points] If the data structure were implemented using the most efficient method
we learned in class, what would be the tightest asymptotic bound for the worst-case
running time of the method discussed in the questions above?
1.
2.
3.
4.
5.
6.
O(1).
O(log n).
O(n).
O(n log n).
O(n log2 n).
O(n2 ).
(e) [30 points] Rewrite the data structure class using the most efficient method we learned
in class. Please note that you are not allowed to import any additional Python
libraries and our test will check this.
8
Problem Set 2
We have provided a few tests to help you check your code’s correctness and speed. The
test cases are in the tests/ directory. tests/README.txt explains the syntax
of the simulator input files. You can use the following command to run all the tests.
python circuit test.py
To work on a single test case, run the simulator on the test case with the following
command.
python circuit.py < tests/1gate.in > out
Then compare your output with the correct output for the test case.
diff out tests/1gate.gold
For Windows, use fc to compare files.
fc out tests/1gate.gold
We have implemented a visualizer for your output, to help you debug your code. To
use the visualizer, first produce a simulation trace.
TRACE=jsonp python circuit.py < tests/1gate.in > circuit.jsonp
On Windows, use the following command instead.
circuit jsonp.bat < tests/1gate.in > circuit.jsonp
Then use Google Chrome to open visualizer/bin/visualizer.html
We recommend using the small test cases numbered 1 through 4 to check your implementation’s correctness, and then use test case 5 to check the code’s speed.
When your code passes all tests, and runs reasonably fast (the tests should complete
in less than 30 seconds on any reasonably recent computer), upload your modified
circuit.py to the course submission site.
9
MIT OpenCourseWare
http://ocw.mit.edu
6.006 Introduction to Algorithms
Fall 2011
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download