Uploaded by Sahil Thakur

ISE 2

advertisement
HPC ISE-2 AY 23-24
Two questions each batch:
B1: 1,8
B2: 2,7
B3: 3,6
B4: 4,5
B5: 2,8
B6: 3,7
Students must include snapshot of their system configuration in the document and must comment on the
analysis based on their configuration. Document must include graphs along with tables of analysis.
1. Execute the all-to-all broadcast operation (Program 3.2.1.c) with varying message sizes. Plot the performance
of the operation with varying message sizes from 1K to 10K (with constant number of processes, 8). Explain the
performance observed.
2. Execute the all-reduce operation (Program 3.2.2.c) with varying number of processors (1 to 16) and fixed
message size of 10K words. Plot the performance of the operation with varying number of processors (with
constant message size). Explain the performance observed.
3. Consider an implementation of the gather operation given to you (Program 3.3.1.c) – in this implementation,
each processor sends its message to processor 0, which gathers the message. Compare this code to the one that
uses the MPI gather operation (Program 3.3.1b.c). Compare the performance for the fixed message size (1K
words at each processor) case with varying number of processors (1 - 16). Which implementation is better?
Why?
4. Run the scatter operation (Program 3.3.2.c) with varying message sizes (10K to 100K), with a fixed number of
processors (8). Plot the runtime as a function of the message size. Explain the observed performance.
5. Consider an implementation of the all-to-all personalized operation given to you (Program 3.4.1.c) – in
this implementation, each processor simply sends messages to all other processors. Plot the time for a fixed
message size (8K words at each processor divided equally among all processors) with varying number of
processors (1, 2, 4, 8).
HPC ISE-2 AY 23-24
6. Run the all-to-all personalized operation (Program 3.4.2.c) with fixed message size (8K words at each
processor) of with varying number of processors (1, 2, 4, and 8). Plot the runtime as a function of the
number of processors. Compare the performance to that of Program 3.4.1.c above. Which implementation
is better? Why?
7. Consider two implementations of one-to-all broadcast. The first implementation uses the MPI
implementation (Program 3.5.1.c). The second implementation splits the message and executes the
broadcast in two steps (Program 3.5.1b.c). Plot the runtime of the two implementations with varying
number of processors (1, 2, 4, 8) with constant message size 100K. Explain the observed performance of
the two implementations.
8. Execute the MPI program (Program 3.1.1.c) with varying sized broadcasts. Plot the performance of the
broadcast with varying message sizes from 1K words to 100K words (with constant number of processes,
8). Explain the performance observed.
Download