Joshua Pelkey and Dr. George Riley
Wns3 March 25, 2011
2
Overview
• Standard sequential simulation techniques with substantial network traffic
– Lengthy execution times
– Large amount of computer memory
• Parallel and distributed discrete event simulation [1]
– Allows single simulation program to run on multiple interconnected processors
– Reduced execution time! Larger topologies!
3
Overview (cont.)
• Important Note
– It is mandatory that distributed simulations produce the same results as identical sequential simulations
Overview: terminology
• Logical Process (LP)
– An individual sequential simulation
• Rank or system id
– The unique number assigned to each LP
4
Figure 1. Simple point-to-point topology, distributed
5
Overview: related work
• Parallel/Distributed ns (PDNS) [2]
• Georgia Tech Network Simulator (GTNetS) [3]
– Both use a federated approach and a conservative
(blocking) mechanism
6
Implementation Details in ns-3
• LP communication
– Message Passing Interface (MPI) standard
– Send/Receive time-stamped messages
– MpiInterface in ns-3
• Synchronization
– Conservative algorithm using lookahead
– DistributedSimulator in ns-3
7
Implementation Details in ns-3 (cont.)
• Assigning rank to nodes
– Handled manually in simulation script
• Remote point-to-point links
– Created automatically between nodes with different ranks through point-to-point helper
– When a packet is set to cross a remote point-to-point link, the packet is transmitted via MPI using our interface
• Merged since ns-3.8
8
Implementation Details in ns-3: limitations
• All nodes created on all LPs, regardless of rank
– It is up to the user to only install applications on the correct rank
• Nodes are assigned rank manually
– An MpiHelper class could be used to assign rank to nodes automatically. This would enable easy distribution of existing simulation scripts.
• Pure distributed wireless is currently not supported
– At least one point-to-point link must exist in order to divide the simulation
9
Performance Study
• DARPA NMS campus network simulation
– Using nms-p2p-nix-distributed example available in ns-3
– Allows creation of very large topologies
– Any number of campus networks are created and connected together
– Different campus networks can be placed on different LPs
– Tested with 2 CNs, 4 CNs, 6 CNs, 8 CNs, and 10 CNs
Performance Study: campus network topology
200 ms, 10 us
10
Figure 2. Campus network topology block [4]
11
Performance Study: Georgia Tech clusters used
• Hogwarts Cluster
– 6 nodes, each with 2 quad-core processors and 48GB of
RAM
• Ferrari Cluster
– Mix of machines, including 3 quad-core nodes and 8 dualcore nodes
Performance Study: simulations on Hogwarts
12
Figure 3. Campus network simulations on Hogwarts with
(A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs
Performance Study: simulations on Ferrari
13
Figure 4. Campus network simulations on Ferrari with
(A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs
14
Performance Study: speedup
Figure 5.
Speedup using distributed simulation for campus network topologies on the
(A) Hogwarts cluster and (B) Ferrari cluster
15
Performance Study: speedup (cont.)
Hogwarts
Ferrari
Table 1: Speedup for Hogwarts and Ferrari
2 CNs
1.8
1.9
4 CNs
3.3
1.6
6 CNs
5.8
2.0
8 CNs
6.9
2.3
10 CNs
8.3
2.4
• Linear speedup for Hogwarts, not for Ferrari. Further investigation revealed Ferrari consisted of a mix of machines, with the first two nodes considerably faster
16
Performance Study: changing the lookahead
• By changing the delay between campus networks, the lookahead was varied (200ms to 10 µs)
• For Hogwarts and Ferrari, the 10 µs simulations ran, on average, 25% and 47% slower, respectively
• As expected, a smaller lookahead time decreases the potential speedup, as the simulators must synchronize with a greater frequency
17
Future Work
• MpiHelper class to facilitate creating distributed topologies
– Nodes assigned rank automatically
– Existing simulation scripts could be distributed easily
• Distributing the topology could occur at the node level, rather than the application
– Ghost nodes, save memory
• Pure distributed wireless support
18
Summary
• Distributed simulation in ns-3 allows a user to run a single simulation in parallel on multiple processors
• Very large-scale simulations can be run in ns-3 using the distributed simulator
• Distributed simulation in ns-3 offers potentially optimal linear speedup compared to identical sequential simulations
References
19
[1] R.M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley
Interscience, 2000.
[2] PDNS - Parallel/Distributed ns. http://www.cc.gatech.edu/computing/compass/pdns , March 2004.
[3] G. F. Riley. The Georgia Tech Network Simulator. In Proceedings of the
ACM SIGCOMM workshop on Models, methods and tools for
reproducible network research, MoMeTools ’03, pages 5-12, New York,
NY, USA, 2003 ACM.
[4] Standard baseline NMS challenge topology. http://www.ssfnet.org/Exchange/gallery/baseline , July 2002