B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 2 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 3 Researchers need a platform in which they can develop, debug, and evaluate their systems One lab is not enough, lack of resources Need more computers Scalability in terms of distance and number of nodes can’t be reached Requires a huge amount of time to develop large scale experiments 4 Simulation: NS controlled, repeatable environment Live networks: PlanetLab Achieves realism Loses accuracy due to abstraction Not easy to repeat the experiment again Emulation: Dummynet, NSE controlled packet loss and delay Manual configuration is boring 5 Derives from “Emulab Classic” A universally-available time- and space-shared network emulator Automatic configuration from NS script Add Virtual topologies for network experimentations Integrates simulation, emulation, and live-network with wide-area nodes experimentation in a single framework 6 Accuracy Provide artifact-free environment Universality Anyone can use anything the way he wants conservative policy for the resource allocation No multiplexing (virtual machine) The resource of one node can be fully utilized 7 Local-Area Resources Distributed Resources Simulated Resources Emulated Resources WAN emulator (integrated yet) PlanetLab ModelNet (still in work) 8 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 9 Resource Life cycle 10 3 clusters 168 in Utah, 48 PCs in Kentucky & 40 in Georgia Each node can be used as Edge node, router, traffic-shaping node, traffic generator Exclusivity of a machine during an experiment The OS is given but entirely replaceable 11 12 Also called wide-area resources 50-60 nodes in approximatively 30 sites provides characteristic live network Very few nodes These nodes are shared between many users FreeBSD Jail mechanism (kind of Virtual machine) Non-root access 13 14 Based on nse (NS-emulation) Enables interaction with real traffics Provides scalability beyond physical resources Many simulated nodes can be multiplexed 15 VLANs Emulate wide-area links within a local-area Dummynet Emulates queue & bandwidth limitation , introducing delays and packet loss between physical nodes nodes act as Ethernet bridges transparent to experimental traffic 16 Resource Life cycle 17 18 Global Node Experiment Specification Resource Self-Configuration Swap Parsing SwapOut In Control Allocation $ns duplex-link $A $B 1.5Mbps 20ms A A B B A DB B 19 Experiment creation A project leader propose a project on the web A netbed staff accept or reject the project All the experiment will be accessible from the web Experiment managment Log on allocated nodes or on the usershost (fileserver) The fileserver send the OS images, home and project directories to the other nodes 20 21 Experimenters use ns scripts with Tcl can do as many functions & loops as they want Netbed defines a small set of ns extension Possibility of chosing a specfic hardware simultation, emulation, or real implementation Program objects can be defined using a Netbed- specific ns extension Possibility of using graphical UI 22 Front-end Tcl/ns parser Recognizes subset of ns relevant to topology & traffic generation Database Store an abstraction of everything about the exeriment ▪ Fixed generated events ▪ Information about Hardwares , users & experiments ▪ procedures 23 24 Binds abstractions from the database to physical or simulated entities Best effort to match with specifications On-demand allocations (no reservations) 2 different algorithms for local and distributed nodes (different constraints) Simulated annealing Genetic algorithm 25 Over-reservation of the bottleneck inter-switch bandwith is to small (2 Gbps) Against their conservative policy Dynamic changes of the topology are allowed Add and remove nodes Consistent naming across instantiations Virtualization of IP addresses and host names 26 Dynamic linking and loading from the DB Let have the proper context (hostname, disk image, script to start the experiment) No persistent configuration states Only volatile memory on the node If requiered, the current soft state can be stored in the DB as a hard state Swap out / Swap in 27 Local Nodes All nodes are rebooted in parallel Contact the masterhost which loads the kernel directed by the database A second level boot may be requiered Distributed nodes Boot from a CD-ROM then contact the masterhost A new FreeBSD Jail is instantiated Tested Master Control Client 28 Netbed supports dynamic experiment control Start, stop and resume processes, traffic generators and network monitors Signals between nodes Used of a Publish/Subscribe event routing system The static events are retrieved from the DB Dynamics events are possible 29 ns configuration files is only high-level control Experimenters can made some low-level controls On local node: root privileges ▪ Kernel modification & access to raw sockets On distributed: Jail-restricted root privileges ▪ Access to raw socket with a specific IP address Each local node support separated network isolated from the experimental one Enable to control a node via a tunnel as we where on it 30 without interfering Netbed try to prevent idling 3 metrics: traffic, use of pseudo-terminal devices & CPU load average To be sure, a message is sent to the user who can disapprove manually A challenge for distributed nodes with several Jails Netbed proposes automated batch experiments When no interaction is required Enables to wait for available resources 31 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 32 1st row : emulation overhead Dummynet gives better results than nse 33 They expect to have better results with future improvements of nse 34 5 nodes are communicating with 10 links Evaluation of a derivative of DOOM Their goal is to sent 30 tics/sec 35 Challenges Depends on physical artifacts (cannot be cloned) Should evaluate arbitrary programs Must run continuoustly Minibed: 8 separated Netbed nodes Test mode: prevent hardware modifications Full-test mode: provides isolated hardware 36 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 37 All-in-one set of tools Automated and efficient realization of virtual topologies Efficient use of resources through time-sharing and space-sharing Increase of fault-tolerance (resource virtualization) 38 Examples The “dumbbell” network ▪ 3h15 --> 3 min Improvement in the utilization of a scarce and expensive infrastructure: 12 months & 168 PC in Utah ▪ Time-sharing (swapping): 1064 nodes ▪ Space-sharing (isolation): 19,1 years Virtualization of name and IP addresses ▪ No problem with the swappings 39 Experiment creation and swapping Mapping Reservation Reboot issuing Reboot Miscellaneous Double time to boot on a custom disk image 40 Mapping local resources: assign Match the user’s requirements Based on simulated annealing Try to minimizes the number of switch and inter- switch bandwidth Less than 13 seconds 41 Mapping local resources: assign 42 Mapping distributed resources: wanassign Different constraints ▪ Fully connected via the internet ▪ “Last mile”: type instead of topology ▪ Specific topologies may be guaranteed by requesting particular network characteristics (bandwidth, latency & loss) ▪ Based on a genetic algorithm 43 Mapping distributed resources: wanassign 16 nodes 100 edges : ~1sec 256 nodes & 40 edges/nodes : 10min~2h 44 Disk reloading 2 possibilities ▪ complete disk image loading ▪ incremental synchronization (hash tables on files or blocks) Good ▪ Faster (in their specific case) ▪ No corruption Bad ▪ Waste of time when similar images are needed repeatly ▪ Pace reloading of freed node (reserved for 1 user) 45 Disk reloading Frisbee Performance techniques: ▪ Uses a domain-specific algorithm to skip unused blocks ▪ Delivers images via a custom reliable multicast protocol 117 sec for 80 nodes, write 550MB instead of 3GB 46 Scaling of simulated resources Simulated nodes are multiplexed on 1 physical node ▪ Must deal with real time taking into account the user’s specification : rate of events Test of a live TCP at 2Mb CBR ▪ 850MHz PC with UDP background 2Mb CBR / 50ms ▪ Able to have 150 links for 300 nodes ▪ Problem of routing in very complex topologies 47 Possibility to program different batch experiment, with the modification of only 1 parameter by 1 The Armada file system from Oldfield & Kotz 7 bandwidths x 5 latencies x 3 application settings x 4 configs of 20 nodes 420 tests in 30 hrs (4.3 min ~ per experiment) 48 Motivation Netbed structure Validation and testing Netbed contribution Conclusion 49 Netbed deals with 3 test environments Reuse of ns script Quick setup of the test environment Virtualization techniques provide the artifact-free environment Enables qualitatively new experimental techniques 50 Reliability/Fault Tolerance Distributed Debugging: Checkpoint/Rollback Security “Petri Dish” 51