Setting up a Testbed with the Netbed/Emulab Software 1 Detailed documentation of all this is in the source tree: doc/setup*.txt 2 Steps • Get network infrastructure set up • Install/configure FreeBSD on boss and ops – Linux should work, but never been done. Glitches will occur, most likely in standard services, not Emulab itself. Rutgers Winlab about to tackle it. • Run installation scripts on boss and ops • Put information about the hardware into the database • Bring up nodes 3 4 Control Network VLANs • • • • External – connection to campus network Node control – nodes’ control interfaces Private – boss and other ‘secure’ nodes Public – ops and other nodes with user accounts • Control hardware – SNMP devices that must be protected from users 5 Switch (Cisco) Setup • Enable SNMP read-write access • Control network – ‘set port host’ – Router: Set ‘UDP helper’ address • Experimental Network – Disable noisy automatic protocols – CDP, spantree, bpdu-guard • Connect control and experimental networks – Port in the ‘Control Hardware’ VLAN on control – Some VLAN with switch IP interface on exper. 6 Installing FreeBSD • • • • • • • FreeBSD 4.9 recommended right now (11/03) Use standard FreeBSD installer Pick 'Developer' distribution set Install the ports collection Pick where you will put important filesystems Configure hostname, IP address, etc. Install your favorite editors, etc. 7 Important Filesystems on ops • • • • • /usr/testbed/ /users/ /proj/ /groups/ /share/ 8 Important Filesystems on boss • /usr/ • /usr/testbed/ • /var/ 9 Configuring Testbed Software • Done with GNU autoconf • Most configuration options stored in 'defs' file – '—with-TBDEFS=' • 'defs-example' provides a template • Use an object tree outside your source tree • Uses GNU make (gmake in FreeBSD) • Install targets – boss-install, ops-install – post-install, to be run as root, on boss 10 Defs File Example # The name of this installation THISHOMEBASE=Example.Emulab.Net # Domain this testbed resides in OURDOMAIN=example.emulab.net # Host name of our web server (or host:port) WWWHOST=www.example.emulab.net # Fully-quailified hostname of the boss node BOSSNODE=boss.example.emulab.net # Fully-quailified hostname of the ops (also called users) node USERNODE=ops.example.emulab.net # Fully-quailified hostname of the fileserver (will probably # be the same as the ops node) FSNODE=fs.example.emulab.net 11 Mailing Lists • testbed-ops – User support – The most serious error messages • testbed-approval – Requests to start new projects • testbed-logs – Experiment creation/teardown • testbed-audit – Audit user and project creation and changes 12 Mailing Lists (contd.) • testbed-www – Web reports • testbed-stated – Reports from state daemon • testbed-users-archive – Archive for automatically-generated testbedusers list – should be direct to a file • testbed-active-users-archive – Archive for testbed-active-users lists 13 Setting Up ops • Run install/ops-install from your object tree • Install the ops part of the testbed software – 'gmake ops-install' • Special kernel if using Cyclades Z-series on FreeBSD • Mailing lists on ops • Put boss in ops' /etc/hosts file • Reboot ops 14 Setting Up boss • Use same defs file you did with ops – Need a duplicate obj tree for boss • • • • • Start install/boss-install Install ports Finish install/boss-install Add {ops,users,fs} to /etc/hosts Install the boss part of the software – Two targets, boss-install and post-install • ssh from boss to ops as root 15 Other Things To Do on boss • setuid bit on /usr/bin/suidperl • Get boss set up as the nameserver for your domain • SSL certificates for the web • 1000Hz kernel • Bootloaders and MFSes from Utah • Disk images for nodes come from Utah, for now 16 Bootstrapping Users and Projects • Create first user with 'firstuser' script • Create holding experiments via the web – In the 'emulab-ops' project – 'hwdown' – for nodes that have failed – 'reloadpending' – for nodes awaiting disk loads – 'reloading' – for nodes having their disks loaded 17 What Goes in the Database • Switches (nodes) – Note: Switch node_id must be resolvable – Types (node_types) – Stacks (switch_stacks, switch_stack_types) – Interconnects (interfaces, interface_types, wires) • Power controllers (nodes, node_types) • Images and OSids (images, os_info) 18 What Goes in The Database Nodes • • • • • • • Node itself (nodes) Virtual nodes (nodes) Node type (node_types) Interfaces (interfaces, interface_types) Connection to switches (wires) Serial line (tiplines) Power control outlet (outlets) 19 Bringing up Nodes - Overview • Basic idea: Let nodes report in automatically • Nodes boot with a temporary IP address • Unknown nodes boot a 'newnode' FreeBSD MFS • Nodes report their hardware to boss • boss guesses node names and addresses • Script finds interface ports on switches • Administrator adds nodes to the testbed via a Web interface (at any time) 20 Bringing up Nodes – Details • Enter node and interface types • BIOS setup – PXE in boot order – Power-loss behavior – Serial console redirection; password • Dynamic IP range in dhcpd.conf – Start with dhcpd.conf.template – Use a range of IP addresses not to be used by nodes 21 Bringing up Nodes (contd.) • Start with one node, to tweak name and IP – Others will be inferred from the first • Re-numbering interfaces – Convert from FreeBSD to Linux order • Searching switches for nodes’ interfaces – Searches all switches in the database – Switch ports must be enabled and in a VLAN • Create nodes • Free nodes from ‘hwdown’ experiment • Scripts don't do serial and power yet 22 Creating a Disk Image for Emulab Test PCs 23 Step by step • Install a stock {Linux,BSD} of your choice – Make one image with both OS’s (if both popular) • Get the serial console working • Setup root access – root password (for console login) – boss root pubkey to ~root/.ssh/authorized_keys • Ensure perl, rpm and sudo installed • Create or copy over node host keys • “make client-install” from build tree 24 Saving the Image • If Emulab up and running: – Use image creation web form • Otherwise: – Shutdown to single-user (FS’s still mounted) – Run /usr/local/etc/emulab/prepare – Umount FS’s and RO mount / – Reboot from CD or MFS – Use imagezip to create the image 25 Details: Programs added • Daemons: idle detection, HW health monitor, watchdog • Agents: traffic generation, link control, program execution • Synchronization server • TMCC • Emulab startup/shutdown hooks 26 Details: Kernel • Stock kernels mostly work • For FreeBSD – Adding IPoD a simple patch – Shaping nodes require additional features – BSD virtual nodes require massive changes • For Linux – Adding IPoD a simple patch 27 All Linux, All the Time • Some environments just do Linux • Doing FreeBSD also asking too much? • Converting servers: boss, ops – Get Emulab software to build – Get RPM equivalents to ports – Easy as that! 28 The BSD-free environment • MFS: for creating and installing images – Should be PXE boot environments out there • Delay nodes – Need Linux to act as a bridge – Is “tc” up to the task? • Multiplexed node hosts – UML (probably not, too slow) – vservers (used in planetlab, which we support) – Xen (looks good) 29 Emulab Idle Detection and Policy 30 Idle Detection • Why do it? – Ease contention for resources – Encourage better resource use practices • How activity is defined – CPU load (tunable threshold – currently 0.8) – TTY activity (mod time on /dev entries) – Network traffic • Control network vs. Experimental network: different levels – External activity • E.g.: user-instantiated reboot. 31 Goals and Challenges • Don’t affect experiment’s performance! • No false positives! – Could ruin an experimenter’s day— or week— since swapout saves no node state (currently) • Few false negatives • Non-goal: detect intentional circumvention 32 Idle Detection - slothd • Small daemon runs on each test node • Tracks last observed metrics & compares vs. current values: crossed activity thresholds? – Runs every 5 minutes when node is active – Runs every 5 seconds when inactive (aggressive mode). • Instant detection of idle->active transition • Reports via UDP to central collector – Every five minutes, or send immediately if transitions from idle->active – Collector stores last activity timestamps in the DB • One for each type of activity. 33 Idle Detection – policy enforcement • Script runs every 5 minutes on server. – May send email to user warning of inactivity. – May swap out experiment (idle, or max duration). • Policy is dictated by tunable site-vars. – Frequency of email warnings. – When to swap experiment out. • Honors per-experiment override settings. – E.g.: expt not swappable; max duration; idle threshold. – User has control over these both at experiment creation, and after swapin by editing expt. metadata. 34 Idle Detection cont. • Last seen activity viewable on web page. – Per experiment. – Drill down at node level (CPU, TTY, Network). • Issues – Easy to circumvent intentionally – Sometimes defeated unintentionally • E.g.: User leaves traffic generator running – Hard (impossible?) to determine what activity is meaningful (be context sensitive) 35 Emulab’s Administrative Structure 36 Executive Summary • Emulab has a two level administrative structure: “Projects”, and “Groups” – Plus “Elab Admins” on top, and “Users” on the bottom • Permits a variety of common organizational structures – A single user project with no groups – A class project with multiple, isolated groups managed by TAs • Administrative control is delegated to “leaders” at each level 37 A “Project” • Central administrative entity • Started by a faculty member or senior student – Submitted through web interface – User account gets created for experiment leader • Approval of project users delegated to leader – Saves on administrative overhead – Project leader responsible for users' behaviour – Leader may grant leader rights to one or more users • Essential for lazy faculty • Project gets its own disk space/tree • Users may join multiple projects 38 A “Group” • Projects may have multiple groups • Project groups created by the project leader – Designates group leaders for each – Delegates group membership approval to them • Groups are independent – Files/experiments are protected from each other • Groups can share – Share the common project file hierarchy 39