Setting up a Testbed With the Netbed/Emulab Software

advertisement
Setting up a Testbed with the
Netbed/Emulab Software
1
Detailed documentation of all
this is in the source tree:
doc/setup*.txt
2
Steps
• Get network infrastructure set up
• Install/configure FreeBSD on boss and ops
– Linux should work, but never been done. Glitches will
occur, most likely in standard services, not Emulab itself.
Rutgers Winlab about to tackle it.
• Run installation scripts on boss and ops
• Put information about the hardware into the
database
• Bring up nodes
3
4
Control Network VLANs
•
•
•
•
External – connection to campus network
Node control – nodes’ control interfaces
Private – boss and other ‘secure’ nodes
Public – ops and other nodes with user
accounts
• Control hardware – SNMP devices that must
be protected from users
5
Switch (Cisco) Setup
• Enable SNMP read-write access
• Control network
– ‘set port host’
– Router: Set ‘UDP helper’ address
• Experimental Network
– Disable noisy automatic protocols
– CDP, spantree, bpdu-guard
• Connect control and experimental networks
– Port in the ‘Control Hardware’ VLAN on control
– Some VLAN with switch IP interface on exper.
6
Installing FreeBSD
•
•
•
•
•
•
•
FreeBSD 4.9 recommended right now (11/03)
Use standard FreeBSD installer
Pick 'Developer' distribution set
Install the ports collection
Pick where you will put important filesystems
Configure hostname, IP address, etc.
Install your favorite editors, etc.
7
Important Filesystems on ops
•
•
•
•
•
/usr/testbed/
/users/
/proj/
/groups/
/share/
8
Important Filesystems on boss
• /usr/
• /usr/testbed/
• /var/
9
Configuring Testbed Software
• Done with GNU autoconf
• Most configuration options stored in 'defs' file
– '—with-TBDEFS='
• 'defs-example' provides a template
• Use an object tree outside your source tree
• Uses GNU make (gmake in FreeBSD)
• Install targets
– boss-install, ops-install
– post-install, to be run as root, on boss
10
Defs File Example
# The name of this installation
THISHOMEBASE=Example.Emulab.Net
# Domain this testbed resides in
OURDOMAIN=example.emulab.net
# Host name of our web server (or host:port)
WWWHOST=www.example.emulab.net
# Fully-quailified hostname of the boss node
BOSSNODE=boss.example.emulab.net
# Fully-quailified hostname of the ops (also called users) node
USERNODE=ops.example.emulab.net
# Fully-quailified hostname of the fileserver (will probably #
be the same as the ops node)
FSNODE=fs.example.emulab.net
11
Mailing Lists
• testbed-ops
– User support
– The most serious error messages
• testbed-approval
– Requests to start new projects
• testbed-logs
– Experiment creation/teardown
• testbed-audit
– Audit user and project creation and changes
12
Mailing Lists (contd.)
• testbed-www
– Web reports
• testbed-stated
– Reports from state daemon
• testbed-users-archive
– Archive for automatically-generated testbedusers list – should be direct to a file
• testbed-active-users-archive
– Archive for testbed-active-users lists
13
Setting Up ops
• Run install/ops-install from your
object tree
• Install the ops part of the testbed software
– 'gmake ops-install'
• Special kernel if using Cyclades Z-series on
FreeBSD
• Mailing lists on ops
• Put boss in ops' /etc/hosts file
• Reboot ops
14
Setting Up boss
• Use same defs file you did with ops
– Need a duplicate obj tree for boss
•
•
•
•
•
Start install/boss-install
Install ports
Finish install/boss-install
Add {ops,users,fs} to /etc/hosts
Install the boss part of the software
– Two targets, boss-install and post-install
• ssh from boss to ops as root
15
Other Things To Do on boss
• setuid bit on /usr/bin/suidperl
• Get boss set up as the nameserver for your
domain
• SSL certificates for the web
• 1000Hz kernel
• Bootloaders and MFSes from Utah
• Disk images for nodes come from Utah, for
now
16
Bootstrapping Users and
Projects
• Create first user with 'firstuser' script
• Create holding experiments via the web
– In the 'emulab-ops' project
– 'hwdown' – for nodes that have failed
– 'reloadpending' – for nodes awaiting disk loads
– 'reloading' – for nodes having their disks loaded
17
What Goes in the Database
• Switches (nodes)
– Note: Switch node_id must be resolvable
– Types (node_types)
– Stacks (switch_stacks, switch_stack_types)
– Interconnects (interfaces, interface_types, wires)
• Power controllers (nodes, node_types)
• Images and OSids (images, os_info)
18
What Goes in The Database Nodes
•
•
•
•
•
•
•
Node itself (nodes)
Virtual nodes (nodes)
Node type (node_types)
Interfaces (interfaces, interface_types)
Connection to switches (wires)
Serial line (tiplines)
Power control outlet (outlets)
19
Bringing up Nodes - Overview
• Basic idea: Let nodes report in automatically
• Nodes boot with a temporary IP address
• Unknown nodes boot a 'newnode' FreeBSD
MFS
• Nodes report their hardware to boss
• boss guesses node names and addresses
• Script finds interface ports on switches
• Administrator adds nodes to the testbed via a
Web interface (at any time)
20
Bringing up Nodes – Details
• Enter node and interface types
• BIOS setup
– PXE in boot order
– Power-loss behavior
– Serial console redirection; password
• Dynamic IP range in dhcpd.conf
– Start with dhcpd.conf.template
– Use a range of IP addresses not to be used by
nodes
21
Bringing up Nodes (contd.)
• Start with one node, to tweak name and IP
– Others will be inferred from the first
• Re-numbering interfaces
– Convert from FreeBSD to Linux order
• Searching switches for nodes’ interfaces
– Searches all switches in the database
– Switch ports must be enabled and in a VLAN
• Create nodes
• Free nodes from ‘hwdown’ experiment
• Scripts don't do serial and power yet
22
Creating a Disk Image
for Emulab Test PCs
23
Step by step
• Install a stock {Linux,BSD} of your choice
– Make one image with both OS’s (if both popular)
• Get the serial console working
• Setup root access
– root password (for console login)
– boss root pubkey to ~root/.ssh/authorized_keys
• Ensure perl, rpm and sudo installed
• Create or copy over node host keys
• “make client-install” from build tree
24
Saving the Image
• If Emulab up and running:
– Use image creation web form
• Otherwise:
– Shutdown to single-user (FS’s still mounted)
– Run /usr/local/etc/emulab/prepare
– Umount FS’s and RO mount /
– Reboot from CD or MFS
– Use imagezip to create the image
25
Details: Programs added
• Daemons: idle detection, HW health
monitor, watchdog
• Agents: traffic generation, link control,
program execution
• Synchronization server
• TMCC
• Emulab startup/shutdown hooks
26
Details: Kernel
• Stock kernels mostly work
• For FreeBSD
– Adding IPoD a simple patch
– Shaping nodes require additional features
– BSD virtual nodes require massive changes
• For Linux
– Adding IPoD a simple patch
27
All Linux, All the Time
• Some environments just do Linux
• Doing FreeBSD also asking too much?
• Converting servers: boss, ops
– Get Emulab software to build
– Get RPM equivalents to ports
– Easy as that!
28
The BSD-free environment
• MFS: for creating and installing images
– Should be PXE boot environments out there
• Delay nodes
– Need Linux to act as a bridge
– Is “tc” up to the task?
• Multiplexed node hosts
– UML (probably not, too slow)
– vservers (used in planetlab, which we support)
– Xen (looks good)
29
Emulab Idle Detection and
Policy
30
Idle Detection
• Why do it?
– Ease contention for resources
– Encourage better resource use practices
• How activity is defined
– CPU load (tunable threshold – currently 0.8)
– TTY activity (mod time on /dev entries)
– Network traffic
• Control network vs. Experimental network: different levels
– External activity
• E.g.: user-instantiated reboot.
31
Goals and Challenges
• Don’t affect experiment’s performance!
• No false positives!
– Could ruin an experimenter’s day— or week—
since swapout saves no node state (currently)
• Few false negatives
• Non-goal: detect intentional circumvention
32
Idle Detection - slothd
• Small daemon runs on each test node
• Tracks last observed metrics & compares vs.
current values: crossed activity thresholds?
– Runs every 5 minutes when node is active
– Runs every 5 seconds when inactive (aggressive mode).
• Instant detection of idle->active transition
• Reports via UDP to central collector
– Every five minutes, or send immediately if transitions
from idle->active
– Collector stores last activity timestamps in the DB
• One for each type of activity.
33
Idle Detection – policy
enforcement
• Script runs every 5 minutes on server.
– May send email to user warning of inactivity.
– May swap out experiment (idle, or max duration).
• Policy is dictated by tunable site-vars.
– Frequency of email warnings.
– When to swap experiment out.
• Honors per-experiment override settings.
– E.g.: expt not swappable; max duration; idle threshold.
– User has control over these both at experiment creation,
and after swapin by editing expt. metadata.
34
Idle Detection cont.
• Last seen activity viewable on web page.
– Per experiment.
– Drill down at node level (CPU, TTY, Network).
• Issues
– Easy to circumvent intentionally
– Sometimes defeated unintentionally
• E.g.: User leaves traffic generator running
– Hard (impossible?) to determine what activity is
meaningful (be context sensitive)
35
Emulab’s Administrative Structure
36
Executive Summary
• Emulab has a two level administrative structure:
“Projects”, and “Groups”
– Plus “Elab Admins” on top, and “Users” on the bottom
• Permits a variety of common organizational
structures
– A single user project with no groups
– A class project with multiple, isolated groups managed
by TAs
• Administrative control is delegated to “leaders” at
each level
37
A “Project”
• Central administrative entity
• Started by a faculty member or senior student
– Submitted through web interface
– User account gets created for experiment leader
• Approval of project users delegated to leader
– Saves on administrative overhead
– Project leader responsible for users' behaviour
– Leader may grant leader rights to one or more users
• Essential for lazy faculty
• Project gets its own disk space/tree
• Users may join multiple projects
38
A “Group”
• Projects may have multiple groups
• Project groups created by the project leader
– Designates group leaders for each
– Delegates group membership approval to them
• Groups are independent
– Files/experiments are protected from each other
• Groups can share
– Share the common project file hierarchy
39
Download