here

advertisement
Herding Cats
Managing a mobile Unix platform
in the enterprise
Who are we?
• IT department of Cisco TAC
• Sun Solaris servers/desktops, HA
environment
• 5600 active accounts, 1200 workstations, 5
major sites globally
• 18 people
• Mail us: maarten@cisco.com
wout
@cisco.com
This presentation
• Desktop replacement plan
1. Choose Platform & Tools
2. ???
3. Profit !!!
• Desktop analysis and source code
•
available
http://radmind.org/contrib/LISA05
Choose platform
• Find out what’s important to you, attach weights
• For every platform, give a score for each
•
•
attribute
sum(), total()
Important in our case:





Unix networking and engineering tools
Only one platform
Supportability
Laptop experience: hotplugging, battery life, …
Security
Herding Cats
• Managing laptops is like herding cats
• Always know where your cats are
 They’re off the network before you know it
 Perform hourly client-side update checks with
fast tools
 Use asset tracking to know IP address and
current system state
This is a personal laptop!
Herding Cats
• Keep your cats healthy & happy
 Feed them regular software updates
 Stroke their ego by giving them control over
the update process
 File system checking makes sure there are no
invading organisms
 With backups, your users can have nine lives
too!
Tools and tricks
These apply not just to OS X !
How
•
•
•
•
•
•
Manage file system: rpm, tripwire?
Asset database: ethers?
Uniquely identify systems: SSL/PGP cert?
Naming service: LDAP?
Secure data: FileVault?
Business continuity: Backups?
Managing the file system
• Our rules:
 Keep full control over the client systems
• No root access for the user
 One image to rule them all
• Identical problems/solutions
• Fast system replacement
• No need to back up system image
 Enforce known-good state on client
• Install what we want there
• Remove what we don’t want there
• Do it without bothering the sysadmins
Which toolkit?
• Apple’s OS X package system is lacking
 No consistent uninstall
 No dependency tracking
• Most package tools can’t manage OS X
 They assume they’re alone on the system
 Resource forks / extended attributes
 File system is case insensitive
So… radmind
• We chose RAdminD
• Large installed base
•
•
of radmind on OS X
http://radmind.org
Thanks, UMich!
Why … radmind?
• Can repair broken systems without reimaging
 Bit flips, rootkits, user with sudo rights
⇒ repair automatically!
 Can handle OS upgrades
• Server can run on any platform
 re-use existing infrastructure
• KISS, easily extensible, Unix philosophy
How radmind works
System state
p base-system-10.3.4.T
n os-negative.T
d /Users 0755 0 0
d /private/tmp 1777 0 0
p omnigraffle.T
command.K
d /Applications/OmniGraffle.app
0755
f /Applications/OmniGraffle.app/.DS_Store
d /Applications/OmniGraffle.app/Contents
…
0 80
0755
0755
0
0
0 1096985459
80
os-negative.T
6148 WCf1IuqHcXrNGZDUiX+Buucs83Q=
omnigraffle.T
• Command file determines the overloads that make
•
up the system
Positive transcripts add files, negative ignore files
User experience
• Updates run while user is working
• User gets prompted before downloading
Radmind conclusion
• 1 mechanism
• Fixes all problems
• Introduces new ones
 Solved by client side trigger scripts
• Normally used in lab setting at reboot time
• Normally not used while user is working
 But it works really well for us
Our radmind setup
• Global DNS service points to nearest, available
radmind server
 Produces a scalable, highly available setup
 SSL certs should contain DNS alias
• 3 ports for 3 trees: stable, testing, staging
 Reduces operator error
 Shared file tree for disk space optimization
• Master host to maintain trees
 Push changes downstream using rsync
 Script checks correctness and dependencies before
pushing (dist-it)
Distributed setup
Multi-release setup
• The file storage is
•
•
shared between
releases
Stable, testing and
staging are the source
dirs for 3 radmind
daemons on 3 ports
Symlinks allow fast
switching
AssetInterface
• Asset tracking software tracks:




Owner of machine (set on first login)
IP addresses
Logins
Etc
• Saves data locally until machine can reach the
•
•
server
Precompiled SQL
Once system leaves us, we don’t see it again
until it breaks ⇒ Margaritas @ the beach!
RegServ
• Radmind differentiates systems based on IP or
certificate name
 We encode system info in the certificate name
 e.g. ppc7450.PowerBook5,4.W842219KQW3.mthibaut
 Wildcards allow matching any of the parts
• RegServ uses generic client certificate installed
on base image to securely provide a machine
specific certificate
 Secure as long as client base image is secure
 Based on radmind code
How - LDAP
• We used our existing LDAP servers
• OS X can cache the credentials
• Lots of policy enforcement possible
 provide default or forced custom settings for any
Cocoa application, lots more
• Sounds simple, but we had quite a bit of trouble
 “MCX” keywords are undocumented
 Trial and error
 Final solution: use same LDAP layout as OS X server
in a subtree, allows using Apple’s GUI tools
• Overall it works, could be better
How - FileVault
• Secures user data using AES-128 encryption
 Data is stored in a resizing disk image
 Master certificate allows password recovery by admin
• Deemed mandatory in our organization




We had to hack things a bit
A script runs at login time and verifies existence
Creates FileVault if not there
Works reasonably well, fast (2-3 MiB/s)
Backups
• KISS
• We already back
•
•
up our home
directories on
Solaris
So we wrote a GUI
for rsync over SSH
Works fine, even
though resource
forks not copied
Wrapping up
Mac OS X vs Solaris
In our experience:
 Better application availability/installation
 Little need to manually compile tools (e.g. fink)
 GUI/usability is not an afterthought
• User mountable filesystems
• User installable programs
 It Just Works
• We can concentrate on our users
Giving back
• All our programs, utilities and scripts for this
•
•
project have been published
http://radmind.org/contrib/LISA05/
We apologize for the inconvenience




Hard coded paths
No installation scripts
Little documentation
Choose your license, but don’t call Cisco TAC about
these!
Questions?
Backup Slides
Because we didn’t tell you
everything :)
Desktop replacement
Why
• Laptops are becoming standard





Telecommuting
Lab work
Customer visits
One system for everything
One computer per employee
• Our users need/want Unix
Why
• Needed a supportable platform
• Experiments with Linux failed
• OS X was the only viable Unix for portable
systems
 It Just Works
Why … not Linux?
 No vendor hardware support
 No drivers for recent hardware
 Missing UI for common laptop tasks
• Networking setup
• Mounting network file systems
• Hotplugging disks, audio, video, …
Why … choose OS X?
• Unix!
 X-Windows capability
• Well designed, e.g. fixing long-time Unix issues
•
(launchd, directory services, …)
Everything is integrated





Applescript
User preference system
Server side preference setting overrides
Naming services
Etc (very extensive)
Why … choose OS X?
• Security
 Admin access hardly needed
• User installs, network setup, USB drives, audio
devices, …
 FileVault
 Secure screen lock
• Good software support from Apple
• Reasonable software availability for
platform
Why … choose Apple?
•
•
•
•
One vendor for hardware and software
Perfect platform support for OS X
Quality hardware
Supportability:
 Firewire target mode for disk rescue
 One disk image for all system types
• Except for brand-new systems
Why … not OS X?
• Politics (MS centric world)
• Apple not geared towards enterprise
 HW release schedule forces use of buggy OS
releases
 Undocumentation & secrecy
 No on-site or phone support as with Sun
 Support costs $$ per admin rather than $$ per
machine
Why … not OS X?
• LDAP caching
 System always contacts LDAP server when
cached data expires and name lookup occurs
 Where can we change expiry timers?
 Can’t this be done asynchronously?
• Have to learn “the Apple Way”
 Sometimes there is no other way
• Using X-Windows is a pain
FileVault architecture
• Sparse disk image
• Encrypted by long key stored with disk image
• Key is encrypted with 2 keys, each can unlock it:
 User password
 Master FileVault certificate
• Master FileVault certificate is encrypted with an
admin password
What is radmind?
• Command files + overloads
 Describe system state, files + checksums
• ktcheck
 Download wanted system state
• fsdiff
 Compare file system with wanted state
• lapply
 Repair changes found by fsdiff
• All done using encryption and authentication
Our overload layout
Download