Vac, Vac-in-a-Box, and Vcycle - Indico

advertisement
Vac, Vac-in-a-Box, and
Vcycle
Andrew McNab
University of Manchester
LHCb
Overview
●
Vac and Vcycle status
●
Most news is about Vac
●
●
Introduction to Vac-in-a-Box
●
Containers and CernVM
●
Plans
Observations about OpenStack
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
2
Vac and Vcycle
●
Two GridPP systems aimed at running VMs
●
Vac - autonomous hypervisors
●
●
Each factory machine creates VMs in response to
observed demand for each type of VM
Vcycle - uses OpenStack etc
●
●
●
Factories created via Cloud API in response to observed
demand for each type of VM
OCCI (EGI) and Azure (MS) plugins from CERN
Now also working with commercial OpenStack provider
(DataCentred at MediaCity)
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
3
Vac, Vcycle, HTCondor Vacuum deployment
Vac
Vcycle
ATLAS
CMS
LHCb
GridPP DIRAC
Manchester
✔
✔
✔
✔
Oxford
✔
✔
✔
✔
Lancaster
✔
✔
✔
UCL
✔
✔
CERN (LHCb)
✔
CERN (Dev)
✔
✔
✔
✔
Imperial
✔
✔
✔
✔
CC-IN2P3
✔
DataCentred
✔
✔
✔
✔
HTCondor Vacuum at RAL
✔
✔
4
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
Vac installation and support
●
Detailed Admin Guide in distribution and on website
●
●
There is a Support Unit for Vac/Vcycle in GGUS
●
●
And a service type for Vac in GOCDB
The Vac distribution includes a Puppet module
●
●
Plus mailing list and other website contents
So it can be managed like a conventional service or piece of
grid middleware
Aim is to make the Vac daemon easy for system admins to manage
●
●
Factory machines are autonomous so less worrying about small
failures of services that have big consequences
man pages, Vac rereads config files each ~minute, config
merger of /etc/vac.conf and /etc/vac.d/*.conf
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
5
Vac site
V VVVV
HEP-specific software
is hidden inside the VMs
apart from Vac.
Vac factory
V VVVV
VM
VM
VM
VM
V VVVV
VM
V VVVV
vacd daemon
V VVVV
V VVVV
puppet client
PXE BIOS
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
6
Vac site
V VVVV
HEP-specific software
is hidden inside the VMs
apart from Vac.
But still need to run
the usual set of
Linux site services.
DNS service
DHCP service
Vac factory
V VVVV
VM
VM
VM
VM
V VVVV
VM
V VVVV
vacd daemon
Puppet server
V VVVV
V VVVV
TFTP service
puppet client
PXE BIOS
Squid server
Kickstart /
HTTP server
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
7
Even easier: Vac-in-a-Box
●
●
●
Vac-in-a-Box (ViaB) aims to make it ridiculously easy to
install Vac-based sites
Extend the autonomous factory machine model to managing
the operating system too
Remove all reliance on central service machines at the site
●
●
●
But be flexible to accommodate site policy about DHCP etc
Each factory not only runs Vac to create machines, but also
●
A Squid cache used by CernVM-FS in the VMs
●
All the services for any other machine to install itself
Configuration is managed through viab.gridpp.ac.uk website
●
Files fetched from website are static, so easy to scale up
multiple instances of the web server
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
8
Vac-in-a-Box website
●
Configuration is managed by site and/or regional admins
●
●
Only required when changes are needed
Right to modify settings based on per-site lists of DNs
●
Intend to pull these from GOCDB lists of site ops
●
Sites use GOCDB-style names like UKI-NORTHGRID-MAN-HEP
●
Sites consist of one or more “spaces”, equivalent to CEs
●
In Vac:
●
Spaces consist of factory machines, with IP, MAC etc
●
Spaces have one or more vmtypes
●
details of how to contextualise that type of VM
●
and its target share of the space
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
9
Site configuration
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
10
Space configuration
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
11
Host (factory) configuration
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
12
Vmtype configuration
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
13
Vac-in-a-Box docs
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
14
DNS and DHCP in ViaB
●
Since we have a list of factories including IP/DNS, can provide
DNS within ViaB
●
●
●
Can use site DNS servers for general DNS queries or generic
DNS like 8.8.8.8/8.8.4.4
Each factory runs a DHCP server
●
Again based on the configured list of IP/DNS/MAC
●
Set non-authoritative, so can co-exist with site DHCP
●
●
Instead of running private DNS services, ViaB just lists each
factory in /etc/hosts on each factory
If necessary, site DHCP can be configured to respond for
factories instead
All this means you can run a ViaB site without your own DNS/
DHCP servers, or getting changes to them if someone else does
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
15
Squid caches
●
Need access to Squid HTTP caches for CernVM-FS
●
●
VMs use CernVM, which gets OS via CernVM-FS too
Site Squids can be single points of failure, performance
bottlenecks etc
●
ViaB puts a Squid cache on each factory machine
●
Used for VM CernVM-FS and RPM installs on that machine
●
Squids use ICP UDP protocol to check if any other Squid in
the space has the requested file in their cache
●
●
Only then is an upstream or direct request made
Avoids need for site Squid(s). Very robust.
●
Least loaded Squid will typically reply first via ICP.
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
16
PXE installation in ViaB
●
●
●
●
A new factory boots with its PXE BIOS and uses a DHCP reply from one
other ViaB machine (the “responder”)
The responder DHCP tells PXE BIOS on installing machine to contact
the responder’s TFTP too
TFTP on each ViaB factory has the usual install files, plus a pointer to
use the ViaB Kickstart file from viab.gridpp.ac.uk
Kickstart file tells Anaconda installer to use the responder’s Squid
cache for fetching the SL6, CA, APEL etc RPMs
●
The site’s viab-conf RPM is fetched from viab.gridpp.ac.uk
●
Installation proceeds as usual, taking ~5 minutes
●
The machine boots straight into SL6 using kexec, so no PXE boot
looping
●
●
A reboot forces a reinstall, unless you manually skip PXE
To get the first factory up, a USB boot image is provided
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
17
Vac-in-a-Box configuration management
●
●
●
Configuration of the site is managed on viab.gridpp.ac.uk
Each time the “Publish” button is used, a new viab-conf RPM is generated
and put in the site-specific YUM repo
●
Includes a list of factory IP/DNS/MAC
●
vmtype configurations for Vac
●
X.509 certificates and encrypted keys in .p12
●
SSH authorized_keys list
●
Idempotent viab-conf-postinstall to apply any changes
●
RPM dependencies to install more RPMs via YUM
A persistent partition at /etc/viabkeys has the .p12 password
●
●
Has to be set up once per factory, but survives reinstalls
YUM auto updates are enabled to ensure security updates are applied
●
Less risk of breaking everything since no middleware other than APEL
●
All YUM updates other than viab-conf are done via factory Squid caches
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
18
Vac+Puppet site
HEP-specific software
is hidden inside the VMs
apart from Vac.
But still need to run
the usual set of
Linux site services.
V VVVV
DNS service
DHCP service
Vac factory
V VVVV
VM
VM
VM
VM
V VVVV
VM
V VVVV
vacd daemon
Puppet server
V VVVV
V VVVV
TFTP service
puppet client
PXE BIOS
Squid server
Kickstart /
HTTP server
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
19
Vac-in-a-Box site
HEP-specific software
is hidden inside the VMs
apart from Vac.
V VVVV
Vac factory
V VVVV
Site services are
hidden inside the Vac
factory machines.
VM
VM
VM
VM
V VVVV
VM
V VVVV
vacd daemon
V VVVV
V VVVV
YUM+viab-conf
DHCP
Squid
hosts
TFTP
viab.gridpp.ac.uk
PXE BIOS
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
20
Back to generic Vac …
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
21
Containers
●
●
CernVM group now offering a technology preview of
container-based machines using Docker
Plan is to use this to offer containers managed by Vac
●
●
●
Will run CernVM-FS on the factory
So a trade-off in managing that vs advantages of
containers
So where I say “virtual machine” in these slides, I could
say “logical machine” to be general
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
22
Vac next steps
●
New VacQuery UDP protocol to scale up beyond ~100
factories in the future
●
Container support
●
Fetch ops DNs for viab.gridpp.ac.uk from GOCDB
●
Provide an (even more) OpenStack-like environment for the
VMs from Vac
●
Integrate Squid-on-factory mode in Vac Puppet module
●
… but stay near current codebase of ~3300 lines of Python
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
23
OpenStack codebase commits
“CY15-Q1 Community Analysis — OpenStack vs OpenNebula vs Eucalyptus vs CloudStack”
http://www.qyjohn.net/?p=3801
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
24
Summary
●
Vac and Vcycle continue to be in use by a matrix of
sites and experiments
●
Vac-in-a-Box now further simplifies creating a Vac site
●
More Vac and Vcycle development in the pipeline
●
OpenStack appears to be the clear winner among
Open Source cloud implementations
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
25
Target shares: ATLAS vs LHCb
Each autonomous Vac
machine uses VacQuery
UDP protocol to discover
what else is happening at
the site.
Compares this against
target share for each type
of VM (~1 per
experiment).
Creates new VMs for
experiments currently
under their share.
But backs-off creating
types of VM which are
failing to find any work to
do.
26
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
Example of Vac configuration
●
Section of vac.conf used to enable LHCb VMs at Manchester
●
They just need this and to create a hostcert/key.pem
●
Compare what YAIM has to do to add a VO to a CE/Batch site
[vmtype lhcbprod]
vm_model = cernvm3
root_image = https://lhcbproject.web.cern.ch/lhcbproject/Operations/VM/cernvm3.iso
rootpublickey = /root/.ssh/id_rsa.pub
backoff_seconds = 600
fizzle_seconds = 600
max_wallclock_seconds = 172800
log_machineoutputs = True
accounting_fqan=/lhcb/Role=NULL/Capability=NULL
heartbeat_file = vm-heartbeat
heartbeat_seconds = 600
user_data = https://lhcbproject.web.cern.ch/lhcbproject/Operations/VM/user_data
user_data_option_dirac_site = VAC.Manchester.uk
user_data_option_cvmfs_proxy = http://squid-cache.tier2.hep.manchester.ac.uk:3128
user_data_file_hostcert = hostcert.pem
user_data_file_hostkey = hostkey.pem
27
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
Vcycle
Apply Vac ideas to
OpenStack etc
Cloud Site
Site
Vcycle
VM factory
s
uest
Req al jobs
e
for r
Third party
Vcycle
VM factory
Central
agents &
services
Vcycle implements an
external VM factory
that manages VMs.
Pilot VM. Runs
Job Agent to
fetch from TQ
Vcycle
VM factory
No direct
communication
between Vcycle
and task queue
Matcher &
Task Queue
Can be run centrally
by experiment
or by site itself
or by a third party
VMs started and
monitored by Vcycle, but
not managed in detail
(“black boxes”)
User and
production
jobs
Vac, Vac-in-a-Box, Vcycle - Andrew.McNab@cern.ch - GridPP35, Sep 2015, Liverpool
28
Download