Three languages for fabric configuration Alexander Holt School of Informatics, University of Edinburgh lex@fixedpoint.org Abstract The highly automated configuration of large scale computing fabrics increasingly exploits declarative representations of the fabric’s intended state. We examine three languages for expressing these representations: LCFG, Pan and SmartFrog. We give examples of their use and examine which language properties are of particular relevance to the configuration of grid computing fabrics. 1 Introduction mediately partitions the configuration problem into two parts: (a) developing the correct description, and Local computing fabrics require correct configura- (b) applying it to all relevant nodes. Only the second tion to function properly. We take ‘configuration’ to part requires a procedural solution, and can be made include operating system and application software neutral with respect to specific configuration values. installation, the setting of system parameters and configuration files, and the configuration of software components: everything that is under machine con- 2 Languages trol. Traditionally, large networked UNIX installa- Declarative representations amenable to automated tions have been managed with a variety of tools, of- processing don’t have to be presented as formal lanfering varying degrees of centralisation and automa- guages. For example the Rocks clustering toolkit tion. These tools have tended to be ad hoc, rein- (Papadopoulos et al. 2003) relies on a declarative vented in different ways at different sites—and lack- representation that consists of the Red Hat Kickstart ing an explicit model of the intended state of each (Red Hat, Inc. 2003) configuration language (with node in the fabric. Instead, the information about some additional structure) augmented by a nodehow a node should be configured would be implicit oriented network parameter database. But a single well-defined language has advanin the design of a special-purpose script, or hidden in various parameters, or perhaps supplied on the fly tages, such as the ease with which its semantics can be expressed. This paper briefly compares some as arguments. This approach makes it hard to tell whether a contemporary options for such a language. The LCFG system (Anderson and Scobie 2002) system is in its intended state or not; it also makes it difficult to compare configurations, as the rele- is a configuration tool already used by some grid vant scripts may need to be executed before their computing sites that relies on a declarative represeneffects are apparent. And since in these traditional tation, and its language is this first one considered in tools configuration policy often gets mixed in with this study. The second language is Pan, recently crethe scripts that apply it, modifying configurations is ated for the European DataGrid (EDG) project (Cons and Poznański 2002). The third language is that of intrinsically vulnerable to programmer error. Historically, these drawbacks have to some ex- the SmartFrog system, currently under development tent been hidden by the relatively small scale of in- at HP Laboratories (Goldsack et al. 2003). stallations, or by the availability of sufficient numbers of experienced system admin staff. In the grid world, these assumptions no longer hold: we require 3 Examples configuration systems that scale to 10,000-node fabSuppose we have a ‘template’, or set of defaults, that rics. specifies how the first disk in a system should be An important step in improving older tools is to partitioned, indicating partition size, filesystem type adopt an explicit and declarative representation of and mount point (where relevant). We want to use the desired configuration state for a fabric. This im- this template when defining the configuration of individual nodes, but sometimes we wish to override certain values. Here’s a simple example of this situation, coded up in the three languages we are studying. Each case consists of a template definition in one file, followed by a second file defining the configuration for a node called mysys, which changes the size of the first partition, and adds a third partition. First, here’s LCFG: ext2 hda1 hda2 5000 <%disk.deftype%> / 512 swap disk.h #include "disk.h" !disk.parts mADD(hda3) !disk.size_hda1 mSET(7000) disk.size_hda3 3000 disk.type_hda3 <%disk.deftype%> disk.mount_hda3 /opt mysys disk.deftype disk.parts disk.size_hda1 disk.type_hda1 disk.mount_hda1 disk.size_hda2 disk.type_hda2 diskTemplate.sf diskTemplate extends { disk { deftype "ext2"; parts extends { hda1 extends { size 5000; type ATTRIB deftype; mount "/"; } hda2 extends { size 512; type "swap"; } } } } #include "diskTemplate.sf"; mysys extends diskTemplate { disk:parts:hda1:size 7000; disk:parts:hda3 extends { size 3000; type ATTRIB deftype; mount "/opt"; } } mysys.sf In LCFG, the language doesn’t actually know about templates—hence the use of a separate file (disk.h) to package up the default values, which is then in- 4 Compilation and order cluded by the C preprocessor. The situation is differdependence ent in Pan and SmartFrog, where there is templatespecific syntax. We can think of all three languages being compiled This is how it looks in Pan: to a similar tree-structured attribute-value low level disk.tpl template disk; representation. (This representation then forms the "/disk/deftype" = "ext2"; basis for applying—or deploying—the configuration "/disk/parts/hda1" = nlist( to nodes in the local fabric.) "size", 5000, A significant property of the compilation process "type", value("/disk/deftype"), is the extent to which the textual order in which lan"mount", "/", guage elements occur in the input files affects the ); final representation. If a language is very order de"/disk/parts/hda2" = nlist( "size", 512, pendent, it can make it harder to keep modules inde"type", "swap" pendent and reusable, as their meaning then depends ); on exactly where they occur in the input stream. It also becomes more difficult to describe the semanmysys.tpl object template mysys; tics of the language clearly, since the ‘current state include disk; of processing’ needs to be modelled as well. "/disk/parts/hda1/size" = 7000; All three languages exhibit significant order"/disk/parts/hda3" = nlist( dependence, though SmartFrog’s pervasive use of "size", 3000, template structures helps reduce the impact. Only "type", value("/disk/deftype"), LCFG—augmenting an otherwise fairly impover"mount", "/opt", ished syntax—actually requires an explicit marker ); to override a value defined earlier in the source file: Compared to LCFG the most obvious difference is see the lines beginning with ! in the file mysys. that LCFG lacks a clear syntax for compound values, Another common theme is a ‘reference’ mechaand instead uses an ‘underscore with index’ conven- nism for retrieving values from elsewhere in the reption to indicate the intended structure. resentation. In the examples above the value of the Here’s the example in SmartFrog—as with Pan, reftype attribute is referenced by the files which this could be done in a single file: define the mysys node. (In fact, both LCFG and SmartFrog provide more than one kind of reference, making it possible to access attribute values at different stages of the compilation or deployment process. Again, from the point of view of declarative comprehension, this is not necessarily a good thing.) 5 Structuring defaults Managing the configuration of a large scale computing fabric is made easier if redundant information is minimized. Collecting together default settings in named templates can make a big contribution to this goal. Furthermore, it’s often the case that a site has a hierarchy of defaults, depending on, say, hardware types, or on the primary functions of nodes, or on network location. So a good configuration language should make it easy to capture this kind of structure and exploit it when defining and modifying configurations. LCFG has no intrinsic support for default hierarchies. One is limited to expressing structure through careful naming of preprocessor-included files. Similarly, since LCFG doesn’t have any template syntax, the only way to indicate which settings in an included file are being overridden is to textually group them after the #include line, perhaps with appropriate comments. Both Pan and SmartFrog, on the other hand, have syntax for invoking a template and simultaneously supplying the more specific attributevalue settings. SmartFrog’s template mechanism is unusual in that it is syntactically extremely simple, yet allows templates to be ‘refined’ into other templates in arbitrarily long chains. This frees the configuration architect from being forced to define all templates at a single level. There is a strong argument to be made that in large scale fabrics of any complexity, some collections of defaults—perhaps better called aspects, by analogy with aspect-oriented programming—do not fit neatly into hierarchical organizations. Treating these aspects accurately is fundamentally impossible in languages where the semantics of template composition is ultimately defined by textual ordering, as in the three under examination: more sophisticated models of aspect composition are required. 6 Types and validation from causing exceptional behaviour at run time is highly desirable. All three languages have provision for enforcing type constraints on values. Since LCFG doesn’t really recognize compound values beyond lists, it doesn’t have support for types beyond that level, but it has built-in provision for basic scalar types, as well as the ability to define arbitrary subtypes of ‘string’ based on regular expressions (examples: IP addresses, URLs). The languages also allow ‘validation expressions’ to be constructed that permit other kinds of wellformedness and validity tests to be carried out by the compiler. LCFG can test for list membership and whether a line is contained in a file, as well as offering access to DNS resolution. Pan has a much more comprehensive data manipulation language that can be used to test for quite complex conditions; Cons and Poznański (2002) give the example “for all machines, any filesystems mounted via NFS must be exported by the corresponding server”. The downside of this expressiveness is that Pan’s validation expressions are fragments of procedural code that must be re-evaluated whenever their target value changes, posing potential issues for both human comprehension and efficiency. SmartFrog has an extensive but somewhat more declarative type system under development. 7 Specifications as constraints Many of the stipulations found in actual configuration files are in fact more precise than they need to be, owing to the inability of contemporary languages to express appropriately under-specified constraints. For example, suppose one is trying to describe the set of services that a node should run. If a node is to act as a web server, then one might want to make sure httpd is in that set. But in practice, one has to explicitly add httpd to a list of services at the point that the template for a web server is evaluated. In LCFG, this might be done with a line like !boot.services mADD(httpd) But this only works properly provided that other template which modify the services list are evaluated in the right order, and that’s not always easy to guarantee. Ideally, it should be possible to say something like One reason to use a declarative configuration lanboot.services contains httpd guage (that is distinct from the process of applying configuration values to nodes) is to get compile time and have the compiler assemble a value for boot. detection of type-related errors. As with program- services at the end of the day which meets all the ming languages, preventing a large class of errors various constraints. None of the languages currently offers this functionality. 8 Configuration languages for grids Acknowledgements We gratefully acknowledge the European Union’s support of the EDG project and UK support for EDG We have tried above to emphasize some areas in conWP4 through PPARC grant Ppa/G/S/2000/00696. figuration language design of particular relevance to The author is indebted to colleagues in EDG WP4 grid computing fabrics. Other issues of importance and at Edinburgh for many illuminating discussions, include: and to Paul Anderson in particular. • Security and delegation. All three languages derive from configuration systems that have devoted effort to addressing security concerns. References However none of the languages is entirely Anderson, P. and Scobie, A. (2002) LCFG—the suitable in its current form for use in a large next generation. In Proceedings of UKUUG scale environment where authors, maintainers LISA/Winter Conference 2002, 13–14 February and system operators work at different secu2002, SOAS, University of London. UKUUG, rity levels and require a declarative model of Buntingford, Herts. http://www.lcfg.org/ the site’s configuration that respects delegadoc/ukuug2002.pdf. tion boundaries. Beckett, G., Kavoussanakis, K., Mecheneau, G., • Change management. In production enviToft, P., Goldsack, P., Anderson, P., Paterson, J. ronments it’s essential that configurations can and Edwards, C. (2003) GridWeaver: automatic, be easily rolled back to earlier versions with adaptive, large-scale fabric configuration for grid known properties, and more generally, that computing. This volume. configuration change can itself be tracked and reported. While mechanisms for accomplish- Cons, L. and Poznański, P. (2002) Pan: A highing this can clearly be developed indepenlevel configuration language. In Proceedings dent of particular languages, it is not yet clear of LISA 2002: 16th Systems Administrawhether language-specific support for change tion Conference, Philadelphia, Pennsylvania, processes (perhaps in conjunction with deleUSA, November 2–8, 2002. USENIX Assogation) would be beneficial. ciation), Berkeley, Ca., pp. 83–98. http: //hep-proj-grid-fabric-config.web. • Asymptotic configuration and autonomic comcern.ch/hep-proj-grid-fabric-config/ puting. The term ‘asymptotic configuration’ documents/pan-lisa.pdf. has been coined to describe systems of sufficient scale that it’s impossible to guarantee all Goldsack, P., Guijarro, J., Lain, A., Mecheneau, nodes will be in a certain configuration at any G., Murray, P. and Toft, P. (2003) SmartFrog: given instant. Instead, the aim is merely maxConfiguration and automatic ignition of disimum convergence. tributed applications. In Proceedings of HP OpenView University Association 10th Workshop, Taking the argument further, it may even be University of Geneva, Switzerland, July 6–9, unrealistic to expect that any single point in 2003. http://www.smartfrog.org/papers/ the fabric—such as a central server—can have SmartFrog_Overview_HPOVA03.May.pdf. a comprehensive view of how every parameter of the entire fabric should be configPapadopoulos, P. M., Katz, M. J. and Bruno, G. ured. Instead, nodes may need to behave with (2003) NPACI Rocks: tools and techniques for more autonomy, making some configuration easily deploying manageable Linux clusters. Condecisions for themselves, within limits set by currency and Computation: Practice and Experifabric-wide policy. The challenge is to deence, 15, 707–725. Special issue: Cluster 2001. vise languages which can both adequately exhttp://download.interscience.wiley. press these policy constraints, and be used dycom/cgi-bin/fulltext?ID=104524123. namically by individual nodes to adjust their own configuration. An experiment in using a Red Hat, Inc. (2003) Red Hat Linux 9: Red Hat combination of LCFG and SmartFrog to perLinux Customization Guide. Red Hat, Inc., mit nodes some degree of reconfiguration via Raleigh, NC. Chapter 7: Kickstart Installations. peer-to-peer protocols is described by Beckett ftp://ftp.redhat.com/pub/redhat/linux/ et al. (2003). 9/en/doc/RH-DOCS/pdf-en/rhl-cg-en.pdf.