LabKey Server Modules Author: Last Changed: Dave Stearns February 8, 2016 Purpose This document proposes a new design for LabKey Server Modules. This design is based on the previously-discussed ideas of “simplified modules,” “mini-modules,” and “file-based modules.” Since these concepts were never defined fully, this document seeks to provide a concrete design that incorporates the important ideas from each. My intention is that we use this design for all of our modules so that we have only one module loader and only one format for modules. Thus part of the work involved in this design is adjusting our build so that the files in our existing modules are organized according to this specification. What is a Module? In this design, a module is simply a set of resources, packaged together into a format that is easy to develop and deploy. Modules may contain many kinds of resources, the types of which include (but are not limited to): Java classes Configuration files Simple Views (JSP and HTML) Reports Queries Validator Scripts Assay Types Database Scripts Schema Definitions and Migrations (e.g., domains) Static web content Modules may be as simple as a few static files, or as complex as the modules we currently build today. Furthermore, migrating a module from simple to complex should be a clear and seamless process, requiring only incremental knowledge to move from one level of complexity to another. Module Packaging When developing a module, one typically wants the module files to be directly accessible via the file system; when that module is deployed to a customer, one typically wants it to be compressed into a single file so that it is easy to transfer and relatively opaque to the customer (discouraging potentially 1 destructive fiddling). Therefore, the server should support loading a module either in compressed or uncompressed form. An uncompressed module is simply a directory in the modules/ or externalModules/ directory. The structure of a module directory is described below. A compressed module is simply a zip of that uncompressed directory with a .module file extension. The server will automatically extract resources that must exist as separate files at runtime (e.g., JARs, static web content). If the .module file is updated, the module loader will automatically extract the new versions of these files, replacing the older uncompressed versions. Whether we support doing this without a redeploy of the web application is subject to requirements, time and funding constraints. Module Directory Structure The structure of a module directory will follow these design principles: Everything is optional: there should be reasonable defaults for everything, but developers should be able to override many of those defaults by supplying additional files or configuration. Convention over configuration: wherever possible, we will rely on well-known directory and file names/extensions rather than require developers to explicitly register things via configuration files. Less is more: we will define structure only for those resources we currently think we understand, and leave room for extensions and modifications in the future. Reflect logical arrangement, not internal architecture: the directory structure should reflect the logical arrangement of resources, not our current internal architecture (which is often less than ideal and subject to change). Module Configuration File A module may include a root configuration file that contains meta-data about the module, but this too is optional. If supplied, it should be named module.xml and be placed in the config/ directory. The exact format of this file can be worked out in development, but XML seems like a logical choice for the encoding. The Spring DI format might be appropriate, but I personally find it verbose and obtuse. The following information may be supplied in this file: Name Version Name RequiredServerVersion Dependencies Description The module’s version number. If not supplied, the version is assumed to be 0.0. A name for the module. If not supplied, the directory name is assumed to be the module name. A version of the server that this module requires. If supplied and the server’s current version does not match this value, the server will not load the module and will display an suitable error message on the admin console. A list of other module names this module depends upon. If not supplied, the 2 ModuleClass BuildDetails/ SVNRevision SVNURL BuildUser BuildTime BuildOS BuildPath SourcePath module is assumed to have no dependencies. The name of the Java class that implements the Module interface. If not supplied, the system will supply a default Module implementation that uses various pluggable loaders to load and register the file-based resources in the module. A branch of information about how this module was built. If not supplied, this information will simply not be available to the system. The current SVN revision number at the time this module was built. The URL to the SVN server at the time this module was built. Name of the user who built this module Time the module was built. The OS on which this module was built. The local path in which this module was built. The local path from which the sources of this module came. Top Level Directories The module directory may contain any of the following sub-directories. A module may include as many of these as necessary, but none are required. We will add to this list over time as needs arise. Some of these may not be supported in the 9.1 release, but we will reserve these names for future releases. Name config/ lib/ views/ reports/ queries/ assays/ schemas/ web/ Description Contains the module configuration file, plus any Spring configuration files that might be needed (e.g., pipeline config). Contains any JARs that the module needs. Contains simple HTML views. These can be served by the default module and controller that is automatically supplied by the system if this module does not contain Module and Controller Java classes. In this case, the controller name is the same as the module name, and the action name is the same as the base file name of the view. A branch of report definitions. See the Reports section later in this document for details. A branch of custom-query definitions. See the Queries section later in this document for details. A branch of assay definition. The structure of this directory will be worked out by the file-based assay project. Schema XML files and database transition scripts. See the section “schemas” later in this document for details. Static web content (images, stylesheets, static HTML, etc.). This will be made available via a URL like this: /<context-path>/<module>/… Files and Associated Meta-Data Wherever possible and appropriate, files should contain their own meta-data. For example, if a file can be expressed in XML, that file could contain both its content and meta-data in one XML document. 3 However, there are many cases where it is not appropriate to include meta-data and content in the same file. For example, developers would typically want to edit their R report in a simple text file, and R script files have no syntax for expressing meta-data. This is, of course, a classic problem, and there seems to be a few typical solutions: Use an optional companion file for meta-data. This file would have the same base name as the content file, but use a different extension (e.g., .properties, .xml, etc.). If not supplied, reasonable defaults should be assumed. Use an optional companion file, but compress the content and companion files together into a single file. Since we will support compressing module directories into a single file, it may or may not be advantageous to also support compressing individual files together with their associated meta-data. However, if this is desired, our various resource loaders should support loading both compressed and uncompressed versions of these files. The resource loaders are the glue that transform a file-based resource into a live running Java object of the type the system expects. For example, the reports resource loader would load a file-based report definition into the appropriate subclass of ReportDescriptor. Schemas The schemas directory contains information about database schemas the module needs to define. For 9.1, this will include the schema XML files and the database-specific transition scripts we currently define for modules. In the future, this may be extended to include declarative schema and migration files, similar to Rails. The structure of this directory will be as follows: schemas/ <schema-name>.xml dbscripts/ postgresql/ <schema-name>-<from-version>-<to-version>.sql sqlserver/ <schema-name>-<from-version>-<to-version>.sql Q: Instead of splitting the scripts into a directory per database type, should we instead just use different extensions per database type? For example: study-0.00-0.01.pg.sql (PostgreSQL), study-0.00-0.01.ss.sql (SQL Server), study-0.00-0.01.oracle.sql (Oracle). General consensus is “no” for now. Reports The reports directory contains a set of report definitions, organized by the object to which each report is tied. Most reports require a dataset and are thus tied to a particular schema and query. Thus, a reasonable organization for the reports directory would group reports by schema and then by query. 4 Each report could then have an optional meta-data file naming the particular view of that query is needs, or an explicit list of field keys once we support that feature. reports/ schemas/ study/ participant/ my-report.r my-report.xml another-report.r another-report.xml nab/ my-assay/ third-report.r (meta-data) For example, the “my-report.r” report would show up as an R report view whenever the user was viewing the participant query in the study schema, and the report runner would use the view of that query named in the my-report.xml meta-data file to generate the dataset. Reports defined in modules will be read-only via the web interface. Since modules can be upgraded, it would be problematic to let users change a report defined in a module, as all those changes would be lost once the module changed. However, we could make it possible for module-defined reports to become a template for new reports. Queries The queries directory, like the reports directory, contains a set of custom query definitions. Queries, by definition, are tied to a particular schema, so it stands to reason that they should be organized according to schema. The structure would thus look like this: queries/ study/ my-query.sql my-query.xml (meta-data) nab/ another-query.sql another-query.xml Module Deployment Deploying a module will still require file-system access to the externalModules/ directory, which is a peer of the modules/ directory of the LabKey web application by default, but can be configured to point to some other file share. There is currently no plan to support deployment of a module file via any other mechanism (web file upload, DAV, etc.). 5