BROCKMANN CONSULT Programming with BEAM Norman Fomferra Carsten Brockmann 20. – 22.01.2009 Finnish Environment Institute, Syke Helsinki 1 BROCKMANN CONSULT Day 1: Introduction 1. Software Overview An introduction to VISAT, its user interface elements and the BEAM command line tools. Includes a detailed presentation of the BEAM standard I/O format. 2. Architecture and APIs A high level introduction into the BEAM 4.x architecture including the diverse application programming interfaces. 3. Runtime Configuration A detailed presentation of the BEAM installation directory structure, the standard modules, the 3rd party libraries used, the main entry points into the BEAM code. 4. Development and Build Process An introduction to Java development with BEAM. The goal is to compile and run a simple program that uses the BEAM API with your preferred Java IDE. Programming BEAM, Syke, 20.-22.01.2010 2 BROCKMANN CONSULT Day 2: Using the BEAM API 1. A simple Data Processor We develop a simple data processor, which will ingest an AATSR Level 1 data product and write a trivial cloud mask. The processor will use a simple command line interface. We will then review the output in VISAT. 2. BEAM Graph Processing Framework (GPF) A detailed introduction to the new BEAM graph processing framework. 3. Developing a GPF Operator We convert the simple data processor into a GPF operator. The processor will use the standard command line interface provided by the GPF. 4. BEAM Scripting An introduction to the new Scripting API introduced in Java 1.6 and a discussion of its relevance for BEAM. Programming BEAM, Syke, 20.-22.01.2010 3 BROCKMANN CONSULT Day 3: Extending BEAM 1. BEAM Module Development An introduction to BEAM modules (plug-ins), the module descriptor file and module activators. 2. The Maven Build System Maven is a highly configurable build system which allows us to quickly create a module for BEAM. 3. Creating a BEAM Module We add a module descriptor to the code developed so far in order to create a versioned, deployable and updateable BEAM module. The GPF operator becomes part of BEAM. 4. Integration with VISAT We develop a simple graphical user interface for the GPF operator. We integrate the GPF operator and its GUI into VISAT by adding a command which allows the users to invoke the operator from the VISAT Tools menu. Programming BEAM, Syke, 20.-22.01.2010 4 BROCKMANN CONSULT Wish list of participants • Writing plug-ins (like a simple algorithm) • Batch processing • Automatic retrieval of new satellite products (wouldn't that be a nice feature) • Command-line interfacing, interfacing with MATLAB • Automatically producing graphics or georeferenced sub-products • Pixel value retrieval, coordinate-based retrieval for a large sample set • Overlaying transect data from external sources with beam-supported products Programming BEAM, Syke, 20.-22.01.2010 5 BROCKMANN CONSULT Why develop with BEAM? • • • • • • • Access EO data, metadata, no-data, flags Simple & effective programming models Reuse a rich software infrastructure It is free, it is open source, its Java Review and reuse source code BEAM code is widely used, it is in use BEAM is supported by ESA Programming BEAM, ESRIN, 22.-26.09.2008 6 BROCKMANN CONSULT Development Use Cases 1. Use it Write batch-mode scripts Program stand-alone applications Develop web-services 2. Extend it Add EO data readers, writers Add EO data processing nodes Add map projections, DEMs Add VISAT actions, tool bars and tool windows 3. Clone it Customize and brand VISAT Programming BEAM, ESRIN, 22.-26.09.2008 7 BROCKMANN CONSULT Architecture Overview Programming BEAM, ESRIN, 20.-22.01.2010 8 BROCKMANN CONSULT Module-based Architecture • BEAM module: – Name – Description – Version – Authors – Changelog – Dependencies – Extension points – Extensions • Simple extension model: – Host module provides extension point, e.g. “product-reader” – Client module provides extension to host's extension point, e.g. “seawifs-reader” – Clients can be hosts Programming BEAM, ESRIN, 20.-22.01.2010 9 BROCKMANN CONSULT VISAT Module Manager Programming BEAM, Syke, 20.-22.01.2010 10 BROCKMANN CONSULT Development Platform Usages Programming BEAM, Syke, 20.-22.01.2010 11 BROCKMANN CONSULT GPF - Graph Processing Framework Source Product Tile Tilereuse! reuse! Source Sourceproduct: product: 33bands, bands, 33xx44tiles tiles Read Tile Tilereuse! reuse! Tile Tile(0,1) (0,1) NoiseRed • Rapid processor development CloudMask Forget about processing environment AtmCorr Concentrate on data and algorithm Write Target Product Programming BEAM, ESRIN, 22.-26.09.2008 12 BROCKMANN CONSULT BEAM Development Process at BC • 3-4 core developers • 1-2 application developers • 1 permanent tester • Developers also do user support • Agile Development Process • Feature list planning with ESA every 2 months • Weekly iteration meetings • Daily stand-up meetings Programming BEAM, Syke, 20.-22.01.2010 13 BROCKMANN CONSULT Tools used for BEAM Development • Development IntelliJ IDEA (10) and Eclipse (2) IDEs Version control system is Subversion Maven - Project model & dependency management • Testing JUnit – achieve good tested(10) code coverage QF Test – Automatic application & GUI testing • Build Maven – The Java make Install4j – Installer builder TeamCity build server Programming BEAM, Syke, 20.-22.01.2010 14 BROCKMANN CONSULT Getting started • Online programming tutorial and guides www.brockmann-consult.de/beam/wiki • From BEAM website Download & install BEAM 4.6.1 Download BEAM API documentation Download BEAM source code • Install a Java IDE IntelliJ IDEA, Eclipse, NetBeans Programming BEAM, ESRIN, 22.-26.09.2008 15 BROCKMANN CONSULT Naming Conventions in Java Code • Interface / Class names CamaleCase, first character upper-case GeoCoding, Product, RasterDataNode • Variable names CamelCase, first character lower-case backgroundColor, threshold, minValue • Constant names All letters upper-case, underscore DIALOG_TITLE, BUFFER_SIZE_LIMIT • Method / Function names CamelCase, first character lower-case getSampleValue, computeTile , isClosed • Package names all lower-case beam, syke, binning Programming BEAM, Syke, 20.-22.01.2010 16 BROCKMANN CONSULT BEAM Installation Directory • beam-4.x.y bin BEAM installation directory Application binary and script files visat.exe, gpt.bat, ceres-launcher.jar, ... config Application configuration file(s) beam.config lib Common, 3rd -party libraries xtream.jar, jdom.jar, gt-*.jar, ... modules Application and plug-in BEAM modules ceres-*.jar beam-*.jar Programming BEAM, Syke, 20.-22.01.2010 17 BROCKMANN CONSULT Invocation of Command-line Tools Invocation of gpt.bat via a dedicated launcher program (cereslauncher.jar). All JARs found in BEAM’s lib and modules directories are put on the classpath: @echo off set BEAM4_HOME=C:\Program Files\beam-4.6.1 "%BEAM4_HOME%\jre\bin\java.exe" ^ -Xmx1024M ^ -Dceres.context=beam ^ "-Dbeam.mainClass=org.esa.beam.framework.gpf.main.Main" ^ "-Dbeam.home=%BEAM4_HOME%" ^ "-Dncsa.hdf.hdflib.HDFLibrary.hdflib=%BEAM4_HOME%\...\jhdf.dll" ^ "-Dncsa.hdf.hdf5lib.H5.hdf5lib=%BEAM4_HOME%\...\jhdf5.dll" ^ -jar "%BEAM4_HOME%\bin\ceres-launcher.jar" %* exit /B 0 Programming BEAM, Syke, 20.-22.01.2010 18 BROCKMANN CONSULT The BEAM Graph Processing Framework (GPF) Programming BEAM, Syke, 20.-22.01.2010 19 BROCKMANN CONSULT Terms • ((EO) Data) Processing System A system that utilizes hardware and manages processes running on it • ((EO) Data) Processor A program that transforms source products into target products • ((EO) Data) Processing Framework A software and API with an architecture designed for developing processors • Framework “Don’t call us, we call you.” Abstract interfaces comprising callback functions clients must implement Programming BEAM, Syke, 20.-22.01.2010 20 BROCKMANN CONSULT Processing System (1) • Main objective: Translate a job into a number of tasks running in parallel on a cluster of machines • Scalability – Add CPUs – Add more storage space • Data locality – Avoid network I/O – Force local data I/O • Load balancing – Assign tasks to idle compute nodes • Backup tasks – Handle tasks taking unusually long time to complete • Fault tolerance – Recover from HW failures • Monitoring – Statistics & Performance 21 BROCKMANN CONSULT Processing System (2) • A processing system places some requirements on the processes (processors) it is running: A processor shall be configurable Input and output files Processing parameters Runtime behaviour (e.g. memory, tiling, buffer sizes) A processor shall be observable Progress Status Logging Exit code Programming BEAM, Syke, 20.-22.01.2010 22 BROCKMANN CONSULT A Processing Graph MERIS TOA rad AATSR TOA refl VGT TOA rad MERIS SDR Proc AATSR SDR Proc VGT SDR Proc MERIS SDR A)ATSR SDR VGT SDR BB Albedo Proc BB Albedo Programming BEAM, Syke, 20.-22.01.2010 CalVal Proc 23 BROCKMANN CONSULT Processing Framework Requirements • Create processing chains or even graphs composed of processing nodes. Processing nodes shall be reusable in other graphs, e.g. MERIS Cloud Screening be easily configurable in terms of processing parameters be easily configurable in terms of the algorithm implementation • Processor implementation code shall abstract from physical file format of EO data file I/O configuration & parameterisation, parameter value access • Avoid I/O overhead between processing steps • Exploit multi-core CPU architectures • The framework shall be extendible, e.g. it shall recognise new processing node implementations at runtime Programming BEAM, Syke, 20.-22.01.2010 24 BROCKMANN CONSULT BEAM GPF • Allows to construct directed, acyclic processing graphs • Implements the “pull-processing” paradigm A processor is represented by its processing graph Product processing requests are translated into raster-tilerequests propagated backwards through the graph Tile requests are automatically parallelised depending on CPU cores • Simple data processing and programming model For a particular node, clients (Java developers) implement how a raster tile is being computed Parameter values are “injected” by the framework • Generated user interfaces: Command-line (gpt) and VISAT GUI • Operators can be dynamically added to BEAM via plug-in modules Programming BEAM, Syke, 20.-22.01.2010 25 BROCKMANN CONSULT Product Data Model (1) Programming BEAM, ESRIN, 22.-26.09.2008 26 BROCKMANN CONSULT Product Data Model (2) Programming BEAM, ESRIN, 22.-26.09.2008 27 BROCKMANN CONSULT GPF Architecture (1) knows its ► Band 0..n bands ◄ creates target ◄ uses as source Product 1 targetProduct 0..n sourceProducts System Module beam-core ◄ creates <<interface>> Tile setSample(x, y, v) getSampleInt(x, y): v getSampleFloat(x, y): v ... ◄ computes target tiles for ◄ provides source tiles for ◄ computes target samples for ◄ retrieves source samples from Operator initialize() computeTile(...) computeTileStack(...) dispose() ◄ maintains OperatorSpi getOperatorType(): Class createOperator(): Operator ◄ knows its <<interface>> OperatorSpiRegistry getOperatorSpi(id) addOperatorSpi(spi) removeOperatorSpi(spi) GPF createProduct(op, ...) createProduct(op, ...) createProduct(op, ...) createProductNS(op, ...) 1 instance System Module beam-gpf CloudMaskOp initialize() computeTile(...) CloudMaskOpSpi createOperator() Client Module cloudmask-op Programming BEAM, ESRIN, 22.-26.09.2008 28 BROCKMANN CONSULT GPF Architecture (2) • Clients provide their compiled Operator Java code packed into JAR modules • A JAR module publishes its operator services via metadata (Service Provider Interface, SPI) META-INF/services/ org.esa.beam.framework.gpf.OperatorSpi • Once GPF is started, e.g. by visat.exe or gpt.bat, it loads all operator services it finds on its (dynamic) classpath %BEAM_HOME%/modules/*.jar %BEAM_HOME%/lib/*.jar Programming BEAM, Syke, 20.-22.01.2010 29 BROCKMANN CONSULT GPF Processing Node Anatomy (1) @OperatorMetadata(alias = “NoiseRedOp", version = "1.0", authors = "Ralf Quast", copyright = "(c) 2008 by Brockmann Consult", description = "Performs a noise reduction on” + “ CHRIS/Proba images.") public class NoiseRedOp extends Operator { @SourceProduct(alias = "source") private Product sourceProduct; @TargetProduct private Product targetProduct; @Parameter(defaultValue=“false”) private boolean slitCorrection; @Parameter(interval=“(0,50]”, defaultValue=“25”) private int smoothingOrder; @Parameter(valueSet={“N2”, “N4”, “N8”}, defaultValue=“N2”) private String neighbourhoodType; // ... } Programming BEAM, Syke, 20.-22.01.2010 30 BROCKMANN CONSULT GPF Processing Node Anatomy (2) If single target bands can be computed independently of each other, e.g. NDVI or algorithms which perform single band filtering: public class NoiseRedOp extends Operator { public void initialize() throws OperatorException { // validate and process source products and parameter values // create, configure and set the target product } public void computeTile(Band targetBand, Tile targetTile, ProgressMonitor pm) throws OperatorException { // Obtain source tiles for used bands of source products // Process samples of source tiles to samples of target tile // Set samples of single target tile } } Programming BEAM, Syke, 20.-22.01.2010 31 BROCKMANN CONSULT GPF Processing Node Anatomy (3) If single target bands cannot be computed independently of each other, e.g. model inversion algorithms based on neural networks: public class NoiseRedOp extends Operator { public void initialize() throws OperatorException { // validate and process source products and parameter values // create, configure and set the target product } public void computeTileStack(Map<Band, Tile> targetTiles, Rectangle targetTileRectangle, ProgressMonitor pm) throws OperatorException { // Obtain source tiles for used bands of source products // Process samples of source tiles to samples of target tiles // Set samples of all given target tiles } } Programming BEAM, Syke, 20.-22.01.2010 32 BROCKMANN CONSULT How GPF “drives” Processing Nodes 1. GPF knows about all registered Operators (Plug-in) 2. GPF encounters a required node (GUI, CLI, XML graph) 1. The client‘s Operator class is found by its alias 2. The SourceProduct, TargetProduct and Parameter annotations are analysed 3. The client‘s Operator object is created 4. Source products and parameter values are injected 3. GPF calls Operator.initialize(). In this method, the client provides code to 1. validate and process source products and parameter values 2. create, configure and set the target product 4. GPF calls either Operator.computeTile() or Operator.computeTileStack()in case raster data is required Programming BEAM, ESRIN, 22.-26.09.2008 33 BROCKMANN CONSULT gpt – The GPF Command-Line Tool Programming BEAM, Syke, 20.-22.01.2010 34 BROCKMANN CONSULT GPF Processing Node Configuration Programming BEAM, Syke, 20.-22.01.2010 35 BROCKMANN CONSULT GPF Processing Node GUI Programming BEAM, Syke, 20.-22.01.2010 36 BROCKMANN CONSULT GPF Programming Model Product p0 = ProductIO.readProduct(“MER_1P.N1”); Map<String,Object> params1 = ...; Map<String,Object> params2 = ...; Map<String,Object> params3 = ...; Product p1, p2, p3; p1 = GPF.createProduct(“SmileCorrOp”, params1, p0); p2 = GPF.createProduct(“CloudMaskOp”, params2, p1); p3 = GPF.createProduct(“NoiseRedOp”, params3, p1, p2); WriteOp.writeProduct(p3, new File(“TEST.DIM”), “BEAM-DIMAP”, ProgressMonitor.NULL); Programming BEAM, Syke, 20.-22.01.2010 37 BROCKMANN CONSULT End Thanks for using BEAM! http://www.brockmann-consult.de/cms/web/beam/forum Programming BEAM, Syke, 20.-22.01.2010 38