Online Resource 3: Data and analysis of list of AEMs and hydrodynamic drivers. Method From the list given in Online Resource 1, we selected those that can be qualified as being an AEM according to our definition. For 42 models we were able to obtain data from experts to categorize them. These categories relate to the modelling approach, environmental domain of the model (e.g. shallow lake), the application domain (e.g. eutrophication), the types of analysis that are available, availability and legal status of source code and executable, hydrodynamic aspects, spatial aspects, the modelling framework, programming language, toolboxes and databases used and finally the type of user interface to access the model (see for the full list in the Online Resource 5). We analysed the frequency distribution of the models over different subcategories. Of the most intriguing outcomes, a spider chart is created which can be found in the main text under section ‘Categorizing diversity’. Results Analysis type available Simulate temporal dynamics 90% Scenario analysis 71% Sensitivity analysis 51% Calibration 49% Validation 43% Uncertainty analysis 41% Bifurcation analysis 16% 90% of the models for which we obtained data are dynamic and therefore allow to simulate temporal dynamics. The remaining models are static and based on statistical relations. Over two out of three models have tools for scenario evaluation. Half of the models have tools for sensitivity analysis and 1 calibration. Validation and uncertainty analysis is implemented for over 40% of the models. Bifurcation analysis, which technique is extensively employed in theoretical ecology is available for one out of six of the AEMs that we analysed. Application domain Eutrophication 98% Climate change 79% Carbon cycle 68% Fisheries 34% Biodiversity loss 17% Adaptive processes 8% Eutrophication was the application domain of no less than 98% of the analysed models. Next comes in decreasing order of importance climate change, carbon cycle, fisheries, biodiversity loss and adaptive processes. This signals that the fields of biodiversity and evolution are not well covered by AEMs although we know that eutrophication, climate change and fisheries can have severe consequences for biodiversity and form an important selective force. The resulting changes in biodiversity and genetic composition may have important feedbacks on ecosystem functioning. Availability source code Free on request 50% Free download 36% Not available 18% Licence can be bought 7% The majority of models that we analysed use an open source policy (free on request or free download). For one out of six models, the source code is not available at all. For one out of seven models a licence for the source code can be bought. 2 Availability executable Free through download 41% Free through own compilation 36% Free on request 33% Not available 15% Licence can be bought 12% Free through web application 7% For just over 40% of the analysed models the executable can be freely downloaded and for a slightly smaller percentage of models the executable can be obtained for free through own compilation. For one out of three models, the executable is freely provided on request. For one out of six models, the executable is not available. In less than one out of ten cases, it can be bought. For three of the forty four models that we analysed, free access to the executable is provided through a web application. Copyright policy source GPL 26% Proprietary 26% LGPL 21% Just less than half of the models is distributed using a GPL (General Public Licence) or a LGPL (Lesser General Public Licence) copyright policy for the model source. One out of four models is proprietary with respect to the source code. Copyright policy executable Free to use and distribute 50% Free to use but not distribute 22% Proprietary 16% 3 For half of the analysed models, users are free to use and distribute the executable. For 22% of the models, this is prohibited. One out of six models is proprietary with respect to the executable. Database definition of model available DATM 14% The idea to implement models in a database format was only recently launched. It is therefore not surprising that only one out of seven of the analysed models is implemented in this way. Environmental domain Shallow lakes 81% Reservoirs 52% Deep lakes 51% Rivers 43% Estuaries 39% Seas 33% Ditches/canals 33% Oceans 31% Wetlands 29% Catchments 22% Global 22% Coastal 10% Despite the fact that the AEMON community that provided the analysed models is traditionally biased towards lake modelling, the 42 analysed models cover about every aquatic habitat, with even 22% of the models claiming global applicability. A deeper analysis of the extent to which models of different aquatic habitats are currently linked or could potentially be linked is a very interesting one. 4 Hydrodynamic driver Simple 0D mass balance 34% Build-in hydrodynamic model 32% GOTM 20% GETM 15% MOM 13% FVCOM 8% GLM 8% NEMO 8% ROMS 8% DUFLOW 7% Delft3D-FLOW 5% Delft3D-WAVE 5% SOBEK 5% Coherens 3% DYRESM 3% ELCOM 3% Mike11 3% Mike21 3% Mike-SHE 3% PERSIST 3% Simstrat 3% 5 One third of the analysed models contains a simple 0D water balance and another (partly overlapping) third contains a built-in hydrodynamic or hydrological driver. Among the external drivers, GETM, GOTM and MOM are most in use. Next comes a list of no less than 16 hydrodynamic and hydrological drivers, each used by a few or even a single model in our set. Hydro-eco process linker FABM 15% DELWAQ 7% DUFLOW 5% MOSSCO 2% ESMF 0% Over the years several laudable initiatives have been taken to develop standardized interfaces between hydrological drivers and process models. It is interesting to see that also at this level there is evolving diversity. For instance, FABM is recently developed, while DELWAQ has been around for decades, although it is available in open source only since 2013. Within the set of analysed models, the former framework is most commonly used (by one out of six models). Mathematical format Partial differential equations 50% Ordinary differential equations 48% Difference equations 14% Input-output relation 10% Agent-based event driven 5% Lattice differential equations 2% The mathematical solution techniques for partial and ordinary differential equations are equally important for implementing AEMs. Other formats in use are in decreasing order of popularity: 6 difference equations, input-output relations, agent-based event driven and finally lattice differential equations. Model stored in repository FABM process model repository 15% DELWAQ process model repository 7% DUFLOW process model repository 5% Some modelling frameworks aim for hosting a suite of AEMs. Among the 42 models that we analysed one out of four was available as part of either the FABM, DELWAQ or DUFLOW model repository. Modelling approaches Aquatic Ecosystem model 100% Dynamical model 86% Process-based model 81% Biogeochemical model 79% Mass balanced model 78% Compartment model 76% Complex dynamical model 76% Stoichiometric model 62% Spatially explicit model 59% Competition model 52% Consumer-resource model 50% Food web model 50% Community model 45% NPZD model 26% 7 Hydrodynamic model 24% Individual-based community model 14% Trait-based model 12% Dynamic Energy Budget model 12% Environmental niche model 11% Hydraulic model 10% Meta model 10% Statistical model 7% Optimization model 5% Generalized Lotka-Volterra model 5% Hydrological model 5% Neural network model 5% Structural equation model 5% Physiologically structured model 2% Regression model 2% Minimal dynamical model 0% We listed thirty types of modelling approaches to get a better insight in the nature of the 42 analysed models. These categories are non-exclusive and not exactly defined. Yet, we believe they give an idea of what are the dominant and what are the more rare approaches in use in aquatic ecosystems modelling. By definition, all models reported here were qualified as being an AEM. Over 75% of them were qualified as being dynamic, process-based, biogeochemical, mass-balanced, compartmented and complex dynamical. Over 45% of them were qualified as being stoichiometric and spatially explicit as well as being a competition, a consumer-resource, a food web and a community model. About 25% of them was qualified a being of the NPZD type of model as well as 8 being a hydrodynamical model. One out of seven of the analysed models contained individual-based approaches, more specifically being an individual-based community model, a trait-based model or a dynamic energy budget model. More rare were the following qualifications (in decreasing order of importance): environmental niche model, hydraulic model, meta model, optimization model, statistical model, generalized Lotka-Volterra model, hydrological model, neural network model, physiologically structured model, structural equation model, and regression model. The low score of statistical models, either based on regression or neural networks shows that this approach is not wellrepresented in our dataset. None of the model was scored as being a minimal dynamical model, whereas this approach provides all the building blocks for process-based AEMs and conceptually can form an AEM in itself (see figure 1). One could also argue the NPZD models could be qualified as minimal dynamical models. Modelling framework R/deSolve 19% FABM 15% Web application tool 8% DELWAQ 5% DUFLOW 5% Matlab 5% AQUASIM 3% Ecopath with Ecosim 3% Stella 3% ACSL 1% GRIND for Matlab 1% OSIRIS 1% Mathematica 0% 9 SENECA 0% SMART 0% VisSim 0% More than two out of three models is developed within an existing modelling framework. R/deSolve and FABM come out as the most used frameworks (19% and 15% respectively). Thereafter follows a long list of frameworks in use, including Web application tools, DELWAQ, DUFLOW, Matlab, Aquasim, Ecopath with Ecosim, Stella, ACSL, Grind for Matlab and OSIRIS. One can rightfully say that the field of aquatic ecosystem modelling is quite scattered when it comes to the use of modelling frameworks. This notion was one of the incentives for developing DATM (Mooij et al. 2014). Numerical integrator method or library Euler 49% Runge Kutta 4th order 22% R integrators 15% Matlab integrators 7% Numerical recipes integrators 7% DELWAQ integrators 5% DUFLOW integrators 5% Odepack integrators 5% ACSL integrators 1% All AEMs that are formulated as differential equations require numerical integration for simulating temporal dynamics. Except for Euler integration and fixed time step Runge Kutta, these integrators are quite complex and use of existing and well-tested routines from software libraries is therefore highly recommended. In decreasing popularity, the following integrators are in use in our set of 10 models: R/deSolve, Matlab, Numerical recipes, DELWAQ, DUFLOW, Odepack and ACSL integrators. Parameter database used BLOOM parameter database 2% Ecopath with Ecosim parameter database 2% Parameter prior database 2% PROTECH parameter database 2% AQUATOX parameter database 0% DEB parameter database 0% We consider it good modelling practice to use databases of parameters. Currently, these databases are not widely used however. In fact we identified four databases in use, each by one model. Programming language FORTRAN 50% C 15% Delphi 15% R 13% C++ 11% Matlab 10% Python 9% Visual Basic 9% DUPROL 6% Through Graphical User Interface (e.g. Stella) 6% JAVA 3% 11 ACSL 1% GRIND for MATLAB 0% Mathematica 0% With 50%, FORTRAN is the dominant programming language for implementing AEMs. Next comes C or C++ (together 26%), followed by a long list of programming languages, including Delphi, R, Matlab, Python, Visual Basic, Duprol, JAVA, ACSL or GRIND for MATLAB. One can rightfully say that the field of aquatic ecosystem modelling is quite scattered regarding the use of programming languages. This notion was one of the incentives for developing DATM. Programming style Procedural 68% Scripting 29% Object-oriented 26% 68% of the models for which we obtained data are written in a traditional procedural programming style (e.g. procedural FORTRAN or C). 29% is written in an object-oriented style (e.g. C++ or Delphi) and 26% in a scripting language (e.g. R or Python). Spatial configuration Box model 54% Network 26% Cubic grid 24% Curvilinear 23% Flexible mesh 16% Finite Element 15% Triangular mesh 11% 12 Polar 5% Just over half of the models for which we obtained data can be run as box model. With respect to spatially explicit models, various spatial configurations are employed. In decreasing popularity these are: networks, cubic grid, curvilinear, flexible mesh, finite element, triangular mesh and polar. Spatial dimension 0D 51% 1D vertical 39% 2D horizontal 32% 3D 29% 1D horizontal 28% Networks of linear waters 22% 2D vertical 20% Half of the models for which we obtained data can be run in a 0D mode. Next comes a 1D vertical model, typically with a focus on stratification. About one out of three models can be run in a 2D horizontal, a 3D or a 1D horizontal mode. One out of five models can be run as a network of linear waters or in a 2D vertical mode. Toolboxes used R-packages 13% Matlab toolboxes 4% Numerical recipes 4% Libreoffice toolboxes 3% GRIND toolboxes 1% LIN/EISPACK libraries 0% 13 We consider it good modelling practice to use existing toolboxes for tasks such as numerical integration, statistical analysis and graphical presentation. R-packages come out as at the most used (one out of seven models). Numerical recipes, which was THE modeller's handbook of the 90’s of the past century seems to be fading away, although some of the methods in more recent toolboxes may actually contain code from it. This certainly applies for LINPACK/EISPACK, which were never mentioned. User interface Graphical User Interface 63% Console 42% Excel 13% Almost two out of three of the models for which we obtained data can be accessed through a graphical user-interface. Just over 40% is controlled from a command line on a console. An interesting development is the use of Excel for specifying model input and inspecting model output. This approach is used by one out of seven of the models for which we obtained data. We observed that model access through Excel is particularly liked by inexperienced users, because they know the software and have access to it. References Mooij, WM, Brederveld, RJ, De Klein, JJM, DeAngelis, DL, Downing, AS, Faber, M, Gerla, DJ, Hipsey, MR, t Hoen, J, Janse, JH, Janssen, ABG, Jeuken, M, Kooi, BW, Lischke, B, Petzoldt, T, Postma, L, Schep, SA, Scholten, H, Teurlincx, S, Thiange, C, Trolle, D, Van Dam, AA, Van Gerven, LPA, Van Nes, EH and Kuiper, JJ 2014. Serving many at once: How a database approach can create unity in dynamical ecosystem modelling. Environ Modell Softw 61:266-273. 14