EOI.FP6.2002 Expression of Interest: Integrated Projects INTRODUCING TOMORROW'S IMAGING TECHNOLOGY FOR LARGE-SCALE MONITORING OF MICROSCOPIC, AQUATIC ORGANISMS Acronym: IMAQUA Thematic areas: 1.1.6.3 - Global change and ecosystems - Biodiversity and ecosystems 1.1.2.iv - Information Society Technology 1.1.5 - Food safety Submitted by Prof. Hans du Buf CINTAL - Technological Research Centre of the Algarve Faro, Portugal Abstract Two European pilot projects, i.e. DiCANN (Dinoflagellate Categorisation by Artificial Neural Network) and ADIAC (Automatic Diatom Identification and Classification) have shown that image identification by computer can compete with human experts. In order to prepare field-tested tools that can be applied in all areas in which large-scale monitoring of microscopic organisms must be done, these projects must be continued in an integrated framework at a much bigger scale that includes more organisms and applications. The proposed framework integrates experts in pattern recognition, taxonomists and researchers with different applications, such as water quality (drinking, recreation, harmful algal blooms, shellfish production), global change and biodiversity. It addresses EU policies such as the Water Framework Directive (WFD), Integrated Coastal Zone Management (ICZM) and Information Society Technologies. After the DiCANN and ADIAC projects, Europe is already the leader. Now this technology must be developed and integrated at the European scale. Need and relevance Global change, coastal zone management, biodiversity and water quality assessment require all routine monitoring on a large scale. This involves much time-intensive work by highly-trained analysts. Recent European projects aimed at studying the possibility of automating this routine work in order to allow researchers to make more efficient use of their time and to concentrate on their applications. These projects demonstrated that dinoflaggellates and diatoms can be identified with a precision that can compete with that of human experts. The next, logical step is to widen the scope of the small pilot projects, DiCANN and ADIAC, such that (a) more organisms can be identified, (b) all relevant monitoring applications can be covered, and (c) the technology can be applied throughout Europe. This involves further algorithm development, professional software development, extending the taxonomic databases to provide comprehensive cover of major habitat types, extensive field-testing of the technology, and, most importantly, the establishment of technology portals in all countries such that all researchers can 1 have access to the technology. These portals will play an essential role in the future, because they serve to coordinate activities and provide hands-on experience to bootstrap new activities and researchers. The most ambitious goal is to develop a centralised database that covers most taxa of most habitats, such that taxa necessary for an application can be selected and identifications can be started without the need of building a special database. Only by developing computer-based technology, in combination with shared databases and a common users platform, it will be possible to develop more efficient solutions for studying global change, biodiversity and ecosystems. These requirements are essential for and directly relevant to EU policies such as the Water Framework Directive (WFD) and Integrated Coastal Zone Management (ICZM). In addition, the project integrates perfectly into Information Society Technologies (IST) and other initiatives such as Marie Curie fellowships and the Erasmus exchange programme. Scale of ambition and critical mass Ten years ago, before the DiCANN and ADIAC projects, automatic identification of microorganisms did not exist. Now, after these projects, Europe has developed the necessary technology, albeit at a pilot scale. Instead of proposing small follow-up projects that would serve to finetune the technology, we propose an ambitious framework that will make Europe the leader in both development and application within this area. A large, co-ordinated effort is necessary, firstly to develop a critical mass of pure and applied researchers to challenge established paradigms, second to provide scope for the necessary synergisms to develop between experts, and third to introduce the new technology at the European scale. A major objective of the proposed work is to bring together scientists form very different disciplines: biology, environment and computing science. The aim is to introduce to biologists the advantages that technology can bring to the field of microscopy and taxonomy. By working directly with the end users, i.e. biologists and enviromental scientists, computer scientists can tune the technology to be user friendly and to provide real solutions to the real problems of the disciplines. Furthermore, establishing simple data banking procedures and bringing together the information produced by the various centres will bring benefits to all contributors and users. If Europe is going to realise policies such as WFD and ICZM, there is an absolute need for (a) standardised sampling and analysis procedures, (b) centralised databasing concerning all microorganisms and habitats, and (c) efficient tools to be applied, throughout Europe, on a routine basis. This Integrated Project serves to: 1. Establish collaborations between major players in aquatic microbiology 2. Establish electronic databases that serve most key applications and habitats 3. Develop, test and optimise the technology 4. Apply the technology to key issues, from predicting harmful algal blooms to coastal zone management 5. Train young researchers in using, and developing, computer technology The main aim is to bring together all players and all applications, such that all are involved in the development and all will use the technology. This integration at the largest scale is the best guarantee that the technology will be useful for all. This needs to be done now in order to have the technology up and running in 5-6 years from now, to guarantee that European policies can be realised, and to assure that European research in this area maintains its cutting edge. 2 Integration The structure of the project may look simple but requires many interactions: 1. To collect representative samples at different habitats (applications) 2. To validate the samples by expert taxonomists 3. To construct databases with additional information (taxonomy, ecology) 4. To develop feature extraction algorithms for different organisms 5. To develop an identification scheme using multiple classifiers 6. To test algorithms using the databases 7. To develop professional software tools 8. To field-test the tools on new samples The dimension of this project requires the establishment of different, pan-European, but multidisciplinary task forces that address these logical units. This will involve a number of core activities that also coordinate the project, plus satellite activities for collecting data etc., which can be dynamic. The basic idea is to create specific or common feature extraction engines optimised for images of different organisms (diatoms, dinoflagellates, coccoliths etc), and to apply these in conjunction with one identification engine (classifier). One important requirement is that the databases should contain sufficient samples for training the identifier, at least 20 samples per taxon. This is the work to be done under points 1, 2 and 3. Points 4, 5 and 6 address experts in computer vision, but require much input from biologists. Point 7 can also be done by experts in computer vision, but requires feedback from field workers who are going to use the tools (user friendliness, robustness). The most important difference with the previous projects DiCANN and ADIAC is the fact that the biologists, i.e. field workers and taxonomists, are the key players: they must specify what they want. This needs to be done right from the start of the project, and this must be guaranteed throughout the project. By definition, the project is a dissemination vehicle that addresses, apart from the wider audience through scientific publications, direct colleagues in many laboratories and institutes through the training of young researchers. At the same time it is a demonstration of the technology's applicability, because expert knowledge is included and prior studies were rather successful, i.e. a further development will lead to even better results: ID rates close to 100%. The most important aspect is the training of young researchers, most importantly biologists but also in pattern recognition. On the basis of the established experience in the DiCANN and ADIAC projects, rigorous procedures must be applied in collecting and imaging samples, plus additional data for the databases like taxonomic and ecological information. They must learn how to integrate the tools into statistical analyses in order to develop the efficiency necessary for large-scale monitoring tasks. They will also play key functions in establishing the techology portals, i.e. at least one centre of excellence in each participating country that serves to disseminate the technology and to provide a backup for all other end users. With respect to necessary resources, hardware is not a point because of fast PCs, cheap disks, and accessible scientific-grade CCD cameras. Most partners have already suitable microscopes. There is one point that has not been mentioned before: methods have already been developed (ADIAC) to automate slide scanning, but this requires a completely computer-controlled microscope. These are very expensive, but it is possible that one microscope in each portal can be shared by many users. 3 The main costs of this project are related to the training of young researchers, PhD students and postdocs, including expenses involved with many partner visits, which are expected to have a duration of up to several months. The project will foster many new grants and partner visits (Marie Curie postdoc fellowships, exchange of PhD students within the Erasmus programme). In addition, a successful integration implies many meetings in order to exchange experiences and to evaluate the progress. It is anticipated that, apart from personnel contacts established during partner visits, meetings will be organised every 6 months. Preliminary (confirmed) partnership BIOLOGY AND APPLICATIONS: Dr. Martyn Kelly Dr. Joachim Huerlimann Bowburn Consultancy AquaPlus Durham Zug UK Switzerland Dr. Michel Coste CEMAGREF Water Quality Research Unit Cestas France Dr. Jean Prygiel Agence de L´Eau ArtoisPicardie Netherlands Institute of Ecology - KNAW Douai France Yerseke The Netherlands Institute for Inland Water Management and Waste Water Treatment (RIZA) Lelystad The Netherlands Dr. Ana Cristina Cardoso Joint Research Centre – IES Dr. Lucien Hoffmann Public Research Center Gabriel Lippmann Prof. Eugen Rott Univ. of Innsbruck Ispra Italy Luxembourg Luxembourg Innsbruck Austria Prof. Alice Newton Univ. of Algarve Faro Portugal Prof. Margarida Reis Univ. of Algarve Faro Portugal Dr. Fran Saborido-Rey Institute of Marine Research Spanish Institute of Oceanography - IEO Institute of Marine Sciences CMIMA-CSIC Vigo Spain Vigo Spain Barcelona Spain Ghent Univ Gent Belgium Univ. Athens Athens Greece Brno Czech republic Dr. Lucas Stal Dr. Arnold Veen Dr. Beatriz Reguera Dr. Celia Marrase, Dr. Jordi Camp, Dr. Lluisa Cros, Dr. Carlos Pedros-Alio Dr. Wim Vyverman Dr. Daniel Daniliedis Prof. Blahoslav Marsalek Czech Acad. of Sciences 4 limnology, water quality bioindication, palaeolimnology, forensics diatom taxonomy and ecology, bioindication, global change monitoring, bioindication, ecology, rivers, diatoms ecophysiology and diversity of benthic and planktonic cyanobacteria and diatoms phyto- and zooplankton surveys, ISO 17025 phytoplankton standardisation, quality assurance monitoring ecological quality indicators, climate taxonomy of cyanobacteria (blue-green algae) periphyton-based river monitoring, taxonomy and ecology oceanography, coastal zone monitoring environmental microbiology, ecology, cyanobacteria marine fisheries ecology, monitoring morphological variability of microalgae phytoplankton taxonomy and ecology, monitoring HABs protistology and aquatic ecology diatom taxonomy and bio monitoring fluorescent probes for metabolic quantification Prof. David Mann Royal Botanic Garden Edinburgh UK Dr. Jeremy Young Natural History Museum London UK Dr. Steve Juggins, Dr. Richard Telford Univ. of Newcastle Newcastle upon Tyne UK Dr. Raymond Leakey, Dr. Christine Campbell Scottish Association for Marine Sciences Argyll UK Dr. Eva-Maria Noethig Alfred Wegener Institute for Polar and Marine Research Alfred Wegener Institute for Polar and Marine Research Bremerhafen Germany Helgoland Germany Botanic Garden and Museum Berlin-Dahlem Max-Planck Institute for Limnology Berlin Germany Ploen Germany Dr. Serena Fonda Umani Prof. Svetislav Krstic Univ. of Trieste Institute of Biology Trieste Skopje Italy Macedonia Dr. Jacob Larsen, Dr. Henrik Enevoldsen, Dr. Gert Hansen IOC Science and Communication Centre on Harmful Algae Copenhagen Denmark Dr. Karen Wiltshire, Dr. Mona Hoppenrath Dr. Regine Jahn Dr. Martin Beutler taxonomy/identification of freshwater and marine algae taxonomy/identification of coccolithophores freshwater and coastal water quality, global change quantification of marine phytoplankton, culture collection phyto- and protozooplankton ecology biological oceanography, phytoplankton, taxonomy of diatoms and dinoflagellates freshwater eucaryotic microalgae, diatoms primary productivity, mathematical fit algorithms microplancton diatom taxonomy, monitoring taxonomy and identification of dinoflagellates PATTERN RECOGNITION AND SOFTWARE DEVELOPMENT: Prof. Hans du Buf Univ. of Algarve Faro Portugal Prof. Phil Culverhouse Univ. of Plymouth Plymouth UK Prof. Hans Thierstein, Dr. Joerg Bollmann, Dr. Patrick Quinn Prof. Horst Bunke ETHZ Zuerich Switzerland Univ. of Bern Bern Switzerland Prof. David Marshall Dr. Gabriel Cristobal Cardiff Univ CSIC Cardiff Madrid UK Spain Dr. Fyllis Tafas Univ. of Athens Athens Greece Prof. Josef Bigun Dr. Jon French Halmstad Univ University College Halmstad London Sweden UK 5 computer vision and pattern recognition (diatoms) computer vision and pattern recognition (dinoflagellates, zooplankton, fish larvae) automated microscopy and identification (coccoliths, foraminifera) computer vision and pattern recognition (diatoms) computer vision and pattern recognition automatic slide scanning, pattern recognition very high speed optical scanning for toxic dinoflagellates pattern recognition in situ imaging, water column profiling