January 27th, 2010 Attendees: Oversight Committee Chair: Maryann Martone, University of California, San Diego, maryann@ncmir.ucsd.edu Project Coordinator: Jyl Boline, Informed Minds Inc., Wilton Manors FL, USA, jylboline@gmail.com Neuron Registry Task Force: Lead: Giorgio Ascoli, George Mason University, Fairfax VA, USA, ascoli@gmu.edu Sean Hill, L’Ecole polytechnique fédérale de Lausanne, Switzerland, sean.hill@epfl.ch Gordon Shepherd, Yale University School of Medicine, New Haven, USA, gordon.shepherd@yale.edu Menno Witter, Norwegian University of Science and Technology, Trondheim, Norway, menno.witter@ntnu.no Representation and Deployment Task Force: Lead: Alan Ruttenberg, Science Commons, Cambridge MA, USA, alanruttenberg@gmail.com Mihail Bota, University of Southern California, Los Angeles, USA, mbota@usc.edu Gully Burns, University of Southern California, Los Angeles, USA, gully@usc.edu Alexander Diehl, Jackson Laboratory, Bar Harbor ME, USA, adiehl@informatics.jax.org Sarah Maynard, University of California, San Diego, USA, smaynard@ncmir.ucsd.edu Onard Mejino, University of Washington, Seattle, USA, mejino@u.washington.edu David Osumi-Sutherland, University of Cambridge, United Kingdom, djs93@gen.cam.ac.uk Structural Lexicon Task Force: Lead: David Van Essen, Washington University, St. Louis, USA, vanessen@brainvis.wustl.edu Rembrandt Bakker, Radboud University, Nijmegen, Netherlands, R.Bakker@donders.ru.nl INCF All PONS MEETING Minutes January 27-29, 2010 Stephen Larson, University of California, San Diego, USA, slarson@ncmir.ucsd.edu Laszlo Zaborszky, Rutgers University, Newark, USA, zaborszky@axon.rutgers.edu Other Participants: Digital Atlasing Infrastructure Task Force: Lead: Ilya Zaslavsky, University of California and San Diego Supercomputer Center, USA, zaslavsk@sdsc.edu Christian Haselgrove, University of Massachusetts, Worcester, USA, christian.haselgrove@umassmed.edu Metadata Standards Oversight Committee: David Kennedy, University of Massachusetts, Worcester, USA, david.kennedy@umassmed.edu INCF Secretariat: Janis Breeze (Manager of Programs), janis.breeze@incf.org Raphael Ritz (Scientific Officer), raphael.ritz@incf.org Albert Burger, Bernard de Bono, Melissa Haendel, Rolf Kötter, and Nicolas Le Novere all joined in for part of the day via teleconference. Overview and Goals: Maryann Martone Overview: The goal of the INCF program on Ontologies of Neural Structures (PONS) is to promote data exchange and integration across disciplines, species, developmental stages, and structural scales by developing terminology standards and formal ontologies for neural structures. A focus will be to create solutions to help resolve many of the barriers around sharing information and data integration, which revolve around people’s use of ill-defined terms and structures. History: The first Oversight Committee meeting was held in Stockholm in the fall of 2007. They focused on general principles for developing a systematic, useful, and scientifically appropriate framework for neuroanatomical nomenclature and created basic principles for implementation strategies. The report from the workshop discusses the problems arising from the use of different parcellation schemes and terminologies and highlights 2 INCF All PONS MEETING Minutes January 27-29, 2010 the need for a universal vocabulary for describing the structural organization of the nervous system. A second workshop was held in Stockholm in the fall of 2009 with a focus on building informatics infrastructure. The scope was defined and three task forces were formed to focus on neuronal structures, neurons, and the supporting infrastructure (which has evolved into deployment and representation). Structural lexicon infrastructure was developed under the umbrella of NIF for neuroscientists to enter definitions and expose standards to humans and machines. The goals of the current workshop are to: Establish necessary interactions among task forces so that we end up with a consistent terminology structure for cross scale queries o Neuron registry should reference structural lexicon and vice versa Define the use cases Refine deliverables o Common high level ontology for mammalian anatomy for cross rodentprimate anatomy that ties together nomenclatures in common use for rodent and primate (Doug Bowden and Laszlo Zaborszky) o Conventions for naming of neurons and brain regions o Definition of a standard set of properties for describing neurons and brain regions Connectivity across scales o Common strategy for representing species differences o Set of software tools for populating lexicon and knowledge bases to be built from lexicon o Strategy for tying efforts to atlasing, modeling and metadata task forces Determine who will do the work Goal: Demonstration of products at Neuroinformatics 2010 in Kobe at the end of August Overview of SLTF: David Van Essen The SLTF has been focusing on the entity pages for brain structural entities in NeuroLex. Efforts of this group (mostly David and Stephen Larson) have revolved around improving the NeuroLex wiki interface and creating tools for bulk uploads. David has been trying to enter Area 25 in the primate and demonstrated the multiple difficulties that arise because of differences of opinion and parcellation criteria. He also showed that this is an extremely complex issue as there may be some boundaries, but more often the brain, especially the cortex is a continuous sheet and different criteria often lead to different parcellation schemes across the cortex. Ideally, probabilistic atlases could be 3 INCF All PONS MEETING Minutes January 27-29, 2010 used to help illustrate these characteristics. Whatever is created needs to be useful for many other data types. SLTF Goals for this meeting: Refine NeuroLex metadata categories, characterization o Exemplar use cases (e.g., ‘area 25’) o Minimal vs optimal characterizations Build upon Scalable Brain Atlas o Additional parcellations/atlases in SBA? o Additional parcellations in BrainInfo, SumsDB? o How to link NeuroLex to these resources? Prepare for the future o Connectome-based parcellation o MR-based architectonics Key Issues for Cortex: Multiplicity of parcellation schemes; won’t go away o Examples - Paxinos et al (PHT00), Ferry et al. (FOAP00) Individual variability (especially in humans) o Need for probabilistic maps Graded strength of areal boundaries o Architectonics o Connectivity gradient analyses (DI, R-fMRI) Areas, area clusters, networks o Modularity, graph-theoretic approaches Overview of NRTF: Giorgio Ascoli The goal of the NRTF is to create a resource to browse and search neuron types based on their properties and properties based on the neuron types in which they are found. Giorgio gave examples of use cases and discussed how the lack of consensus in this field presents one of the largest hurdles to their goals. Fairly recently, there has been a significant advancement with the creation of the Petilla terminology, a stepping stone within a larger classification effort that began in 2004. The group did not move on any classifications, but agreed on a list of terms for properties that are crucial for the description of neurons. 4 INCF All PONS MEETING Minutes January 27-29, 2010 In addition, Giorgio discussed some of the challenges, both scientific and technical, of trying to create a resource such as this and in conjunction, an interface that gathers information from the domain experts. So far NRTF members have focused on investigating multiple domain-expert interfaces rather than NeuroLex, since that interface isn’t ideal for what they want to do. Whatever is used must be compatible and integrated with the backbone of what’s in NeuroLex. They’re tackling the problem in a pragmatic way, by narrowing their focus and linking to the SLTF for location and chemoarchitecture and leaving the ontology organization to the ontology experts. Proposed timeline: Phase I: June-Dec 2009 Refining scope, agreeing on properties, designing curator interface Phase II: Jan 2010-August 2010 Testing curator interface, seed populating Registry, designing user interface Phase III: Sept-March 2011 Testing user interface, devising system of ongoing curation, release of beta Neuron Registry, and write report for publication Goal for this meeting is an operational decision about how to proceed over the next ~9 months and to start putting together other longer-term goals and actions. Overview of RDTF: Alan Ruttenberg The group has focused on starting to collect potential relations (how the different things are connected to each other, e.g. located in). David Osumi-Sutherland has started collating the multiple relations from different efforts (many without definitions) and the group is starting to look at how to navigate across them. They’ve started scoping what is needed as supporting representation “high level entities” (initiated by Gully Burns), and started reviewing and further formalization of BAMS. During this meeting, he’d like to work on trying to represent neurons and brain regions together by evaluating existing work and recording what doesn’t work. Also he’d like to ensure the group understands the formal representation task and to distinguish this from the user interface. He’d like to ensure that people understand each other. They would like to do some initial formalization which will result in an OWL file. and Review and set goals for NeuroLex as it is a main component of user interface. 5 INCF All PONS MEETING Minutes January 27-29, 2010 Representation Task: Be clear about what is being represented, communicate with clarity and good documentation, reuse and integrate resources and ensure that we can express and answer queries precisely. User Interface Task: This is used to allow domain experts to effectively communicate their knowledge, and needs to be expressed in formal representations. These two views need to be kept in sync. Some of this depends on what has already been developed, but if we are defining the ontology, we can create a new one that is very controlled and specific (formally defined in such a way that future people can follow the same path). What is a Type: This session was primarily to help the Representation and Deployment TF get up to speed on understanding associated representation issues, primarily on what is the thinking behind the different types of criteria used for classifying things. Cell Classifications: Giorgio discussed how different people use different criteria to “classify” cells. He showed a table (created in his lab on the basis of extensive literature mining) that gives an example of how neurons in the hippocampus can be “classified” by the patterns of distinct subregional layers invaded by axons and dendrites. This doesn’t include the physiology or chemistry of the cell. He emphasized that people will use different criteria, which leads to very different classification results. The Registry can help by showing possible alternatives with an observed subset of properties. Moreover, that the Registry would also indicate when something isn’t known. We need to put an information structure in place to address these issues. We need to ensure that we have use cases in place that guide the development of this infrastructure. Structure Parcellations: David discussed how people use parcellation as a way to get a handle on function and connectivity. Connections are not binary, as they are usually described, but may be thought of more as a “probability.” The range in connective strengths is a factor of 105 or 106. Even many of the “weak pathways” are very consistent across organisms, which strongly suggests an important functional role. When we think about our descriptions of cell properties and connections, we should keep in mind that the range of “connection strengths” will be tremendously diverse, and have some ways to support that information if we can. 6 INCF All PONS MEETING Minutes January 27-29, 2010 He also touched again on the issue of instances the different parcellations set by different people. All these areas may not have equal value and robustness, and this comes back to the idea that the cortex may be more the functional components of a network than individual parcellations. Again the evidence for gradients within a parcel shouldn’t be ignored and that what may be needed to characterize domains throughout the brain will include a mixture of parcellation and location within a large parcel. This of course ties strongly to what the atlasing group is doing so we should think about how we can engage with them as we move forward. There was a brief discussion about current technology that may aid in dealing with the issue of granularity of descriptions that justify parcellations, in order to identify and encode the chunks of publication that relate to a particular area. Gully’s group is creating LATISI (literature annotation tool from the information sciences). A user can download a pdf, render it and interact with the text. In the future, there would be a way to interact with the database. OBO (Open Biomedical Ontologies) Foundry Approach: Alan Alan discussed the OBO Foundry approach, which includes a subset of ontologies whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure: tight connection to the biomedical basic sciences compatibility interoperability, common relations formal robustness support for logic-based reasoning Their principles for ontology development are built on collaboration and consensus using clear communication, documentation, attribution, and curation. Results should enable understanding of how ontologies are developed and the relations across different ontologies. They should also support reasoning, so you get out more than what is put into the ontology. Creating ontologies and their components: Upper level distinctions: Continuants (Things) o Independent Continuants, exist independently: Cells, Molecules o Dependent Continuants, exist in relation to something else 7 INCF All PONS MEETING Minutes January 27-29, 2010 Qualities: Shape, Mass, Reflectivity (subject of PATO) Realizables: Kinase function, IRB member role, being a drug o Generically dependent continuants (Information) Occurrents (Processes) o Having difficulty breathing o A metabolic reaction o A mass spectroscopy run “Instances” are defined as: • Objects (particulars, independent continuants) “fully present at every time when it exists” – one coho salmon – The National Oceanic and Atmospheric Administration – one person – one notebook • Properties (dependent continuants) “fully present at every time when it exists” – The ability of a passive integrated transponder tag to respond to a radio signal (a function) – NOAA Fisheries Service charge to review locally prepared salmon recovery plans (a role) – one salmon’s fork length (changes over time) – the measured value of one salmon’s fork length in millimeter • Processes-“takes place (unfolds) over a period of time” – A salmon swimming up a salmon ladder – Member of the NOAA Fisheries Service preparing advice to augment a salmon recovery plan (realizing their role) – A PTT responding to a radio signal (executing its function) – A fisheries staff member weighing a salmon Relations between instances: • trap17 located_in xxx river • salmon23 has_quality {fork length of 12.2 mm} now • attenna78 has_function PTT detection • John Samuels has_role Biostatistian Classes: • Those entities that are alike in some way • Mostly expressible as the relationships that their instances have to other instances • A PTT Antenna is an antenna that has function PTT detection function 8 INCF All PONS MEETING Minutes January 27-29, 2010 – ALL instances of PTT Antenna are an instance of Antenna that has_function SOME instance of PTT detection function There was a discussion about how deep we will go. We must define the situation where a given term is used so a user can evaluate effectively whether or not it fits their needs. In practice, when there’s a dispute, the process goes to collaborative debate and sometimes these terms get left behind and two different terms get created. We want to get people to use what exists already and if something doesn’t exist to get them to define it in terms of things that already exist. We don’t have to get it right the first time, we can modify it and it will get better over time. We are not building a dictionary; the definitions we develop will fit our needs and won’t satisfy everyone. There needs to be an interface tool that allows people to see the definitions and classes and has the ability to issue queries. Original Requirements of Steering Committee: Maryann Maryann reviewed the original Steering Committee requirements about the development of the PONS infrastructure with the purpose of seeing what we have achieved and to get a better handle on what we still need to do. Initial Requirements were as Follows: Infrastructure: • Infrastructure should handle information ready to go into a textbook as well as areas still under debate (controversy should be allowed and made accessible) • Must be able to handle versioning • The first Neuropedia (now Neurolex) is unlikely to allow query across a lot of different sources, but this would be desirable in future implementations • Need to define links with space and atlases. Tying annotations to an atlas is a registration issue and beyond dealing with just images. Work with the Atlasing Task Force on this. • Need to define how to link this to electrophysiology, possibly link an electrophysiology database via the ontology and atlasing framework. • Need to provide services for tool access in the future (e.g., atlasing tools) • Authority • Easy to use interface for trained users to add information (e.g. workshops for the dedicated domain experts) 9 INCF All PONS MEETING Minutes January 27-29, 2010 Usability: • Whatever is created needs to be market-tested and modified to fit needs • Needs to be tested by tool-builders accessing services Metadata Structure for the Lexicon: There was a lot of discussion of initial metadata structure for the structural lexicon entries and a template was created that includes items such as the species, partitioning scheme, etc. This has evolved over time and is used for inputting information into the NeuroLex wiki. Also, there was a fair amount of discussion over the term “metadata,” as it seems to be a loaded term and may or may not be appropriate for this group and its goals. There was a lot of discussion around methods for building this for people vs. text mining. Distinguishing different synonyms with the context of IDs are useful, but there are other methods to further define these individual terms. We need to be careful about putting constraints on our ontologies for text mining. Our primary goal should be around defining terms well, and that text miners can figure out that there are synonyms, etc, but not to build for that particular purpose. More Requirements were Developed over Time and Include: • Access to synonyms • Link the definitions to an atlas that visually displays the definition (being done now) • In the future, if someone links to a space in an atlas, all the terms for that space should be shown-these might be ranked with a few “INCF-stamped” registries • History of term • Problems with the term • Link back to figure in published literature o INCF to negotiate with publishers for rights • Link to a higher order brain structure o Classical anatomy o Brain Info structure • Citations for every relationship The goal is to review these requirements and come back with recommendations for improvements, or if we think something can’t or shouldn’t be done. 10 INCF All PONS MEETING Minutes January 27-29, 2010 Scalable Brain Atlas (SBA): Rembrandt Bakker Rembrandt started with a review of CoCoMac-Paxinos 3D viewer (precursor to the Scalable Brain Atlas) and how it didn’t quite fit the needs they were trying to fill. They wanted a tool that didn’t have to be downloaded and would be easy to use. To meet these needs, they built the Scalable Brain Atlas (SBA, http://scalablebrainatlas.incf.org/). SBA is web-based, and they can put in any atlas template that supports scalable vector graphics (SVG). They also plan to provide services to other websites and databases. It’s currently being used as an image service to NeuroLex for structures. They will also supply other services such as brain region coordinates in XML and a connectivity display for CoCoMac. He gave an example of viewing the Paxinos macaque atlas in SBA, where it’s possible to see the structure in pseudo-3D, or 2D plus. It isn’t complete yet and was hand-copied from the paper atlas, so they haven’t had too many copyright complaints. They may wish to go with public domain atlases such as the Allen Brain Atlas. Discussion issues: Copyright: o We’ll need to have people look into it for the different atlases o May be able to get around by modifying them enough (e.g. apply automated parsing algorithms We need something else as a method of recording location: o It should connect to high resolution datasets o It should be possible to “paint” locations on them (more visual rather than names) May be worth trying to put Menno’s hippocampus atlas into this format NeuroLex Overview and Tutorial: Stephen Larson Since the oversight meeting reviewed by Maryann, the San Diego group has moved forward in creating a wiki tool that fills many of these requirements called NeuroLex, http://neurolex.org/wiki. It is sponsored by NIF and INCF and is meant to allow people to enter neuroscience terms. Much of the content of NeuroLex can be browsed from the main page under major categories, arranged as hierarchies or as tables. Note that brain partonomies for a generalized mammalian brain are accessible from the main page and at http://neurolex.org/wiki/Brain_Partonomy_(general_mammalian). There are also quick links on the main page to add new content. Each “type” of new entry has a different set of fields that can be filled in. 11 INCF All PONS MEETING Minutes January 27-29, 2010 Given the way these are built as “blocks” you can create different “views” of the data based on the query. In addition, you can upload bulk content with an excel template developed with David Van Essen. However, at this point these need to be done with the help of a curator to help prevent problems. Cell Ontology Overview: Alex Diehl He gave a brief presentation to illustrate the power of using different ontologies to help build up different views of the information. By using the relationships when cells are given definitions and described with properties, you can create more powerful linkages that weren’t necessarily specifically defined to begin with. For neurons, there are a set of properties that are required to define them. If there are 8 ontologies, they express these properties 8 different ways. But if there is a core set, we can focus on those properties. Discussion Items: We should choose the properties, relationships, etc. that we want to focus on and ensure there is a page in NeuroLex that explains them. The NRTF will work with the RaD TF on what they need to express. SLTF will also focus on a core for properties and relationships so people can build from them. The ontologists will be responsible for putting them into a formal structure. As we move forward we need to realize that user interface and underlying infrastructure is not the same thing. However, often both need to be developed in conjunction. The RaD TF has started compiling a set of existing relationships and terms from different efforts. The table is huge and ungainly at the moment, but it lists the existing properties that different groups have used, although different people use them slightly different. They are in the process of pulling out some essential features from this and can share their findings with the other groups soon. The plan is to go through specific examples tomorrow and determine how to create representations. A suggestion for a potential deliverable for Kobe was put forward by Gully, to have a combined representation of connectivity in the hippocampus between individual cell types and areas and capable of handling two levels (structures and cells) and also capable of expressing lineage information. o We may wish to start by taking advantage of the connectivity information from Menno and the work in progress as demonstrated by Giorgio. 12 INCF All PONS MEETING Minutes January 27-29, 2010 Reoccurring themes throughout the day: o We will need to work with the Metadata Program on these issues o We need open-access to journals and if not, we at least need some sort of solution for some images 13 INCF All PONS MEETING Minutes January 27-29, 2010 Thursday January 28: Groups broke into the Structural Lexicon and Neuron Registry Task Force core groups and members of the Representation and Deployment group split between these two. Neurons and Properties: Attendees: Jyl Boline NRTF: Menno Whitter, Gordon Shepherd, Sean Hill, Giorgio Ascoli RDTF: Alex Diehl, David Osumi-Sutherland, Stephen Larson, Bernard De Bono (joined by teleconference for part of the day) Overview: The group started with an overview of the issues discussed yesterday followed by a review of an interface Giorgio has been working on for the Neuron Registry Task Force and Menno’s new Rat Brain Workbench. Discussion Topics: Goal is to give domain experts a chance to enter information in a flexible manner and in a way that makes sense to them. Ensure we define relations that people will use to input information. Question arose if people will be entering single instances of neurons. Instead it seems that people will be putting in information from publications, which is more of a category that has been characterized in a paper. Can use lists of properties to aid us in checking for identical entries. Is there a minimal or necessary set of properties that every neuron must have in order to be identified? Or can any neuron be entered as long as it has proper references? Classification will be the last thing that happens and there will be disagreement. Keep in mind the different viewpoints of ontologists vs. domain experts: we can accommodate both as often as possible, but focus on the domain experts when we can’t. 14 INCF All PONS MEETING Minutes January 27-29, 2010 Focus on Properties: Petilla contains a list of features, and the group is working on common definitions of properties, it will be an iterative process. Use “query” as an aid in quality control, it can alert us if things are not right. What level of granularity for description is required? Using the relation “part_of,” we should give as much detail as possible. How to deal with properties (such as fast spiking) that should have a range. A potential option is to have an interface (e.g. Giorgio’s might allow this) where the property has to be defined first and then values can be put in. o Any properties that are added also need to have the option to have publications and curator attached. Giorgio’s Neuron Registry Interface Prototype: This interface has hierarchical ordered folders with drop-down menus. The folders can function as placeholders (not required to be populated). To create a new type of cell, (names are irrelevant because the name can be edited later), start attaching and entering information into the properties that describe that cell. However, an in-depth discussion on properties of properties (see below), indicated this may not be able to accommodate the desired data structure, thus Giorgio would like to get more information on this so he can modify the interface accordingly. Compare this to Cell Ontology entries (Alex Diehl): for the Cell Ontology, people give textual information and he has a curator that combines the information in the ontology. Discussion-State or Conditions of Experiment: There was a great deal of discussion about how a neurons often have properties that itself has properties (e.g. firing rate) that depend on “state” or experimental details (e.g. species, age, behavioral conditions etc.). How much of this information is needed and how specific should it be? There was also discussion about what these types of properties are called, some call them metadata, others annotation, and others data. This seems to be an especially important issue in the case of physiology. Gordon uses models, which forces specific information about each property and the physiological expression. It simulates some specific properties such as bursting and regular firing. Discussion-Necessary and Sufficient: There was a great deal of discussion around whether the properties corresponding to a neuron type in the Registry should (each) be necessary and (all together) sufficient for our definitions and for classifying neurons, especially since there will always be edge cases. An example is the pyramidal cell, which is usually associated with a soma shape; however, there are pyramidal cells that fit the definition without having a pyramidal soma shape. 15 INCF All PONS MEETING Minutes January 27-29, 2010 What is considered necessary to define a cell is for it to meet the criteria of a complete set of properties (which will vary depending on a cell). What is sufficient to define a cell type is the set of minimum properties (matching the criteria for that cell type), which vary from type to type. Rodent Brain Workbench: Menno Witter Menno gave an overview and showed their new web-based interface that links to the rat hippocampus from the rat brain workbench: It is still under development so not publicly available yet. It contains a repository of rat brain sections (focus on the hippocampus, entorhinal, and perirhinal area) in all three planes. They currently have two markers but other markers could be added. Segmentations are contained in an overlay. Descriptions of the areas (including cell layers) are linked to the names. Text describes a clicked area and its closest relationships in the brain. They don’t include definition of the cells in this interface but cell types are mentioned in the text. It would be great to cross-link to the neuron information once it’s available. Next aim is to get the 3D properties working. Linkage to wiring information: o Colors are linked to another web-site that contains connectional data, cell layers and topology. o People can use this site to build a routing diagram. o This is for rat, but they plan to create one for the monkey and rat retrosplenial cortex. o Their code should be general enough for other areas of the brain. o This is built on a reference manager database (~160 papers). Related Discussions: Coordinating and syncing this information to NeuroLex is further down the road but high on the list of next steps. These area descriptions may be useful for the SLTF. This could act as a scaffold for many of the entries of the neuron registry TF. It would be good to integrate this with the atlasing effort, although it might be a little early (since that focus is on the whole brain and mouse at this point). It would be great to share some of this via services (XML for boundaries and the access to the annotation page). 16 INCF All PONS MEETING Minutes January 27-29, 2010 Properties Structure: A fairly lengthy discussion was had about properties and the organization of the data structure. What is a property and how do we determine what properties are assigned to a neuron? This led to a discussion of a relation and its value. A relation was described as how two instances relate to each other, for instance: A neuron: Has shape Has location Has terminus Has role Has neurotransmitter Develops from Has orientation Has size Has firing pattern Has afferent Has efferent Part of Participates in (e.g. awake, behaving) A relationship was described as how an instance has relation to a value. Example: basal dendrites of CA1 pyramidal cells has terminus in area X At this point, the group broke into smaller ones. One group worked on graphically describing an example CA1 pyramidal cell with limited properties. Another group worked on trying to find what might be a set of core defining properties in order to ensure definitions are developed and vetted for them. Structural Lexicon and Ontology Development Attendees: Maryann Martone, Janis Breeze SLTF: David Van Essen, Rembrandt Bakker, Stephen Larson, Laszlo Zaborszky RDTF: Alan Ruttenberg, Mihail Bota, Gully Burns, Sarah Maynard, Onard Mejino 17 INCF All PONS MEETING Minutes January 27-29, 2010 Overview: Maryann presented the 3 scientist ‘use cases’ that help define the goals of the SLTF and the upcoming demos in Kobe: 1. If I have an electrophysiology/fMRI study in primates that shows activation in areas of cortex, and I want to compare these results with genes expressed in the Allen Brain atlas, then I will need to map across primate cortex to mouse cortex. What are the underlying shared structures that allow me to make this cross-atlas and cross-species comparison? 2. If I’m creating a gene expression atlas in mouse brain, and would like a simple and computable hierarchy of high-level structures, where can I find consistent definitions of high-level structures across species? 3. If I have created an atlas with its own parcellation scheme, how can I incorporate an existing nomenclature? Can I represent my atlas in the Scalable Brain Atlas (SBA)? More specifically such a user might be a: 1. Scientist looking for data, e.g. genes in cerebral cortex 2. Anatomist with a new atlas 3. Molecular biologist with lots of money to create a mouse brain-protein expression atlas 4. Computer developer Regarding the various parcellation schemes that already exist or will be created: We should encourage all atlases to have representation in the SBA, which ties to Neurolex. Perhaps a future INCF service could be a registration process for new atlases. Review of current resources for neuro-ontologies: FMA and BAMS Review of FMA (Onard) Onard reviewed the principles underlying Basic Formal Ontology (BFO) and the Foundational Model of Anatomy (FMA) Ontology (Powerpoint available). A first question for our group: How high do we want to define the hierarchy? Once the higher-level classes are defined, then ontologists will determine the underlying structure, independence, etc. The FMA is essentially a top-down approach: you start at the root, determine which anatomical entity is the root class, and follow with a “single inheritance class hierarchy”; things get grouped together based on common properties. This process allows for a 1/2/3-dimensional representation of the brain. All anatomical entities are either material or immaterial: 18 INCF All PONS MEETING Minutes January 27-29, 2010 all entities must have dimension material entities must have mass immaterial entities (“spaces”) don’t have mass – but they’re still 3-D inheritance of properties is propagated “portion/substance” does not inherit 3-D space “cardinal” structures refer to things like head, trunk, limbs Review of “fiat” boundaries versus physical/bona fide boundaries: fiat boundaries can be anchored or floating (e.g., dependent on a surgeon’s eye) fiat boundaries are typical of neuroanatomy Example from rodent barrel cortex: There’s nothing on the surface of the brain that indicates barrel cortex, but staining reveals clear zones. Filling cells show that while some respect boundaries, others don’t. Staining reveals some cell organization, but other cells can move freely between zones. What type of boundary does this suggest? Probably bona fide as there is a clear zone, depending on the level of granularity (i.e. will see bona fide boundaries at microscopic resolution, whereas gross divisions tend to be fiat). There are different kinds of parts: regional (cell body of neuron) and constitutional (plasma membrane of neuron): Regional parts of brain: forebrain, midbrain, hindbrain (like geographic states) Constitutional parts: neural tissue, vasculature, ventricular system (like mountains, lakes, trees) A common problem with nervous system cells is that they don’t respect the divisions/territories we would like to define, e.g., amygdala (Helmer vs. Swanson), which has changed because of observations from staining. Laszlo explained that the classic definition of amygdala was “extended” because staining revealed continuity with surrounding areas. Another example is Substantia inominata – named because classically it was not known what it should belong to; however, staining now shows chemo-architectural commonalities with surrounding structures (striatum, amygdala). Different partitioning schemes also exist based on the scientist/practitioner’s approach: histological, surgical, anatomical/morphological, and the challenge is how to correlate the three. Other anatomic entities have a framework for doing this (e.g., prostate, kidney, etc), but their partitioning is simpler. For a particular definition, you can add as many properties as you want. So neuron has multiple definitions: it is enucleated, it is a neural cell, it has a cell body, etc. These definitions deal only with structure (not function). Later, one can perhaps include connectivity, constituent molecules rather than molecules that are expressed, location, etc. 19 INCF All PONS MEETING Minutes January 27-29, 2010 Surgeons might say “parietal lobe” and by that term include subcortical structures as well, or at least the white matter; neuroanatomists would use that term to mean only cortex. To solve the problem of differing use of the same term within the community, we give the terms suffixes and therefore a unique label (this is analogous to Mihail’s approach, in which every term is associated with a nomenclature). Thus FMA ensures that each term is qualified, but can also extended. This leads to an important clarification: In FMA: if 2 names refer to same thing, they are still one entity. In BAMS: if 2 names refer to the same thing, they are still considered separate entities This requires careful use of synonyms, cross-linking, etc. FMA is currently working with RadLex to enhance the neuronatomy content of Radlex to improve the annotations they use for fMRI. Need to reconcile all the different parcellation and/or naming schemes, whether based on topographic or cytoarchitectonic approaches (e.g., for human brain there is Talairach vs. Freesurfer vs. Neurolex). Review of BAMS (Mihail) The constructing principle is that any term/concept is always defined by the reference in which it is published. Every term is associated with a publication. For brain regions, we don’t know which one label is best. We use a single nomenclature for anchors (though there are many out there). Nomenclatures are species-specific. A nomenclature can be a single term, or can refer to an entire region or species. The anchor here is the “BAMS Neuroanatomic Ontology” which is in the rat domain (Swanson 1988), but upper-level brain regions can be applied to any mammalian species. This is general enough in definition to apply to humans. At the upper level, we have organism and cell. (Mihail to talk to editors for Swanson about getting atlas definitions online to incorporate into SBA; legal issues, etc.) Nomenclatures are not “fixed” – they can be merged, expanded, revised, etc. Gene expression data is available across nomenclatures. Questions to ask: if we wish to standardize a nomenclature: Should the terms reflect the way that people would organize brain regions? OR Should they be organized the way an anatomist would prefer to see it? (e.g. midbrain and hindbrain as part of brain stem) 20 INCF All PONS MEETING Minutes January 27-29, 2010 Alan: Terms are always ambiguous. We want to accommodate different nomenclatures as the major ontologists are doing. The strength of BAMS is the ability to grab the citation easily; with FMA it’s more difficult to get back to the citation. Best practice that INCF should adopt: for any representation that is presented, people should be able to disagree. How do users define striatum for usability/experimentation/etc? Currently Neurolex says “Striatum of X (2000)” and ”Striatum of Y (2002)”. We can have an overall, classical “striatum” under which all striatums are grouped. Striatum mammal can be a term that refers to this piece of brain shared by all mammals. What about terms existing in multiple nomenclatures? Alan: When things are identical, they are synonyms. Every term must refer to some piece of brain. If terms have identical definitions, then that term becomes one, and they point to two different papers or nomenclatures. If they are not identical, they will be striatum A and striatum B, and each will refer to its respective source. If possible, we will link out pictures to accompany these terms. Example of the FMA definitions of Thalamus and Brain Stem are shown below. Definitions can be modified, expanded, and include previous definitions, always referencing the source. Thalamus, rat Thalamus, SW 92 (identical to SW 98) Thalamus, SW 98 Thalamus, SW 04 (e.g., “thalamus as defined by Swanson in 2004 publication”) Brain Stem Brain Stem, SW 92 Brain Stem, SW 98 (part of SW 04) Brain Stem, SW 04 (part of SW 98) How do we get to a high-level shared parcellation if we have different fundamental definitions of brain stem? Hierarchy is a strict consequence of definition. (DVE: Hierarchical relationships get problematic when Swanson is putting thalamus in the brainstem.) For this nomenclature, we get Thalamus-Swanson-1998, which is part of BrainstemSwanson-1998. (Note the definition of thalamus from BMA is entirely structural, whereas the Swanson definition is primarily developmental.) We could create Thalamus, consensus to include all of SW 92 98 04. 21 INCF All PONS MEETING Minutes January 27-29, 2010 Plan before Kobe: Each expert TF member gives his/her definition of a structure, then Stephen and Maryann deal with the Neurolex side of mapping it across rat/mouse/monkey/human. We should start with the cerebral cortex of rat/primate/human/macaque (perhaps getting Rembrandt’s help via Cocomac). For cross-species definitions, they will be much more generic, e.g. for Cerebral Cortex, one can’t say that “it contains lobes” since that statement is not true of all species. Maryann: As a starting point, we can use Doug Bowden’s “Rosetta Stone” list derived from BrainInfo: a collection of terms that people have commonly used (available at http://braininfo.rprc.washington.edu/OtherModels.aspx?requestID=3071&questID=21& pTerm=NeuroLex+Mammalian+Brain). We will take that list of structures, and create consensus definitions for all of them, starting with pan-mammalian structures. Underneath pan-mammalian will be rodent/primate versions. The technical back-end of this nomenclature will provide translation among branches and among correlated structures across species. Clarification: Consensus structures are those that people “know” what they mean. (An example of a non-consensus structure is ‘lentiform nucleus’ (it’s not a “primary” term, as it refers to a bunch of structures.) To do (Stephen): Create disambiguation pages for Neurolex (similar to Wikipedia). 22 INCF All PONS MEETING Minutes January 27-29, 2010 EXERCISES IN DEFINING TERMS AND PARTS Hippocampus (Group joined by Menno Witter) How to define hippocampus? There are many possible definitions: Neurolex: http://neurolex.org/wiki/Category:Hippocampus Gully: Use gross anatomy based on how it looks on Nissl stain. DVE: “Architectonically distinct part that includes sheet-like region that includes CA1-CA3 or CA4, adjoined by the subiculum (CA1) and termining in dentate gyrus.” Then describe parts of hippocampus. There are 3 layers: Molecular layer (outer layer) Cell layer (Pyramidal in CA1) Polymorph layer (Oriens in CA1 and CA2) Then break down further…. In-depth review of BAMS Using as an example the Bed nuclei of the stria terminalis (BST) in rat: Bed nuclei of the stria terminalis (can take a while to load and/or does not always load). (Noted that neurolex definitions currently lacking http://neurolex.org/wiki/Category:Stria_terminalis (currently defined via Wikipedia) http://neurolex.org/wiki/Category:Nucleus_of_stria_terminalis) Start with first citation: seems to be Johnston 1923 (Journal of Comparative Neurology) – he called it the bed of stria terminalus) can also be called the special nucleus of stria terminalus. This was followed by Guroljian 1925 (Journal of Comparative Neurology) which provided topographical data. Until 1960, there was considered to be a single BST (bn.st). Then in 1963, Blzier divided the structure into BST a, b, c, d, e (based on cytoarchitecture data: Nissl stains and landmarks). Two BST paradigms then emerged: divide BST into posterior and anterior divide BST into medial and lateral In 1987, Bayer looked at BST from the point of ontogenesis and recognized anterior and posterior parts. This is the current definition of BST. 23 INCF All PONS MEETING Minutes January 27-29, 2010 January 29th Attendees: All from the 27-28th, in addition: Ilya Zaslavsky, University of California, San Diego, CA, USA, zaslavsk@sdsc.edu David Kennedy, University of Massachusetts, Worcester, MA, USA, david.kennedy@umassmed.edu Albert Burger and Clif Saper joined for part of the day by phone The day began with a continuation of the small group working sessions that started on the 28th. Discussion of Structural Delineation Issues: Laszlo Zaborszky Laszlo gave an example of an important issue moving forward. In the basal forebrain, it is difficult to create boundaries using the cytoarchitectural or other structural features. They found that four different cell types in this area display a general pattern of highdensity clusters. The other three cell types form twisted bands along a central dense core of cholinergic cells traversing the traditionally defined basal forebrain regions. A representation of the space that the cell types occupy is X…but the space covers multiple regions. We need a general representation of what space this occupies in the brain. These cell populations make odd shapes in the brain, but it doesn’t necessarily correspond with the structures that people delineate in the brain. This exemplifies the difference between cell types and cell populations (or cluster), here you’re talking about the instance of the type as a cell population vs. the instance of the type as a cell, which are modeled in a different way. A brain region is not the same as the space of a region. This can be described with the ontologies using “scattered aggregates”, we can create concentrations and densities, and describe containers in which these aggregates sit. Wrap up of Working Sessions: NRTF summary (Giorgio): Review of major topics covered by the group during their working sessions: Ensure that information can be easily accessed, visualized, and entered. Discussed what is necessary and what is a sufficient set of properties that every neuron must have in order to be identified, e.g. we discussed pyramidal cells 24 INCF All PONS MEETING Minutes January 27-29, 2010 that don’t actually have a pyramidal shaped soma, in which case, we either create 2 subclasses, or remove this property if not actually necessary to classify those cells. Discussed what might be the difference between informatics vs. domain expert point of view and issues. Work on common definitions of properties (will be iterative). Reviewed potential Neuron Registry interface to define properties and neurons. There needs to be a protocol put into place that links tools like this to NeuroLex. Clarified 5 components for assigning characteristics: relation, value, part of the neuron that gains that relation, a reference (pub ID), and a free-text optional note by the curating domain expert. Created a graph (below) for visualization and to help us better understand each other. A few points came out of this exercise: o A dedicated interface is needed to facilitate entry and representation of this kind of structure. o We need to assign references and authorship to each and every property separately. o We will have to investigate OWL to see if it can accommodate this. SLTF summary (Maryann): Review of major topics covered by the group during their working sessions: Discussed some use case scenarios of who might want to use this, including people searching for information, people that want to annotate data, people that want to adhere to the structure label standards of this group, and computer scientists might need services Reviewed FMA model (Onard): human anatomy with a top-down approach Reviewed BAMS model (Mihai): rodent, deals with a lot of different nomenclatures and defines spatial relations between them. Based on Larry Swanson’s view of a brain heierarchy (some of the other anatomists don’t 25 INCF All PONS MEETING Minutes January 27-29, 2010 necessarily agree with it). Led to a lengthy discussion and some disagreement about what’s the best way to create a consensus view, or if it’s even possible. They decided there needs to be disambiguation information, e.g. tell people that “hippocampus” is used x different ways and recommendations for how to use it Decided there can still be a “rosetta stone” for anatomy, with structures most people can agree on, and include relations to other hierarchies/nomenclatures. o Worked on list: David VE: cerebral cortex Laszlo: brainstem and basal forebrain o BrainInfo is a disambiguation resource, shows how many times structures come up in the different hierarchies, no computable structures. o BAMS has a lot of work already, especially at the level of nomenclatures o FMA has a model that can handle some of the representations we need o Use these resources together to tackle this issue Additional SLTF Summary (David VE): He’s been working with Stephen and Rembrandt off-line, with the goal of creating a mapping of parcellation schemes into NeuroLex with as much description as possible along with linkage from NeuroLex to extended Scalable Brain Atlas and other resources. Example: The description of Area 25 in NeuroLex at http://neurolex.org/wiki/Category:Ongur,_Price,_and_Ferry_(2003)_area_25 could include a link that goes directly to the SumsDB, which can launch WebCaret, showing the area in an average MRI volume (surface or volume) in both human and monkey. Secondly, they’d like to get some of the Sums DB parcellation schemes (human, macaque, and possibly rat and mouse) into SBA. They may also be able to map mouse to WHS, interact with the atlasing group, which was followed by a general discussion of how to interact with the atlasing group. In summary, the needs of PONS must be prioritized for the atlasing group (discussion to follow this meeting). Planning: Kobe Demonstration: Mammalian hippocampus representation. Using statements of cellular connectivity, show that we can go to regional connectivity and vice a versa. Need formalization of the structures (Menno), cells (Giorgio), and logical formalizations (RD TF) Need to follow up with RD TF on other pieces that are needed to reach this goal 26 INCF All PONS MEETING Minutes January 27-29, 2010 Do this first for a set number of brain structures and cells, then we can write best practices for this process (start after Kobe, although we may be able to start parts). Workplan: Create specific use cases Create detailed representation of hippocampus structures and cells (Menno and Giorgio) Logical formalizations set up by RD TF Put it into a formal model (OWL) Implement in FMA Show cross scale reasoning Show cross species query Show how INCF can recommend a more clear nomenclature See what can go into NeuroLex and how Identify what still needs to be required and what needs to be displayed Demo 2: Registering a new nomenclature to Neurolex. Action Items: NRTF Action items: For Kobe demonstration: o Create a set of entity lists that NRTF needs to complete this demo o Contribute definitions that the group already has in difference resources o Create specific queries that might be used for this demonstration, also generate additional queries in areas outside of hippocampus to ensure what we build isn’t too focused on one area Have electrophysiologists review our interfaces (Neurolex, Giorgio’s, and Paul’s) to see if it is possible for them to describe electrophysiology characteristics. Examine OWL for limitations Review relevant ontologies in a systematic manner Review current terms in Giorgio’s and NeuroLex interface to see how an ontologist would structure them. Examine property list and determine how it can be divided into properties and values and how it relates to ontologies Look at definitions in sense lab (Giorgio) and put them into GO Compile existing and missing textual definitions of locations for the entorhinalhippocampal complex (notably, layers) from Menno’s interface and determine how to sync with NeuroLex in the future (Menno, send to Jyl & Giorgio for distribution to NeuroLex and other PONs) 27 INCF All PONS MEETING Minutes January 27-29, 2010 Convergence on a set of core properties that are likely to be commonly considered “necessary and sufficient” to describe a cell e.g. a few use cases suggested 3-4 descriptors would do, likely including location and/or shape of some combination of soma, dendrites, and axon, as well as the neurotransmitter, possibly the cell marker, and firing pattern was good to have but did necessarily seem to be a defining characteristic for these use cases. Annotation of neurons: o Menno: an entorhinal and a presubicular neuron o Sean: S1 (non-barrel) Martinotti cell and S1 (non-barrel) nested basket cell o Gordon: two (potentially more) undecided cells o Complete annotation of 8 more cells from remaining NRTF members (NRTF members) [mid-April] o People must also define properties if they don’t already exist Ensure properties we define and create is handed over to PATO Work with the cell graph and convert into OWL (David OS) Complete the Excel template with relations, values, property structure, and one plausible-looking example and circulate to Jyl, Menno, Sean, Gordon, and David OS (Giorgio) Map relations and values to existing unique identifiers in NeuroLex or exisiting ontologies, and note the missing ones (David OS) [mid-February] Write up NRTF operating principles we agreed on and circulate to TF (Giorgio) Implement a working interface for use by task force members and test it with 14 above examples (Giorgio) [July] Ensure interoperability of wiki, spreadsheet, and curator interface with model formalism, access to current ontology terms, definitions, and synonyms (Giorgio, Stephen, RDTF) [July] SLTF Action Items: For Kobe demonstration: o Create a set of entity lists that SLTF needs to complete this demo o Contribute definitions that the group already has in difference resources o Create specific queries that might be used for this demonstration, also generate additional queries in areas outside of hippocampus to ensure what we build isn’t too focused on one area List of consensus structures for mammalian brains (Laszlo, Maryann and Doug) o March 31st: Review Doug’s Rosetta Stone structures and proposed hierarchy o Identify which ones are consensus between rodent and primate o Add the appropriate structure to the Neurolex WHS structures delineated with consensus structures (Laszlo) 28 INCF All PONS MEETING Minutes January 27-29, 2010 o Protocol one day meeting ~April (Laszlo, Doug, Maryann, Seth, Jyl) o Subset of x number of structures ready by Kobe o Finish structure delineations within a year Add all rodent brain structures in BAMS to Neurolex Wiki (Stephen and Mihail) For Allen Atlas Brain Structures, link to the Scalable Brain Atlas structures (Rembrandt and Stephen) Add each parcellation scheme to the Neurolex (can some major ones be added by Kobe?) (Stephen, Alan, Maryann, Doug, David VE, Maryann, Jyl) o Determine naming protocol (Alan) o Properties of each parcellation need to be defined (e.g. defining criteria) (perhaps Mihail, Doug, David VE, Maryann) o Ensure synonyms and equivalencies are recognized (BAMS and BrainInfo have much of this information) Tie different nomenclature parcellations to the NeuroLex consensus structures and preferably to an atlas. Get some started before Kobe, a year and beyond (BAMS, BrainInfo, David VE, Menno) Recommend best practices for new brain parcellations (begin after Kobe) o Protocol o Best practices Define all properties associated with Brain Regions in NeuroLex o Ensure that connectivity property can be expressed in terms of neuron to neuron connectivity property from NRTF Create OWL representation of BAMS (Alan and Mihail) Investigate Scalable Brain Atlas (SBA) copyright issues with (Paxinos) (INCF, Gully) Investigate legal issues of getting Swanson into SBA (Mihail) Sums DB atlases/parcellations (human, macaque, possibly rat and mouse) into SBA (David VE, Rembrandt, Stephen) Sums DB mouse atlas mapped to WHS (David VE, atlasing) Issues for RDTF: Help determine the workflow and coordination among resources: What should go where? Content will be exposed through Neurolex Wiki, but we have multiple resources with representations that should be populated. o NeuroLex o BAMS o FMA o Brain Info o etc. Who will define the higher level “data types”, e.g., populations of cells, that will be necessary for some of the cross scale anatomy? 29 INCF All PONS MEETING Minutes January 27-29, 2010 Cell component from NIF is being submitted to GO (Chris Mungall is submitting them and NIF ID is becoming secondary ID). NIF has to maintain a mapping. o Recommendation: GO should maintain them as secondary IDs. A process needs to be put into place to handle this. We will need multiple models from RDTH for how to formalize: o brain region (above) o a cell o a population of cells o Owl file is standard, but set workflow for syncing to other infrastructure o also develop recommendations for how others might extend this Outline set of properties so that inferences can be made across scales o e.g. brain region X projects to brain region Y if principle neuron with soma or dendrites in brain region X contacts cell with part of cell in brain region Y Put the example cell created by Sean and Giorgio into OWL Split to work more with the other TFs with occasional sessions within the group Determine if meeting in Kobe for review of best practices for hierarchies Infrastructure: Create nomenclature pages Create disambiguation pages (so people can see synonyms while entering) e.g. NeuroLex wiki: high priority, but easy to do via curation (do you mean this, this, this, or this) Define wiki and OBO foundry ontology process for interaction and what needs to be embedded in NeuroLex. How to define things in terms of the other ontologies and to import mechanisms and just the pieces we need. Define some processes and tools. (Alan, David OS, Maryann, Stephen), first pass cell component with GO, should be a process by Kobe (and hopefully some level of automation). Alan and David OS will start discussing this first soon Determine Scalable brain atlas (SBA) issues with Paxinos copyright (Rolf, Rembrandt, INCF and Gully). NeuroLex images of structures need to be linked to the images o mouse: ABA images via SBA o macaque and human: via Webcarat and SBA (probably faster) o rat and mouse hippocampus: Menno o look into Google 3D plug-in and 3D PDF (Stephen) Recommend best practices for a common model to be used for the major atlas hierarchies (how to join together anatomical parts; this is how it should be expressed, OWL or OBO and use certain sets of relations, etc.). May not have this in place by Kobe, but could have a start and then work while there Services for accessing ontology 3 months to start pushing data between NeuroLex and Owl (BAMS and FMA) 30