Potential issues for TNRS development Jan. 23, 2014 At meeting: Martha, Naim, Nicole, Brian 1. Nomenclatural synonyms erroneously marked as "Accepted" The problem: o Tropicos API incorrectly marks many resolvable synonyms as both "Accepted", or one "Accepted" and the other "No opinion", or both "No opinion" o These names can be resolved manually using Tropicos web interface, therefore should resolvable automatically o Consequences are severe: many spurious species in BIEN (hundreds or thousands?) Possible solutions: Find support ($$$) Tropicos to fix algorithm Fix tropicos algorithm ourself (programming time) with their permission Get full information from API and build out own Computed Acceptance algorithm Use The Plant List Decisions: Go after TPL (see next item for details) Send strongly worded/polite/official letter to Tropicos requesting they fix their algorithm (Naim?) 2. Missing Old World names and synonymy The problem: o Current taxonomic sources do not include most Old World plant names o As a result, many names not being resolved, or are matched to (incorrect) name of an unrelated New World taxon Possible solutions: o Add The Plant List o Add multiple sources of more local taxonomy (e.g., Flora Melanesia) Decisions: o Research options for access data immediately ourselves (Brad) Download directly from website Rod Page's methods o Contact TPL officially to request permission to entire content Check with Barbara Thiers (Martha) Write official request (Naim) 3. No fully-functional API The problem: o Missing critical options of web interface, in particular ability to specify taxonomic source o These shortcoming may make API unuseable for many purposes o No API makes TNRS useless to many potential external and internal (iPlant) users Possible solutions: o Redevelop the API from scratch o Expose existing internal functions, a la Aaron ("quick and dirty") Decisions: o Apply for additional funding o If get funding, rebuild from scratch o If no funding, "quick and dirty" solution 4. Challenges with adding new taxonomic sources The problem: o Currently only Brad can add new sources o No detailed documentation o In general, ingest is automatic once source data has been formated to one of fours schemas (easiest is Simplified Darwin Core). Most validations are already performed automatically as part of the TNRS DB pipeline. However, in the past, some critical validations have been performed ad hoc by Brad o In addition, many users would like to use the TNRS with external or private sources. We currently do not support this. Also, we cannot reasonably add every source requested by users Possible solutions: o Script validations and add to TNRS pipeline o Write documentation for adding new sources o Develop ability to access external sources Decisions: o Add outstanding validations to pipeline (Brad) o Write documentation (Brad) o Seek funding to support development of ability to use external sources of taxonomy (Naim, with input from Brad) 5. Weakness in algorithm that selects single best name The problem: o Algorithm at end of TNRS pipeline the selects "Best match" sometimes chooses arbitrary or obviously incorrect names o Especially weak when using multiple sources o The problem will become increasingly serious as we add more taxonomic sources Possible solutions: o Make improvement to "Best match" algorithm (iPlant programmer with input from Brad) o Improvements to TNRS database may also be needed (Brad) Decisions o Will need extra funding o Include in grant (Naim, with help from Brad) 6. Extend TNRS to other codes The problem: o Currently TNRS only handles plants o Many requests to make TNRS work for other codes as well o The changes needed involve both the database and name resolution code o Moderately challenging, not super complex but will definitely need dedicated funding Decisions o Animals a priority o Naim will include in grant