1. Nomenclatural synonyms erroneously marked as

advertisement
Potential issues for TNRS development
Jan. 23, 2014
At meeting: Martha, Naim, Nicole, Brian
1. Nomenclatural synonyms erroneously marked as "Accepted"
 The problem:
o Tropicos API incorrectly marks many resolvable synonyms as both
"Accepted", or one "Accepted" and the other "No opinion", or both "No
opinion"
o These names can be resolved manually using Tropicos web interface,
therefore should resolvable automatically
o Consequences are severe: many spurious species in BIEN (hundreds
or thousands?)
 Possible solutions:
 Find support ($$$) Tropicos to fix algorithm
 Fix tropicos algorithm ourself (programming time) with their
permission
 Get full information from API and build out own Computed
Acceptance algorithm
 Use The Plant List
 Decisions:
 Go after TPL (see next item for details)
 Send strongly worded/polite/official letter to Tropicos requesting
they fix their algorithm (Naim?)
2. Missing Old World names and synonymy
 The problem:
o Current taxonomic sources do not include most Old World plant
names
o As a result, many names not being resolved, or are matched to
(incorrect) name of an unrelated New World taxon
 Possible solutions:
o Add The Plant List
o Add multiple sources of more local taxonomy (e.g., Flora Melanesia)
 Decisions:
o Research options for access data immediately ourselves (Brad)
 Download directly from website
 Rod Page's methods
o Contact TPL officially to request permission to entire content
 Check with Barbara Thiers (Martha)
 Write official request (Naim)
3. No fully-functional API
 The problem:


o Missing critical options of web interface, in particular ability to specify
taxonomic source
o These shortcoming may make API unuseable for many purposes
o No API makes TNRS useless to many potential external and internal
(iPlant) users
Possible solutions:
o Redevelop the API from scratch
o Expose existing internal functions, a la Aaron ("quick and dirty")
Decisions:
o Apply for additional funding
o If get funding, rebuild from scratch
o If no funding, "quick and dirty" solution
4. Challenges with adding new taxonomic sources
 The problem:
o Currently only Brad can add new sources
o No detailed documentation
o In general, ingest is automatic once source data has been formated to
one of fours schemas (easiest is Simplified Darwin Core). Most
validations are already performed automatically as part of the TNRS
DB pipeline. However, in the past, some critical validations have been
performed ad hoc by Brad
o In addition, many users would like to use the TNRS with external or
private sources. We currently do not support this. Also, we cannot
reasonably add every source requested by users
 Possible solutions:
o Script validations and add to TNRS pipeline
o Write documentation for adding new sources
o Develop ability to access external sources
 Decisions:
o Add outstanding validations to pipeline (Brad)
o Write documentation (Brad)
o Seek funding to support development of ability to use external
sources of taxonomy (Naim, with input from Brad)
5. Weakness in algorithm that selects single best name
 The problem:
o Algorithm at end of TNRS pipeline the selects "Best match"
sometimes chooses arbitrary or obviously incorrect names
o Especially weak when using multiple sources
o The problem will become increasingly serious as we add more
taxonomic sources
 Possible solutions:
o Make improvement to "Best match" algorithm (iPlant programmer
with input from Brad)

o Improvements to TNRS database may also be needed (Brad)
Decisions
o Will need extra funding
o Include in grant (Naim, with help from Brad)
6. Extend TNRS to other codes
 The problem:
o Currently TNRS only handles plants
o Many requests to make TNRS work for other codes as well
o The changes needed involve both the database and name resolution
code
o Moderately challenging, not super complex but will definitely need
dedicated funding
 Decisions
o Animals a priority
o Naim will include in grant
Download