Click here to get the file

advertisement
Questions about LOINCs for gene sequences from Jerry Sable
Discussion points (from C McDonald)



Test for human mutations of a specific gene or genes are usually reported by exception: in other words, by noting what is different from “normal”,
and not by reporting the whole sequence.
o This variation can (and commonly is) reported by HGVS format -- a standard syntax for describing mutations
(http://www.hgvs.org/dblist/dblist.html). Ideally, such reports would also include a LOINC code for the reference sequence (the code
system from NCBI) so the position of the variation is crystal clear.
o The variation can also be reported using NCBI’s dbSNP data base ID (http://www.ncbi.nlm.nih.gov/SNP/) -- dbSNP assigns a unique ID for
each short sequence variation that occur frequently enough in a population to be termed polymorphic. (In order to emphasize the
comprehensive nature of dbSNP’s content, the full name of the database was changed from “database of Single Nucleotide
Polymorphism” to the more inclusive “database of Short Genetic Variation” in July of 2011.
http://www.ncbi.nlm.nih.gov/books/NBK174586/)
o Because HL7 v2 allows two coding systems to be reported in one OBX-5, both the HCVS notation and the dbSNP ID could be reported in
one OBX (See HL7 LOINC-based specification for reporting genetic sequences: HL7 Version 2 Implementation Guide: Clinical Genomics;
Fully LOINC-Qualified Genetic Variation Model)
A different nomenclature exists for deletion duplications – and such abnormalities should be reported in the ISCN (Shaffer LG, McGowan-Jordan J,
Schmid M, Editors. ISCN 2013: An International System for Human Cytogenetic Nomenclature. Karger:2013. ISBN-13: 978-3318022537), which is
a syntax for reporting every kind of abnormality that can be observed in any kind of cytogenetic study from simple Q-bands to FISH and ArrCGH. It
reports simple chromosome findings in the familiar way, e.g. XY, XX, or XXY, but can go much farther and provide detail at every level of
cytopathology precision. If every cytopathology test result, included one structured variable (e.g. LOINC # 62356-1, Chromosome analysis result
in ISCN expression….) that described the findings in ISCN syntax in addition to what is currently reported, receivers would have a fully-computable
result. If needed, LOINC could create additional LOINC codes that deliver ISCN nomenclature but were more specific to the tissue or purpose of
the exam. For details on an HL7 LOINC-based specification for reporting everything in a cytopathology report see (HL7 Version 2
Implementation Guide: Clinical Genomics; Fully LOINC-Qualified Cytogenetics model).
Mutation analyses for a given gene can be done in multiple ways.
o Historically (and still today) a set of probes for the variations known to be clinically important is run against the sample. So a report would
list the mutations (in HGVS format and/or as DbSNP codes), but, ideally, would also report the mutations tested for. (Well more than 1000
variations are known to be cause cystic fibrosis, but only 30-80 are common enough to be tested. In LOINC such tests would be PRID tests.
A parallel LOINC code for the same gene by with the name, “mutations tested for” would ideally be reported.)
o It can be done by sequencing. The reporting of the variations found would be the same as for the probe case, but in contrast to the case for
probe testing, one could discover variations of unknown significance. The field is still discovering how and whether to report these. The
HL7-approved LOINC panel(s) for human sequencing provide LOINC terms (and appropriate answer lists) for describing the importance of
such variations. Mutation analysis by sequencing should also describe the parts of the gene that was sequenced. This is usually described in
text narrative.
o




It can be done by other less common methods, which we won’t go into here.
Mutation analysis of infectious organisms
o NCBI’s dbSNP accepts reports of variations from any organism, and it now carries genetic sequences for over 800 different organisms,
including multiple independent submissions, frequency data, genotype data, and allele observations. NCBI has specialized databases of
genetic sequences for Influenza virus, West Nile Virus, and Dengue virus with plans for Rotavirus.
Still have to research what CDC has. They do have special databases for salmonella and related organisms but until recently those used an old
technology -- I believe they are going to pure sequencing now.
A lot is known about variation among HIV, Hep C, and Hep B (at least) but each virus tends to use a different nomenclature for variation. Some have
suggested using the HGVS system which sounds like a good suggestion for reporting individual mutations in viruses and bacteria. Though would
want to also have a way to report a reference sequence for each such organism.
LOINC does have mechanism for reporting actual sequences, as illustrated below. Seq as the property defines a LOINC term that carries a sequence.
We have used Nom as the scale because these are long strings of letters (text if you will). Public Health needed these for the literal sequence of
viruses that had not yet been given a unique absolute ID, but the same approach would work for other uses including those listed below.
Figure 1
From Jerry Sable
Hi Dan, I've attached a PDF with questions about LOINCs for gene sequences. This has come up in the context of ELR 2.5.1 messaging. The PDF includes
examples from messages. Cindy Johns has reviewed them already. Can you put this topic on the agenda for the next LOINC meeting? Let me know if you
have any questions.
Thanks,
A LabCorp example message: This message was discussed on an ELR call.
MSH|^~\&|LABCORP-CORP|LABCORP^34D0655205^CLIA|CA|CA-DOH|20120806152852||ORU^R01|2012080615285210001|P|2.3
PID||415183^^^^^TWIN CITIES COMMUNITY HOSPITAL&04238340|15309706470^^^^^CMBP Infectious Disease
Dept.|342735|TEST^PATIENT^A||190101010000|M||U||||||||||||||||||||
OBR|1||15309706470|551697^GenoSure (R) MG^L|||2012060100001322|||||||||WRIGHT|||||||||F
ZLR||TWIN CITIES COMMUNITY HOSPITAL, INTERFACE ACCOUNT|1100 LAS TABLAS RD^^TEMPLETON^CA^93465|8054344501^^^
OBX|1|TX|33630-5^HIV protease gene mutations detected [Identifier]^LN^551657^HIV GenoSure(R)^L||
CCTCARATCACTCTTTGGCAACGACCCCTAGTCACAATAAARATAGGGGGGCAACTAAGGGAAGCTCTATTAGATACAGGRGCAGATGATACAGTATTAGAAGAAATAAA
TTTGCCAGGRAGATGGAAACCRAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGRTAYTAGTAGAAATYTGYGGACAWARAGCCATAGGT
ACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAATCTGTTGACTCAGCTTGGCTGYACTTTAAATTTT||||||F|||2012060100001322
The LOINC used in OBX-3 is:
loinc_num
component
property
time_aspct
system
scale_typ
33630-5
HIV protease gene mutations
detected [Identifier]
Prid
Pt
Isolate
Nom
method_typ
class
ABXBACT
Question: Is Prid/Nom the best choice for this kind of result?
CM: No this term intended to identify the specific mutations of importance via some coding system (Could be local) , Not the whole sequence. Would
need a LOINC term with the same component but with property of Seq
Another LabCorp example message
MSH|^~\&|LABCORP-CORP|LABCORP^34D0655205^CLIA|CA|CA-DOH|20120806152852||ORU^R01|2012080615285210002|P|2.3
PID||415183^^^^^TWIN CITIES COMMUNITY HOSPITAL&04238340|15309706470^^^^^CMBP Infectious Disease
Dept.|342735|TEST^PATIENT^A||190101010000|M||U||||||||||||||||||||
OBR|1||15309706470|815959^HIV-1 Genotypic Drug Resist, Panel^L|||2012060100001322|||||||||WRIGHT|||||||||F
ZLR||TWIN CITIES COMMUNITY HOSPITAL, INTERFACE ACCOUNT|1100 LAS TABLAS RD^^TEMPLETON^CA^93465|8054344501^^^
CM: Same comment as above, this term would be used to report a discrete mutation. Need a different term with property of Seq and method of
Sequencing to report the whole sequence.
OBX|1|TX|30554-0^HIV reverse transcriptase gene mutations detected [Identifier]^LN^815962^Reverse Transcriptase Mutation^L|
|CCTATTAGTCCTATTGAAACTGTACCAGTAARATTAAARCCAGGAATGGATGGCCCAAAAGTTAAACAATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAAAT
TTGYACAGAAATGGAAAAGGAAGGAAARATTTCAAAAATTGGGCCTGAAAATCCATACAATACTCCAGTATTTGCCATAAAGAAAAARGATAGTACTAAATGGAGRAART
TAGTAGATTTCAGAGAACTTAACAAGAGAACTCAAGACTTYTGGGAAGTTCAATTAGGAATACCACATCCTGCAGGRTTRAAAAAGAAAAAATCAGTAACAGTAGTGGAT
GTGGGTGATGCATATTTTTCAGTRCCCTTAGACAARGACTTCAGGAAATATACTGCATTTACCATACCTAGTATAAACAATGAGACACCAGGGATTAGATATCAGTACAATG
TGCTTCCACAGGGATGGAARGGATCACCAGCAATATTYCAAAGTAGCATGACAAAAATCTTAGAACCTTTTAGAAAACAAAATCCAGACATAGTTATCTACCAGTATATGG
ATGATTTGTATGTAGGWTCTGACTTAGAAATAGGGCAGCATAGARCAAAAATAGAGGAACTGAGAGAACATCTGTTGAGGTGGGGATTYACCACCCCAGACAAAAARCA
YCAGAAAGAACCYCCATTCCTTTGGATGGGTTATGAACTCCATCCTGATAAATGGACAGTACAGCCTATCATGCTRCCAGAAAAAGACAGCTGGACTGTCAATGACATACA
GAAGTTAGTRGGRAAATTRAATTGGGCAAGTCAAATYTATCCAGGGATTAARGTAAAGCAATTATGTAAACTCCTTAGAGGARCCAAAGCACTAACAGAAGTAGTACCACT
AACRGAAGAAGCAGAGCTAGAACTAGCAGAAAACAGGGARATTCTAARAGARCCAGTGCATGGAGTGTATTATGAYCCATCAAAAGACTTAATAGCAGAAATACAGAAG
CAGGGGYACGGCCAATGGACATATCARATTTATCAAGAGCCATTTAAAAATCTGAARACAGGRAAGTATGCAAGAATGAGGGGTGCCCACACTAATGATGTAAAACAATT
AACAGAGGCAGTGCAAAAAGTRGCCACAGAAAGCATAGTAATATGGGGAAAGACTCCCAAATTTAGATTACCCATACAAAAGGAAACATGGGAATCA||||||F|||2012
060100001322
LabCorp's LOINC for HIV sequencing:
loinc_num
component
property
time_aspct
system
scale_typ
30554-0
HIV reverse transcriptase gene
mutations detected
Prid
Pt
Isolate
Nom
method_typ
The OBX-5 result above contains 1201 characters. Prid/Nom might not be the best choice for a result like this.
A possible new LOINC request for the test LabCorp is sending in this message:
component
HIV reverse transcriptase gene sequence
property
Seq
time_aspct
Pt
system
XXX
scale_typ
Nar
method_typ
Sequencing
class
exmpl_answers
ABXBACT
Listing of any
mutations found
Comments:
 Component: Another choice for Component could be the existing one: "HIV reverse transcriptase gene mutations detected"
 Property: Seq is probably a better choice than Prid.
 Scale: Nar seems like a better choice than Nom for long sequences.
 Method: Sequencing is probably an improvement over methodless.

CM: Yes to all

Your feedback and questions are requested on this.
Some reference tables
LOINCs with Property = Seq
loinc_num
component
property
time_
aspct
system
scale_
typ
method_typ
class
status
comments
Example answer: tagttagtgtgaaagaagc
48021-0
Reverse primer
Seq
Pt
Bld/Tiss
Nom
Molgen
MOLPATH
.MISC
ACTIVE
48020-2
Forward primer
Seq
Pt
Bld/Tiss
Nom
Molgen
MOLPATH
.MISC
ACTIVE
74311-2
Porcine
reproductive and
respiratory
syndrome virus
ORF5 gene
sequence
Seq
Pt
Isolate
Nom
Sequencing
MICRO
ACTIVE
Definition: Carries the literal string that
represents the reverse primer for the
sequencing. Use the usual abbreviation for the
four nucleotides.
Example answer: cctcattagtcgtt
Definition: Carries the literal string that
represents the forward primer used to perform
one portion of the sequencing g .The
nucleotides are specified with the standard one
letter codes.
Sequencing of ORF5 gene in PRRSV and
reporting the actual base sequence. The
submitter's lab performs reverse transcriptase
PCR on various specimens, including tissue,
serum, oral fluid, and semen, to detect and
isolate the ORF5 gene. PCR products are then
used for sequencing.
LOINCs with Method = Sequencing
CM: those that are not defined as ”seq” have answers that are either HGVS or some class of organism not the actual sequence recorded in OBX-5
loinc_num
61101-2
67569-4
74311-2
71357-8
component
Influenza virus A neuraminidase RNA
Streptococcus pyogenes M protein (emm) gene
Porcine reproductive and respiratory syndrome virus
ORF5 gene sequence
PDGFRA gene exon 18 mutation analysis
property
Type
Type
time_aspct
Pt
Pt
system
XXX
Isolate
scale_typ
Nom
Nom
method_typ
Sequencing
Sequencing
class
MICRO
MICRO
Seq
Pt
Isolate
Nom
Sequencing
MICRO
Prid
Pt
Bld/Tiss
Nar
Sequencing
MOLPATH.MUT
loinc_num
72487-2
69487-7
73735-3
49122-5
49123-3
49124-1
70907-1
49125-8
70294-4
39025-2
49126-6
74310-4
70867-7
49127-4
49128-2
71760-3
71761-1
74306-2
74308-8
74309-6
74307-0
72767-7
72201-7
72200-9
component
TF gene full mutation analysis
TNFRSF13B gene full mutation analysis
ACADVL gene full mutation analysis
Anaplasma sp identified
Bartonella sp identified
Coxiella burnetii identified
Cryptococcus sp rDNA
Ehrlichia sp identified
Entamoeba sp DNA
Influenza virus A hemagglutinin cDNA
Orientia tsutsugamushi identified
Porcine reproductive and respiratory syndrome virus
strain identified
Rabies virus strain identified
Rickettsia sp identified
Rickettsia typhus group identified
Rotavirus identified
Rotavirus identified
Porcine reproductive and respiratory syndrome virus
ORF5 gene sequence homology to Fostera
Porcine reproductive and respiratory syndrome virus
ORF5 gene sequence homology to IngelvacATP
Porcine reproductive and respiratory syndrome virus
ORF5 gene sequence homology to IngelvacMLV
Porcine reproductive and respiratory syndrome virus
ORF5 gene sequence homology to Lelystad
Influenza virus A hemagglutinin segment sequence
identifier
Influenza virus A matrix protein segment sequence
identifier
Influenza virus A neuraminidase segment sequence
identifier
Jerry Sable, APHL
2014-0527
jsable@tsjg.com
property
Prid
Prid
Prid
Prid
Prid
Prid
Prid
Prid
Prid
Prid
Prid
time_aspct
Pt
Pt
Pt
Pt
Pt
Pt
Pt
Pt
Pt
Pt
Pt
system
Bld/Tiss
Bld/Tiss
Bld/Tiss
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
scale_typ
Nar
Nar
Nom
Nom
Nom
Nom
Nom
Nom
Nom
Nom
Nom
method_typ
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
class
MOLPATH.MUT
MOLPATH.MUT
MOLPATH.MUT
MICRO
MICRO
MICRO
MICRO
MICRO
MICRO
MICRO
MICRO
Prid
Pt
Isolate
Nom
Sequencing
MICRO
Prid
Prid
Prid
Prid
Prid
Pt
Pt
Pt
Pt
Pt
XXX
XXX
XXX
Stool
Isolate
Nom
Nom
Nom
Nom
Nom
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
MICRO
MICRO
MICRO
MICRO
MICRO
NFr
Pt
Isolate
Qn
Sequencing
MICRO
NFr
Pt
Isolate
Qn
Sequencing
MICRO
NFr
Pt
Isolate
Qn
Sequencing
MICRO
NFr
Pt
Isolate
Qn
Sequencing
MICRO
ID
Pt
Isolate
Nom
Sequencing
MICRO
ID
Pt
Isolate
Nom
Sequencing
MICRO
ID
Pt
Isolate
Nom
Sequencing
MICRO
Download