Guideline

advertisement
 Guideline
Subject:
Approval Date:
Review Date:
Review Committee:
Number:
Massively Parallel Sequencing Implementation Guidelines
May 2014, May 2015
May 2018
Genetic Advisory Committee
3/2014
Common Abbreviations:
BAM File
Binary form of a SAM File
CNV
Copy number variant such as a duplication or deletion
CSV File
Comma separated variable (sometimes TSV: Tab separated
variable); generally able to be opened with a spreadsheet program
FASTQ File
A text file for storing sequence data and associated quality scores
Indel
Short insertion or deletion (typically less than 200bp)
MPS
Massively Parallel Sequencing (“Next Generation Sequencing”)
SAM File
A tab-delimited text file that contains sequence alignment data.
SNV
Single nucleotide variant
Ti/Tv
Transition / Transversion Ratio
VCF File
Variant Call Format File
Background
This document is aimed at diagnostic laboratories preparing for implementation of next generation
sequencing based genomic methods. At the time of writing (second quarter of 2015), there are no
NPAAC standard publications specifically aimed at next generation sequencing in Australian
diagnostic laboratories. Australian diagnostic laboratories should adhere to these Guidelines to
ensure that the high quality of medical genetic testing across Australia is maintained. It is hoped that
this document may provide the basis for a standard in the future.
The first version of this guideline was launched at Royal College of Pathologists of Australasia
(RCPA) College's Annual Scientific Meeting, Pathology Update, in February 2013 in Melbourne,
Victoria. This document is the second version updated to reflect the developing knowledge and
changes in this area of testing in the last 2 years. The updated guidelines have been drafted under
the auspices of the Genetics Advisory Committee of the College, with the Chair of the Committee,
Melody Caramins, acting as Editor-In-Chief. As per the first version, separate writing committees
were developed to address each of the topics/chapters. The writing committees had nominees from
the RCPA discipline of Genetics, Faculty of Science of the RCPA, the HGSA, and other experts.
Chapter 1
Ethical and legal Issues
1. Introduction
Medical testing by genomic methods share many ethical, legal and social issues with other forms of
clinical investigation. Genomic methods are simply methods and do not necessarily introduce new
issues. However genomic testing is marked out by the opportunity and challenge of scale. Existing
issues of informed consent, incidental findings, the right not to know, family studies and re-contacting
are potentially magnified due to the of the volume of information that these tests yield.
Comprehensive genomic analyses (e.g. whole genome sequencing or exome sequencing) can
generate information pertinent to the management of diseases other than the targeted clinical
condition being investigated. Genomic testing can, therefore, be viewed as comprising both a
diagnostic and a screening function. The scale of this overlap in test function is unprecedented. The
implications of this complex testing scenario for the individual patient will require clear explanation in
order to obtain informed consent.
Genomic testing should not be performed without careful consideration of these broader issues. For
this reason, this chapter on ethical and legal implications of genomic testing precedes the chapters
detailing the analytical, interpretive, reporting, and resource requirements for such testing.
2. Medical Responsibilities
2.1 Medical genomic testing is subject to ethical guidelines for medical practice and existing
NPAAC guidelines.
There is an ethical dimension to all medical testing. Patients expect that they will be offered tests that
are safe and provide information that is accurate and useful in the management of their condition. If
tests are to be used in clinical management, it is expected that they will have undergone an
evaluation of the evidence for their safety, analytic and clinical validity, and clinical utility. Tests which
have not been evaluated in this fashion or where the evidence base is weak may be used for
research purposes, but any reports should be identified as research only, or not validated to clinical
standards (as appropriate) and the patient will need to give their informed consent to be a part of a
research study.
Guidance on ethical issues relating specifically to genetic testing comprises foundation documents
and a rapidly evolving peer reviewed literature, some of which are listed below. Medical genomic
testing is subject to the existing NPAAC guidelines. When genomic testing is used for targeted DNA
sequencing i.e. analysis of genes known to cause the patient's current disease, it falls within current
guidelines for genetic testing by Sanger sequencing. Genomic testing for a clinical condition with a
suspected genomic causation but with unknown a-priori genetic basis i.e. whole exome or whole
genome sequencing falls more within current guidelines for microarray testing.
2.1.1 Resources
•
•
•
•
•
•
ALRC-NHMRC (2003): Essentially yours – the protection of human genetic information in
Australia.
NHMRC (2010): Medical genetic testing: Information for health professionals.
NPAAC (2013): Requirements for Medical Pathology Services.
NPAAC (2013): Requirements for medical testing of human nucleic acids.
Australian Medical Council. Good medical practice: a code of conduct for doctors in
Australia.
Australian Medical Association Code of Ethics.
2 •
•
•
•
•
•
•
•
Standardisation in clinical laboratory medicine: an ethical reflection. Bossuyt, Louche, &
Wiik (2008).
ACCE model process for evaluating genetic tests. CDC (2004).
American College of Medical Genetics (2012): Points to Consider in the Clinical
Application of Genomic Sequencing.
PHG Foundation (2011): Next steps in the sequence. The implications of whole genome
sequencing for health in the UK.
Health Council of the Netherlands: The thousand dollar genome, an ethical exploration.
The Hague: Centre for Ethics and Health, 2010.
Evans and Rothschild, Return of results: not that complicated? Genet Med 14(4):358-60;
2012
HGSA Commentary of ACMG recommendations 2014
Incidental findings in clinical genomics: a clarification. American College of Medical
Genetics and Genomics (2013) Genetics in Medicine 15 (8) 2013
2.2 There should be an explicit medical consultation framework within which clinician and
laboratory operate whenever genomic testing is undertaken.
While all pathology test requests imply a consultation between the referring clinician and the
laboratory professional supervising the test, this should be an explicit requirement of referrals for
genomic testing.
The pathology teams need to know the specific clinical question being asked of testing to allow
planning of the analytical processes and to facilitate interpretation of the analytical result. The
referring clinician should provide adequate clinical and laboratory information to assist in these
decisions.
Laboratory Directors should clearly distinguish between clinical testing with a strong evidence base
from testing for research purposes in which there is an evolving evidence base.
Laboratories should clearly state their policy on reporting incidental findings and variants of unknown
significance.
The referring clinician should know what analytical approach will be undertaken, the policies of the
pathology service with respect to reporting of findings, incidental findings (including carrier status for
recessive disorders), storage of data, and links with research bodies and biobanks, so that this
information can be conveyed to the patient during the informed consent process. The pathology team
and the clinician may wish to vary procedures regarding information provided, consent, and testing
performed on a patient-by-patient basis.
2.3 Patients should receive counselling prior to genomic testing by a genetic counsellor or
relevant medical specialist.
There is no consensus guideline on best practice with respect to the components and communication
elements of counselling for genomic testing, however the core principles of genetic testing apply and
should be discussed. These core principles include the requirement for a discussion of both expected
results and incidental findings, that the interpretation of results requires reference to population and
disease specific genomic databases, and that the interpretation of results may alter with increasing
knowledge. Information about the storage of data, and protocols for re-analysis and call–back, should
be communicated to the patient during the process of obtaining informed consent. The pre-test
counselling should also clarify that results will be conveyed to the patient, discussed during post-test
counselling, and distributed to specified clinicians and records. These principles also apply in settings
in which consent is provided by an appropriate proxy for the patient e.g. in a paediatric setting.
The HGSA commentary emphasises importance of patient autonomy:
3 The HGSA Code of Ethics states that individual autonomy should be respected by “actively promoting
informed decision-making, which is not coerced, for all involved by providing accurate and balanced
information and an opportunity to deliberate, based on individual values and beliefs”. The ACMG
recommends pre and post-test counselling, and note that it could be considered coercive with respect
to predictive information to only offer a choice between receiving reported incidental findings and not
having genomic testing at all.
2.4 A formal consent process should be in place and patients should consent to testing for a
medical service.
A standard consent form should be developed which is acceptable to all jurisdictions.
Online access is available to Information/consent/test ordering forms for various genomic assays
provided by the Baylor College of Medicine (USA) and Ambry Genetics (USA).
Please note that these resources are examples only, and no commitment is made that these are
suitable for specific purposes, times, or places.
2.4.1 Resources
Examples of genomic information sheets and consent forms:
•
•
•
•
•
•
•
•
Genomic Information Sheet
General Genomic Consent Form
Cancer Exome Consent Form
Exome Sequencing Information Sheet
General Exome Consent Form
Cancer Exome or Panel Consent Form
SA Pathology informed consent for genetic testing
SA Pathology informed consent for genomic testing
2.5 The patient should provide specific consent to allow the contribution of their de-identified
data to public databases.
The development of population reference ranges for laboratory analytes has been a fundamental
process in the development of diagnostic testing. The clinical validity and utility of tests which identify
deviations from population reference ranges can be assessed, and an evidence base for the use of
the test can be built. This fundamental principle also applies to genomic testing i.e. documenting the
frequency and clinical relevance of variants in different populations. All patients seeking genomic
testing should, therefore, be asked for permission during the consent process to include de-identified
results into publicly available databases for the common good.
2.6 Patients should receive a clear written record of the policy regarding the reporting of
incidental findings.
As yet, there is no consensus on whether and what incidental findings should be reported to the
patient. Patients may have the right to know, to know of some, or not to know about incidental
findings. Doctors have both an obligation to do what the informed patient has requested and to advise
the patient of any serious health risk revealed by testing. They also have an obligation to the blood
relatives of the patient. These ethical dilemmas are made more complex by the fact that the
significance of many findings is unknown and that the classification of benign and pathogenic
mutations may be unreliable and alter over time with accumulation of new evidence.
The recommended use of targeted analysis, where it does not interfere in reaching a diagnosis, is a
pragmatic approach to minimise these ethical dilemmas.
4 One approach is to classify findings into groups (or “bins”) as a function of the risk of disease and the
existence of effective therapy. A process has been proposed in which stakeholders would determine
which genes belong in the medically actionable bin (see Resources below). Achieving consensus on
the management of incidental findings is likely to be a complex process as studies show the wide
range of views of both patients and healthcare professionals as to what constitutes valuable
information. The construction of databases with phenotypic annotation of genomic variants and
evidence based research will be critical to the success of this approach.
It is current practice for doctors to seek consent from patients to notify them of actionable mutation
results identified during testing. The list of these mutations needs to be documented and regularly
reviewed. Should patients decline to be notified, many doctors have taken the decision not to proceed
with genomic testing and have offered other diagnostic pathways.
In this period while debate and guideline development are in progress, doctors are using standard
practices (clinical reasoning, the advice of peers and local ethics committees) to develop their own
approach to these issues.
Whatever approach they choose, clear verbal and written communication of the policy about what
findings will and will not be disclosed should be provided.
As this is an emerging technology, there is, as yet, no case law regarding medical liability of referring
clinicians or pathologists in this field.
2.6.1 Resources
•
•
•
•
Berg JS, Khoury MJ, Evans JP. Deploying whole genome sequencing in clinical practice
and public health: meeting the challenge one bin at a time. Genet Med 2011
Jun;13(6):499-504.
Evans and Rothschild, Return of results: not that complicated? Genet Med 14(4):358-60;
2012
Managing incidental and pertinent findings from WGS in the 100,000 Genome Project. A
discussion paper from the PHG Foundation April 2013.
Rigter et al. Reflecting on Earlier Experiences with Unsolicited Findings: Points to
Consider for Next-Generation Sequencing and Informed Consent in Diagnostics. Hum
Mutat 2013 Jun 19.
2.7 Results from genomic studies should be conveyed to the patient in the context of post-test
genetic counselling by an appropriately qualified expert.
2.7.1 Resources
•
•
•
•
•
NHMRC 2010 Medical genetic testing: Information for health professionals. Australian Health
Ethics Committee.
NPAAC (2012) Requirements for medical testing of human nucleic acids. Refer to Appendices
regarding ethical categorization of tests.
Biedecker, Opportunities and challenges for the integration of massively parallel genomic
sequencing into clinical practice: lessons from the ClinSeq project. Genet Med 14(4): 393-398;
2012.
Green et al, Exploring concordance and discordance for return of incidental findings from
clinical sequencing. Genet Med 14(4):405-10; 2012.
The Collaborative Institutional Training Initiative (CITI): a resource for research ethics
education.
5 3. Laboratory Responsibilities
3.1 The accountable laboratory professional should sight a copy of the consent form before
testing.
The existing NPAAC Requirements for Medical Testing of Human Nucleic Acids (see Appendix A)
distinguishes between two classes of DNA tests:
•
•
Level 1 tests (the default classification; includes diagnostic testing and neonatal screening)
and
Level 2 tests (DNA testing for which specialised knowledge is needed for the DNA test to be
requested, and for which professional genetic counselling should precede and accompany the
test; this includes predictive and pre-symptomatic tests).
The document notes that specific written consent and counselling issues are associated with Level 2
tests, and assigns responsibility to the laboratory director to document consent and defer testing if
there is a concern about the consent process. Although diagnostic testing using genomic methods
could be regarded as a Level 1 test, genomic testing should be regarded as a level 2 test because of
the complexity of the issues associated with consent and variants of unknown significance.
Pathologists and scientists need to be assured that the patient has undergone pre-test counselling by
an accredited genetic counsellor or relevant medical specialist, that counselling has included
discussion of expected outcomes of testing and the likelihood and type of incidental findings, and that
the patient has given informed consent.
The NPAAC document does not require that the consent for Level 2 testing be sighted - only that the
accountable laboratory professional knows that such consent has been provided. However, to ensure
explicit consistency in the pre-analytical, analytical, and post-analytical phases of a genomic test (see
Sections above and below), the laboratory professional may request to sight a copy of the completed
consent form prior to testing. The laboratory should request a copy of the consent form for whole
exome or whole genome sequencing.
3.2 Data generated by genomic testing are subject to the privacy legislation operating in each
jurisdiction.
Under the Australian Privacy Act 1988 (Cth) (Privacy Act), the health service provider in the private
sector is responsible for the security and privacy of a patient’s health information. Commonwealth
State or Territory laws apply to health service providers in the public sector. There are specific
provisions under the Federal Privacy Act to allow medical practitioners to disclose a patient’s genetic
information without their consent to a relative if there is a serious threat to life, health, safety of the
relative and the use or disclosure is necessary to lessen or prevent that threat (APP 6). Note that
compliance with the NHMRC Guidelines (see below) is a legal requirement for anyone wishing to
utilize this legal provision.
There is debate as to whether additional legislative security is necessary because of the identifiability
of data generated by genomic testing. An alternative view has been put that it is not the data or DNA
sequences per se that is identifiable, but rather that identifiability only occurs at the time that genome
sequence is matched with that of the patient. As use of the sample without the patient’s permission is
illegal under Australian privacy legislation, any matching of genomic data with a particular patient is
illegal and the perpetrator is subject to existing legal penalties. Further, under the Privacy Act, any
unsolicited or non-legally collected personal information should be de-identified and possibly
destroyed.
3.2.1 Resources
•
•
Australian Privacy Principles.
Privacy Act 1988 (Cth).
6 •
•
•
•
•
•
Privacy Information Sheets.
Privacy in the private health sector.
Fact sheet on management of genetic data in the private sector.
NHMRC Guidelines on disclosure of genetic information without consent.
NHMRC National Statement on Ethical Conduct in Human Research 2007 – updated
2009.
RCPA Guideline: Managing Privacy Information in Laboratories
https://www.rcpa.edu.au/getattachment/a631a573-0d07-4bd4-ba67cfe545618dd1/Managing-Privacy-Information-in-Laboratories.aspx
3.3 DNA samples and laboratory records should be retained in accordance with existing
NPAAC requirements.
The existing NPAAC standard for samples submitted for medical testing specifies the retention of
diagnostic material for “three months from the date of issue of the report for an individual or for
completion of a family study or for completion of testing: whichever of the three periods is longest”. It
is reasonable to apply this to samples submitted for genomic testing.
The document specifies that “The copy of the original report, or ability to reprint the information
content of an original report has a minimum retention time of 100 years.” This may need to be altered
to accommodate genomic testing. It should be noted that the standard specifies for only reports to be
kept indefinitely; the raw data files from genomic testing are very large and their storage poses a
significant cost and logistical burden.
The Privacy Act advises of the risks of keeping health information longer than is necessary as this
may increase the risk of privacy breaches.
A 3 year study at South Eastern Area Laboratory Service (SEALS) in NSW has commenced to
determine the need to access archived genomics reports. This is likely to inform laboratory practice.
In the interim, laboratories are recommended to retain at least an aliquot of the DNA and the
corresponding VCF file for 3 years.
3.3.1 Resources
•
NPAAC Requirements of the retention of laboratory records and diagnostic material (Sixth
Edition 2013)
3.4 Data generated by genomic testing should be stored in accordance with the privacy
legislation operating in each jurisdiction.
Standards Australia AS/NZS ISO/IEC 17799:2001 and AS/NZS 7799.2:2000 incorporate electronic
storage of medical records. There are no specific standards for the storage of genetic information
however the NPAAC document “Requirements for Information communication” and the RCPA
“Standards for clinical databases of genetic variants”, provide useful guidance.
Enforcement provisions for misuse or loss or disclosure without consent of stored health information
are legislated under Federal and State privacy legislation. There is no specific Australian legislation
around genetic information. If health data from Australian patients is analysed or stored on computing
platforms which are physically located in another country, they are subject to Australian Privacy
legislation. If data are lost, disclosed or stolen, the Australian entity that transmitted the information
overseas may be found liable.
3.4.1 Resources
•
See also section on IT infrastructure
7 3.5 Upon request, the Laboratory Director should give patients access to the personal
genomic information for which consent has been given.
Provisions are made for some exceptions in the Australian Privacy Principles (APP 12).
examples include:
Relevant
•
A serious threat to life, health or safety of an individual or to public health or public safety; or
•
Giving access would have an unreasonable impact on the privacy of other individuals; or
•
Various exceptions relating to current or anticipated legal proceedings or under certain legal
authorities.
Procedures may need to be put in place to address specific cultural sensitivities relating to the access
by patients to their genomic data.
3.5.1 Resources
•
•
•
•
Guidelines from the Office of the Australian Information Commissioner.
RCPA Guideline: Release of Pathology Results to Patients.
Guidelines for Researchers on Health Research Involving Māori 2010 VERSION 2.
NHMRC (2007): National Statement on Ethical Conduct in Human Research. See Section 4.7
re working with people of Aboriginal or Torres Strait Islander heritage.
3.6 The Director of the laboratory should observe the relevant provisions of privacy legislation
in that jurisdiction should the business circumstances of the laboratory change.
For a laboratory operating in the private sector (and hence falling under the requirements of the
PrivacyAct), the relevant Australian Privacy Principles include numbers 3, 6, 11 and 12. Where
ownership of the health service provider changes (e.g. amalgamation, takeover, closure) but the
original purpose for which the information was used does not change, the health information stays
with the organisation and there is no requirement to inform or seek consent from the patient.
However, if the new health service provider intends to use the information for purposes other than for
which it was collected, the new provider may need to seek consent from the patient. Where a health
service provider’s business ceases altogether, arrangements will need to be put in place to securely
transfer and store the patient’s health information.
3.6.1 Resources (as per 3.5.1)
4. Scope of Testing
4.1 Throughout the process of testing, there should be explicit distinctions between targeted
diagnostic testing i.e. of selected genes, whole exome or genome sequencing, screening of an
unaffected person, and research studies.
An understanding of the role of genomic testing in investigation of disease is rapidly evolving.
Genomic testing may be indicated in the investigation of patients with a Mendelian phenotype or
family history which strongly implicates a genetic aetiology. The case for targeted diagnostic testing is
clear where the phenotype is consistent with a known disease in which mutations in a number of
genes are known to be causative.
Genomic testing may also play a role in the investigation of families with a Mendelian phenotype
where the specific genetic aetiology is not established i.e. genome-wide diagnostic testing. It may
also play a role in the investigation of multiple affected individuals from different families or single
8 individuals with very rare genetic disorders, where randomised clinical trials to assess clinical utility
and other measures of efficacy of genomic testing are not possible. These referrals are based on
clinical judgment. This approach is analogous to investigations such as cytogenetic analysis,
microarrays or tissue biopsy where the target pathology is unknown.
There is debate about the use of genomic methods in preconception carrier screening for relevant
mutations, prenatal screening, and as a first tier approach for newborn screening.
A recent report from the Foundation for Genomics and Population Health in the UK concludes that
“Extensive interrogation of genomic data for preventive purpose is not recommended.”
These different purposes of genomic testing involve different ethical considerations as well significant
differences in analysis and interpretation. The clinician, laboratory, and patient should have a clear
understanding of the purpose and scope of a test, and this should be reflected in pre-test counselling
and consent, in the analysis and interpretation, and in the reporting and distribution of a genomic test.
4.2 When genomic testing is used in the investigation of a heritable disease, the analytical
approach should be targeted so that only genes relevant to the specific disease phenotype are
analysed, provided that this approach does not compromise test performance.
Current recommendations are that the analytic approach be clinically targeted at a candidate gene or
set of genes which are known to cause the disease phenotype in question. “Genome-wide” diagnostic
testing should only be considered if it is clear that testing with a narrower scope (using filters) will
yield insufficient results.
Where the phenotype is non-specific or not recognized as a particular syndrome, wider capture of
data with targeted data interrogation that can be performed in a tiered manner is useful.
4.2.1 Resources
•
Health Council of the Netherlands: The thousand dollar genome, an ethical exploration. The
Hague: Centre for Ethics and Health, 2010. See Section 8.2 (p 48).
4.3 Patients should provide consent if their genomic data derived from clinical testing are to
be used for research purposes.
There are no specific conditions to be applied to the research use of samples or data that had been
obtained for diagnostic testing using genomic methods. Collaborations between pathology practices
providing diagnostic testing and researchers take several forms and are subject to the provisions of
privacy legislation in the various jurisdictions and to the NHMRC National Statement on Ethical
Conduct of Human Research.
Collaborations between institutions are subject to the framework outlined in the Australian Code for
the Responsible Conduct of Research (2007).
Patients should have provided informed consent for the use of their samples or data for research; this
consent should be distinct from the consent process for clinical testing. Consent may be
•
•
•
Specific to a project under consideration, or
Extended where consent is given for the use of data or tissue in future research projects that
are an extension of or closely related to the original project or in the same general area of
research, or
Unspecified where consent is given for the use of data or tissue in any future research.
There is ongoing debate in Australia about unspecified, also known as “open-ended”, consent to the
use of genomic data in research. Should the patient give extended or unspecified consent, further
consents are required to enter data or tissue into databases or biobanks. The informed consent
9 process should clearly state the protocol with respect to re-contacting the patient about incidental
findings identified during subsequent research projects.
The view presented by NHMRC is that human tissues samples should always be regarded, in
principle, as re-identifiable. All requests from researchers for the release of de-identified data from
samples submitted for diagnostic testing and of associated laboratory data to biobanks or databases,
will require approval by the Ethics Committee with responsibility for oversight of activities of the
pathology service.
Provided a suitable ethical framework is in place, the diagnostic laboratory can provide samples and
data to the researchers, but should retain sufficient sample for the minimum retention period and
laboratory records to meet NPAAC requirements for the retention of the health record.
4.3.1 Resources
•
•
•
•
•
NHMRC 2007 National Statement on Ethical Conduct of Human Research.
Australian Code for the Responsible Conduct of Research.
PHG Foundation (2011) Next steps in the sequence. The implications of whole genome
sequencing for health in the UK.
US National Human Genome Research Institute Strategic Plan 2011.
Data submission policy of dbGap database.
CHAPTER TWO
Wet lab
1. Introduction
Next generation sequencing (NGS) has been adopted in all areas of molecular diagnosis.
Laboratories must become familiar with the critical differences between NGS and traditional Sanger
sequencing. The wet laboratory process is one such area of critical difference. Robust quality
assurance and quality control procedures are essential to ensuring the reliability of NGS testing
results.
This chapter will focus on the “wet” laboratory issues including laboratory environment, sample/library
preparation, template generation, sequencing and quality assurance in genomic diagnostic
application.
1.1. Wet lab processes
Many of the guidelines in this document are common to all forms of nucleic acid testing. These
guidelines should be read in conjunction with ISO 15189 and all relevant NPAAC documents, but
particularly Requirements for Medical Testing of Human Nucleic Acids, and Requirements for the
Development and Use of In-House In Vitro diagnostic Devices.
We propose that these principles and guidance could form a foundation for future specifications of
performance and formal regulations of genomic testing. It is not our intention to generate a user guide
and provide all the solutions. Instead, we try to include some relevant resources for your reference.
For example, some relevant “wet” laboratory issues can be found from the website of the Division of
Laboratory Programs, Standards, and Services (DLPSS) of the American Centers for Disease Control
and Prevention (CDC).
1.1.1 Resources
•
•
GenomeWeb Clinical Sequencing News.
CLSI: Molecular Methods for Clinical Genetics and Oncology Testing.
10 •
•
•
•
•
•
•
•
•
•
NPAAC: Requirements for the Supervision of Medical Pathology Laboratories.
Gargis AS, Kalman L, Berry MW, et al. Assuring the quality of next-generation sequencing in
clinical laboratory practice. Nat Biotechnol. 2012 30(11):1033-6.
Best Practice Guidelines for the Use of Next Generation Sequencing (NGS) Applications in
Genome Diagnostics: A National Collaborative Study of Dutch Genome Diagnostic
Laboratories. Hum Mutat. 2013 Jun 17.
Ellard S, Lindsay H, Camm N, et al. Practice guidelines for Targeted Next Generation
Sequencing Analysis and Interpretation. UK Association for Clinical Genetic Science; 2014.
Linderman MD, Brandt T, Edelmann L, et al. Analytical validation of whole exome and whole
genome sequencing for clinical applications. BMC Medical Genomics 2014; 7: 20.
Pritchard CC, Salipante SJ, Koehler K, et al. Validation and implementation of targeted
capture and sequencing for the detection of actionable mutation, copy number variation, and
gene rearrangement in clinical cancer specimens. The Journal of Molecular Diagnostics 2014;
16(1): 56-67
Aziz N, et al. (2014) College of American Pathologists' Laboratory Standards for NextGeneration Sequencing Clinical Tests. Arch Pathol Lab Med.
Rehm HL et al (2013) ACMG clinical laboratory standards for next-generation sequencing.
Genet. Med. 15 733-747
Van Keuren-Jensen, Keats and Craig (2014) Bringing RNA-seq closer to the clinic Nat.
Biotechnol. 32:884-885
Zook JM, et al. (2014) Integrating human sequence data sets provides a resource of
benchmark SNP and indel genotype calls. Nat Biotechnol 32:246-251.
2. Measures to Control Contamination
2.1 The laboratory should be designed to minimise the contamination of samples at different
stages of the workflow with other specimens or amplified products.
Laboratories should ensure the physical design can accommodate separate areas for patient derived
samples and amplified material.
Possible cross contamination between these areas including by movement of equipment, staff, or
aerosols should be assessed and managed.
Measures should be available to both detect cross contamination between clinical samples, and to
eliminate it. Detection may include the use of processing blanks or environmental monitoring.
Elimination may include the use of hypochlorite or other decontamination measures.
For further information refer to refer to NPAAC Requirements for the Medical Testing of Human
Nucleic Acids.
2.2 Cross contamination between samples due to carryover from equipment:
Laboratories should ensure recommended and appropriate maintenance and cleaning processes are
performed to eliminate carryover contamination.
Laboratories should include a monitoring process for carryover contamination as part of regular
internal quality control
Sample indexes (barcodes) used to identify unique reads in pooled libraries can be used to detect
carryover contamination. These should be re-used on the longest cycle possible. Consecutive runs of
the same sequencing instrument using the same barcode indexes should be avoided. Frequent reuse of the same set of barcode indexes will compromise the laboratory's ability to detect crosscontamination at any stage of the sequencing procedure.
11 2.3 Sample Indexing should be performed at the earliest possible stage of library preparation
to allow subsequent detection of cross-contamination.
The laboratory should avoid workflows that offer the potential for undetectable sample crosscontamination. Workflows that call for multiple manipulations, additions, and incubations of samples
prior to index ligation or amplification increase the risk of undetectable sample to sample crosscontamination whereas workflows which add unique indexes to each sample early in the library
preparation process provide a means to make cross-contamination detectable.
2.4 Laboratories should consider including identity SNPs within the assay to confirm patient
identity.
Identity SNPs’ could be included within each assay and interrogated with a second method to confirm
patient identity, if no unique variants are identified within the genes analysed. These SNPs can also
be used to monitor and detect any carryover contamination within the data. Where members of the
same pedigree have been analysed, bioinformatics analyses to confirm family relatedness may also
prove useful to highlight errors in specimen identification, processing or contamination.
3. Wet Workflow Validation
3.1 The genomic platform used must meet the specifications required for the diagnostic
purpose and be operated in accordance with best practice as determined by the manufacturer.
Consideration should be given to biases inherent in the platform of choice. Particular attention should
be given to ensuring that any systematic weaknesses or errors of the sequencing system do not limit
the diagnostic specificity of the assay, or that if such flaws exist, that orthogonal testing is employed
to detect variants in regions of bias. Examples include regions of high GC content or repetitive
regions.
3.2 Diagnostic laboratories should validate the operational performance of the wet laboratory
workflow used in molecular diagnosis.
Expansion of genomic methods for diagnostic applications makes it increasingly important to
demonstrate data quality, reliability and reproducibility. Diagnostic laboratories should empirically
determine their minimum requirements for data quality.
Analytic sensitivity and specificity are important performance characteristics for genomic diagnostic
applications. Diagnostic laboratories should document these aspects of the laboratory workflow by
comparison of test results obtained under conditions defined above, to those obtained from a gold
standard method (usually Sanger sequencing).
3.2.1
Resources
•
Aziz N, et al. (2014) College of American Pathologists' Laboratory Standards for NextGeneration Sequencing Clinical Tests. Arch Pathol Lab Med.
•
Gargis AS, Kalman L, Berry MW, et al. Assuring the quality of next-generation sequencing in
clinical laboratory practice. Nat Biotechnol. 2012 30(11):1033-6. PubMed PMID: 23138292.
3.3 Diagnostic laboratories should regularly monitor the performance of the wet laboratory workflow
used in molecular diagnosis.
12 Inclusion of known DNA control/standard samples at <10% of the pooled libraries at regular intervals
would allow ongoing monitoring of assay performance and data analysis processes.
3.4 The use of outsourced platforms and services for diagnostic services should meet all of
the standards outlined in this document
If part of the genomic testing process is to be outsourced, NATA accredited providers or providers
showing full compliance with NPAAC standards must be used. It remains the responsibility of the
clinical laboratory to review, retain and furnish for audit all documentation related to clinical testing.
4. Sample Preparation
4.1 The laboratory should assess the quantity and quality of DNA samples before proceeding
with diagnostic application.
Failure to exclude samples of poor quality or insufficient quantity of amplifiable DNA can significantly
affect the sensitivity and specificity of genomic diagnosis and lead to the possibility of false negative
results. This is of particular significance where the sample type may be associated with limiting
amounts of DNA, for example FFPE tissue or cell-free circulating DNA. Failure of sample exclusion
can also affect turnaround time, due to the long cycle of the genomic testing process.
In the case of measuring cell-free circulating DNA for the purposes of non-invasive prenatal screening
or testing, the laboratory should have a process to ensure that adequate amounts of foetal DNA (i.e.
in accordance with the sensitivity limit determined for the assay) are present in the sample prior to
data analysis and interpretation of results.
4.2 Diagnostic laboratories should determine an appropriate range of DNA sample
concentration and types to be included for an efficient test using genomic methods.
Where appropriate, consideration should be given to including related affected and unaffected
samples in the analysis. For example sequencing trios (proband and both parents) to confirm a de
novo change, or tumour and normal samples to exclude a cancer variant as germline.
4.3 When only small amounts of tissue are available for somatic testing, the laboratory should
determine the minimum specimen size and tumour proportion needed for successful analysis.
Assessment of tissue volume and cellularity is usually estimated by microscopic examination by a
competent person. Sufficient purity or proportion of targeted cells can then be achieved through
macro-dissection.
5. Library Preparation
5.1 The laboratory should have an effective system to track the samples during the multiplestep process of library preparation.
For laboratories handling in excess of 1000 samples per year, a Laboratory Information Management
System capable of tracking a multistep workflow, with multiple samples, and QC steps should be
considered.
5.2 The laboratory should have a quality control procedure to assess the adequacy of DNA
fragmentation procedures.
13 For those laboratories that use protocols making use of DNA fragmentation, quality assessment of
DNA fragmentation procedure is essential to ensure the right size distribution and accurate amount of
fragmented DNA samples. The latter is critical for equal molar representation if multiple barcoded
samples are to be subsequently pooled for library preparation.
5.3 The laboratory should undertake quality assurance measures during the validation phase
to demonstrate that no significant allele bias or allele dropout occurs during target enrichment
processes.
The laboratory should determine the optimal conditions for library preparation. Documented metrics of
performance of library preparation should be generated and used to QC library preparation steps on
all clinical samples. For example, effect of input mass of DNA, fragmentation conditions, PCR cycles,
etc. should be assessed. QC metrics in the form of Bioanalyser traces, spectrophotometric readings,
or real-time PCR results should be produced and routinely collected and compared to those of an
optimal validated run.
6. Template Generation
6.1 The laboratory should have a quality assessment procedure to assess the quality and
quantity of a prepared DNA library used for template generation.
An accurate estimation of DNA library quantity is essential for optimal clonal amplification.
Quantification should be based on amplifiable templates (i.e. DNA fragments with proper ligated
adaptors). For example, quantitative PCR (qPCR) has high levels of sensitivity and specificity and
can accurately measure quantities of DNA.
6.2 The laboratory should have a quality assessment procedure to assess the adequacy of
clonal amplification used for template generation.
Quality assessment of the clonal amplification procedure is essential to ensure an adequate
representation of DNA samples in the template. This is critical for equal representation if multiple
barcoded samples have been pooled during library preparation.
7. Data Generation
7.1 The laboratory should establish empirically the coverage necessary for accurate detection
of sequence variants and copy number changes, and provide the best estimation of false
positive and negative rates.
The laboratory should employ quality control measures that specify the quantity and quality of DNA
sequence data to accurately differentiate all targeted sequence variants. This is especially critical
when a multiplexed target enrichment procedure has been used to generate libraries
The laboratory should ensure that there is sufficient coverage for the detection of aneuploidy e.g. in
non-invasive prenatal trisomy 21 testing.
7.2 If multiple samples are to be sequenced simultaneously, the laboratory should have
quality assurance measures to demonstrate that DNA sequence data generated cannot be
attributed to the wrong sample.
14 Consideration should be given to the use of barcoded DNA samples and the possibility of sequence
data being misdirected to the wrong specimen.
7.2.1 Resources
•
College of American Pathologists: Molecular Pathology Checklist 2012. Includes
massively parallel sequencing. Part of a suite of checklists available for purchase online;
not available separately
7.3 Data should be stored as required for diagnostic DNA studies.
Consideration should be given what would be the suitable data format to keep (see further discussion
in Bioinformatics Section). The raw reads and quality scores should be kept as a minimal
requirement.
Data storage should also comply with overarching regulatory and legislative requirements (see
section in Ethical & Legal Issues.)
7.4 Any exception should be recorded for patient samples where steps used in the analytical
process deviate from laboratory standard operating procedures.
This exception log should be kept with the reason(s) for deviation and should retain links to the
patient sample.
7.4.1
Resources
•
College of American Pathologists: Molecular Pathology Checklist 2012. Includes massively
parallel sequencing. Part of a suite of checklists available for purchase online; not available
separately
•
NPAAC: Requirements for the Retention of Laboratory Records and Diagnostic Material (Fifth
Edition 2009).
8. Quality Control and Quality Assurance
8.1 The laboratory director should be able to identify the appropriate quality metrics that are
suitable for their genomic tests.
Consideration should be given to cross platform confirmation. Sanger sequencing should be
considered to reduce false positive and/or negative rates, particularly in small indel variants.
The limitation of genomic testing should be presented in the final report (See the details in the
Reporting section).
QC of sequencing data may include:
•
•
•
•
•
•
•
•
Base call quality scores
Read depth
Uniformity of read coverage
Read enrichment (for capture-based methods)
Percentage PCR duplicates (for capture-based methods)
Allelic Read Percentage
GC bias
Decline in signal intensity along a read
15 8.2 The laboratory should implement quality assurance measures that evaluate the entire
process.
Well-characterised DNA samples should be used as internal quality control samples. Cell lines are
renewable, but may have some balanced or unbalanced chromosomal rearrangements. Blood
samples from young subjects (<55 years) are typically free from such rearrangements, but have
limited supply. Rearrangements that are identified may reflect the age of the donor or be a
consequence of the culture process. Consideration should also be given to obtaining reference
materials from overseas. For example, the Food and Drug Administration of the United States of
American has recently completed the Sequencing Quality Control (SEQC) project, as a part of Phase
III of the MicroArray Quality Control (MAQC-III) project. Its aims were to assess the technical
performance of genomic platforms by generating benchmark datasets with reference samples, and to
evaluate the advantages and limitations of various bioinformatics strategies in RNA and DNA
analyses.
Acceptable intra-and inter-run variability should be established during validation and monitored in
diagnostic laboratories. It is important to determine assay precision, i.e., the degree to which repeated
measurements give the same result – both repeatability (within-run precision) and reproducibility
(between-run precision).
Genomic technologies are rapidly evolving. Consideration should be given whether positive findings
in genomic analysis should be confirmed by a different chemistry or a second method, particularly at
the initial validation stage and for results that affect clinical decision-making.
The laboratory should monitor, implement and validate upgrades to instruments, sequencing
chemistries and reagents or kit used to generate genomic data.
8.2.1
Resources
•
Forsberg LA, Rasi C, Razzaghian HR, et al. Age-related somatic structural changes in the
nuclear genome of human blood cells. Am J Hum Genet. 2012; 10;90:217-28. PMID:
22305530. Free PMC Article.
•
College of American Pathologists: Molecular Pathology Checklist 2012. Includes massively
parallel sequencing. Part of a suite of checklists available for purchase online; not available
separately
•
FDA: MicroArray Quality Control project
•
Roychowdhury S et al. Personalized oncology through integrative high-throughput
sequencing: a pilot study. SciTransl Med 2011; 3:111-21
8.3 Laboratories performing diagnostic genomic testing should participate in suitable
genomic proficiency testing or inter-laboratory sample exchange programs to meet the
requirements for external quality assessment measures.
Laboratories should establish a reportable range for each assay, such as multiple genes, exome and
large genomic regions.
8.3.1
Resources
•
NPAAC: Requirements for Participation in External Quality Assessment (Fourth Edition
2009).
•
CDC: Next-generation Sequencing: Standardization of Clinical Testing (Nex-StoCT)
Workgroup Principles and Guidelines.
16 CHAPTER THREE
Bioinformatics
1. Introduction:
1.1
Scope
Diagnostic applications of genomic testing span a wide range of approaches. These may include
copy number analysis using DNA microarrays or resequencing of single genes (in high multiplex),
gene panels, whole exomes, whole genomes, tumour profiling, non-invasive prenatal
screening/testing, methylation analyses and RNA-Seq.
The scope of this chapter is restricted to consideration of MPS technologies applied to clinical
diagnostic DNA analysis. Excluded from scope are analyses of RNA, transcriptomes, epigenetic and
methylation analysis and other applications of MPS.
Issues addressed cover the range of MPS testing for genes, panels of genes, exomes and whole
genomes. As the size and complexity of the analysis increases, additional procedures and
safeguards may need to be included to ensure robustness and reliability of the analysis.
1.2
The Bioinformatics Pipeline
A “bioinformatics pipeline” refers to a number of computational tasks, generally applied sequentially
(hence the term “pipeline”), which receive at the beginning the output of an MPS sequencing
instrument such as an image or FASTQ files, and progressively analyse this data through key steps,
ending up with a VCF file, or even further with an annotated spreadsheet (CSV, TSV) or Text file.
While there is no one standard pipeline, most bioinformatics pipelines convert the data through a
series of fairly standardised milestones.
A bioinformatics pipeline can be provided by the MPS instrument vendor, using proprietary software,
or using open-source software. None of these approaches has been shown to be innately superior to
the others, provided they are selected, tuned, validated or verified (as appropriate) and applied
correctly.
Primary analysis:
This phase receives raw electronic information from the MPS instrument, and converts it using the
vendor’s proprietary algorithms into genomic signals such as nucleotide positions and ordering (“base
calling”). The laboratory usually has relatively little control of this phase as it is under the instrument
manufacturer’s control.
Where multiplexing strategies have been applied, de-multiplexing is performed at this analysis stage;
de-multiplexing re-identifies the sample from which individual sequence reads were derived.
For amplicon sequencing strategies, primers have to be trimmed from the reads.
The outputs of the primary analysis phase are usually FASTQ files. Quality control (including machine
metrics) and acceptance criteria should be applied at this stage.
Secondary analysis:
This phase receives the FASTQ files from the primary analysis, and maps (or aligns) it to the
reference sequence and identifies changes from the reference sequence (variant calling).
17 The secondary analysis pipeline must be tailored to the MPS technical platform used. For example,
duplicates arising from PCR strategies are typically marked for capture-/enrichment-based
approaches where this strategy helps identify clonally-derived sequences and potential sequence
artefacts. In contrast, PCR duplicates are not marked in amplicon-based sequencing strategies. Local
realignment can optimise mismatches to increase accuracy and minimise false-positive variant calls.
Variant calling is then performed to identify sequence variations from the reference such as SNVs
and small insertions/deletions, copy number alterations and structural changes.
The outputs of the secondary analysis phase are usually BAM and VCF files. There are a large
number of commercial, academic and in-house tools in use for the secondary analysis of MPS data.
Further quality control should be applied at this stage.
Tertiary analysis:
Tertiary analysis concerns the annotation of the identified sequence variants and may involve a
combination of the following strategies:
•
•
•
•
•
•
Comparison of the identified sequence variants to those reported in the most appropriate of
the various polymorphism databases (e.g. dbSNP, dbVar, 1000Genomes, Exome Aggregation
Consortium, Exome Variant Server)
Annotation of the resulting transcript consequences (synonymous, truncating, missense,
splice site etc.)
Application of tools to predict the severity of the alteration, such as in silico pathogenicity
prediction tools, splice site prediction tools, Grantham difference, assessment of sequence
conservation, comparison to known protein domains
Comparison to variants documented in clinical variant databases (e.g. ClinVar, HGMD, OMIM,
LOVD, DECIPHER) and locus- and disease-specific databases
Review of functional data relevant to the variant/locus, including gene expression data, in
vitro, and in vivo studies
Research of the variant/gene published in peer-reviewed literature
For large-scale genomic investigations, such as expanded gene panels, whole-exome or wholegenome analysis, tertiary analysis further involves a process of variant filtering and prioritization, by
removal of findings of lesser interest. The aim of variant filtering and prioritization is to reduce the
number of candidate variants to those most-likely associated with disease. For genome-scale
investigations, variant filtering and prioritization is typically performed in a (semi-)automated fashion.
The resulting pre-filtered set of candidate variants is then manually reviewed in further detail to allow
clinical interpretation and classification of the sequence variants, and to take into account the current
limitations of annotation databases; clinical interpretation and reporting of findings are discussed in
chapter 5.
The outputs of annotation and filtering phases commonly are annotated VCF or CSV/TSV
(spreadsheet) files. Further quality control system standards can be applied at this stage.
2. Documentation
Comment: Laboratories have a choice of using vendor-supplied pipelines, open-source pipelines, or
some combination of both. In general, less documentation is required for vendor-supplied pipelines,
but more customisation and fine-tuning is possible for in-house developed or applied software. The
requirements described in this section apply regardless of the source of the bioinformatics pipeline.
2.1 The laboratory must document all components of, changes to, and auditing of the
informatics pipeline.
18 The laboratory must document all components of the informatics pipeline, including software
packages, custom scripts and algorithms, reference sequences and databases. Any changes, patch
releases or updates in processes or version numbers must be documented with the date of
implementation such that the precise informatics pipeline and annotation sources used for each test
and report is traceable. If information from public websites is used, the date of access should be
documented.
2.2 The laboratory must use version control to track software releases and updates to analysis
methods.
The laboratory may consider use of dedicated version control software to assist with this requirement
for managing software code, such as Concurrent Versions System (CVS), Apache Subversion (SVN),
or Git. There are also dedicated software tools for management and control of laboratory method
documents and validation records.
2.3 The laboratory must document the quality metrics assessed during a test.
For the informatics pipeline, relevant quality metrics include but are not limited to: the total number of
reads passing quality filters, the percentage of reads aligned, the number of single nucleotide
polymorphisms (SNPs) and insertions and deletions (indels) called, and the percentage of variants in
dbSNP.
2.4 The laboratory must document the results of the pipeline validation.
The validation documentation must detail the performance of the pipeline such as the sensitivity,
specificity and accuracy of the pipeline to detect variants and any limitations of the pipeline. The
validation document must be readily available to staff involved in MPS based genetic testing.
2.5 The laboratory should document all training and staff qualifications.
Given the rapid advances in bioinformatics, laboratories implementing NGS-based assays need to
consider appropriate staff training and ongoing professional development of staff in bioinformatics.
Staff involved in the reporting of NGS results must have, as a minimum, an understanding of the
bioinformatics analysis steps and resources used for annotation.
2.6 The laboratory must document the process of data handling and storage.
The laboratory needs to define the minimum set of data to store. Typically, this will involve storage of
.bam, .vcf files but not image files. Alternatively, the laboratory may store .fastq files to allow reanalysis of the primary data. Interpreted variant call files, such as those after review of the initial calls
must also be stored.
2.7 The laboratory must define and document the conditions for data reanalysis.
As our understanding of sequence variation expands and our bioinformatics tool set improves, it may
be necessary to re-evaluate the annotation of a variant or to re-analyse the sequence data. The
laboratory must specify under which circumstances, if any, such reanalysis is to be performed.
3. Validation
The general principles of validation of laboratory tests (IVDs) (see NPAAC Requirements for the
Development and Use of in-house in-vitro Diagnostic Devices - 2014) also apply for MPS assays.
These include design, production, technical validation, and monitoring /improvement, and
documentation requirements. However, that document does not address aspects specific to
19 genomics and MPS, which is covered in greater depth in resource documents such as Clinical
Laboratory Standards Institute. MM09-A2: Nucleic Acid Sequencing Methods in Diagnostic
Laboratory Medicine; Approved Guideline - Second Edition (February 2014) and in Gargis et al.
(2012).
Risk of errors in bioinformatics pipeline: In an analysis pipeline for identification of sequence variants,
one must have high confidence that the resulting variant calls have high sensitivity and specificity.
Although true positives (TP) can be distinguished from false positives (FP) easily through external
validation, it is almost impossible to systematically distinguish false negatives (FN) from the vast
number of true negatives (TN). Different pipelines may vary widely in their degree of concordance of
classification of findings (e.g. O’Rawe et al. 2013), with the risk of false negative rate being
particularly difficult to address, especially with indels compared to SNVs. The majority of differences
between variant calling pipelines appear, however, in ‘problem regions’ of the genomes, such as
repeat sequences, regions of sequence homology elsewhere, low complexity regions and regions
with errors in the reference assembly; the concordance between calls can often be further improved
by applying post-variant calling filters to remove artefactual calls (Li et al Bioinformatics 2014 PMID:
24974202).
Besides variant calling, the use of different variant annotation software programs and transcript
annotation files can also make a substantial difference in annotation results that are not commonly
appreciated (McCarthy et al. 2014). These troubling reports highlight the needs to ensure
bioinformatics pipelines are subjected to rigorous validation and QC, especially for clinical diagnostic
applications.
3.1 Design of validation study
3.1.1 The validation study must be designed to provide objective evidence that the
bioinformatics pipeline is fit for the intended purpose.
Validation is the process of measuring the performance characteristics of a bioinformatics pipeline,
and ensuring that the pipeline meets certain pre-defined minimum performance characteristics before
it is deployed.
3.1.2 The validation study must identify and rectify common sources of errors that may
challenge the analytical validity of the bioinformatics pipeline.
As part of the validation study, it is important to gain an understanding of common error sources that
may compromise the validity of the pipeline, such as:
●
Inherent limitation of individual programs
●
Inadequate optimization of parameters of individual programs
●
Problems with data flow between individual programs
●
Use of incorrect auxiliary files (e.g. wrong human genome reference)
●
Hardware or operating system failure
3.1.3 The validation study must establish the analytical validity of the bioinformatics pipeline
in terms of being able to correctly detect sequence variants (secondary analyses) and
correctly annotate sequence variants (tertiary analyses).
Analytical validity refers to the ability of a bioinformatics pipeline to correctly call and annotate a
variant. Analytical validity must be achieved before clinical validity can be established.
20 Clinical validity refers to the ability of a test to detect or predict a phenotype of interest. Clinical validity
must be established by external knowledge such as results from large-scale population studies or
functional studies (Refer to chapter on Reporting).
3.1.4 The laboratory must validate the entire bioinformatics pipeline as a whole, under the
given operational environment.
A laboratory may choose to put together its bioinformatics pipeline using any combination of
commercial, open-source, or custom software. Regardless of whether an individual component has
been validated, the laboratory is still required to validate the entire bioinformatics pipeline under their
operational environment (i.e., same hardware specification, same operating system, same parameter
setting, and same input load).
3.1.5 The validation study must be designed to avoid bias caused by testing on training data.
It is important to ensure that quality metrics were measured on reference materials that have not
been used for tuning (training) the parameters of any of part of the pipeline. The use of training data
as testing data may lead to artefactually inflated measurement of various quality metrics.
3.2
Validation process
3.2.1 The laboratory must determine standardised performance metrics of the pipeline.
The use of standardised performance metrics ensure that validation results could be communicated
and compared unambiguously. Some commonly used performance metrics are:
•
The frequency of True Positive, True Negative, False Positive, and False Negative results
•
Accuracy
•
Precision
•
Sensitivity
•
Specificity
•
Reportable range
•
Reference range
•
Limit of detection
The usefulness of these metrics depends on testing on a diverse collection of Reference Materials in
an environment that realistically simulates the real operational environment. Depending on the
performance characteristics of the analytical system, it may be necessary to use replicate analyses or
duplicate samples to achieve satisfactory technical reproducibility.
3.2.2 The validation study must define valid ranges for commonly assessed quality metrics.
We generally do not know the correct answer associated with an input FASTQ file, except for the
case of reference materials. Nonetheless, based on the results of RM and other previous experience,
it is possible to establish some general statistics that we could expect from a valid pipeline. For
example, the return of the expected number of variants from WES data set (generating 10,000 –
50,000 variants) can be checked. The transition/transversion ratio (Ti/Tv) can also be determined to
fall within a defined range. Deviation from these pre-defined ranges may indicate a necessity for
closer examination, but does not automatically imply a validity problem.
21 3.2.3 Acceptability criteria must be defined to describe clearly the minimum quality metrics
required to demonstrate the bioinformatics pipeline is fit for purpose.
One way to demonstrate acceptability and fitness for purpose is to undertake proficiency testing
carried out by a NATA accredited (or international equivalent) third party using a different set of
Reference Materials.
3.2.4 The laboratory must benchmark the bioinformatics pipeline using reference material,
where available. The reference materials chosen must be appropriate for assessing
performance of the pipeline for its intended purpose.
Validation of a bioinformatics pipeline generally involves executing it given some input data where the
correct status of the variant is known. These input data are called Reference Material (RM). The
usefulness of a RM depends on obtaining a large variety of input, from sequence containing only
simple SNV to sequences containing complex indels. RM can be generated entirely by in silico
simulation, or sequencing real oligonucleotides of known sequences. Note that for the purposes of
specific bioinformatics Quality Assurance, this RM may consist of well characterised data sets (e.g.
FASTQ files), rather than physical materials such as DNA samples. It is possible to obtain a large
variety of RM from in silico simulation. Nonetheless, RM from real sequences should also be
employed as they likely better capture characteristics of real data. Examples of bioinformatics
reference materials are consensus variant calls distributed by the Genome in a Bottle Consortium for
NA12878, e.g. accessible on the Genome Comparison and Analytics Testing website
(http://www.bioplanet.com/gcat) and consensus calls distributed under the Illumina Platinum
Genomes initiative (http://www.illumina.com/platinumgenomes/). Both of these datasets include a
consensus set of calls from multiple pipelines to allow identification of pipeline-specific artefacts.
3.2.5 The laboratory should compare the results from multiple pipelines, where possible, to
allow identification of pipeline-specific artefacts.
Multiple pipelines could generate quite different variant calling results from the same input FASTQ
file. One strategy to validate a pipeline is to measure the concordance between the results of a given
pipeline against several other widely used pipelines. High concordance does not necessarily
guarantee correctness, but low concordance indicates problems. Poor concordance commonly
overlaps with ‘problem regions’ of the genome, e.g. low complexity regions, as discussed above. Any
limitations of the chosen pipeline must be defined as part of the validation study.
3.2.6 The validation study must establish appropriate error handling within the pipeline.
A bioinformatics pipeline could fail due to the corruption of an input file generated by primary analysis
or intermediate steps within the pipeline. It could also fail due to excessive load on the server or
interrupted network connection. As part of the validation procedure, it is important to assess whether
the pipeline can detect corrupted files or interrupted execution, and generate appropriate error
messages.
3.2.7 The validation study must establish appropriate hardware and operating system
environments to allow successful execution of the pipeline.
The bioinformatics pipeline can be executed in a dedicated computer server, a shared high
performance computing (HPC) environment, or the cloud. The successful execution of these
programs also depends on the use of appropriate operating system, appropriate auxiliary software
program, and supporting reference files (e.g., the human reference genome file, and gene annotation
file). Validation should be conducted in a system that closely resembles the actual operational
environment. See also issues raised in section 5 of this chapter.
22 3.2.8 When changes are made to the test system, the laboratory must demonstrate that
acceptable performance specifications have been met before using the changed test system
for clinical purposes.
3.2.9 The laboratory must define the limitations of the informatics pipeline.
Common limitations of the bioinformatics pipeline include but are not limited to: the maximum size of
indels detectable, regions of poor mapping and/or excessive read depth, regions of poor sequence
coverage, repeat regions and homopolymer sequence regions that may affect variant calling. There
may also be specific limitations of individual specimens that can affect the capability of a given
bioinformatics pipeline.
4. Quality Control and Quality Assurance
Quality control (QC) of sequencing data vs. QC of the bioinformatics pipeline: It is important to
distinguish QC for checking the quality of sequencing data, and QC for ensuring the correct execution
of the bioinformatics pipeline. Data QC is important for checking whether the sequencing data is of
sufficient good quality to ensure variant calling can be performed to the required standard. On the
other hand, pipeline QC is concerned about whether the bioinformatics pipeline has been correctly
executed according to the predefined quality metrics for a given sequencing data input. Both types of
QC are important.
QC of bioinformatics pipeline may include the following metrics:
•
•
•
•
•
Mapping quality
Transition/Transversion ratio
Presence of duplicate reads
Expected number of variants
Expected percentage of known variants (e.g. variants in dbSNP)
4.1.1 The laboratory must monitor quality metrics and acceptability criteria of the informatics
pipeline established during pipeline validation.
Quality metrics are to be recorded for each test performed and interpreted in the context of the
acceptability criteria that were defined during pipeline validation.
4.1.2 Deviation of achieved quality metrics from defined acceptability criteria must be
investigated and mitigated.
Significant deviations may require repeat of the test. For example, a deviation in the percentage of
SNPs in dbSNP observed may indicate a problem with variant calling for that sample.
4.1.3 Quality metrics and acceptability criteria must be reviewed regularly to ensure relevance
to current test performance.
Revalidation must be performed where ongoing deviations are observed and/or substantial changes
to the informatics pipelines have been made. Choice of appropriate quality metrics can be of
significant help in troubleshooting the source of the problem in an underperforming test. Trend
analysis of bioinformatics quality metrics may also prove to be useful. The appropriateness of the
chosen quality metrics to monitor test performance needs to be reviewed regularly, and at least
annually.
4.2
Confirmatory processes
4.2.1 The laboratory must define the policy for confirmation of reported variants.
23 The policy must include a statement as to the circumstances, if any, under which clinically actionable
findings are to be confirmed by use of an orthogonal technology. For example, this may involve resequencing using Sanger sequencing, or using a second, different MPS technology, or applying an
independent or different technique (such as protein, enzyme or functional assay). Confirmation of the
results in an independent sample with the same assay may be considered in an effort to minimise
stochastic effects. The circumstances may depend on the nature of the test request, the performance
characteristics of the assay (in particular the defined accuracy of the test), and the intended use of
the reported result.
4.2.3 The laboratory should consider use of multiple independent software tools to establish
consensus calls or for confirmation of calls.
Depending on the accuracy of individual software tools, establishing consensus of multiple tools may
significantly improve the accuracy of the prediction. The policy for use of multiple software tools and
the confirmation of calls should be established during pipeline validation.
4.3 Quality assurance
4.3.1 The laboratory must participate in QAP programs for the analysis and interpretation of
DNA sequence variants, where such programs are available.
Example of QAP programs include those organised by the RCPA and the EMQN network. Currently,
programs for MPS analysis are in pilot phases.
4.3.2 The laboratory should consider the use of reference materials for ongoing monitoring of
test performance.
For example, alignment and variant calling pipelines can be validated and monitored using the
Genome in a Bottle, Coriell NA 12878, Illumina Platinum Genomes or similar reference materials.
4.3.3. The laboratory should establish the local process for proficiency testing.
Proficiency testing may involve an external QA program, sample exchange, use of electronic
sequence files, reference materials and other approaches.
5. General Informatics Aspects
This section refers to general issues that are applicable in all circumstances and environments.
Where a laboratory uses off-site or hosted facilities (including “cloud” facilities), these requirements
must be met for all stages of the process, including those not physically co-located or under the direct
control of the laboratory.
5.1 Data security and privacy
5.1.1 The laboratory must ensure that data management meets requirements for data integrity
and security including avoidance of tampering with primary data files and/or corruption of
result files.
MPS data may involve the management of very large data files (in excess of hundreds of Gb) on
shared compute resources. Strategies need to be put in place to ensure the integrity of data files is
maintained (e.g. use of checksum tools during file transfer, management of data permissions and
‘write’ access rights) and that a secure copy of the primary data files (FASTQ) is maintained
elsewhere from ‘working copies’ which allows regeneration of results files (BAM, VCF, annotations), if
this should be required.
24 5.1.2 The laboratory must use structured databases wherever possible.
The use of spreadsheets or text files to store information is discouraged as these typically don’t allow
satisfactory traceability or auditing of changes made.
5.1.3. The laboratory must ensure that data management meets the requirements for
protecting patient privacy and autonomy.
General requirements for privacy as they relate to the practice of pathology can be found in the
NPAAC Standards: Requirements for Medical Pathology Services, and Requirements for Information
Communication. Patient autonomy here relates to a patient’s wish of learning or not of incidental
findings that may arise in the course of testing and the general scope of testing to which the patient
consented. Data management strategies should consider the masking of information that is outside
the scope of testing for a given patient sample. This may involve masking of loci other than those
targeted for analysis in a given patient. Masking may be performed at any stage during the
bioinformatics analysis pipeline, but must be performed prior to providing annotated variant calls for
review to a laboratory scientist to ensure the scientist is not exposed to information outside the scope
of testing.
5.2 Data storage and backup
5.2.1 The laboratory must establish a procedure for the storage and backup of data with
particular reference to the management of raw sequence data, primary, secondary, and
tertiary analysis files. The data files to be stored long-term must be identified.
5.2.2 The laboratory must ensure adequate data storage and backup capacity is available.
For MPS data this may require Tb of storage to accommodate primary and secondary analyses files.
Network speed to manage data transfer and access also needs to be considered.
6. Resources
●
●
●
●
●
●
●
●
●
●
●
●
PHG Foundation (2011): Next steps in the sequence. The implications of whole genome
sequencing for health in the UK.
Gargis et al, Assuring the quality of next-generation sequencing in clinical laboratory practice.
Nat Biotechnol. 2012 Nov;30(11):1033-6.
Opportunities and challenges associated with clinical diagnostic genome sequencing: a report
of the Association for Molecular Pathology. Schrijver et al. J Mol Diagn. 2012 Nov;14(6):52540.
Vihinen, Guidelines for Reporting and Using Prediction Tools for Genetic Variation Analysis.
Hum Mutat 34:275–277, 2013.
College of American Pathologists: Molecular Pathology Checklist 2012. Includes massively
parallel sequencing. Part of a suite of checklists available for purchase online; not available
separately.
Pabinger et al, A survey of tools for variant analysis of next-generation genome sequencing
data. Brief Bioinform. 2013 Jan 21. PubMed PMID: 23341494.
Analysis of in silico tools for evaluating missense variants, A summary report. National
Genetics Reference Laboratory, Manchester. 2012.
Best Practice Guidelines for the Use of Next Generation Sequencing (NGS) Applications in
Genome Diagnostics: A National Collaborative Study of Dutch Genome Diagnostic
Laboratories. Hum Mutat. 2013 Jun 17. PubMed PMID: 23776008.
EuroGentest, Guidelines for diagnostic next generation sequencing. 2014.
NPAAC standard: Requirements for the Retention of Laboratory Records and Diagnostic
Material.
NPAAC standard: Requirements for Medical Pathology Services.
NPAAC standard: Requirements for the Information Communication.
25 ●
Clinical Laboratory Standards Institute. MM09-A2: Nucleic Acid Sequencing Methods in
Diagnostic Laboratory Medicine; Approved Guideline - Second Edition (February 2014)
Validation (generic):
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Jennings et al. Recommended Principles and Practices for Validating Clinical Molecular
Pathology Tests. Arch Pathol Lab Med—Vol 133, May 2009
Mattocks et al. A standardized framework for the validation and verification of clinical
molecular genetic tests. EJHG 2010.
NPAAC: Requirements for the Development and Use of In-House In Vitro Diagnostic Medical
Devices (Third Edition 2014)
Validation (secondary analyses):
Linderman et al BMC Medical Genomics 2014. 7:20. Analytical validation of whole exome and
whole genome sequencing for clinical applications
Cornish and Guda. BioMed Research International. A comparison of variant calling pipelines
using genome in a bottle as a reference.
Heinrich et al. The allele distribution in next-generation sequencing data sets is accurately
described as the result of a stochastic branching process. Nucleic acids research.
Meynert et al. Variant detection sensitivity and biases in whole genome and exome
sequencing. BMC Bioinformatics 2014.
Zook et al. Integrating human sequence data sets provides a resource of benchmark SNSNP
and indel genotype calls. Nature Biotechnology. 2014.
Meynert et al. Quantifying single nucleotide variant detection sensitivity in exome sequencing.
BMC Bioinformatics. 2013.
Chin et al. Assessment of clinical analytical sensitivity and specificity of next-generation
sequencing for detection of simple and complex mutations. BMC Genetics 2013.
Pirooznia et al. Validation and assessment of variant calling pipelines for next-generation
sequencing. Human Genomics. 2014.
O’Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications
for exome and genome sequencing. Genome Med. 5, 28 (2013).
Validation (tertiary analyses):
Walters-Sen et al. Variability in pathogenicity prediction programs: impact on clinical
diagnostics. Molecular Genetics and Genomic medicine. 2014.
McCarthy, D. J. et al. Choice of transcripts and software has a large effect on variant
annotation. Genome Med. 6, 26 (2014).
Guidelines (tertiary analyses/ annotation):
ACGS Practice Guidelines for the Evaluation of Pathogenicity and the Reporting of Sequence
Variants in Clinical Molecular Genetics.
ACMG Standards and guidelines for the interpretation of sequence variants: a joint consensus
recommendation of the American College of Medical Genetics and Genomics and the
Association for Molecular Pathology. Genetics in Medicine.
CMGS Practice guidelines for Targeted Next Generation Sequencing Analysis and
Interpretation.
Other:
College of American Pathologists’ Laboratory Standards for Next-Generation Sequencing
Clinical Tests. doi: 10.5858/arpa.2014-0250-CP
26 CHAPTER FOUR
Reporting
1. Introduction
The goal of a genomics report is to convey accurate, interpretable and succinct information that is
relevant to patient care. In the Massively parallel sequencing era, this simple statement is becoming
increasingly difficult to put into practice. This chapter aims to provide guidelines and establish
principles that should assist in the preparation of a genomics report.
Approaches to genomic analysis vary in terms of the technology and methodology used as well as
the breadth of genetic variation that is interrogated; the analysis may yield information about a single
class of genetic variation or may extend to encompass all sequence and structural variants. This
presents a number of challenges for the clinical laboratory when preparing a report based on
Genomic data. The issue of clinical validity and utility is an important concept to keep in mind when
formulating the report, although addressing this is beyond the scope of this document, and is already
well covered by existing legislation in NPAAC standard publications (validation of in-house IVDs and
nucleic acid testing).
A key issue in reporting genomic tests is that variants of known or possible pathogenicity may be
identified which may be unrelated to the primary clinical indication for the test. Such incidental
findings are inevitable in high-resolution genomic studies utilising Massively Parallel Sequencing
techniques which interrogate a greater proportion of the human genomic sequence, presenting
difficulties both for laboratories in reporting such variants and for clinicians receiving unsolicited
information. The potential for false positive results is also amplified by the increasing number of
genes interrogated, and the low prevalence of some of the disorders which may be the subject of
investigation by genomic sequencing.
A second key issue is that a large number of variants of uncertain clinical significance can be
identified and the reporting of this information requires careful management to minimise potential
harm while providing the maximum available, relevant information, for clinical management.
It is essential that laboratories producing reports of genomic tests have clearly defined, evidence
based protocols for classifying the clinical significance of detected genetic variants and addressing
incidental findings. This protocol should define, prior to the analytical result being available, which
outcomes will be reported and which will not and that these protocols are available to requestors of
Genomic tests.
It is also essential that these results are reported clearly, consistently and unambiguously, using
established nomenclature guidelines such as those available from the Human Genome Variation
Society (http://www.hgvs.org/mutnomen/) and relevant standardised reporting formats such as the
RCPA Guidelines for reporting molecular genetic tests to medical practitioners 2009. It must be
recognised that laboratory reports may be read by both experts and non-experts, and may be stored
for years in a patient’s medical record.
This chapter provides a guide for the reporting of NGS results in the clinical context. It has been
developed in the interests of ensuring the analytical and clinical validity of genomic reports, the
consistency and clarity of reporting, thereby assisting in the production of a report that is accurate,
interpretable, succinct and relevant to patient care.
2. Published Guidelines
A number of national and international professional bodies have issued policy statements regarding
the clinical application of genomic sequencing that include guidelines for the reporting of genomic
27 testing. It is advised that these are consulted to provide a broader overview of the issues relating to
the reporting of genomic data.
Published Guidelines for the Clinical application of Genomics that include recommendations on the
reporting of massively parallel sequencing data:
•
•
•
•
•
•
•
•
•
•
•
EuroGentest 2014
Rehm HL et al. Working Group of the American College of Medical Genetics and Genomics
Laboratory Quality Assurance Committee. ACMG clinical laboratory standards for nextgeneration sequencing. Genet Med. 2013;15(9):733-47
van El CG et al. Whole-genome sequencing in health care. Recommendations of the
European Society of Human Genetics. Eur J Hum Genet. 2013 Jun;21 Suppl 1:S1-5
Brownstein CA et al. An international effort towards developing standards for best practices
in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY
Challenge. Genome Biol. 2014;15(3)
Weiss, Van der Zwaag et al. Best practice guidelines for the use of next-generation
sequencing applications in genome diagnostics: a national collaborative study of Dutch
genome diagnostic laboratories. Hum Mutat 2013; 34:1313-21
College of American Pathologists’ Laboratory Standards for Next-Generation Sequencing
Clinical Tests Arch Pathol Lab Med.
Association for Clinical Genetic Science (ACGS) Practice guidelines for Targeted Next
Generation Sequencing Analysis and Interpretation
Scheuner MT, Hilborne L, Browne J, Lubin IM, et al. A report template for genetic tests
designed to improve communication between the clinician and the laboratory. Genet Test Mol
Biomarkers. 2012 Jul;16(7):761-9.
Lubin IM, Caggana M, Constantin C, Gross SJ, Lyon E, Pagon RA, et al. Ordering molecular
genetic tests and reporting results: practices in laboratory and clinical settings.J Mol Diagn.
2008 Sep;10(5):459-68.
Lubin IM, McGovern MM, Gibson Z, Gross SJ, Lyon E, Pagon RA, Pratt VM, et al. Clinical
perspective about molecular genetic testing for heritable conditions and development of a
clinician friendly laboratory report.J Mol Diagn. 2009 Mar;11(2):162-71.
CLSI standard MM09-A2 Nucleic Acid Sequencing Methods in Diagnostic Laboratory
Medicine (second Edition Feb 2014)
3. The Clinical Context of Testing
3.1 The laboratory should have explicit requirements regarding the level and quality of clinical
information that must be provided when a genomic test is requested.
The Massively Parallel Sequencing Genomics report must include the context in which testing has
been requested. This assists with correct interpretation of the report, and re-statement of the clinical
context for testing is currently an NPAAC requirement. Detailed and informative clinical information is
also critical for the laboratory to produce a report that presents its conclusions in the most appropriate
clinical context.
The clinical interpretation of genomic analyses are improved when testing laboratories are provided
with discriminating clinical details. Bioinformatics pipelines are capable of providing interpretative
comment on identified variants. Reporting laboratories should ensure that they understand the
limitations of informatics interpretations and provide adequate review of automatically generated
interpretations in the clinical context of testing.
3.2 The laboratory should maintain active dialog with requesting clinicians regarding
requests, interpretation, and reporting of genomic tests.
Strengthening laboratory liaison with requesting clinicians is essential in any rapidly developing field
of medical testing. The interpretation and reporting of genomic test results would benefit from a team
28 approach, whereby clinical laboratory scientists, pathologists, clinical geneticists, and other medical
professionals are involved in genomic data interpretation.
This liaison should extend from requesting of the test and defining the most appropriate genomic
testing through to interpretation of the report.
Astute clinical assessment and laboratory-clinician consultation can focus genomic analysis on
specific genomic regions, potentially enabling diagnosis of monogenic disorders, or clarify differential
diagnoses.
4. Variant Reporting
4.1 The laboratory should consistently classify genomic variants according to their clinical
significance.
Variants that are classified as benign or even likely benign could potentially not be included in a
report, with reports confined to variants classified as pathogenic or likely pathogenic (EuroGentest
2014 final draft; ACMG Standards and Guidelines 2015). Local policy can dictate which variant
categories are to be reported; however, a record of all variants identified should be maintained by the
testing laboratory and should be readily accessible for review and may be disclosed upon request to
a clinician. The report should clearly state the laboratory's reporting policy indicating which classes of
variants have been reported, highlighting the possible existence of variants which may not appear on
the report.
The classification of a variant as benign or pathogenic must be based on a secure evidence-base
such that significant reclassification in the future is unlikely without additional and convincing
functional data. However, the observation of a variant of unknown significance may require a fresh
evaluation of that variant to reflect new information that may be available.
To ensure consistency, it is highly recommended that a reporting laboratory maintain an in house
database of variants that is professionally curated to a standard acceptable for clinical use (RCPA
Clinical database standards document reference) and submits variant data to a clinical standard
external database.
4.2 The laboratory should have a clear policy regarding the reporting of variants and this
policy must be readily available to referrers.
Irrespective of how genomic variants are classified, there is likely to be a substantial number of
variants of ‘unknown significance’ for which there is no relevant evidence to assist interpretation, i.e.
there is no evidence base on which to determine clinical significance for the condition under
investigation.
Reporting laboratories should be aware that there is a possibility of potential over-interpretation of
results of uncertain significance based on a limited understanding of contextual information. As such,
reporting laboratories must minimise the potential for readers of genomic test reports to misconstrue
the clinical significance of certain clinical categories of genomic findings and the ensuing
anxiety/harm that this may bring to patients (and their families). The report should be transparent in
how it has reached its conclusions, and include information relating to how the findings may influence
subsequent clinical judgment, including suggestions for further testing, if necessary.
4.3 Assessing the Available Evidence
The laboratory’s interpretation of clinical significance should be based on sound evidence. Peerreviewed literature and clinical or near clinical quality databases could be regarded as high quality
primary evidence in the assessment of clinical significance for a particular genomic variant. The
RCPA Standards for Clinical Databases of Genetic Variants document is a useful guide for measuring
29 the quality of a clinical database. Functional studies on genetic variants are occasionally performed
by clinical laboratories to evaluate clinical significance. Examples include RNA studies to determine
the effect of a variant on RNA splicing events, and protein functional studies to determine protein
activity. Caution must be exercised with in-house studies which have not been subject to peer review
processes to ensure that appropriate controls are included and that the results are analytically valid. .
Furthermore, in assessing protein function, care must be taken in determining which functions of a
multi-functional protein are relevant to the disease state and therefore should be assessed.
When evaluating literature the quality of the publication should be taken into account. Critical review
of literature cited in reports is a requisite competency skill for any laboratory geneticist. This should
include consideration of statistical significance requirements in case/control and comprehensive
familial series.
When utilising genomic databases consideration should be given to the purpose of each database
and the processes used for the classification of variants submitted to the database.
It should be noted that merely appearing in a database with a classification of pathogenic does not
constitute final proof that a particular variant should be reported as pathogenic in the clinical context
of the report. It is more likely that a database entry will provide the starting point for the collection of
evidence of proof of pathogenicity and that multiple databases and sources will be required before
arriving at this conclusion.
In general, classifications should be based on multiple independent lines of evidence such as in vitro
or in vivo functional data, segregation with disease, algorithmic prediction of protein function or RNA
transcription events and disease/normal population variant frequencies.
Care should be taken to avoid over-interpreting early reports showing enrichment of a variant in
“affected” populations, particularly those that have not been verified in replication studies. This should
also apply to variants found to confer low relative risk and predictive power for common
(multifactorial) diseases and traits.
4.4 Variant Interpretation
4.4.1 The laboratory professionals who provide clinical interpretations of genomic variants
should understand, have access to, and utilise up-to-date resources to aid them in their task.
4.4.2 The laboratory should develop a protocol for the assessment of variant pathogenicity.
See above section 4.1
Novel and rare genetic variants pose the greatest interpretational challenge owing largely to the
continued lack of high-quality, large-scale control data. Additionally, there is often a lack of
information on rare variants for evidence based assessment.
The task of assessing seemingly novel or rare variants is challenged by the accelerating pace of
discovery of rare variants associated with phenotypic abnormality, as well as a growing number of
clinically significant variants recognised to have incomplete penetrance and variable expression.
The potential influence of genetic background, in particular modifier sequence variants is a
confounding factor and should be considered where there is any evidence to suggest there may be a
modifier effect.
Report writers should be aware of the possibility that the initial diagnosis arrived at in the laboratory
might be incomplete and that additional clinically significant genetic variants (that have gone
undetected or remain unclassified due to lack of an evidence base at the time of reporting) may
underlie the patient’s non-specific or variable phenotype.
The laboratory geneticist should not over-interpret the genomic analysis result, especially where a
variant of ‘uncertain’ significance is concerned, in recognition that these findings do not necessarily
indicate a diagnosis for the patient. Reported assertions regarding variant pathogenicity should ideally
30 include any mitigating clinical information such as the inheritance pattern, clinical context, and
phenotype.
Analysts should also provide insight into the extent of what remains unknown and the challenge this
brings to the task of clinically interpreting genomic findings.
The report should include or cite the evidence which justifies the conclusion regarding the clinical
significance of identified genomic variants.
4.5 Systematic Review of Variant Interpretations
4.5.1 The reporting laboratory must have a written protocol for the review of variants and this
protocol must be available to referrers.
Interpretation of the clinical significance status of a genomic variant may change in the light of new
information.
A key question requiring consideration is how (validated/issued) reports that require modification in
light of new information, with their attendant clinical risks, should be dealt with. In practical terms,
responsibility for a specific patient lies primarily with the physician with an ongoing patient
relationship.
Recommendations that call for continued review of variants (including ‘benign’, ‘likely benign’,
‘unknown’ and ‘uncertain’ significance findings), which collectively may number in the tens of
thousands per sample in the case of whole exome sequencing and whole genome sequencing, have
significant resource implications for laboratories, particularly for professional time. Scheduled review
based solely on defined time intervals also raises the possibility of re-issuing clinically inappropriate
reports performed in isolation from the referring clinic.
It is advised that laboratories have a formal process for evaluating new evidence, re-interpreting
previous, individual patient results, re-contacting referrers, and contributing to patient reviews, where
required. However, a bi-directional flow from the clinic through continued review of patient files can
also contribute to the timely review of variants.
5. Incidental Findings
Genomic analysis will inevitably detect clinically significant variants, which are unrelated to the clinical
features that prompted testing. The issues associated with detection of these remain under
discussion, but their solutions will no doubt involve an emphasis on counselling and education before
testing is performed, informed consent with a clear explanation of the current limits of testing and
interpretation, maintenance of privacy and confidentiality, and sensitivity to culture within families,
their heritage, and their communities. For further information see Section 1: Ethicsl and Legal issues.
Commonly encountered examples of unsolicited findings detected during genomic testing include:
•
•
•
Detection of consanguinity and incest, where this was not known to ordering clinicians and
families;
Detection of carrier status for autosomal recessive disorders unrelated to the clinical indication
for genomic testing;
Detection of variants involving highly penetrant genes associated with dominant, adult-onset
conditions.
5.1 Laboratories performing genomic testing should have clear policies in place for disclosure
of incidental findings.
31 It is advisable that clinicians and patients be informed of these policies and the types of incidental
findings that will be reported.
Clinicians may give patients the option of not receiving certain results. While these policies should be
in place, exceptional circumstances may arise which need to be handled judiciously on a case-bycase basis through laboratory-clinician consultation.
In keeping with the principles of good laboratory practice, uncertainty associated with reporting
incidental findings is usually best managed with input from a medical genetic specialist and/or the
referrer.
Reporting of incidental findings should be limited to variants that are unequivocally classified as
pathogenic or likely pathogenic and are deemed reportable according to the laboratory’s policy.
6. The Genomic Test Report
Clinical Genomic reports should follow the general principles for reporting of genetic tests as
described in the RCPA guidelines (Guidelines for reporting molecular genetic tests to medical
practitioners, 2009) https://www.rcpa.edu.au/getattachment/3cb35802-3cfa-49fa-9d2f4f199501c551/Guidelines-for-reporting-Molecular-Genetic-Tests.aspx and be consistent with ISO
15189 as a minimum requirement. However, the genomics report is likely to need to convey a much
larger amount of information both in terms of test data and in terms of descriptive information about
the test and particularly its limitations.
The challenge is to produce a report that contains all the relevant information for accurate
interpretation of the report while avoiding information overload and possible distraction from the
actual result required for patient care. One approach to solving this problem is to provide a report with
a prominent one page patient summary containing all class 4 and 5 variants relevant to the request,
that is supplemented with further information relating to the actual test (limitations, bioinformatics
pipeline description and metrics) and variants other than class 4 and 5 if appropriate to the request.
Such multi page reports would need to comply with all relevant standards and guidelines for reporting
including page number, report date and the inclusion of patient demographics on each page to
ensure unambiguous linking of the entire report.
A consistent approach to reporting genomic findings is important, particularly for families dispersed
across state or national boundaries. Crucially, those responsible for reporting should appreciate that
interpretive difference may influence medical management and patient choices. Even if a report is
directed to the expert requesting clinician, it is also important to note that reports may be included in
medical records and hence be read by non-experts involved in the patient’s care. Hence, every effort
should be made to ensure that the report is succinct, clear and interpretable by as wide a range of
relevant clinicians as possible.
6.1 Key requirements of a Genomics Report
The minimum suggested content for a report is described below. The following list is not a
recommendation for the structure of the report which could be ordered to provide a one page
summary of important results relevant to patient care followed by additional pages detailing the test
and any further results that may be appropriate.
Report Details
•
•
•
•
Reporting laboratory details
Title of report
Report status and Report authorisation
Issue date and time of report
32 Patient identification
•
•
•
•
•
Name
Date of Birth
Unique laboratory identifier
Gender
Ethnicity if relevant to testing
Patient diagnosis context
•
•
•
•
Clinical details on request
Specimen
Type (blood, tissue and site, fluid)
Secondary specimen identifier (Block number, referring laboratory identifier)
Test description
•
•
•
•
•
Test category
Purpose of test (e.g to assist in the diagnosis of … or the exclusion of…)
Genes tested list
Methodology used including confirmation of variants by an orthogonal method if performed
Limitations to test including any remaining uncertainty where it exists
Result summary
•
•
•
•
•
•
•
•
Inheritance model used for sequencing data analysis if relevant
Gene name using HGNC approved gene symbol
Zygosity
cDNA nomenclature utilising standardised nomenclature (HGVS recommended)
Protein nomenclature utilising HGVS recommended nomenclature
Genomic coordinates utilising HGVS recommended nomenclature based on an LRG where
available and a RefSeqGene record if not available.
Reference sequences including genome build or reference sequence version
Variant reporting policy for the reporting laboratory that complies with relevant guidelines
Interpretive comment
•
•
•
Variant classification as class 1 (benign) through to class 5 (pathogenic)
Narrative comment indicating the relevance of the identified variants to the reason for the test
request
If applicable the need for follow up or confirmatory testing should be indicated on the report
6.2 Laboratories should include recommendations for appropriate follow-up in reports.
In situations in which further genetic studies may be warranted (e.g. parental testing, segregation
analysis, testing of other tissues), these recommendations should be included in the test report.
6.2.1 Examples/resources
•
•
Caris NSCLC example report
Foundation One Sample Report for Cancer Related Genes
7. Internal Laboratory Databases
7.1 Laboratories should establish an internal database of genomic findings.
33 This could serve the purpose of identifying common genomic variants specific to a patient population
and/or recurrent false-positive calls associated with a particular genomic platform.
The curation of an internal laboratory- and platform-specific list of common benign variants can assist
with the interpretation process. This could aid with the process of systematic review of variant
interpretations.
Any handling of data derived from medical testing should comply with the relevant regulatory and
legislative requirements.
8. Sharing Genomic Data
8.1 Laboratories are strongly encouraged to submit genotypic data from genomic testing to
appropriate clinical databases to facilitate consistency of interpretation across laboratories.
Genomic testing would benefit from the availability of clinically vetted, regularly updated databases of
annotated variants that ideally would include population frequencies and referenced clinical relevance
for each variant. Thus, there is a need for consolidation of the various genotype-phenotype databases
available into a commonly available and perhaps centralized clinical- grade resource that is publically
accessible.
The availability of phenotypic information is essential for the investigation of genotype-phenotype
relationships both at the individual patient level and at the global level. Integrating the phenotype
information into these clinical- grade resource databases will greatly assist with interpretation of
variants identified from genomic analyses. Laboratories should be encouraged to contribute
phenotypic information whilst being aware of the issues posed by privacy concerns, data complexity
and lack of uniform methods for collection of phenotypic data.
9. Reporting of Somatic Variants
The reporting of somatic variants in oncology should follow the recommendations described
previously for germ line variants, with the following additional considerations. If the variants are
known to predict a likely response to specific therapeutic agents then the relationship between the
presence or absence of the variant and the agent must be included on the report. Due to the
heterogeneous nature of tumours the percentage of the variant allele that is considered significant to
recommend or not recommend use of the agent should be included on the report along with the
uncertainty of measurement for the variant. The percentage of tumour cells in a solid tumour sample
or blasts in haematological malignancy must also be included in the report as contextual information.
10. Summary Comment
The utility of a genomics report can be increased by the preparation of a standardised report that
adheres to established guidelines. However, of equal importance is the need to ensure that clinicians,
genetic counsellors, or others who read these reports have the necessary training and support to
optimise interpretation of genomic test results for patient care. In part, this can be achieved by greater
collaboration between requestors of genomic tests and the laboratories performing genomic tests
however, formal education of requestors in the interpretation of genomics reports is likely to further
increase the utility of genomics reports.
Specific training for genomic testing needs to be incorporated into existing professional development
programmes for laboratory geneticists, clinical geneticists and other clinical specialities (e.g
cardiology, obstetrics, neurology) and primary care medical practitioners that may be likely to request
genomic testing. With the advent of direct to consumer testing both in Australia and abroad, genomic
education of the general public should also be addressed, and may be feasible by the technological
advances made in mobile applications.
34 It is suggested that individual laboratories reporting genomic tests play an active role in developing
and implementing a continuing professional development program focussed on genomic testing by
establishing structured in-house training programs for their staff and potential referrers who are
seeking to extend their professional competency into this arena. Such a programme should focus on
the tests offered by the individual reporting laboratory.
10.1 Resources
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
NPAAC. Requirements for Medical Testing of Human Nucleic Acids 2012.
NPAAC. Requirements for Information Communication 2007.
International System for Human Cytogenetic Nomenclature (ISCN).
Genetic testing in asymptomatic minors: Recommendations of the European Society of
Human Genetics. Eur J Hum Genet. 2009 Jun;17(6):720-1. Free article
RCPA Guidelines for reporting molecular genetic tests to medical practitioners 2009
HGVS Nomenclature for the Description of Sequence Variants
Riggs et al, Phenotypic Information in Genomic Variant Databases Enhances Clinical Care
and Research: The International Standards for Cytogenomic Arrays Consortium
Experience. Hum Mut 33:787–796, 2012
Kircher et al. A general framework for estimating the relative pathogenicity of human genetic
variants. Nat Genet 2014;46:310-5
McArthur et al Nature. Guidelines for investigating causality of sequence variants in human
disease. Nature 2014; 508:469-76
Samocha et al. A framework for the interpretation of de novo mutation in human disease
Nature Genetics 2014 46, 944-950
Hofman et al Yield of molecular and clinical testing for arrhythmia syndromes: report of 15
years' experience. Circulation. 2013 Oct 1;128(14):1513-21
Thompson et al. Application of a 5-tiered scheme for standardized classification of 2,360
unique mismatch repair gene variants in the InSiGHT locus-specific database. Nat Genet
2014; 46:107-15
Opportunities and Challenges Associated with Clinical Diagnostic Genome Sequencing. A
Report of the Association for Molecular Pathology. The Journal of Molecular Diagnostics,
2012.
Emory university public facing database
Opportunities and Challenges Associated with Clinical Diagnostic Genome Sequencing. A
Report of the Association for Molecular Pathology. The Journal of Molecular Diagnostics,
2012.
de Leeuw et al, Diagnostic interpretation of array data using public databases and internet
sources. Hum Mut 33:930–940, 2012.
ESHG. Whole Genome Sequencing and Analysis and the Challenges for Health Care
Professionals: Recommendations of the European Society of Human Genetics. 2012 (draft)
CHAPTER FIVE
IT Infrastructure
1. Introduction
Genomic technologies introduce complex analytical methods which require substantial bioinformatic
and IT infrastructure which are not the usual domain of regulatory and/or accreditation agencies. This
chapter will discuss the specific IT infrastructure issues that should be addressed by laboratories
considering genomic methods.
1.1 IT process overview
35 Following sequencing, primary analysis (base-calling) usually occurs on the sequencing instrument
and is beyond the control of the user.
Secondary analysis (alignment and variant-calling) can occur on-or off-instrument.
Tertiary analysis usually occurs off-instrument. Most of the sequencing manufacturers provide
appropriate computing power and storage on-instrument. Where analyses occur off-instrument, it is
necessary for the laboratory to consider the following issues (also see Figure):
•
•
•
The level of processing power required to perform timely analyses
The need to ensure data integrity during transfer across a network
Data management and storage
Given the vast potential of genomic methods to generate genome-wide data, laboratories will need to
actively consider precisely which data they will store and the retention time of that data. In some
cases it may be that institutional IT departments and policies may be able to accommodate data
within centralized storage facilities. However, there may be many cases where this is not possible
and the problems will need to be addressed locally.
2. Data processing infrastructure and capacity
2.1 The computing hardware and other IT infrastructure should be fit for purpose.
Specific requirements will vary according to the platform and style of analysis (i.e. the requirements
for small-scale targeted sequencing will be different to those for whole-exome or whole-genome
sequencing). The choice of computing hardware specification (i.e. type and number of CPUs or
GPUs, amount of RAM, type and amount of storage platform and operating system) will be governed
by the chosen software/analytical pipelines (see Bioinformatics chapter).
2.2 Computing hardware should at least meet the minimum specifications of the software.
Further consideration should be given to equipment which exceeds the minimum specifications in
order the reduce processing, and hence turnaround time.
2.3 The laboratory should show that the choice of hardware and software can be maintained
appropriately, including installation, updates and troubleshooting.
The choice of operating system will also be largely determined by the specific software and analytical
programs being used. At a minimum, a 64-bit operating system should be installed (memory
allocation can be severely restricted in some/all 32-bit operating systems).
2.4 The chosen computing hardware should be shown capable of performing the required
analyses and/or capable of running the chosen software using training/control datasets (i.e.
datasets with characteristics consistent with clinical samples to be analysed.
Datasets may be supplied by software providers, or may be obtained externally.
3. Data Transfer
3.1 Wherever possible, data should not be transferred using USB “memory sticks” or external
hard drives.
3.2 Consideration should be given to the use of high-speed network connections between the
various components of the computing hardware.
36 Genomic methods have the capacity to generate very large data files. During analysis data may need
to be transferred to different computing hardware (i.e. from sequencer to analytical computer or from
sequencer to storage location). A speed of 1 gigabit/second (i.e. Gigabit Ethernet) is suggested as a
minimum data transfer speed. This requirement will affect network cables as well as routers/switches.
Infrastructure capable of faster transfers will reduce delays introduced by the transfer of large files.
3.3 Confidentiality of data should be maintained during data transfer.
3.4 Appropriate steps should be taken to ensure that data corruption does not occur during
transfer.
This is a significant issue, especially as files increase in size. Laboratories should implement a
system to show that data transferred between different elements of their computing hardware have
not been corrupted during the transfer. Consideration should also be given to similar mechanisms for
data transferred to external organisations for analysis. Checksums for individual files or compressed
files can be generated using a variety of software packages.
3.4.1 Resources
See also resources section in Ethical & Legal Issues.
4. Data management and storage
4.1 The laboratory should determine and justify which data are to be stored.
During data generation and analysis, a series of files of varying sizes are created.
In Sanger sequencing, the stored data includes unedited chromatograms (“raw” data), edited
chromatograms, sequence alignments and summarized results/reports. Equivalent components can
be identified within NGS pipelines, although the amount of storage required will be significantly larger.
Some genomic data may need to be repeatedly accessed and analysed over a greater period than
expected in typical data retention policies (e.g. whole genome or whole exome data). Where possible,
the laboratory should determine the feasibility of very long term data retention. The laboratory should
develop a formal data management policy which minimizes the possibility of data loss. During
analysis, genomic data will be transferred to a number of different computers for analysis and/or
storage.
4.2 The laboratory should ensure that data are stored in a manner that prevents loss in the
event of hardware failure (i.e. data should have redundant backup).
The specific choice of computing hardware for storage purposes will vary between laboratories. The
specifications of storage devices will be substantially different from the specifications of processing
devices (see above). The important characteristics of storage devices will be quantity, speed and
redundancy. It is suggested that “solid state” devices are inappropriate for long-term data storage as
their life-span has not been empirically determined.
Cloud storage has the potential for reducing the loss of data due to hardware failure, and is readily
scalable, but issues of bandwidth for access and confidentiality of identifiable data remain major
concerns.
4.2.1 Resources
•
The potential for cloud computing services in Australia. A Lateral Economics report to
Macquarie Telecom. October 2011.
37 •
•
Financial Considerations for Government use of Cloud Computing. Australian Dept of Finance
& Deregulation. Nov 2011.
See also section in Ethical & Legal Issues.
4.2.2 Other Resources
This section lists some of the documents which address quality issues in genomic sequencing
generally. More specific references are provided in subsequent sections of this document.
•
•
•
•
•
•
•
Gargis et al., Assuring the quality of next-generation sequencing in clinical laboratory
practice. Nat Biotechnol 2012 Nov;30(11):1033-6. doi: 10.1038/nbt.2403. PubMed PMID:
23138292
CDC. Next Generation Sequencing: Standardization of Clinical Testing (Nex-StoCT)
Working Groups.
CDC. Next-generation Sequencing: Standardization of Clinical Testing (Nex-StoCT)
Workgroup Principles and Guidelines.
Weiss et al. Best Practice Guidelines for the Use of Next Generation Sequencing (NGS)
Applications in Genome Diagnostics: A National Collaborative Study of Dutch Genome
Diagnostic Laboratories. Hum Mutat 2013 Jun 17. PubMed PMID: 23776008.
Schrijver et al. Opportunities and challenges associated with clinical diagnostic genome
sequencing: a report of the Association for Molecular Pathology. J Mol Diagn 2012
Nov;14(6):525-40.
Next Generation Sequencing (NGS) guidelines for somatic genetic variant detection. New
York State Department of Health, 2015 update.
Rehm et al. ACMG clinical laboratory standards for next-generation sequencing. Genet
Med 2013 Jul 25. doi: 10.1038/gim.2013.92. [Epub ahead of print PubMed PMID:
23887774.]
Acknowledgements
The RCPA wishes to acknowledge Dr Melody Caramins (Editor-in-Chief, and Reporting chapter
writing committee), the immediate past Editor-in-Chief Professor Graeme Suthers, and the Chapter
Editors Dr Janice Fletcher (Ethical and legal issues), Associate Professor Bruce Bennetts (Wet Lab),
Professor Leslie Burnett (Bioinformatics), Dr Cliff Meldrum (Reporting) and Professor Graham Taylor
(IT infrastructure); the funding provided by the Australian Government Department of Health & Aging,
and the RCPA Genetics Advisory Committee. Writing committee contributors included: Ethical and
legal issues chapter: Associate Professor David Amor, Professor Leslie Burnett, Dr Mark Davis, Mr
Mike Ralston, and Associate Professor Meredith Wilson. Wet Lab Chapter: Dr Desiree DuSart, Dr
Andrew Fellowes, Professor Nelson Tang, Dr Elizabeth Tegg. Bioinformatics chapter: Dr Douglas
Chesher, Dr Warren Kaplan, Dr Karin Kassahn, and Dr Joshua Ho. Infrastructure chapter: Dr Denis
Bauer, Mr Ken Doig, Dr Arthur Hsu, and Associate Professor Andrew Lonie. Reporting Chapter:
Professor Leslie Burnett, Dr Melody Caramins, and Dr Peter Taylor.
38 
Download