MDD_MR3_guidance

advertisement
Medical Devices and MR sequences – GUIDANCE
File: MDD_MR3_guidance
Version – May 16th 2005
1. Introduction
a. Orientation: This document contains guidance notes, intended to support the MR physicist in
answering the questionnaire (MDD_MR2_questionnaire).
b. Background.: The EU Medical Devices Directive (MDD) places some obligations on anyone who
gives software (such as a MR sequence, or analysis program) to an external organisation. The
obligations also constitute a ‘best practice’ which improves the scientific quality of our devices.
Geoff Cusick, from the Department of Medical Physics and Bioengineering, UCLH, gave a very
informative talk to the Division on this topic on May 10th 2004. Then document MDD_MR_ER
gives the Essential Requirements (ER’s) arising from the MDD. This document addresses the issues
arising from work carried out by physicist members of the Division, and sets out to define a simple
procedure for validation of such products within the context of the MDD.
c. Aims of this project:
i. Understand the MDD and its Essential Requirements (ER’s)
ii. Identify the subset of ER’s relevant to MR (and particularly quantitative MR, qMR, seen as
a measuring device)
iii. Develop MR responses to the MDD, both generic and specific to a particular device
iv. Develop a friendly checklist for MR physicists who dread approaching the MDD itself.
d. Procedure/timeline/jobs: This is still a draft procedure, and feedback from various sources is
needed in order to establish its validity (to ensure we are operating within ‘best practice’, as seen by
the MR community). Specifically:
i. Within the Division of Neuroradiology and Neurophysics, each group has a representative
to be concerned about these issues. This should be convened to test and provide input.
ii. Feedback from UCLH trust (via Geoff Cusick). (also existing practice for measuring
devices in other areas e.g. electrical)
iii. IPEM MR Special Interest group (SIG), and any national bodies they recommend
iv. UCL HR (for issues of employer/employee liability)
v. I presented this work at an IPEM meeting on Software as a Medical Device on November
12th 2004. A summary of feedback is given in appendix 5.
vi. More is needed on the regulatory bodies limits on B1 SAR and dB/dt.
vii. We have to ‘beta-test’ this on a few people and devices internally.
viii. Eventually it’s probably good for web- and journal publication (MAGMA?)
ix. We should/could start adding ‘validation’ paragraphs to our methods journal papers, using
information from this MDD procedure.
x. We may need to set up formal collaborations, honorary contracts etc to prevent some
sequences being formal medical devices, or to limit liability.
xi. A document will be presented to the Division in January 2005
xii. The generic and guidance text should be applicable to IoN/Queen Square machines from
various manufacturers.
e. More background information is given in appendix 1.
2. General guidance notes for completing the questionnaire
a. These are proposed as part of developing good practice. Some are suggested as a result of errors
that have been made locally.
1
b. All devices should have an instruction manual of some kind; this could be a few lines of on-line
help, and/or a pointer to a fuller document.
c. When errors are made, these should be analysed and learnt from. The procedures should be
modified to at least prevent such known errors from taking place again. Some case studies are given
in the appendix. See also the cases from eWEEK (see references at end)
d. Devices should have ‘test modes’, where the likely errors can be anticipated. The appendix case
studies suggest ‘test modes’ for specific devices
e. Before being handed over for routine use by non-technical people, an independent person should
look at the device and check its operation.
f. For quantitative MR we are in essence producing a measuring instrument, and should follow, as
much as possible, the traditions that have built up in that area. These include specifying the
accuracy and precision, or systematic and random errors, or using the modern approach of
uncertainty and uncertainty budget. We cannot claim to be producing qMR tools unless we have
followed this practice.
g. Human error is always present (see the case studies!). Design has to include recognition and
acceptance of its presence in prototype devices, by designing test procedures to detect it.
h. Testing software and sequences: Gareth Barker’s advice: see his email of 28 June (appendix 4)
3. specific guidance notes for the questionnaire, for each Specific Response (SR). These are prompts
(questions) to drive your SR. For non-quantitative devices (e.g. a visualisation sequence) ignore SR3-6.
a. SR1. The device (whether sequence or data analysis program) must be shown to work as
intended [ER3, ER12.1,ER13.6p]
i. technical description: Name the device. What type of device is it? describe the design of
the device. Why was it developed? What kind of output does it give? How does it work?
ii. anticipate the most likely errors (up to three) that could occur and how the chance of these
happening can be minimised. Look at the case studies where errors have occurred.
Recognise that human error is always present.
iii. Provide test modes to anticipate problems (including those of human error through
inadequate training). For a sequence, this could be to vary one or more of the user-set
parameters (the ones that the user is expected to alter as part of using the device) and to
monitor that the response is as expected. For programs, provide test image data and ensure
the user can replicate the expected results (within given confidence limits).
iv. Provide independent review of the device by an experienced person or persons (one
physicist and maybe one non-technical user). The degree of independence (i.e. lack of
involvement in the development process) should be in accordance with the level of risk
associated with device failure.
v. Monitor the device usage during a beta-test period. This could be for one month, or 20
measurements, after the device has been handed over to the user.
vi. Through upgrades (hardware or software) all these validation procedures will have to be
repeated.
vii. Quality Assurance (QA) phantom results and normal control values may be used to
support the validity of the device
viii. Accuracy and precision data (see below) also support validity
b. SR2. The device have an instruction manual and a device label [ER13.1, ER13.3,
ER13.4,ER13.6a, ER13.6b, ER13.6p]
i. write a (short) set of instructions for a non-technical user. Test the manual with at least 2
users. The manual must include the author (manufacturer) of the device. Anticipate potential
problems and errors and include these in the manual. Include ‘test modes’ and QA data, and
suggested test intervals. Include the device label information where appropriate.
2
c.
d.
e.
f.
g.
h.
i.
ii. The manual should ideally be ‘attached’ to the device (i.e. accessible from the device,
through e.g. a website link, or ‘on-line help’.
iii. Device label: This is short and ideally fixed to the device. For a pulse sequence (GE or
Siemens) this is generally not possible. For an analysis program, this could be brief text
which is displayed when the program first runs.
iv. The label should include:
1. identification of the device
2. name of maker
3. ‘for research purposes only’
4. date of manufacture
5. version number
6. any special operating instructions, precautions or warnings
SR3. (if quantitative) the device must give output with proper units [ER10.3]
i. give the output in proper SI units where possible (for example ms, percent units (pu),
microseconds). Stored integer values in calculated maps should be at good enough
resolution so as not to degrade the data, whilst not coming near the 16-bit signed integer
limit of 32767 ; this typically means values of between 1000-10,000. Use the scaling factor
facility in the display program dispim and in the UNC file header so that real floating point
values can be used directly by the user, without seeing the integer values.
SR4. (if quantitative) the accuracy of the device must be specified [ER10.1, ER13.6p]
i. show that the quantity being produced is close to the truth, as far as possible. For a
sequence, use a search coil to confirm aspects of a new pulse that may be crucial,
particularly its amplitude. For an analysis program, show that the quantity being measured
is true in phantoms, if possible [some quantities, for example blood flow, may only
meaningfully exist in the brain, since accurate enough phantoms do not (yet) exist. Others,
such as volume, can meaningfully be tested in phantoms (test objects)].
SR5 (if quantitative) the precision of the device must be specified [ER10.1]
i. measure the reproducibility by repeated measurements, using repeated scans if necessary.
Use paired measurements and the Bland Altman analysis to estimate standard deviation and
95% confidence limits.
SR6. (if quantitative) the limits of accuracy of the device must be specified [ER10.1, ER13.6p]
i. ‘Limits of accuracy’. Presumably it means the 95% CL for total uncertainty (whether arising
from systematic or random sources). This can be calculated from the mean inaccuracy, and
the 95% confidence limit on repeated measurements, by combining the two quantities. Total
uncertainty, and the ‘uncertainty budget’, are discussed on p68 of QMRI of the brain
SR7. A sequence device must have safe RF power deposition SAR [ER1, ER9.2, ER11.1.1,
ER12.1]
i. GE machines: this is an ongoing subject under investigation. A statement from GE (or a
quote from the manual) on how SAR is calculated and controlled would help here.
ii. Siemens machines: There is no ‘research mode’. When a new sequence is written, and
passed through ‘Unit Test Mode’, all the inbuilt checks (including SAR and dB/dt) are all
applied. However if a local coil is built, this would need local testing for SAR. There is a
‘first level’ where inbuilt checks are disabled – however this is never used.
SR8. A sequence device must operate with the lowest reasonable static field, SAR and dB/dt
[ER9.2b, ER11.1.1]
i. Has anything been done that might increase the dB/dt? Are the maximum gradient and the
slew rate unchanged? (which gradient direction is most likely to have been altered?)
SR9. Ergonomic principles [ER10.2].
3
i. The process of using the device and reading the output should, as far as possible, be
convenient and user-friendly
4. Appendix 1a - Summary of Geoff Cusick’s talk on May 10th 2004 – need for compliance with MDD
(This can be downloaded via the Division website – see the MDD link)
a. Geoff Cusick, from the Department of Medical Physics and Bioengineering, UCLH, gave a very
informative talk to the Division on this topic on May 10th 2004. This document addresses the issues
arising from work carried out by physicist members of the Division, and sets out to define a simple
procedure for validation of such products within the context of the MDD.
b. What is a medical device? “a medical device means any .. apparatus… including software..
intended for human beings for…diagnosis [or]… monitoring…of disease” [slide 6]. “software may
be a medical device”; [this includes] “’instrument control” [and] “ image analysis” [slide 7]
c. The device has to be “placed in the market” for the MDD to formally (legally) apply [slide 8].
This could be a formal commercial sale, or giving the device to an external agency (e.g. from UCL
to the NHS or to another university). A letter of collaboration between the parties at the outset of
the work can remove the need for formal transfer at a later stage.
d. In a R&D setting, indemnity of the individuals has been an ongoing issue. Do the physicists need
personal indemnity cover (insurance) in case of a mishap? (This could be the MR instrument
damaging a subject, or incorrect image analysis leading to incorrect treatment). GC’s advice is that
provided the researcher has followed ‘best practice’, the employer would back the researcher, and
courts would see compliance with the MDD as compliance with best practice. Thus we are led to
conclude that to indemnify ourselves we should comply with MDD, even though in most cases it is
not legally required [slide 9]
5. Appendix 1b - Summary of Geoff Cusick’s talk – what the MDD requires
a. There are two primary components:
i. demonstrating the device works (i.e. establish the ‘benefits to the patient’)
ii. managing the risk (i.e. show the risks are ‘acceptable’). If the consequences of failure are
high, this needs more scrutiny, possibly including an external or independent assessor.
b. There is a whole set of Essential Requirements (ER’s) well described in the UCLH checklist
(downloadable – see below). Each one must be considered or declared ‘not applicable’ [slide 10]. I
have picked out the obviously relevant ones for us. There are [also ??] some ER’s where a blanket
statement will probably suffice, and can be recycled for each device. (e.g. for MR: static field can
be covered by saying the device has been supplied by a reputable manufacturer). A response to
these must include pointers to either generic or specific statements in the MR checklist (see below).
c. ER – does it work? :
i. ‘The device must achieve the performances intended by the manufacturer…’ [ER 3]
ii. ‘Devices with a measuring function must ... provide sufficient accuracy and stability. The
limits of accuracy must be indicated.” [ER10.1]
iii. ‘The … scale must be designed … with ergonomic principles’ [ER10.2]
iv. ‘The measurements must be expressed in legal units’ [ER10.3]
v. ‘Instructions for use must be included in the packaging for every device’ [ER 13.1]. This
implies an on-line instruction manual for image analysis programs. ‘Where appropriate, the
instructions must contain the degree of accuracy … for devices with a measuring function’
[ER 13.6 (p)]
d. ER – manage the risk:
i. ‘…any risks [must] constitute acceptable risks when weighed against the benefits to the
patient…’ [ER 1]
4
ii. ‘Devices must be designed .. to remove or minimise .. risks connected with magnetic fields
[and] temperature’ [ER 9.2]
iii. ‘devices shall be designed …[to minimise] exposure of patients … to [any form of]
radiation’ [ER 11.1.1].This includes RF heating and potentially dB/dt from the switched
gradients.
iv. ‘Devices incorporating electronic programmable systems must…ensure …performance of
these systems’. [ER12.1]. Applies to RF heating.
6. Appendix 1c - Summary of Geoff Cusick’s talk - how to do this at Queen Square?. GC’s advice is to
follow good design practice, which is something like:
a. Establish documentation, and populate it. It can be short and simple. [slide 13]
b. The technical document is:
i. Need – what need led to this device
ii. Design – how we are dealing with the need
iii. Implementation – how the device was built
iv. Validation – does it work?
v. Safety – consider and minimise the risks
vi. Much of this information might already be in a published paper, in which case this can be
referenced.
c. User documentation – instruction manual
d. Do this at the start, during development, and as part of support.
e. Developing MDD compliance at Queen Square.
i. GC’s view is that we can do this simply and deftly once we learn how to do it.
ii. Developing the validation part (and our skills at designing validation) would reduce the risk
of making fundamental errors (such as happened in the qMT sequence or the MTR maps
which are 10x the actual MTR value). However validation has to be designed to be
maximally sensitive to error whilst taking minimal time. If done well, it is probably a
mindset modification which achieves these 2 objectives, and also increases the scientific
value of the development and the ensuing journal papers. The ER’s I have picked out above
are good practice in scientific instrument design, and we should embrace those.
7. Appendix 2a: Example of specific response for a pulse sequence to ER’s
a. ongoing
8. Appendix 2b: Example of specific response for image analysis to ER’s
a. ongoing
9. Appendix 3a: Case study B1 mapping sequence.
a. These case studies are intended to be learning experiences; the individuals involved should not in
any way feel bad about this!
b. History: an inexperienced scientist implemented a B1 mapping technique, which worked correctly
under their use. It was handed over to a non-technical user, with an instruction sheet. The sheet had
an error in it. The result was that the sequence was used incorrectly. This continued for several
months without being detected.
c. Learning:
i. A more experienced independent person should have had oversight of this before handing
over the device.
5
ii. Inbuilt test mode would have forced the users to find the error (provided it was used!). For
this device, reduce TG (the transmitter output) by 10 units, and remeasure B1 at the coil
centre. It should be scaled by a factor of 10-10/200 = 0.891 (i.e. 11% reduction).
iii. Monitoring, by a technical person, of the device during its first weeks of use would
probably have detected the error.
iv. The manual is part of the device, and need to be tested! Silently watch how a naïve user
uses the device.
10. Appendix 3b: Case study 2D qMT sequence.
a. History:
i. A sequence to apply 3 different amplitudes of MT saturation pulse amplitude was written.
ii. The resulting data appeared to fit a model, and were published in 3 places
iii. An independent physicist, developing their own qMT sequence, found they could not
reproduce the original data.
iv. Several experienced physicists became involved, and suggested measuring the MT pulses
directly.
v. The amplitude of one pulse was found to be wrong, and the explanation found.
vi. A retrospective correction of old data could be made; the resulting corrected data were of
higher quality, as judged by fitting the model, and also agreeing with those from other
groups.
b. Learning:
i. Do not assume a new sequence is doing what you think. In particular, recognise that pulse
amplitudes may be wrong on a (GE) scanner.
ii. Invent independent ways of testing the sequence
1. progressively increase Bsat and observe the signal. Do this in a phantom, and use Bsat
values at least as high as in-vivo. Do this 20kHz off-resonance (where the behaviour
of the imaging pulse is monitored), and signal should be constant, and 1khz off
resonance, where a progressive reduction should be seen.
2. observe MT and imaging pulses with search coil for progressive experiment
described above; measure amplitude (with confidence limits) relative to first
imaging pulse. Measure its width, and estimate its area (and hence FA) relative to
the imaging pulse.
3. We still need a way of testing the offset frequency, although errors in this are less
likely
iii. Ask an independent person – do you believe this sequence is doing what I have
programmed it to do?
11. Appendix 3c: Case study qMT model fitting
a. History:
i. New qMT data were being fitted by a new model implementation, showing good agreement.
ii. An independent person tried to plot the same data and model, using an independent
implementation of the model, and could not.
iii. Comparison with a 3rd model implementation showed that the new implementation was
probably the one that was wrong.
iv. Detailed examination showed which part was in disagreement (the super-lorentzian
lineshape)
v. The source was a typographic error in a published paper (that had not been corrected by the
authors).
b. Learning:
6
i. With complicated mathematical formulae, implement everything twice, as independently
as possible (e.g. in C and a spreadsheet, with two different people).
ii. Recognise that published formula may have errors, and go back to original work and
several versions to check for agreement, or else re-derive the expression. Make basic checks
on published expressions (e.g. is the area under the lineshape equal to unity?).
12. Appendix 3d: Case study MTR factor of 10 error
a. History:
i. MTR values are in the range 10-40pu; they are stored in computer files as integers (range
100-400) to obtain 0.1pu precision.
ii. The clinical researcher was dealing with the integer values in all their analysis, including
graphing
iii. The integer values were passed to a statistician for complex analysis
iv. The statistician estimated values of difference and slope using the integer values
v. An independent person, who knows about MTR, recognised the factor of 10 error.
b. Learning:
i. Output all map values as proper floating point values, with proper units, with no access for
the user to the raw integer values.
ii. Independent review works
13. Appendix 3e: Case study – gains not fixed in Gd enhancement scans
a. History:
i. Scans pre- and post-Gd did not have the transmitter and receiver gains fixed. A nonresearch scanner was used. The auto-prescan procedure was repeated for the post Gd-scan,
and some scanner changes took place.
ii. These changes can be partly compensated retrospectively, but inaccurate attenuators and
models limit how well this works.
iii. A new radiographer noticed the problem, after about 18 months, and from then on data
collection was correct.
iv. In data analysis, methods that are independent of scanner gain were investigated.
b. Learning:
i. Analyse at least some of the data as soon as it is collected
ii. Beware of data collection problems with staff who are unfamiliar with a new kind of
scanning.
14. Appendix 3f: Case study – wrong volumes from oblique scans:
a. Dispim volumes had been used by an enterprising radiographer to measure tumour volumes
b. The results seemed ‘reasonable’
c. Oblique slices are incorrectly handled (the inter-slice distance is wrong)
d. Dispim volumes are underestimated by a variable amount, depending on the angulation; could be
up to 30% error.
e. Detailed analysis of volumes had been carried out, and a paper prepared.
f. With hindsight, the volumes should have been checked with simple phantoms
15.
Appendix 4: detailed and very helpful guidance from Gareth Barker (edited from his email of June
28th 2004)
a.
units: Dispimage has scale factors set in the header for intensity values, and to convert pixels too distance units, but no
idea about putting actual unit on anything. Adding 'mm' to the end of any measured/displayed value would be
(relatively) easy, but wouldn't actually be true/enforcable unless all the processing programs were compliant. It all
7
b.
c.
d.
e.
f.
stems from the original conversion, really - the values in the headers are whatever the scanner uses - in our case this is
mm for distances, and arbitrary units for anything else. This may not be true for other data (other scanners) and can
also easily be over written by anything along the way (eg by conversion to a format that doesn't support pixel sizes, or
by using a program that doesn't preserve the headers). Most of our programs should be OK, but we'd have to check.
(Getting dispunc to display MTR in pu is jut a matter of adding a header element called 'value_scale' and setting it to
0.1, by the way, but anything that processes these images will need too be updatetd to 'pass the change along').
Validation -A lot of this is what we already do, but don't really document. It may be that publishing the technique and
phantom data (if believable) would go some way towards this? Validating software formally is almost impossible,
though, particularly given that we don't have control over all of it (ie we have to take the GE stuff on trust).
Trials: What we do (are required to do) for clinical trials depends on where the software comes from. If it's
commercial, we're pretty much allowed to believe it. If it is 'small user base' (i.e. written by someone else, and used by
several people) we test it as a 'black box', but only within it's expected input range. If it's locally written, we have to
explicitly test it with 'bad' values and check that it falls over gracefully. We could do similar things for general
analysis programs (including passing in the wrong files, etc etc)
For pulse sequences we could do things like setup/run with a variety of input parameters (including some noto very
reasonable ones) and make sure that all the software SAR calculations (provided by GE, so not our responsibility) gave
sensible numbers, that scaled as we expected, and remained within acceptable ranges. Gradient strengths and slew rates
are controlled in hardware, so we can push responsibility there back to GE.
We'd also need to show that all the parameters needed by the analysis are correctly represented in the image headers.
(This means checking, one off, the transfer and conversion programs for things like TE and TR, and also checking on a
per sequence basis that things like the 'op user' CVs make sense and are correctly interpreted by the processing
packages.
Risk just say we obey all the NRPB and MDA guidelines? (Which everyone needs to actually read!! - they're far
from clear!)
16. Appendix 5 – feedback from IPEM talk on Nov 15th 2004
a. TickIT may provide a framework for software QA.(talk by Paul Ganney from Hull
(paul.ganney@hey.nhs.uk); based on a BSI document). Need not be onerous, he said. Includes a
‘decommissioning process’. This is used by DTI as a standard. Related to ISO9001. Looks good.
Paul Tofts is following this up.
b. Regarding Liability: there is no case law on Software Medical Devices, and no cases have ever
been brought for negligence, so the chances of being sued are ‘low risk’.
c. Regarding our employer: we should get them to ‘accept vicarious liability’ (which Trusts have
done)
d. IPEM report 90 is relevant: Report 90 Safe Design, Construction and Modification of Electromedical Equipment
e. An anaesthetist speaker: in ‘research mode’ you have to be able to take risks. Use the ethics
committee.
f. A computer manager speaker: distinguishes 2 kinds of software:
i. Home made for one user, can be written by an amateur
ii. Professionally written, by an expert, for distribution (though much good software is written
by people without formal computing qualifications).
g. If software environment (e.g. compiler) version changes, need to repeat (last stage of) validation
17. References:
a.
GC’s ppt talk: http://www.medphys.ucl.ac.uk/~geoff/Presentations/The%20Medical%20Devices%20Directive%20for%20R&D.ppt
b. UCLH checklist: http://www.medphys.ucl.ac.uk/~geoff/Presentations/MDD%20ER%20Checklist.doc
c. These are also linked from the Division website (MR MDD)
d. Penny Gowland (Nottingham) gave a very good summary of safety issues at the British Chapter
September 2004. http://www.magres.nottingham.ac.uk/teaching/edinburg_safety_new.ppt (7 Mb)
8
e. FDA document Criteria for Significant Risk Investigations of Magnetic Resonance Diagnostic
Devices http://www.fda.gov/cdrh/ode/guidance/793.pdf is very short and advocates the ‘least
burdensome approach’!
f. ‘Can software kill?’ cases from eWEEK. http://www.eweek.com/article2/0,1759,1544231,00.asp
g. Circulation: Neuroradiology and Neurophysics Division, Geoff Cusick, Richard Lanyon, Gareth
Barker
Paul Tofts
Currently Unresolved issues:
1. how can brief instructions be built into the sequence and program? (SR2)
9
Download