The Bland- Altman Method: How Often Has It Been

advertisement
THE BLAND-ALTMAN LIMITS OF
AGREEMENT:
HOW OFTEN HAVE THEY BEEN
MISAPPLIED?
Introdução à
Medicina –
23/Maio/2011
Turma 13
INTRODUCTION
Background

Due to the advances of technology, new
methods of clinical measurement appear
constantly, and they keep becoming
more innovating.1

In the 80s, Bland and Altman took
knowledge of the wide use of the
correlation coefficient as a way to
evaluate the agreement between two
methods of clinical measurement.

They realized it wasn’t adequate.

So, they created their own method - the
limits of agreement of Bland-Altman.2
1 - Zietman A, Goitein M, Tepper JE. Technology evolution: is it survival of the fittest? Journal of Clinical Oncology: official journal of the American
Society of Clinical Oncology, 2010 Sep 20; 28(27): 4275-4279.
2 - Altman DG, Bland JM. Statistical Methods For Assessing Agreement Between Two Methods of Clinical Measurement. Lancet, 1986; i: 307-310.
Statistical methods for assessing agreement between
two methods of clinical measurement

The Lancet, 1986

Objective of the method

Assess the agreement between two methods of clinical
measurement

Importance:

If the agreement isn’t accomplished, there is a high risk of
diagnosis mistakes, which may lead to severe consequences3
3 - Stoker, Mark. Common Errors in Clinical Measurement. Anesthesia & Intensive Care Medicine, December 2008; volume 9, issue 12: 553-558.
How do we apply the method?2
Correlation coefficient
Instrument
2
Instrument
1
Measurement
Measurement
Average
Difference
2 - Altman DG, Bland JM. Statistical Methods For Assessing Agreement Between Two Methods of Clinical Measurement. Lancet, 1986; i: 307-310.
=0

No systematic error
≠0

Systematic error
If the limits of agreement are…
Too wide…

There are random
Small but the average of the
differences is ≠ 0…

mistakes associated
systematic error;
with the measuring
instrument;

It is unacceptable
for clinical use.
There is a

The measuring
device must be
calibrated.
The evaluation of whether the limits of
agreement are too wide or, on the other
hand, adequate, may be a little subjective.
Thereby, it is important that the maximum limits
of agreement are defined according to the
clinical needs.
Assumptions
the differences between
the measured values
must follow a normal
distribution;
the standard deviation
must be constant / there
must be no relation
between the averages
and the differences;
Images: Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003;
22, 85-93.
Example of the existence of a relation
between the averages and differences
Images: Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003;
22, 85-93.
had a great impact on the scientific community
and,
after being published in The Lancet, was quoted
more than 17000
times4
BUT,
some of the quotes/applications of this method may not have been
correctly made!
Bland and Altman noticed themselves that their limits of
agreement were being misapplied and, thereby, led to false
conclusions about the agreement between two instruments of
clinical measurement.5
4 - Ryan TP and Woodall WH. The Most Cited Statistical Papers. Journal of Applied Statistics, 2005; 32: 461-474.
5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22,
RESEARCH QUESTION AND
AIMS
Research Question
“What is the percentage of articles in
which the Bland-Altman method is
applied correctly?”
at what level the method
is misapplied
what percentage of
articles fit into each of the
document types defined
by ISI.
Our secondary aims
are to find out:
if the percentage of
articles applying the
method correctly varies
according to whether it is
used to obtain primary or
secondary data.
if, through the years, the
percentage of articles
applying the method
incorrectly has varied
if the impact factor of a
journal influences the
percentage of articles
published in it that apply
the method correctly
which
assumption is the
least fulfilled one
METHODS
Methods

Sample
 70
articles indexed by ISI that cite the article
where Bland and Altman expose their method,
published by The Lancet
Check-list

Evaluates the article when it comes to the:

Verification of the assumptions;

Application of the method itself;

Interpretation of the obtained limit of
agreement.
Check-list


Evaluates the article when it comes to the:

Verification of the assumptions;

Application of the method itself;

Interpretation of the obtained limit of agreement.
The check list will also gather some relevant
data related to the articles: type of article and
year and journal in which it was published.
Reprodutibility of the check-list
Student B
Student A
Article X
Comparison between the answers given between
the two students.
To analyze our results…


We calculated the median of the impact
factor and year of publication
Created two groups
≤ median
> median
How do we know the
differences are significant?
Data of the tables related to the year, journal
of publication and type of data of each
article
Chi Square Test6
6 - PERLA, Rocco J, CARIFIO James. Use of the Chi-square Test to Determine Significance of Cumulative Antibiogram Data. American Journal of
Infectious Diseases, 2005; 1 (4): 162-167
EXPECTED RESULTS
Many articles will have
misapplied the method


Main reason

lack of verification of the assumptions;

wrong verification of the assumptions.5
Least fulfilled assumption

verifying if the differences follow a normal distribution
Why?
It requires the construction of a different graph (histogram of
the differences), while the other assumption can be verified by
analysis of the averages vs. differences one, which is often
used to observe the limits of agreement.
5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22,
85-93.
There will be variations of the
percentage of articles
misapplying the method
throughout the years

WHY? researchers started to notice that the method was
being misapplied

HOW? they realized that two methods of clinical measurement
that had passed the test of Bland-Altman in terms of
agreement weren’t actually agreeing very much.

Example: didn’t agree when it came to higher values than the
ones used for the test.5
5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22,
85-93.
The impact factor of a journal
must have influence in the
percentage of misapplications
of the method present in the
articles published there
Why?
> Impact Factor
> Quality
> Attention to
scientific
correction
RESULTS
Reproducibility of the check list
- To ensure the correct analisis of the articles  two students analized the same article
Of the 5 articles analyzed by two
different students
There was an agreement of 100% in all
questions,
except
for the one that asked if the article had
interpreted the outcome correctly
according to the clinical needs, in which
there was a disagreement relative to 1
article
The two students which
disagreed re-evaluated the
question and came to an
agreement.
What percentage of articles fit into each of the
document types defined by ISI.
n= 18360
Articles - 16230
Reviews - 291
Meeting abstracts - 70
Reprints - 2
Proceeding papers - 1059
Notes - 121
Corrections/Addictions - 2
Correction - 1
Letters - 471
The Sample
UNAVAILABLE
(in a full text version or in
foreign languages)
AVAILABLE
Out of those 56,
5 weren’t applications of the Bland and Altman limits of agreement, while
51 were.
THE MAIN FINDINGS of our
study in regards to our original
research question and aims:
Table 1. n(%) of articles which fullfill
each point of the check-list.
… if the impact factor of a journal influences the percentage of articles
published in it that apply the method correctly
p>0,05!!
Table 2 – Percentage of articles fulfilling each main point of the check list, divided according
to the impact factor of the journal where they were published. We used a Chi-Square test to
compare the percentages amongst the two levels of impact factor.LA – Limits of agreement.
…if, through the years, the percentage of articles applying the method
incorrectly has varied
p<0,05!!
Table 3 – Percentage of articles fulfilling each main point of the check list, divided according to
the year when they were published. We used a Chi-Square test to compare the percentages
amongst the two levels of impact factor. LA – Limits of agreement.* - statistically significant.
…if the percentage of articles applying the method correctly varies according
to whether it is used to obtain primary or secondary data.
Table 4 – Percentage of articles fulfilling each main point of the check list, divided according to
the type of data obtained by using the limits of agreement. We used a Chi-Square test to
compare the percentages amongst the two levels of impact factor. LA – Limits of agreement.* statistically significant.
DISCUSSION
“What is the percentage of articles in which the Bland-Altman method is
applied incorrectly?”
Interestingly,
out of all the articles we analyzed,
THERE WAS NOT ONE article which
correctly applied the method in its entirety.
The method seems to be mostly misapplied at
the level of:
 verifying the assumptions
The least fulfilled assumption
The 7 articles where this assumption was
applied correctly - a mere 14%– also correctly
fulfilled the first one
Table 1. n(%) of articles which fullfill
each point of the check-list.
So,
the errors of articles that correctly applied the
second assumption were only minor ones.
… if the impact factor of a journal influences the percentage of articles
published in it that apply the method correctly
•The articles published
in journals with a lower
IF appear to have a
higher percentage of
correct applications!
p>0,05!!
Table 2 – Percentage of articles fulfilling each main point of the check list,
divided according to the impact factor of the journal where they were
published. We used a Chi-Square test to compare the percentages amongst
the two levels of impact factor.LA – Limits of agreement.
•The differences are
however not
statistically significant.
…if, through the years, the percentage of articles applying the method
incorrectly has varied
•In every single category, the articles
published at a more recent date always
have a higher percentage of correct
application of the method
•Only one of the results is not statistically
significant.
p<0,05!!
Table 3 – Percentage of articles fulfilling each main point of the check list,
divided according to the year when they were published. We used a ChiSquare test to compare the percentages amongst the two levels of impact
factor. LA – Limits of agreement.* - statistically significant.
•With the passing of time authors have
come to realize that sometimes the
employment of the Bland-Altman method
leads to incorrect findings
•This would obviously lead the authors of
more recent study to be more careful when
employing the method.
…if the percentage of articles applying the method correctly varies according
to whether it is used to obtain primary or secondary data.
•Articles which are using the
method to obtain primary data
have a higher percentage of
correct application of the method
than those that use it to obtain
secondary data.
Table 4 – Percentage of articles fulfilling each main point of the check list,
divided according to the type of data obtained by using the limits of
agreement. We used a Chi-Square test to compare the percentages
amongst the two levels of impact factor. LA – Limits of agreement.* statistically significant.
• It is more likely for authors of an
article to pay more attention to
the correct employment of a
scientific method if it is their main
method or one of their main
methods for acquiring data.
Limitations of our work



Relatively small sample
Human error
No other works to cross-reference with
Acknowledgements




Professora Doutora Cristina Santos
Professor Doutor Altamiro Rodrigues da
Costa Pereira
Mestre João Cláudio Antunes
Turma 4
Download