Accuracy of fetal echocardiography, A systematic review

A systematic review
Introduction: The fetal echocardiography is the only way we can study fetal heart. It is of great
importance since congenital heart diseases have been reported to occur in 8/1000 live births.
Aim- Estimate the sensitivity and specificity of fetal echocardiography in detection of cardiac anomalies.
Methods- It was made a Medline systematic review of studies about sensitivity and specificity of fetal
echocardiography in detection of cardiac anomalies.
There were three phases of revision of the articles by the abstracts: in the first and in the
second, all the abstracts were read so that each article was read twice by two different revisers; in the
third, only the articles in discordance were analysed by a third reviser. We extract true positive, true
negatives, false positives and false negatives that were treated in Metadisc (pool sensitivity and
specificity). Homogeneity test was performed.
We also submitted these articles to some quality criteria to evaluate their relevance
Results: We obtained 1696 articles, from which only 16 were included.. Our plot was very heterogeneous
so we had to divide it in two groups with different sensitivity values: in the first one the value obtained
was 0,92 and in the second one it was 0,46. The specificity was 1,00.
Discussion: The lack of concordance among the sensitivity results could be explained by methodological
differences (methods, population, quality and date). Differences among populations (high and low risk )
were the main causes to this heterogeneity.
Conclusion: Echocardiography is a very useful and reliable tool in the evaluation of the fetal
cardiovascular system, and has high sensitivity and specificity for congenital heart diseases in high risk
Key words: systematic review, fetal echocardiography, cardiac anomalies, sensitivity,
specificity, true positives, true negatives, false positives, false negatives
Fetal Echocardiography is the only way we can study fetal heart.
This technique requires the use of high-resolution ultrasound equipment with Mmode, pulsed and colour Doppler imaging capabilities [1].
Congenital heart disease has been reported to occur in 8/1000 live births. Prenatal
diagnosis of congenital heart disease is crucial to optimal obstetric and neonatal care. In
utero identification of congenital heart disease allows a variety of treatment options to
be considered, including delivery at an appropriate facility, termination of pregnancy
and in some cases in utero therapy.
With this presentation, we intend to study the specificity and sensitivity of fetal
echocardiography in detection of cardiac anomalies.
Sensitivity and specificity are the most widely used statistics used to describe a
diagnostic test.
 sensitivity is the probability of a positive test among patients with disease
 specificity is the probability of a negative test among patients without
Clinicians don’t generally know whether or not the human fetus has a heart
disease; that’s why they order the test in the first place. Thus, sensitivity and specificity
do not give the information needed to interpret the test results, these parameters provide
just an idea of how reliable the test results are. Anyway, no matter how high these
values may be, there is always a small margin of error.
Picture 1 : M-mode echocardiography [1]
Picture 2: Pulsed Doppler echocardiography. [1]
Picture 3: colour Doppler echocardiography [1]
Picture 4: Power Doppler echocardiography [1].
Our study was designed to collect all the information about the sensitivity and
specificity of fetal echocardiography in detection of cardiac anomalies. This systematic
review was made to show whether this diagnostic test is effective or not and what the
scale of trust we can expect from it.
Pubmed and SCOPUS were the databases selected.
a) In Pubmed our original query [2] selected was:
(((((("sensitivity and specificity"[All Fields] OR "sensitivity and specificity/standards"[All Fields]) OR "specificity"[All Fields])
OR"screening"[All Fields]) OR "false positive"[All Fields]) OR "false negative"[All Fields]) OR "accuracy"[All Fields])
((("predictive value"[All Fields] OR "predictive value of tests"[All Fields]) OR "predictive value of tests/standards"[All Fields])
OR "predictive values"[All Fields]) OR "predictive values of tests"[All Fields])
(("reference value"[All Fields] OR "reference values"[All Fields]) OR "reference values/standards"[All Fields])
((((((((((("roc"[All Fields] OR "roc analyses"[All Fields]) OR "roc analysis"[All Fields]) OR "roc and"[All Fields]) OR "roc
area"[All Fields]) OR "roc auc"[All Fields]) OR "roc characteristics"[All Fields]) OR "roc curve"[All Fields]) OR "roc curve
method"[All Fields]) OR "roc curves"[All Fields]) OR "roc estimated"[All Fields]) OR "roc evaluation"[All Fields])
((((( #1 OR #2) OR #3) OR "likelihoodratio"[All Fields]) AND notpubref [sb]) AND "human"[MeSH Terms])
This query must provide all studies of sensitivity and specificity existent in
Pubmed. So, to adequate it with our subject we had to include some key words and
exclude “notpubref” (because of technical problems):
AND ("echocardiography" OR echocardiography[MeSH Terms]) AND (Heart Defects,
Congenital[MeSH Terms] OR "cardiac anomalies")
Furthermore, we limited the cases studied to humans and to articles published
until August 1st 2005.
As a result, 1044 articles were found. Initially we obtained 1042 articles. Then,
when we repeated the Pubmed search two another articles were added to our study.
b) next, we applied a similar query in SCOPUS to obtain more articles.
First, we had to adapt the query used in Pubmed to the SCOPUS software. So, we made
the following changes:
 substitute All Fields and Mesh Terms for ALL.
The result was:
(((((((((((ALL("sensitivity and specificity") OR ALL("sensitivity and specificity/standards"))
OR ALL("specificity")) OR ALL("screening")) OR ALL("false positive")) OR ALL("false
negative")) OR ALL("accuracy")) OR (((ALL("predictive value") OR ALL("predictive value
of tests")) OR ALL("predictive value of tests/standards")) OR ALL("predictive values")) OR
ALL("predictive values of tests"))) OR (ALL("reference value") OR ALL("reference values")
OR ALL("reference values/standards"))) OR ((((((((((((ALL("roc") OR ALL("roc analyses"))
OR ALL("roc analysis")) OR ALL("roc and")) OR ALL("roc area")) OR ALL("roc auc"))
OR ALL("roc characteristics")) OR ALL("roc curve")) OR ALL("roc curve method")) OR
ALL("roc curves")) OR ALL("roc estimated")) OR ALL("roc evaluation")))) AND
(("echocardiography" OR ALL("echocardiography")) AND ALL(congenital heart defects) OR
ALL("cardiac anomalies")))
We obtained 1370 articles.
Next, we submitted these articles to the following limits:
 First, we added to the query AND ( EXCLUDE(PUBYEAR,2006) )
because our Pubmed query did not include articles published in 2006. The
number reduced to 1342 articles;
 Second, we added AND (EXCLUDE(SUBJAREA, "VETE")) to restrict
our search to humans. 1331 articles obtained.
 Finally, we added AND ALL(Fetus OR Fetuses OR fetal OR neonatal
OR prenatal OR antenatal OR perinatal OR embryonic OR
pregnancy OR pregnant OR pregnancies OR obstetric OR gestation
OR utero OR feto OR fetos OR pré-natal OR gravidez OR grávida
OR gestação OR útero) to include exclusively fethus in our study;
As a result we obtained 810 articles.
As we can see, 1954 articles were found.
To remove repeated articles from the search we work with EndNote.
The EndNote is a software which associates bibliographic resources. We made
used of this to separate the articles existing in SCOPUS which are different from that
obtained in Pubmed. We obtained 158 articles in common, which we excluded.
Articles in Pubmed
Articles in SCOPUS
+ 810
Repeated articles
- 158
1696 articles
Tab refering the total articles obtained
In a first analyse we divided the articles with all students. Each one read 116
abstracts and selected those which follow our inclusion/exclusion criteria:
 To be applied to humans, fethus, in particular;
 Refer the echocardiography as the only diagnostic test used;
 Cardiac anomalies as the subject in study;
 Articles with their abstracts available;
 Sensitivity and specificity studies;
 Language: english and portuguese;
A second reviewer analysed the abstracts and we compared the results between
them. When there were no concordance, a third reviewer read these abstracts and came
to a decision. We used SPSS as a statistic program which allowed an easier calculation
of frequencies of two variables in study. As we can see in picture 5, we calculated the
number of articles included: the result is 157 (from Pubmed) + 91 (from SCOPUS)
248 articles.
As we obtained the articles of interest we search their full text, using the
following strategy, attending that if the article were not available we selected other
sources, according to that order: first, search in Pubmed; second, faculty’s library; third,
contacting the autor (s) by e-mail.
To evaluate how relevant these articles are we needed to submit them to some
quality criteria.
With this purpose, we selected the STARD checklist [3]. In fact, this consists of
twenty-five items which we have to select how many conditions does the articles fulfils.
According to this checklist, the more items satisfied by the articles, the more powerful
is the test designed.
We adopted the following strategy: a) in a first approach the articles were
analysed by one reviewer; b) subsequently, we are going to analyse the others for the
first time, and then, submit all to a second reviser, in order to eliminate the more errors
we can. Moreover, the concordance between the two reviewers will be analysed, too.
The results of this evaluation will be expressed in a SPSS table and in a excel
Metadisc was the adopted software to treat the information extracted from the
full article. This included the number of: a)True positives; b)True negatives; c)False
positives; and d)False negatives cases;
With these results introduced, the software designs a forest plot. By analysing this
graphics we are able to get some conclusions about the grade of sensitivity and
specificity of fetal echocardiography in detection of cardiac anomalies.
The plot obtained now is just a first interpretation of the final results.
Finally, we intend to study the homogeneity/heterogeneity of all our results. This
is important because we need to know whether the variance found is or is not the result
of the variability of the sample.
So, we need to: a) evaluate the heterogeneity’s degree; b) find and explore the
causes of this problem; c) adapt these results with the interpretation of the problem in
This fact will be analysed with a Chi-square test. This will permit us to evaluate
the heterogeneity by calculating the p value, with a significance level of 5%. If the p
value obtained > 0.05, our results are relatively homogenous; whereas if the p value <
0.05, the heterogeneity are statistically relevant.
With the forest plot and the degree of heterogeneity we are able to associate this
information to understand the results obtained. This means that, if the situation justifies,
we will proceed to a subgroups analyses, giving more relevance to the results with more
quality. In addition to divide into subgroups, we can change the scale or designing a
We designed a flowchart reflecting the methods used in our project.
A forest plot of sensitivity and specificity were designed with Metadisc:
Graph 1 - Articles with high sensitivity
Graph 2 – Articles with low sensitivity
Graph 3 – Specificity
The table 1 illustrates a great variance of quality among the articles found. The
quality mean – 15,8 (out of 25) – reveal that these articles have a reasonable
classification in terms of quality.
Analysing the sensitivity of the studies included, we noticed that there was a great
heterogeneity among them. One group had a low sensitivity value and the other had a
high one.
The aspects considered to explain the different sensitivities were: methods,
population, quality and date.
In high risk population studies, the sensitivity values tend to be high. On the
contrary, in low risk population studies, the sensitivity was lower.
We also noticed that studies that used extensive echocardiography had a higher
sensitivity and studies that used a common four-chamber view had lower sensitivity
values. The extended echocardiography comprised the four-chamber view, and
visualisation on the left ventricular outflow tract, the right ventricular outflow tract and
the main pulmonary artery and its branches [4]
All the articles had a very high specificity value.
Nowadays, we use the four-chamber view in low risk population, and extended
echocardiography in high risk population. However, we have to consider that most
neonates with congenital heart disease are born in women without any previous known
risk factors. So, since four-chamber view is less expensive, needs less human resources
and in high risk population the sensitivity tends to be higher with both techniques [4],
we are able to conclude that the four-chamber view may be enough to detect cardiac
anomalies in high risk population. The four-chamber view appears to be ineffective in
low risk population as a screening method. So, if any, the screening technique to be
used should be the extensive echocardiography. In the future, it would be useful to
investigate how different the cost can be.
As specificity values seem to be all alike, we can assume that fetal
echocardiography has a great specificity, almost 100%.
[1] Julia A. Drose, “Fetal Echography”, W.B. Saunders Company
Lindsey Allan, Lisa K Hornberger, Gurleen Sharland, “Text Book of fetal cardiology”,
[2] "Conducting systematic reviews of diagnostic studies: didactic guidelines" –
[3] "Towards complete and Accurate Reporting of Studies of diagnostic Accuracy: The STARD Initiative" – Patrick M. Bossuyt et al;
