Title Page Validating the Diagnosis of Acute Ischemic Stroke in a

Title Page
Validating the Diagnosis of Acute Ischemic Stroke in a National Health Insurance
Claims Data
Cheng-Yang Hsieh1,2, Chih-Hung Chen3,4, Chung-Yi Li5, Ming-Liang Lai2,3,4
Center and Department of Neurology, Tainan Sin Lau Hospital, Tainan, Taiwan.
of Clinical Pharmacy and Pharmaceutical Science, National Cheng Kung
University, Tainan, Taiwan. 3Stroke Center, National Cheng Kung University Hospital,
College of Medicine, National Cheng Kung University, Tainan, Taiwan. 4Department of
Neurology, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
of Public Health, College of Medicine, National Cheng Kung University,
Tainan, Taiwan
Corresponding Author: Chih-Hung Chen, M.D.
Department of Neurology, College of Medicine, National Cheng Kung University, #1,
University Road, Tainan 701, Taiwan
Tel: +886-6-276-6187
Fax: +886-6-237-4285
E-mail: lchih@mail.ncku.edu.tw
Cover title: Acute ischemic stroke validation
4 tables, 1 figure, no supplemental file
Keywords: acute ischemic stroke; claims data; diagnosis; National Health Insurance;
Taiwan Stroke Registry; validation
Word count: 2461
Background/ Purpose: The National Health Insurance Research Database, which uses
claims data from hospitals contracted with the National Health Insurance (NHI)
program in Taiwan, has been widely used for stroke research. The diagnostic accuracy
of the NHI claims data with regards to acute ischemic stroke (AIS) has rarely been
validated. The aim of this study was to validate the diagnosis of AIS in NHI claims
data using the Taiwan Stroke Registry (TSR) as a reference.
Methods: We retrieved patients with a discharge diagnosis of AIS (5-digit
International Classification of Diseases Code, 9th version [ICD-9 code]: 433xx or
434xx) in a single medical center from August 2006 to December 2008. We then
linked these patients to the TSR to validate their AIS diagnosis in the claims data. The
positive predictive value (PPV) and sensitivity were determined.
Results: We reviewed the claims data of 1736 consecutive AIS patients, of whom
1299 (74.8%) were linked successfully to the stroke registry database. After reviewing
the medical records and imaging results of other patients not linked to the registry
database (n=437), 235 patients were found to have had an AIS. The PPV was 88.4%
(95% CI: 86.8-89.8%) and sensitivity 97.3% (95% CI: 96.4%-98.1%). Forty-four (21.8%)
of the false-positive cases (n=202) were coded as 433x0 or 434x0.
Conclusion: The PPV of a diagnosis of AIS in the NHI claims data was high. Using
5-digit ICD-9 codes to identify AIS cases will markedly decrease the false positive rate
compared to using the commonly used 3-digit method.
Keywords: acute ischemic stroke; claims data; diagnosis; National Health Insurance;
Taiwan Stroke Registry; validation
The National Health Insurance Research Database (NHIRD), derived from the
claims data of the National Health Insurance (NHI) program of Taiwan, has been
widely used in studies on stroke.1-6 Although the accuracy of a diagnosis of stroke in
the NHIRD is critical for the veracity of study results, the only article reporting the
validity of the diagnosis of acute ischemic stroke (AIS) in the NHIRD referred to
clinical practice around 13 years ago.7 Advances in magnetic resonance imaging (MRI)
sequences (e.g. diffusion weighted image [DWI]) and different case-mix effects (e.g.
increased age and comorbidities of the patients) may have substantially changed the
diagnostic accuracy of AIS.
The Taiwan Stroke Registry (TSR) was established in May 2006, and is the first
national stroke database to assess the quality of stroke care, and represents
approximately 18% of stroke patients nationwide.8 The TSR prospectively identifies
acute stroke admissions, including subjects meeting any 1 of the 5 stroke type
definitions, namely ischemic stroke, transient ischemic attack, intracerebral
hemorrhage, subarachnoid hemorrhage (SAH), and cerebral venous thrombosis.8,9
Data are collected prospectively by TSR-trained neurologists and study nurses. The
key items in the TSR form include: (1) preadmission data; (2) inpatient elements
including clinical care during hospitalization, National Institutes of Health Stroke
Scale at admission, in-hospital complications, stroke risk factors, laboratory results of
blood tests, electrocardiography, computed tomography, and MRI findings, and
medications during admission; (3) discharge status and follow-up information.8,9 In
particular, the TSR data is strictly quality controlled, and is thus a well-validated
stroke database.8
The experience of the Registry of Canadian Stroke Network may be applied in
Taiwan, since both administrative and clinical registry databases are now available for
stroke studies, and further linkage of the NHI claims data with the TSR data is
expected to improve the research level and stroke care quality.10 The aim of the
present study was to validate the diagnosis of AIS in the NHI claims data of a single
medical center using TSR data as a reference, a more efficient way than reviewing all
of the patients’ medical records.
Data sources and record linkage
Our hospital (National Cheng Kung University Hospital, NCKUH) is a tertiary
referral center contracted with the NHI, with approximately 1200 beds and an
average of 88,000 outpatient visits/month and 28,000 admissions/month. NCKUH
has been participating in the TSR program since August 2006. Instead of extracting
data from the NHIRD, we got the claims data reported to the Bureau of NHI directly
to reduce the possible missing extraction of data. The in-patient claims of care for
the NHI contain up to five columns of diagnosis at discharge. We retrieved the claims
data of NCKUH for hospitalized patients with 5-digit AIS diagnostic codes
(International Classification of Diseases, 9th version, with clinical modification,
[ICD-9-CM code], 433xx or 434xx) in any column of their discharge diagnoses (up to
five) from August 2006 to December 2008. This differs from the previous validation
study of AIS diagnosis in NHI claims data in which the authors used 3-digit ICD-9
codes for cases retrieval.7 Each patient in the claims data and TSR data was
anonymized by an encrypted identifier for linkage. We then linked the patients to the
TSR database during this study period. If a patient had multiple hospitalizations for
AIS during this period, only the first hospitalization was included. The study protocol
was reviewed and approved by the Institutional Review Board of the National Cheng
Kung University Medical Center.
Validating the diagnoses of ischemic stroke
The validation process is summarized in Figure. AIS patients in the claims data
who were successfully linked to the TSR database with a consistent diagnosis of AIS
were considered to be accurately diagnosed. The definition of AIS in the TSR was
more strict,8 i.e. “Acute onset of neurological deficits with signs or symptoms
persisting for longer than 24 hours, presenting to the hospital within 10 days of onset,
with or without acute ischemic lesion(s) on brain computed tomography (CT) or with
acute ischemic DWI lesion(s) on MRI that corresponded to the clinical presentations”.
Not all of the AIS patients at our medical center were registered in the TSR database,
partly because the definition of AIS was stricter than the World Health Organization
(WHO)’s definition, which is routinely used in clinical practice; i.e. “Rapidly
developing clinical signs of focal (or global) disturbance of cerebral function, with
symptoms lasting for 24 hours or longer or leading to death, with no apparent cause
other than vascular origin,” plus “No evidence of hemorrhage stroke on brain
imaging”, and admitted within 10 days of symptom onset. For those not linked to the
TSR database, further validation was conducted by reviewing the medical records. A
neurologist (CY Hsieh) reviewed the electronic discharge notes and results of brain
imaging (either CT or MRI) of the patients not linked to the TSR. The patients who
fulfilled the definition of either the TSR or WHO were considered true AIS cases,
otherwise they were considered false-positive cases. The final diagnosis of a
false-positive case was also determined by the same neurologist (CY Hsieh) and
separated into 6 categories as follows:
1. Subacute ischemic stroke, i.e. presenting to the hospital within 11-30 days of
symptom onset.
2. Old ischemic stroke, i.e. presenting to the hospital more than 30 days after
symptom onset (e.g. for rehabilitation of stroke-related disability).
3. Precerebral or cerebral artery occlusion without cerebral infarction, i.e.
ICD-9 CM code 433x0 or 434x0.
4. Vasospasm-related cerebral infarction after SAH.
5. “Ruled out” diagnosis, i.e. the AIS diagnosis was ruled out after clinical
evaluation and imaging studies were completed.
6. Other miscoding (e.g. encephalopathy, transient ischemic attack, etc.).
The patients who were considered false-negative using the NHI claims data (i.e.
true AIS cases registered in the TSR but no relevant AIS diagnostic code in the
discharge diagnosis of the NHIRD), were linked to the whole-population
hospitalization files of the NHIRD using birth date, admission date, discharge date,
and sex. Thus, we were able to retrieve the diagnosis for the false-negative cases.
Statistical analysis and methods
We determined the positive-predictive value (PPV), sensitivity, and false-positive
rate of AIS diagnosis with corresponding 95% confidence intervals (CI) for the NHI
claims data after performing the two-step validation process mentioned above. For
the discharge diagnosis columns (up to five) of the patients’ claims data, we further
analyzed in which column (from principal diagnosis to fifth diagnosis) their AIS
diagnostic codes appeared, and performed a sensitivity analysis to see how many
discharge diagnosis columns should be included when retrieving the AIS patients NHI
claims data to obtain the best PPV and sensitivity. In addition, since AIS may be more
difficult to diagnose in the elderly, those who are more fragile and those having more
disabilities at baseline, we compared the PPV between the elderly (defined as age 65
years and over) and non-elderly subgroups using the chi-square test, and the results
were considered statistically different only when the two-sided p-value was less than
0.05. All analyses and 95% CI for binominal proportions were performed using SAS
9.1 for Windows (SAS Institute, Cary, NC).
From August 2006 to December 2008, there were 1736 consecutive patients
with AIS diagnostic codes in any one column of their discharge diagnoses in the NHI
claims data of NCKUH. After linking with the encrypted identifier of those patients,
1299 (74.8%) patients were successfully linked to the TSR and considered to be an
accurate diagnosis of AIS. For the other 437 patients not linked to the TSR database,
235 patients were considered true-positive AIS cases after review by the neurologist
(Figure). One hundred and fifty-five (66.0%) of these patients fulfilled the stricter TSR
definition of AIS, while 80 (34.0%) of them only fulfilled the less strict WHO definition
but not the TSR definition for AIS . Of the 155 patients not registered in the TSR, 40
(25.8%) were admitted to non-neurological departments due to AIS, and 52 (33.5%)
had in-hospital stroke.
As shown in Table 1, the PPV, sensitivity, and false-positive rate of the NHI
claims data for the diagnosis of AIS were 88.4% (95% CI: 86.8%-89.8%), 97.3% (95%
CI: 96.4%-98.1%), and 11.6% (95% CI: 10.2%-13.2%), respectively. The final diagnoses
of the 202 false-positive AIS cases in the claims data are summarized in Table 2 and
Table 3. Of the false-negative AIS cases (n=42) in the claims data (i.e. true AIS cases in
the TSR, but no AIS diagnostic codes in the discharge diagnoses), 21 were miscoded
as 435xx (n=5), 436xx (n=5), 438xx (n=5), 431xx (n=4), and 437xx (n=2), and 21 had
no diagnostic code relevant to stroke (430xx to 438xx) in their first 5 discharge
diagnostic codes.
Of the true AIS patients (n=1534), 86.2% (n=1322) had the diagnostic codes of
433xx or 434xx as the principal diagnosis, 4.9% were in the second, 3.8% in the third,
2.7% in the fourth, and 2.4% in the fifth diagnostic column. Including all 5 columns of
discharge diagnoses of the claims data had the best PPV and sensitivity in retrieving
AIS cases (Table 4). The accuracy of AIS diagnosis did not differ between the elderly
and non-elderly (PPV: 88.3% and 88.5% for the elderly and non-elderly, respectively;
p = 0.94).
In the present study, we demonstrated that the diagnosis of AIS on inpatient
claims data in our medical center had an accuracy (PPV) and sensitivity of 88.4% and
97.3%, respectively, when including all discharge diagnoses (up to 5) to retrieve the
diagnostic codes for AIS. The PPV of the diagnosis of AIS in our study is comparable
to a previous systematic review validating data for AIS diagnosis using administrative
data, in which the PPV ranged from 82% to 92%.11 The diagnostic accuracy was not
affected by the age of the patient. To the best of our knowledge, this is the first
stroke study to link these two large databases (TSR and NHIRD) in Taiwan. The linkage
of administrative and registry data of the stroke patients seems to be representative
of the entire population, with parameters of detailed clinical, laboratory, radiological,
as well as functional outcomes of the stroke patients in Taiwan.
Compared with previous validation study in which only patients older than 55
years were enrolled to validate the AIS diagnosis in the NHIRD,7 the strength of the
present study is that we included a broader age range of AIS patients for validation,
i.e. 17.5% for those 18-55 years and 82.2% for those over 55 years of age. The
validation results from this study were therefore more representative of the general
Because we linked the patients directly using encrypted identifiers, we were able
to identify the false-positive cases of AIS diagnosis in the claims data and assess the
final diagnosis of these cases. As shown in Table 2, 17.8% of the ischemic stroke
patients were admitted 11-30 days after symptom onset. There are two possible
explanations. First, AIS patients who presented with symptoms other than limb
weakness (e.g. higher cortical function deficits or visual field defects) may not have
been aware of the stroke attack and therefore came to the hospital more than 10
days of symptom onset. Second, the AIS patients who were beneficiaries of the NHI
would have been given critical illness cards for one month because of their AIS. Any
partial medical payments, including those for readmission due to any reason related
to this AIS episode (e.g. airway infection due to dysphagia or in-patient rehabilitation
for disability after AIS), would then be waived within one month after AIS. Patients
may have been given critical illness cards for AIS after being discharged from
admissions in other hospitals due to AIS, and may have subsequently been admitted
again to our center due to another reason. The probability of readmission within one
month after stroke has been reported to be 10% (95% CI: 9-11%) in Taiwan, with the
most common reason being infection.12 The diagnosis of AIS would therefore not be
recorded in the principal discharge diagnosis, and would therefore be a false-positive
case in the claims data. The attending physicians were reluctant to delete the
diagnosis of AIS in the subsequent readmission within one month because
reimbursement by NHI might be affected.
In addition, we also found that 15.3% of the false-positive cases were old strokes
coded as AIS, and 6.9% of the false-positive cases were infarction due to vasospasm
after SAH and thus were not true AIS cases. Because there is no corresponding
ICD-9-CM code for vasospasm after SAH, the disease classifier coded the result of
vasospasm, i.e. AIS. As shown in Table 3, 16 patients with either traumatic or
non-traumatic intracranial hemorrhage, as well as 6 patients with transient ischemic
attack were miscoded as AIS. Although this is a small number of cases, they were
markedly miscoded and the administrative staff should be reminded about the
accuracy of coding acute stroke.
Another strength of the present study was that we employed 5-digit ICD-9-CM
codes to retrieve the discharge diagnosis, instead of the 3-digit ICD-9 codes that we
used in our previous validation study7 and most other stroke studies using the
NHIRD .2-6 In total, 21.8% of the false-positive cases were coded as 433x0 or 434x0,
which indicates the occlusion of precerebral or cerebral arteries without cerebral
infarction. Because 99.9% of the discharge diagnoses of in-patient AIS in the NHIRD
use ICD-9-CM codes, at least 20% of the false-positive AIS cases may have been
avoided with the use of modifier codes, i.e. 433x1 or 434x1. Our findings are
different from those reported by Goldstein in 1998, in which the inclusion of modifier
codes did not have an appreciable effect on the accuracy of AIS diagnosis.13 This
difference may be because ICD-9-CM codes have now been used by the NHI for more
than 10 years, so the staff are more experienced in disease classification and thus
provide more appropriate coding.
Only 74.8% of the true AIS patients in our center were registered in the TSR
database. This may be partly due to the stricter definitions for AIS used by the TSR,
i.e. corresponding acute ischemic lesions should be demonstrated on either brain CT
or MRI. Among the 235 patients considered true AIS by the neurologist, 155 (66.0%)
patients also fulfilled the TSR’s definition of AIS. The remaining 80 (34%) patients
either did not receive an MRI examination when corresponding acute ischemic
lesions were not present in the initial CT, or the infarction was too small to be
identified on the MRI-DWI sequence.
The limitation of the present study was that tThe validation materials we used
were from only one medical center, so that extrapolation of the results to other
institutes is limited. Different diagnostic facilities may have different and variable
reporting principles, diagnosis coding rules, and criteria of acute ischemic stroke. To
the best of our knowledge, there are no published reports about the validation of AIS
diagnosis in a non-medical center, and fFurther studies may be warranted about the
validation of AIS diagnosis in a non-medical center. Besides, this study used TSR as a
standard reference. Although TSR is a well-designed registry, but data in TSR had
their own enrolled criteria of AIS which may be different from in NHI claims data,
such as days stroke onset, enrolled admission department to neurology vs. all
departments, etc. However, we try our best to solve this discrepancy by reviewing
the medical records and image results of cases not linked to TSR to confirm whether
their AIS diagnoses were true. Finally, we retrieved only the first AIS episode for
patients with multiple hospitalizations for AIS to avoid old strokes miscoded as acute
ones. It may exclude some patients with definite recurrent AIS.
The PPV and sensitivity of inpatient NHI claims data were both high in this
medical center. Using 5-digit ICD-9-CM codes to retrieve the AIS diagnostic codes (i.e.
433x1 or 434x1) will decrease the number of false-positive AIS cases identified from
the claims data by at least 20%, and it should be applied in the future for AIS studies
that use the NHIRD.
The authors wish to thank Edward Chia-Cheng Lai for his assistance in statistical
programming, Dr. Meng-Tsang Hsieh for his collection of medical records, professor
Yea-Huei Kao Yang and assistant professor Ching-Lan Cheng for their critical review of
our manuscript.
Sources of funding: This research was funded by National Cheng Kung University
Hospital (NCKUH-10101001), Tainan Sin Lau Hospital (SLH-10124), and the Taiwan
National Science Council (NSC 96-2320-B-006-028-MY3), Multidisciplinary Center of
Excellence for Clinical Trial and Research (DOH100-TD-B-111-002), Department of
Health, Executive Yuan, Taiwan. The funding sources had no role in the design,
analysis, interpretation, or reporting of results or in the decision to submit the
manuscript for publication.
Disclosures: All authors have no conflict of interest to disclose.
National Health Research Institute: Background of National Health Insurance
Research Database. http://www.Nhri.Org.Tw/nhird/en/index.htm. Assessed
October 12, 2012.
Lin HC, Xirasagar S, Chen CH, Lin CC, Lee HC: Association between physician
volume and hospitalization costs for patients with stroke in Taiwan: a
nationwide population-based study. Stroke. 2007;38:1565-9.
Tung YC, Chang GM: The effect of cuts in reimbursement on stroke outcome: A
nationwide population-based study during the period 1998 to 2007.
Chang CH, Shau WY, Kuo CW, Chen ST, Lai MS: Increased risk of stroke associated
with nonsteroidal anti-inflammatory drugs: A nationwide case-crossover study.
Chen PC, Muo CH, Lee YT, Yu YH, Sung FC: Lung cancer and incidence of stroke:
A population-based cohort study. Stroke.2011;42:3034-3039.
Sheu JJ, Kang JH, Lou HY, Lin HC: Reflux esophagitis and the risk of stroke in
young adults: A 1-year population-based follow-up study.
Cheng CL, Kao YH, Lin SJ, Lee CH, Lai ML: Validation of the National Health
Insurance Research Database with ischemic stroke cases in Taiwan.
Pharmacoepidemiol Drug Saf.2011;20:236-242.
Hsieh FI, Lien LM, Chen ST, Bai CH, Sun MC, Tseng HP, Chen YW, Chen CH, Jeng JS,
Tsai SY, Lin HJ, Liu CH, Lo YK, Chen HJ, Chiu HC, Lai ML, Lin RT, Sun MH, Yip BS,
Chiou HY, Hsu CY; Taiwan Stroke Registry Investigators: Get With the
Guidelines-Stroke performance indicators: Surveillance of stroke care in the
Taiwan Stroke Registry: Get With the Guidelines-Stroke in Taiwan.
36526.DC1/CIR200805-Online_Appendix.pdf. Assessed September 24, 2012.
10. Fang J, Kapral MK, Richards J, Robertson A, Stamplecoski M, Silver FL: The
Registry of Canadian Stroke Network : An evolving methodology. Acta Neurol
11. Andrade SE, Harrold LR, Tjia J, Cutrona SL, Saczynski JS, Dodd KS, Goldberg RJ,
Gurwitz JH: A systematic review of validated methods for identifying
cerebrovascular accident or transient ischemic attack using administrative data.
Pharmacoepidemiol Drug Saf.2012;21(Suppl 1):100-128.
12. Lin HJ, Chang WL, Tseng MC: Readmission after stroke in a hospital-based
registry: Risk, etiologies, and risk factors. Neurology.2011;76:438-443.
13. Goldstein LB: Accuracy of ICD-9-CM Coding for the Indentification of Patients
with Acute Ischemic Stroke: Effect of Modifier Codes.
Table 1: Validation of National Health Insurance (NHI) claims records on acute
ischemic stroke (AIS)
Validation results, number
NHI claims
diagnosis for AIS
AIS (+)
AIS (-)
NA: not applicable.
Table 2: The reasons for false-positive AIS diagnoses (N=202) and their percentage
N (%)
Other miscoding
57 (28.2)
433x0 or 434x0
44 (21.8)
Subacute ischemic stroke
36 (17.8)
Old ischemic stroke
31 (15.3)
Ruled out diagnosis
20 (9.9)
Vasospasm after subarachnoid hemorrhage
14 (6.9)
Definition of subacute: 11-30 days, and old stroke: >30 days after symptom onset.
Total percentage may not equal 100% because of rounding
Table 3: Final diagnoses of other miscoding false-positive (N=57) cases and their
N (%)
Intracranial hemorrhage*
16 (28.1)
Other neurological diseases†
15 (26.3)
Toxic/metabolic/anoxic encephalopathy
Brain tumor‡
6 (10.5)
Transient ischemic attack
6 (10.5)
Hypoglycemia or hyperglycemia
2 (3.5)
1 (1.8)
*Intracranial hemorrhage included both spontaneous and traumatic cases;
neurological diseases included neurodegenerative diseases, non-diabetic
ischemic oculomotor nerve palsy, cerebral autosomal dominant arteriopathy with
subcortical infarcts and leukoencephalopathy, and peripheral neuropathy.
tumors included both primary and metastatic cases.
Table 4: Sensitivity analysis for the effect of enrolling different numbers of discharge
diagnoses on positive-predictive value (PPV) and sensitivity
Principal diagnosis only
Principal +
Principal + 2nd + 3rd diagnoses
Principal +
Principal + 2nd + 3rd + 4th + 5th diagnoses
PPV (%)
Figure legends
Figure: Algorithm of the validation process.