Quality Ratings and Premiums in the Medicare Advantage Market

advertisement
Quality Ratings and Premiums in the Medicare Advantage
Market
Ian M. McCarthy∗
Department of Economics
Emory University
Michael Darden†
Department of Economics
Tulane University
January 2015
Abstract
We examine the response of Medicare Advantage contracts to published quality ratings. We
identify the effect of star ratings on premiums using a regression discontinuity design that exploits
plausibly random variation around rating thresholds. We find that 3, 3.5, and 4-star contracts
in 2009 significantly increased their 2010 monthly premiums by $20 or more relative to contracts
just below the respective threshold values. High quality contracts also disproportionately dropped
$0 premium plans or expanded their offering of positive premium plans. Welfare results suggest
that the estimated premium increases reduced consumer welfare by over $250 million among the
affected beneficiaries.
JEL Classification: D21; D43; I11; C51
Keywords: Medicare Advantage, Premiums, Quality Ratings, Regression Discontinuity
∗ Emory
† 206
University, Rich Memorial Building, Room 306, Atlanta, GA 30322, Email: ian.mccarthy@Emory.edu
Tilton Memorial Hall, Tulane University, New Orleans, LA 70115. E-mail: mdarden1@tulane.edu
1
1
Introduction
The role of Medicare Advantage (MA) plans in the provision of health insurance to Medicare beneficiaries has grown substantially. Between 2003 and 2014, the share of Medicare eligible individuals
in an MA health plan increased from 13.7% to 30%.1 To better inform enrollees of MA quality, in
2007, the Center for Medicare and Medicaid Services (CMS) introduced a five-star rating system that
provided a rating of one to five stars to each MA contract – a private organization that administers
potentially many differentiated plans – in each of five quality domains.2 For the 2009 enrollment
period, CMS began aggregating the domain level quality scores into an overall star rating for each
MA contract in which each plan offered by a contract would display the contract’s quality star rating.
Since in 2012, contracts have been incentivized to earn high quality star ratings through star-dependent
reimbursement and bonus schemes.
Early studies on the effects of the star rating program focus on the informational benefits to
Medicare beneficiaries. To this end, the program has been found to have a relatively small positive
effect on beneficiary choice, with heterogeneous effects across star ratings (Reid et al., 2013; Darden
& McCarthy, forthcoming). However, one area thus far overlooked concerns the supply-side response
to MA star ratings, where a natural consequence of the star rating program could be for contracts
to adjust premiums and other plan characteristics in response to published quality ratings.3 Indeed,
while the quality star program is often presented as a potential information shock to enrollees, the
program could also serve as an information shock to health insurance contracts, better informing them
of competitor quality and better informing contracts of their own signal of quality to the market. For
example, learning that its plans have the highest quality star rating in a market in 2009, a contract may
choose to price out its quality advantage in 2010 by raising plan premiums. Conversely, a relatively
low-rated contract may lower its 2010 premium in response to its 2009 quality star rating. More
generally, the extent to which policy may cause health insurance companies to adjust premiums is a
central question in health and public economics.4
The current paper provides a comprehensive analysis of 2010 premium adjustments to the 2009
publication of MA contract quality stars. We investigate the specific mechanisms by which contracts
can adjust their premiums in response to their quality ratings, and we calculate the corresponding welfare effects. We adopt a regression discontinuity (RD) design that exploits plausibly random variation
around 2009 star thresholds, allowing us to separately identify the effect of reported quality on price
1 Kaiser Family Foundation MA Update, available at http://kff.org/medicare/fact-sheet/medicare-advantage-factsheet/.
2 For example, one domain on which contracts were rated was “Helping You Stay Healthy.”
3 Preliminary evidence of a supply-side response to the publication of MA quality stars was found in Darden &
McCarthy (forthcoming), albeit with a restricted sample of contract/plan/county/year observations.
4 For example, see Pauly et al. (2014) on the effects of the Affordable Care Act on individual insurance premiums.
2
from the overall relationship between quality and price. Our data on contract/plan market shares,
reported contract quality, plan premiums, and other plan characteristics come from several publicly
available sources. Our results suggest strong premium adjustments following the 2009 star rating program, with average to above average star-rated contracts significantly increasing premiums from 2009
to 2010. When we conduct our analysis at the contract level, we find that 3, 3.5, and 4-star contracts
increase their average premiums across existing plans by $33.60, $29.30, $31.85, respectively, relative
to contracts with 2009 ratings just below the respective threshold values. At the plan level, we estimate
mean increases of $19.40, $41.99, and $31.52 for 3, 3.5, and 4-star contract/plans, respectively. These
effects are sizable compared to overall average premium increases of between $9 and $15. The results
are also broadly consistent across a range of sensitivity analyses, including consideration of alternative
bandwidths, falsification tests with counter-factual threshold values, and the exclusion of market-level
covariates.
While an MA contract may directly adjust its plans’ premiums in response to quality stars, the
contract may also adjust the mix of plans it offers within a market (county). For example, in response
to the published star ratings, a contract could alter the number of zero-premium plans; adjust the
number of plans that include Medicare Part D coverage; change the drug deductible in plans that offer
part D coverage; or add/drop plans entirely. Indeed, our data show that nearly all of the regional
variation in plan premiums is due to selection of plan offerings by contracts, as opposed to contracts
charging different premiums in different areas of the country. We find that contracts just above the
3 and 3.5-star thresholds in 2009 are more likely to drop $0 premium plans in 2010, with 3.5-star
contracts also more likely to introduce positive premium plans into new markets. We find no such
disproportionate change in $0 or positive premium plans among contracts with a 4-star rating in 2009.
Meanwhile, low quality contracts (those just above the 2.5-star threshold in 2009) maintain their 2009
plan offerings at largely the same premium levels in 2010, while contracts just below the 2.5-star
threshold in 2009 are much more likely to exit the market altogether in 2010.
Overall, our results suggest that the star rating program in 2009 may have caused low quality
contracts to drop plans while generating large premium increases among contracts receiving 3-star
ratings and above. Adopting the consumer welfare calculations used in Town & Liu (2003) and
Maruyama (2011), our estimated increases in premiums imply a reduction in consumer surplus of over
$250 million among those beneficiaries enrolled in the relevant plans. To the extent that higher quality
plans are replacing low quality plans at reasonable premium levels, plan entry and exit behaviors
induced by the star-rating program may partially offset this welfare loss; however, given the number of
new plans estimated to have entered the market due to the star ratings, such offsets are likely relatively
small (Maruyama, 2011).
3
In what follows, we discuss the institutional details of Medicare Advantage and the recent star
rating program in Section 2. The data and methods are discussed in Sections 3 and 4, respectively.
We present our results in Section 5, with a series of robustness checks discussed in Section 6. Section
7 examines the potential mechanisms underlying our estimated premium adjustments, and Section 8
summarizes the welfare effects associated with our estimated premium increases. The final section
concludes.
2
Institutional Background
Since Medicare’s inception, beneficiaries have had the option to receive benefits through private health
insurance plans. The Balanced Budget Act of 1997 (BBA) classified all private Medicare health insurance plans as Medicare Part C plans, and it allowed for additional types of business models including
Preferred Provider Organizations (PPOs), Provider-Sponsored Organizations (PSOs), Private fee-forservice (PFFS) plans, and Medical Savings Accounts (MSAs). Later, in addition to the beneficiary
entitlement to prescription drug coverage, the Medicare Modernization Act of 2003 renamed Medicare Part C plans as Medicare Advantage (MA) plans. In each year since 2003, Medicare beneficiaries
choose to enroll in traditional fee-for-service (FFS) Medicare or an MA plan during an open enrollment
period from November 1st through December 31st. By enrolling in an MA plan, enrollees must pay
Medicare Part B premiums in addition to any additional premium charged by the plan. In exchange,
MA plans provide at least (often more than) the services covered by traditional FFS Medicare. In
2009, 38% of MA plans charged no additional premium, while 77% of plans also offered prescription
drug coverage. Given the generosity of plan coverage at possibly no additional cost relative to traditional Medicare FFS, the MA has grown dramatically in recent years with share of Medicare eligible
individuals in an MA plan increasing from 13.7% in 2003 to 30% in 2014.5
Broadly, an MA contract is an agreement between a private insurance company and CMS whereby
the company agrees to insure Medicare beneficiaries in exchange for reimbursement. A contract is
approved by CMS to operate in specific counties, and an approved contract typically offers a menu
of MA plans that are differentiated by premium, prescription drug coverage, and, if covered, the
prescription drug deductible. Most MA contracts are required to offer at least one plan that includes
prescription drug coverage. For the 2015 enrollment year, 78% of all Medicare beneficiaries live in a
county with access to at least one plan that offers prescription drug coverage (MA-PD) and charges no
additional premium (above the Part B premium).6 In 2009, the mean number of MA plans available to
5 Kaiser Family Foundation MA Update, available at http://kff.org/medicare/fact-sheet/medicare-advantage-factsheet/.
6 http://kff.org/medicare/issue-brief/medicare-advantage-2015-data-spotlight-overview-of-plan-changes/
4
beneficiaries was roughly 11 plans per county.7 However, there exists considerable regional variation in
the availability of MA plans, and enrollments in MA plans are concentrated in a few national contracts.
Indeed, according to the Kaiser Family Foundation (KFF), 60% of all plans offered in 2015 are affiliated
with just seven health insurance companies.8
Staring in the 2007 enrollment year, CMS began collecting and distributing a one to five-star
quality rating in each of five quality domains (e.g., “Helping You Stay Healthy”). Each domain
was itself an aggregation of many individual quality metrics such as the percentage of enrollees with
access to an annual flu vaccine. These individual quality metrics are calculated based on data from a
variety of sources, including HEDIS, the Consumer Assessment of Healthcare Providers and Systems
(CAHPS), the Health Outcomes Survey (HOS), the Independent Review Entity (IRE), the Complaints
Tracking Module (CTM), and CMS administrative data. Starting in enrollment year 2009, CMS began
aggregating the domain level quality stars to an overall contract rating of between one and five stars
(in half-star increments).9 And since 2011, CMS constructs the contract-specific quality ratings as a
function of Part D coverage, when relevant. Our focus is on the 2009 and 2010 enrollment years the first two years of the overall contract star rating program and the years in which all contracts,
including those offering prescription drug coverage, were rated based on the same underlying quality
metrics.
The literature on the MA quality rating initiatives has generally focused on the enrollment effects.
Recently, Reid et al. (2013) find large effects of increases in star-ratings on enrollment that are homogeneous across the reported quality distribution, but results from that paper fail to disentangle the
effects of quality from quality reporting on enrollment. Attempting to disentangle these effects, Darden
& McCarthy (forthcoming) find heterogeneous effects of the quality star rating program on MA plan
enrollment in 2009 and no significant effect in 2010. At the plan level, they find that a marginally
higher rated contract at the lower end of the quality distribution (e.g., a 3 as compared to 2.5 starred
contract) realized a positive and significant enrollment effect equal to 4.75 percentage points relative
to traditional FFS Medicare in 2009 enrollments. This effect diminishes for higher rated contracts,
and vanishes for the 2010 enrollment year. The lack of an enrollment response to 2010 quality stars
suggests that the 2009 star ratings may have acted as a one-time informational event, or that there
was a supply-side response in 2010 based on the 2009 ratings.
Generally, the potential for supply-side responses to Medicare Advantage policy has received little
attention from researchers. One recent exception is Stockley et al. (2014), who examine how MA plan
premiums and benefits respond to variation in the benchmark payment rate - the subsidy received
7 Author’s
calculation. See Section 3 for a presentation of our data.
http://kff.org/medicare/issue-brief/medicare-advantage-2015-data-spotlight-overview-of-plan-changes/.
9 For a complete discussion of the star rating program, see Darden & McCarthy (forthcoming).
8 See
5
by the MA contract for each enrollee. Those authors find that contracts do not adjust premiums
directly as a result of changes in benchmark payment rates, but rather contracts adjust the generosity
of plan benefits in response. Conversely, Darden & McCarthy (forthcoming) find that contract/plans
in 2010 raise premiums in response to higher 2009 contract-level quality star ratings. However, the
sample used to estimate the supply-side response of contracts in 2010 was restricted to just those
contract/plans with a.) 10 or more enrollees in both 2009 and 2010 and b.) nonmissing quality ratings
in 2010. Furthermore, that paper only focuses on direct premium increases, ignoring the possibility
of indirect premium adjustments such as changing the number of zero-premium plans or adjust the
plan-mix within a county. The current paper provides a comprehensive examination of the supply-side
response to quality star ratings, examining the full population of approved MA contracts to evaluate
several potential response mechanisms as well as potential welfare consequences.
3
Data
We collect data on market shares, contract/plan characteristics, and market area characteristics from
several publicly available sources for calendar years 2009 and 2010.10 As a base, we use the Medicare
Service Area files to form a census of MA contracts that were approved to operate in each county
in the United States in 2009 and 2010. To these contract/county/year observations, we merge contract/plan/county/year data on enrollment and other contract characteristics.11 To our market share
data, we merge further information on MA contract quality ratings, contract/plan premiums, countylevel MA market share, CMS benchmark rates, fee-for-service costs, hospital discharges, and census
data. The CMS quality information includes an overall summary star measure; star ratings for different domains of quality (e.g., helping you stay healthy); as well as star ratings and continuous summary
scores for each individual metric (e.g., percentage of women receiving breast cancer screening and an
associated star rating). Data are not available for the overall continuous summary score (i.e., the score
rounded to generate an overall star rating), but we are able to replicate this variable by aggregating the
specific quality measures following CMS instructions. We explain this process thoroughly in Appendix
B. Hospital discharge data are from the annual Hospital Cost Reporting Information System (HCRIS),
and CMS benchmark rates and average FFS costs by county are publicly available from CMS. Finally,
county-level demographic and socioeconomic information are from the American Community Survey
(ACS).
10 See
Appendix C for a detailed discussion of our dataset and specific links.
suppresses enrollment counts for contract/plans with 10 or fewer enrollees, but we keep these observations and
impute enrollment. The Service Area files are needed because the enrollment files do not account for migration. For
example, it is possible for the enrollment file to contain a positive enrollment record for a contract/plan in a county even
if that contract is not approved to operate in the county. See Appendix C for futher details.
11 CMS
6
Our enrollment data is available monthly; however, there is little variation in enrollments across
months due to the nature of the open enrollment process at the end of each calendar year. Furthermore, all other variables of interest are specific to a calendar year. Therefore, we take the average
enrollment of each plan across months in a given year. The resulting unit of observation is the contract/plan/county/year. Our analysis focuses only on health maintenance organizations (HMO), local
and regional preferred provider organizations (PPO), and private fee-for-service (PFFS) contracts.
We exclude all special needs plans and employer/union-specific plans (also known as 800-series plans),
and we drop all observations that pertain to United States Territories and Outlying Areas. Our final
sample includes 247,978 contract/plan/county/years.
Table 1 provides summary statistics for our final dataset at the plan, county, and contract level.
The data consist of 51,442 and 34,642 plan/county observations in 2009 and 2010, respectively, with
an increase in average MA enrollment per plan from 292 in 2009 to 361 in 2010.12 The county-level
summary statistics also reveal an increasing penetration of MA in the overall Medicare market, from
15.6% in 2009 to 16.5% in 2010, alongside a decrease in the number of plans offered per county,
an increase of just over $15 in average premiums, an increase in the percentage of plans offering
prescription drug coverage, and an increase in the proportion of HMO and PPO plans relative to
PFFS plans. Finally, the bottom panel of Table 1 illustrates a slight rightward shift in the distribution
of star ratings from 2009 to 2010, with 1.5-star contracts either improving in rating in 2010 or exiting
the market, and with a relative increase in the percentage of 4.5 and 5-star contracts in 2010.
Table 1
4
Methodology
Since star ratings are assigned to contracts (rather than specific plans operating within a contract),
our initial analysis follows Town & Liu (2003), Cawley et al. (2005), Dafny & Dranove (2008), Frakt
et al. (2012) and others in aggregating plan characteristics to the contract level by taking the mean
values across plans within a contract (in the same county). We then examine the relationship between
a contract’s quality star rating in 2009 and the contract’s premiums in 2010. Denoting the vector of
mean characteristics in market m (county) for contract c by ȳcm = {ȳcm,1 , ..., ȳcm,K }, we specify the
12 As indicated in Table 1, enrollment data are not available for all plans as CMS does not provide enrollment counts
for plans with 10 or fewer enrollments. As such, the mean enrollment figures presented are higher than the true mean
as they exclude a large number of plans with missing enrollment data.
7
mean characteristic k for contract c as follows:
ȳcmk = f (qc , Xcm , Wm ) + εcmk ,
(1)
where qc denotes the contract’s star rating in 2009, Xcm denotes other contract characteristics, Wm
denotes 2010 market-level data on the age, race, and education profile of a given county, and εcmk
is an error term independently distributed across characteristics and markets.13 Given our focus on
premiums, our plan characteristics of interest consist of the average premium and the proportion of
the contract’s plans (in the same county) charging a $0 premium.14
The CMS quality rating system relies on a continuous summary score between 1 and 5 which is
rounded to the nearest half. A contract with a 2.24 summary score is therefore rounded down to a 2-star
rating, while a contract with a 2.26 summary score is rounded up to a 2.5-star rating. Intuitively, these
two contracts are essentially identical in quality but received different quality ratings. We propose to
exploit the nature of this rating system using a regression discontinuity (RD) design.15 More formally,
denote by Rc the underlying summary score, by R̂ the threshold summary score at which a new star
rating is achieved (e.g., R̂ = 2.25 when considering the 2.5 star rating), and by R̃c = Rc − R̂ the
amount of improvement necessary to achieve an incremental improvement in rating. We then limit our
analysis to contracts with summary scores within a pre-specified bandwidth, h, around each respective
threshold value, R̂. For example, to analyze the impact of improving from 2.0 to 2.5 stars, the sample
is restricted to contracts with summary scores of 2.25 ±h.
To implement our approach, we specify plan/contract quality as follows:
qc = γ1 + γ2 × I Rc > R̂ + γ3 × R̃c + γ4 × I Rc > R̂ × R̃c ,
(2)
where γ2 is the main parameter of interest. Incorporating this RD framework into equation (1), and
adopting a linear functional form for f (.), yields the final regression equation
ȳckm
= γ1 + γ2 × I Rc > R̂ + γ3 × R̃c + γ4 × I Rc > R̂ × R̃c
+βc Xcm + βm Wm + εckm ,
(3)
where Wm and Xcm are as discussed previously. Our baseline analysis estimates equation 3 using
ordinary least squares with a bandwidth of h = 0.125. We consider alternative bandwidths in Section
13 We cluster standard errors by contract; however, the results are qualitatively unchanged when clustering standard
errors at the county level.
14 The overall plan type (e.g., HMO versus PPO) is typically contract-specific and therefore does not vary across plans
within the same contract.
15 See Imbens & Lemieux (2008) for a detailed discussion of the RD design and its application in economics.
8
6 as well as a more traditional RD design with a triangular kernel Imbens & Lemieux (2008).
Changes in mean premiums at the contract level can arise in several ways, most directly via changes
to premiums among specific plans. To investigate this possibility, we also estimate a regression of 2010
plan premiums as a function of the plans’ 2009 premiums, 2009 star ratings and other contract-level
variables, and 2009 county characteristics. This analysis is akin to estimating equation 3 but where
our analysis is at the plan level rather than aggregating to the contract level. For this analysis, we
examine only plans that were available in the same county in both 2009 and 2010.
5
Results
5.1
Average Premiums at the Contract Level
Table 2 presents the results of a standard OLS regression of mean contract characteristics in 2010
on the 2009 mean value, the contract’s 2009 star rating, as well as additional county and contractlevel covariates. To the extent that contract quality is already reflected in the contract’s mean plan
characteristics, we would expect the effects of increasing star ratings to be relatively small in magnitude.
This is the case in Table 2, where we see small decreases in average premiums among 2.5 and 4-star
contracts with small increases in premiums among 3 and 3.5-star contracts (relative to contracts with
one-half star lower ratings). Note that, in order to better reflect the premium charged to a given
enrollee in a specific contract, our analysis of average premiums at the contract level excludes plans
with 10 or fewer enrollments.16 Our analysis at the plan-level makes no such exclusion.
Table 2
The OLS results say little about the specific effects of an increase in reported quality on premiums.
To address this question directly, Table 3 presents the initial RD results at the contract level for a
bandwidth of h = 0.125. The results suggest a large premium increase for contracts receiving a 3, 3.5,
or 4 star rating in 2009, with these contracts increasing average premiums by between $29 and $34
per month from their 2009 levels relative to contracts with one-half star lower ratings. By contrast,
contracts receiving a 2.5-star rating showed no statistically significant increase in premiums. By virtue
of the RD design and the nature of the CMS star rating program, we argue that these estimates can be
interpreted as the causal effect of a one-half star increase in quality ratings separate from the quality
16 Not
surprisingly, low star-rated plans with 10 or fewer enrollments also charge much higher premiums relative to
the same quality plans with higher enrollments. For example, in 2010, the average premium among 2.5-star plans with
10 or fewer enrollments was $63, compared to just $32 among 2.5-star plans with 11 or more enrollments. The results
are nonetheless consistent when we include all plans and an indicator variable for missing enrollment data.
9
of the contract itself. For example, 3.5-star contracts of comparable “true” quality to 3-star contracts
were able to increase their premiums on average $29 per month. Looking purely at sample averages,
all other contracts receiving a 3.5-star rating in 2009 increased their premiums by an average of $12,
while 3-star contracts falling just below the 3.25 threshold increased their premiums by just over $3.
We provide extensive robustness and sensitivity analyses for these results in Section 6.
Table 3
5.2
Premiums at the Plan Level
Table 4 summarizes the RD results for 2010 plan premiums as a function of 2009 premiums, countylevel covariates, as well as the contract’s quality rating as specified in equation 2. This analysis
therefore estimates premium changes at the plan level (for the same plans offered in both 2009 and
2010), rather than analyzing average premiums at the contract level as in Table 3. For the same
plan/county/contract, the results again show a large and statistically significant increase in premiums
for 3, 3.5, and 4-star contracts, with premiums increasing by between $19 and $42 per month for the
same plans.
Table 4
6
Robustness and Sensitivity Analysis
The appropriateness of our proposed RD design depends critically on whether contracts can sufficiently
adjust their summary scores. Intuitively, it is unlikely that contracts can manipulate their scores
because the star ratings are calculated based on data two or three years prior to the current enrollment
period. Contracts would therefore not have the opportunity to manipulate other observable plan
characteristics in response to their same-year star ratings. To test this formally, McCrary (2008)
proposes a test of discontinuity in the distribution of summary scores around the threshold values.
The resulting t-statistics range from 0.15 to 0.96, suggesting no evidence of a discontinuity in the
running variable at any of the threshold values. In the remainder of this section, we investigate the
sensitivity of our results along several other dimensions, including: 1) bandwidth selection; 2) inclusion
of covariates; and 3) falsification test with counter-factual threshold values.
10
6.1
Choice of Bandwidth
The choice of bandwidth is a common area of concern in the RD literature (Imbens & Lemieux, 2008;
Lee & Lemieux, 2010). To assess the sensitivity of our results to the choice of bandwidth, we replicated
the local linear regression analysis from Tables 3 and 4 for alternative bandwidths ranging from 0.1 to
0.25 in increments of 0.005. The results for mean plan premiums at the contract level (Table 3) are
illustrated in Figure 1, where each graph presents the estimated star-rating coefficient, γˆ2 , along with
the upper and lower 95% confidence bounds. Similar results for plan-level premium adjustments are
presented in Figure 2. In general, our results are consistent across a range of alternative bandwidths.
Figure 1
6.2
Inclusion of Covariates
The RD literature generally advises against including covariates in a standard RD design (Imbens
& Lemieux, 2008; Lee & Lemieux, 2010). The intuition for this advice is as follows: if treatment
assignment is random within the relevant bandwidth, then the covariates should also be randomly
assigned to the treated and control groups. However, in our setting, purely randomized quality scores
at the contract level would not necessarily imply randomization in county-level variables. As such, we
argue that county-level covariates belong in our analysis in order to control for geographic variation
influencing contract location and plan offerings.
Nonetheless, we assess the sensitivity of our analysis to the exclusion of these covariates by estimating a more traditional RD model with right-hand side variables presented in equation 2. We estimate
the effect of a one-half star increase in quality ratings with a triangular kernel and our preferred
bandwidth of h = 0.125. The results, summarized in Table 8, are generally consistent with our initial
findings in Tables 3 and 4, where we again see large increases in average premiums among 3, 3.5, and
4-star contracts relative to contracts just below the respective star-rating thresholds. One exception is
the estimated effect on individual plan premiums for 4-star versus 3.5-star contracts presented in the
bottom right of Table 8. In this case, unlike the estimates in Table 4, we find no significant increase
in premiums among 4-star contracts along with a reduction in the magnitude of the estimated effect.
This is perhaps not surprising given the location of higher rated contracts throughout the country,
where 4-star contracts are more concentrated in specific geographic areas relative to lower star-rated
contracts.
11
Table 8
6.3
Falsification Tests
Finally, it is possible that the observed jumps at threshold values of 2.25, 2.75, etc. are driven more by
specific contracts that happen to fall above or below the threshold versus the star rating system itself.
As a test, we therefore considered a series of counter-factual threshold values above and below the
true threshold values. Intuitively, we should not see any jumps in premiums around these thresholds.
Figure 3 presents the results of this analysis for mean premiums at the contract/county level, where
we estimated the effects just as we did for Figure 1 and Table 3. The results support 2.75 and 3.25 as
the true threshold values, with the largest premium increases occurring just above those thresholds.
The results for 2.25 and 3.75 thresholds are less conclusive, with apparent jumps in premiums for what
should be irrelevant thresholds such as 1.9, 3.65, and 3.85.
Figure 3
7
Mechanisms for Premium Adjustment
Comparing our contract-level (Table 3) and plan-level (Table 4) analysis, we see larger premium
increases at the plan level for 3.5-star contracts and smaller increases at the plan level for 3-star
contracts. These results suggest that increases in average premiums at the contract level do not arise
solely from increases in premiums of the same plans from 2009 to 2010. Rather, the results suggest that
contracts also alter their plan mix from one year to the next (e.g., dropping plans within a contract,
introducing new plans under the same contract, or expanding plans to new counties).
Table 5 summarizes the exit behaviors from 2009 to 2010 by star rating, where we see low quality
plans were significantly more likely to exit their respective markets than plans associated with higher
star ratings. In particular, we see almost all 1.5-star plans leave the market from 2009 to 2010, with
very little exit among 4 and 4.5-star plans.17 Regarding plan entry, Table 5 shows that of the contracts
receiving a 1.5-star rating in 2009 that still operate in 2010, 37% of the underlying plans entered into a
new county in 2010. Similarly, 55% of 2-star plans (in 2009) entered into a new county in 2010, while
higher rated contracts were relatively less likely to enter into new markets. Collectively, the exit and
17 The 1.5-star contracts that stayed in the market from 2009 to 2010 also had a marginally higher star rating in 2010.
As such, there are no 1.5-star contracts remaining in 2010 (see Table 1).
12
entry figures reflect larger turnover in plan offerings among lower rated contracts relative to higher
rated contracts. This is perhaps expected as higher rated contracts may be more deliberate in their
market entry/exit decisions and less likely to quickly cycle through new plans from one year to the
next.
Table 5
7.1
Analysis of Plan Exit
To examine plan exit more directly, we follow Bresnahan & Reiss (1991), Cawley et al. (2005), Abraham
et al. (2007), and others in assuming that an insurance company will only offer a plan in a given county
if the plan positively contributes to the contract’s profit. Assuming profit is additively separable across
geographic markets (counties), our observed plan choice indicator becomes:
yc(j)m =


1
if πc(j)m = g Wm , Xc(j)m + εc(j)m ≥ 0

0
if πc(j)m < 0
(4)
where Wm again denotes county-level demographics, Xc(j)m denotes contract and plan characteristics
(including the contract’s 2009 quality, qc , plan premium, Part D participation, etc.), and εc(j)m is an
error term independently distributed across markets and plans.
We adopt a reduced form, linear profit specification with covariates including the benchmark CMS
payment rates, 2009 contract quality (qc ), the plan’s enrollments in 2009, the number of physicians
in the county, the average Medicare FFS cost per beneficiary in the county, and plan characteristics
such as premiums, whether the plan offers prescription drug coverage, and indicators for HMO or PPO
plan type. Within this specification, we also consider the RD design from equation 2. We estimate
equation 4 with a linear probability model where yc(j)m = 1 indicates that the contract continued to
offer the plan in 2010 and yc(j)m = 0 indicates the plan was dropped. By definition, this analysis is
based on existing plans as of 2009.
The results of our RD analysis of plan exit are summarized in Table 6. The top panel presents
results for all plans, while the remaining panels present results for plans with $0 premiums and plans
with positive premiums, respectively. Overall, we see that 2.5-star contracts are significantly less likely
to exit markets than 2-star contracts of similar overall quality. Relative to 2.5-star contracts, 3-star
contracts show no significant differences in exit behaviors, but they are significantly more likely to
drop their $0 premium plans and less likely to drop positive premium plans. Somewhat surprisingly,
13
contracts receiving a 3.5-star rating are more likely to drop plans overall; however, from the middle
panel of Table 6, we see that this result is entirely driven by 3.5-star contracts dropping their $0
premium plans. Finally, 4-star contracts are significantly less likely to exit overall, particularly for
their positive premium plans.18
Table 6
7.2
Analysis of Plan Entry
An important and relatively unique aspect of the MA market concerns the distinction between plan
and contract-level decisions. Specifically, contracts must obtain CMS approval in order to be offered
in a given county; however, conditional on receiving CMS approval, the decision of which plan(s) to
offer in a county is relatively less regulated. As a result, we argue that the fixed costs of entry are
primarily incurred at the contract level while the plan-level entry/exit decisions are based on the variable profits per enrollee (i.e., regardless of market share). With regard to plan entry, this unique CMS
approval process alleviates many of the traditional econometric issues surrounding multiple equilibria
or endogeneity of other players’ actions in models of market entry with incomplete information (Berry
& Reiss, 2007; Bajari et al., 2010; Su, 2012). Conditional on plan characteristics, our entry analysis
therefore need only consider variable cost shifters and should be largely independent of the number or
type of competing plans in the county.19
The full set of plans available to a contract in a given market m is identified by taking all plans
offered under that contract across the entire state in the same year. All such plans are therefore
considered “eligible” to be operated in any given county, and the contract must choose which of those
plans to offer in each county, where yc(j)m = 1 indicates that the plan was added to the county (under
that contract) in 2010, and yc(j)m = 0 indicates that the plan was not offered. As with our analysis
of plan exit, we estimate the entry-equivalent to equation 4 using a standard linear probability model,
with entry considered as a function of 2010 county and plan characteristics as well as 2009 contract
quality as in equation 2.
Table 7 summarizes the results of our RD analysis for plan entry. Note that these results only
apply to markets in which the contracts previously operated (i.e., we do not consider the contract-level
18 The robustness of our plan exit results to bandwidth selection is summarized in Appendix D. The overall results (top
panel of Table 6) at the 2.75 threshold appear relatively sensitive to bandwidth selection, with the statistical significance,
magnitude, and sign of the point estimates changing within bandwidths from 0.1 to 0.2. In terms of hypothesis testing,
we interpret this as evidence in favor of the null that the star rating has no effect on plan exit at the 2.75 threshold. As
such, the qualitative findings from our point estimates in Table 6 are unchanged.
19 Results are robust when we weaken this assumption and allow predicted 2010 market shares to influence entry
behaviors. The results are excluded for brevity but available upon request.
14
entry decisions and instead focus specifically on the plan-level entry of pre-existing contracts). The RD
results indicate that a one-half star improvement for 3 or 3.5-star contracts makes them significantly
more likely to expand their plans into new markets. The bottom panels of Table 7 further reveal that
the increase in probability of plan entry occurs for the positive premium plans, with 3.5-star contracts
significantly less likely to enter new markets with their $0 premium plans.20
Table 7
8
Welfare Effects
To examine the welfare effects of our estimated premium increases in Section 5, we follow Town &
Liu (2003) and Maruyama (2011) in estimating a standard Berry-type model of plan choice based on
market-level data (Berry, 1994). Specifically, let the utility of individual i from selecting Medicare
option c(j) in market area m be given as
Uic(j)m = δc(j)m + ξc(j)m + ζig + (1 − σ)ic(j)m ,
(5)
where δc(j)m and ξc(j)m represent the mean level of utility derived from observed and unobserved
contract-plan-market area characteristics, respectively. We include in δc(j)m observed characteristics at
the contract and plan level, including premiums, plan type (HMO, PPO, or PFFS), and the underlying
summary score of the contract. Similar to Town & Liu (2003), we partition the set of Medicare options
into two groups: 1) MA plans that offer prescription drug coverage (MA-PD plans); and 2) MA plans
that do not offer prescription drug coverage (MA-Only). Traditional Medicare FFS is taken as our
outside option.
In addition to the i.i.d. extreme value error ic(j)m , individual preferences are allowed to vary
through group dummies ζig . This nested logit structure relaxes the independence of irrelevant alternatives assumption and allows for differential substitution patterns between nests. The nesting
parameter, σ, captures the within-group correlation of utility levels.
Following Berry (1994) and others, the parameters in equation 5 can be estimated using marketlevel data on the relative share of MA plans. Specifically, our estimation equation is as follows:
ln(Sc(j)m ) − ln(S0m ) = xc(j)m β − αFc(j) + σln(Sc(j)m|g ) + ξc(j)m ,
20 The
robustness of our plan entry results to bandwidth selection is summarized in Appendix D.
15
(6)
where xc(j)m denotes observed plan/contract characteristics, and ξc(j)m denotes the mean utility derived from unobserved plan characteristics. We estimate the parameters of equation 6 using two-stage
least squares (2SLS) due to the endogeneity of within-group shares, Sc(j)m|g , and plan premiums, Fc(j) .
We take as instruments the number of contracts operating in a county, the number of hospitals in a
county, the Herfindahl-Hirschman Index (HHI) for hospitals in a county (based on discharges), and
the number of physicians in the county. The results of this regression are presented in Appendix D.
With estimates of the mean observed utility, δ̂c(j)m , and the within-group correlation, σ̂, estimated
monthly consumer surplus for a representative MA beneficiary is then derived as follows (Manski &
McFadden, 1981; Town & Liu, 2003; Maruyama, 2011):

Wi =
1
(1 − σ̂) ln 
α̂
X
exp
j∈Jm
!
ˆ
δ̂c(j)m + ξc(j)m
.
1 − σ̂
(7)
Our results yield an estimated $120 reduction in yearly consumer surplus per beneficiary for every
$10 increase in premiums (all else equal). In 2010, there were approximately 1,080,000 beneficiaries
enrolled in a 3, 3.5, or 4-star MA plan with a summary score just above the relevant threshold value.
Assuming a $20 increase in premiums from 2009 to 2010 (the smallest estimated effect in Tables 3 and
4), this yields a total reduction in consumer surplus of approximately $259 million.
9
Discussion
The potential supply-side response of MA contracts to the CMS quality rating system is critical both
from a policy perspective as well as a consumer welfare perspective. If contracts can take advantage
of improved quality scores by increasing premiums (holding the contract’s true quality constant), then
this suggests a lack of competitiveness in the MA market with contracts raising prices without any
true improvement in quality. Building on the initial results of Darden & McCarthy (forthcoming),
the current paper finds strong evidence of such premium increases among average to above average
star-rated contracts.
Based on the results in Section 5 and the range of sensitivity analyses in Section 6, we conclude
that the increases in premiums for 3-star versus 2.5-star contracts (the 2.75 threshold) as well as 3.5star versus 3-star contracts (the 3.25 threshold) are not due to chance but are instead reflective of a
true increase in premiums following an increase in reported quality. Meanwhile, we find no consistent
changes in premiums for 2.5 relative to 2-star contracts. We find some initial evidence for increases
in premiums among 4-star contracts relative to 3.5-star contracts; however, this finding is sensitive to
bandwidth specification, and the effect does not persist in our falsification tests. Plan-level results for
16
4-star rated contracts are also sensitive to the inclusion of market-level covariates,
There are likely several reasons for a contract to increase 2010 premiums in response to its prioryear quality ratings. One natural reason is pure rent extraction - contracts may seek to capitalize on
their high reported quality by charging a higher price to its existing customers. However, contracts
may also increase premiums in order to better curb adverse selection. In this case, contracts of
higher reported quality but comparable true quality may want to price-out certain customers from
the market, particularly if sicker beneficiaries are more likely to make decisions based in-part on
the quality ratings. With market level data, we cannot empirically identify either of these effects
individually. Nonetheless, our results generally suggest that the perceived benefits of the star rating
program in terms of beneficiary decision-making are at least partially offset by the supply-side response
of higher premiums.
17
References
Abraham, Jean, Gaynor, Martin, & Vogt, William B. 2007. Entry and Competition in Local Hospital
Markets. The Journal of Industrial Economics, 55(2), 265–288.
Bajari, Patrick, Hong, Han, Krainer, John, & Nekipelov, Denis. 2010. Estimating static models of
strategic interactions. Journal of Business & Economic Statistics, 28(4).
Berry, Steven, & Reiss, Peter. 2007. Empirical models of entry and market structure. In: Armstrong,
M., & Porter, R. (eds), Handbook of industrial organization, vol. 3. Amsterdam: Elsevier.
Berry, Steven T. 1994. Estimating discrete-choice models of product differentiation. The RAND
Journal of Economics, 242–262.
Bresnahan, Timothy F, & Reiss, Peter C. 1991. Entry and competition in concentrated markets.
Journal of Political Economy, 977–1009.
Cawley, John, Chernew, Michael, & McLaughlin, Catherine. 2005. HMO participation in Medicare+
Choice. Journal of Economics & Management Strategy, 14(3), 543–574.
Dafny, L., & Dranove, D. 2008. Do report cards tell consumers anything they don’t already know?
The case of Medicare HMOs. The Rand journal of economics, 39(3), 790–821.
Darden, M., & McCarthy, I. forthcoming. The Star Treatment: Estimating the Impact of Star Ratings
on Medicare Advantage Enrollments. Journal of Human Resources.
Frakt, Austin B, Pizer, Steven D, & Feldman, Roger. 2012. The Effects of Market Structure and
Payment Rate on the Entry of Private Health Plans into the Medicare Market. Inquiry, 49(1),
15–36.
Imbens, G.W., & Lemieux, T. 2008. Regression discontinuity designs: A guide to practice. Journal of
Econometrics, 142(2), 615–635.
Lee, David S, & Lemieux, Thomas. 2010. Regression Discontinuity Designs in Economics. Journal of
Economic Literature, 48, 281–355.
Manski, Charles F, & McFadden, Daniel. 1981. Structural analysis of discrete data with econometric
applications. Mit Press Cambridge, MA.
Maruyama, Shiko. 2011. Socially optimal subsidies for entry: The case of medicare payments to hmos*.
International Economic Review, 52(1), 105–129.
18
McCrary, Justin. 2008. Manipulation of the running variable in the regression discontinuity design: A
density test. Journal of Econometrics, 142(2), 698–714.
Pauly, Mark, Harrington, Scott, & Leive, Adam. 2014. ‘Sticker Shock’ in Individual Insurance under
Health Reform. Tech. rept. National Bureau of Economic Research.
Reid, Rachel O, Deb, Partha, Howell, Benjamin L, & Shrank, William H. 2013. Association Between
Medicare Advantage Plan Star Ratings and EnrollmentStar Ratings for Medicare Advantage Plan.
JAMA, 309(3), 267–274.
Stockley, Karen, McGuire, Thomas, Afendulis, Christopher, & Chernew, Michael E. 2014. Premium
Transparency in the Medicare Advantage Market: Implications for Premiums, Benefits, and Efficiency. Tech. rept. National Bureau of Economic Research.
Su, Che-Lin. 2012. Estimating discrete-choice games of incomplete information: Simple static examples. Quantitative Marketing and Economics, 1–41.
Town, Robert, & Liu, Su. 2003. The welfare impact of Medicare HMOs. RAND Journal of Economics,
719–736.
19
A
Appendix A: Star Rating Metrics
The star rating system consists of five domains, with the names of each domain, the underlying metrics
in each domain, and the data sources for each metric changing over the years. The metrics and relevant
domains for 2009 are listed in Table 9.
Table 9
20
B
Appendix B: Star Rating Calculations
Although the domains and individual metrics changed from year to year, the way in which overall star
ratings were calculated was consistent across years. The calculations follow in five steps, as described
in more detail in the CMS technical notes of the 2009, 2010, and 2011 star rating calculations:
1. Raw summary scores for each individual metric are calculated as per the definition of the metric
in question. As discussed in the text, these scores are derived from a variety of different datasets
including HEDIS, CAHPS, HOS, and others. The resulting summary scores are observed in our
dataset.
2. The summary scores in each metric are translated into a star rating. For most measures, the
star rating is assigned based on percentile rank; however, CMS makes additional adjustments in
cases where the distribution of scores are skewed high or low. Scores derived from CAHPS have
a more complicated star calculation, based on the percentile ranking combined with whether or
not the score is significantly different from the national average. The resulting stars for each
individual metric are observed in our dataset.
3. The star values from each metric are averaged among each respective domain to form domain
level stars, provided a minimum number of metric-level scores are available for each domains.
For example, in 2009 and 2010, a domain-level star was only calculated if the contract had a star
value for at least 6 of the 12 individual measures. The domain-level star ratings are observed in
our dataset.
4. Overall Part C summary scores are then calculated by averaging the domain-level star ratings
and adding an integration factor (i-Factor). The i-Factor is intended to reward consistency in a
plan’s quality across domains, and is calculated as follows:
(a) Derive the mean and variance of all individual metric summary scores for each contract.
(b) Form the distribution of the mean and variance across contracts.
(c) Assign an i-Factor of 0.4 for low variance (below 30th percentile) and high mean (above 85th
percentile), 0.3 for medium variance (30th to 70th percentile) and high mean, 0.2 for low
variance and relatively high mean (65th to 85th percentile), and 0.1 for medium variance
and relatively high mean. All other contracts are assigned an i-Factor of 0.
5. Overall Part C star ratings are then calculated by rounding the overall summary score to the
nearest half-star value.
21
We do not observe the i-Factors in the data. We therefore replicated the CMS methodology, ultimately
matching the overall star ratings for 98.8% and 98.5% of the plans in 2009 and 2010, respectively. As
discussed in the text, plans for which we were unable to replicate star ratings were dropped from the
analysis. Note also that star ratings are based on data from at least the previous calendar year and
sometimes further back depending on ease of access from CMS. New plans therefore do not have a star
rating available, nor was a star rating for such plans provided to beneficiaries.
Tables 10 and 11 presents example calculations of the overall summary score and resulting star
values for 5 contracts in 2009. The table lists the summary scores for the individual metrics along
with the corresponding star values, each of which are observed in the raw data. The high mean and
low mean thresholds for i-Factor calculations were calculated to be 3.6667 and 3.2381, respectively.
Similarly, the high variance and low variance thresholds were 1.3462 and 1.0362, respectively.
Table 10 and 11
The calculations for each contract in Table 10 are discussed individually below:
1. Contract H0150: With a mean star value of 2.583 and a variance of 0.879, the contract received
an i-Factor of 0 (due to a low mean), which provided an overall summary score of 2.583 and a
star rating of 2.5.
2. Contract H0151: With a mean star value of 2.667 and a variance of 0.8, the contract received an
i-Factor of 0 (again from a low mean), which provided an overall summary score of 2.667 and a
star rating of 2.5, just 0.083 points away from receiving a 3-star rating.
3. Contract H1558: With a mean star value of 3.967 and a variance of 1.275, the contract received
an i-Factor of 0.3 (high mean and medium variance), which provided an overall summary score
of 4.267, just 0.0167 above the 4.25 threshold required to round up to a 4.5-star rating.
4. Contract H0755: With a mean star value of 3.5278 and a variance of 1.285, the contract received an i-Factor of 0.1 (relatively high mean and medium variance), which provided an overall
summary score of 3.6278 and a star rating of 3.5.
5. Contract H1230: With a mean star value of 3.694 and a variance of 1.018, the contract received
an i-Factor of 0.4 (high mean and low variance), which provided an overall summary score of
4.094 and a star rating of 4.0.
22
C
Appendix C: Data
Our analysis merges publicly available data from several sources. As our starting point, we merge
together enrollment and contract information by month/year/contract id/plan id for all Medicare
Advantage MA contract/plans from June 2008 through December of 2011.21 For a small number of
counties, CMS reports enrollment counts at the Social Security Administrative (SSA) level.22 For
these observations, we aggregate enrollment to the county level, and, after limiting our focus to
HMO, PPO, and PFFS type contracts, we have a dataset of 50,269,123 observations at the contract
id/plan/county/month/year level.
The enrollment files are invalid for providing a census of MA contracts that operate in a given
market (county) because of migration. For example, if contract A is approved to operate in Orange
County, North Carolina, and an enrollee in contract A moves to Miami-Dade County, Florida, the
enrollment files will report positive enrollment in contract A in Miami-Dade County regardless if
contract A is approved to operate in Miami-Dade. To overcome this problem, CMS provides separate
service area files that list all contracts that are approved to operate in a given county.23 In addition to
the CMS service files, we merge our enrollment dataset to quality star data at the contract/year level24 ;
CMS contract/plan premium data25 ; Medicare Advantage market share data at the county/contract
id level26 ; and county-level census data from the American Community Survey for 2006-2010 in wide
format.
Given the size of the resulting data, we proceed in cleaning the data for 2009 and 2010 separately. In
what follows, we document our cleaning of the 2009 data, with 2010 in parenthesis. Our 2009 (2010)
data contain 19,290,326 (13,427,779) contract id/plan id/county/month observations. We begin by
dropping the 331,272 (204,355) observations from U.S. Territories and Outlying Areas. Next, we drop
all contract/plans that are specific to an employer or union-only group (these are also known as the
“800-series plans”). While the decision to eliminate these plans reduces our sample by 17,051,609
(11,988,547) observations, these contract/plans are not available to the public and are not our primary
21 CMS
at
records enrollment data in separate files from contract characteristic information.
Data are available
http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/MCRAdvPartDEnrolData/Monthly-Enrollment-by-
Contract-Plan-State-County.html
22 The
contract characteristic files contain a small number duplicate observations, which we drop.
are
available
at
http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-andReports/MCRAdvPartDEnrolData/MA-Contract-Service-Area-by-State-County.html.
For the few counties that are sub-divided
by SSA, we aggregate to the county level.
24 Contract-level
quality
data
available
at
http://www.cms.gov/Medicare/Prescription-DrugCoverage/PrescriptionDrugCovGenIn/PerformanceData.html.
25 Data
on
plan
premiums
available
at
http://www.cms.gov/Medicare/Prescription-DrugCoverage/PrescriptionDrugCovGenIn/index.html?redirect=/PrescriptionDrugCovGenIn/.
County names and FIPS codes available
at http://www.census.gov/popest/about/geo/codes.html.
26 MA
penetration
data
available
at
http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-andReports/MCRAdvPartDEnrolData/MA-State-County-Penetration.html.
23 Data
23
focus. Next, we drop the 231,655 (159,439) observations of special needs plans. Finally, we drop the
observations that did not merge perfectly between the CMS enrollment files and the service area files.
These reflect either contracts with positive enrollment in a month/year/county that were not approved
to operate in that county (due to migration) or contracts that were approved to operate in a county
but had no corresponding enrollment record. Our final sample size for 2009 is 1,422,887 (841,790)
contract id/plan id/county/month. We also collect hospital discharge data from the annual Hospital
Cost Reporting Information System (HCRIS) as well as CMS benchmark rates and average FFS costs
by county.27
27 Data
Year.html
are available at
http://www.cms.gov/Research-Statistics-Data-and-Systems/Files-for-Order/CostReports/Cost-Reports-by-Fiscal-
.
24
D
Appendix D: Additional Analyses
D.1
Robustness Checks
Figure 2 illustrates the sensitivity of the plan-level RD analysis to our bandwidth selection. As should
be the case, the figure closely follows that of the contract-level analysis from Figure 1. Generally, Figure
2 suggests that the findings from the point estimates in Table 4 are relatively persistent across alternative bandwidths (provided the bandwidths are sufficiently narrow and include a sufficient number
of contracts).
Figure 2
Figures 4 and 5 present similar graphs for the analysis of plan exit and plan entry, respectively.
The figures generally support the robustness of the point estimates in Tables 6 and 7 to our bandwidth
selection. Our analysis of plan exit and entry at the 2.75 threshold (2.5 versus 3-star contracts) is one
possible exception, with the statistical significance, magnitude, and sign of the point estimates changing
within bandwidths from 0.1 to 0.2. In terms of hypothesis testing, we interpret this as evidence in
favor of the null that the star rating has no effect on plan exit or entry at the 2.75 threshold. As
such, the qualitative findings from our point estimates in Table 6 are unchanged, while the overall
findings from our analysis of plan entry (top panel in in Table 7) are less definitive among 3.0 relative
to 2.5-star rated contracts.
Figures 4 and 5
D.2
Welfare Analysis
The results of estimating equation 6 with OLS and 2SLS are presented in Table 12 along with the
first-stage results for the 2SLS estimator.
25
E
Tables and Figures
Table 1: Summary Statistics
2009
2010
Mean (S.D.)
Mean (S.D.)
Plan-level Data, n=51,442 and 34,642
Enrollmenta
291.55 (1,413) 361.17 (1,600)
Overall Share %
1.18
1.26
Within-nest Share, %
28.87
31.07
53.27 (52.97)
Premium
37.69 (42.23)
Drug Coverage, %
58.58
64.39
24.12
HMO, %
16.32
PPO, %
18.53
33.71
Market Characteristics, n=3,139 and 3,094
MA Penetration
15.59 (11.03)
16.50 (12.12)
Mean Number of Plans
37.38 (22.31)
26.61 (17.58)
12.59 (35.74)
Population > 65 in 1,000s
12.22 (34.90)
Population > 85 in 1,000s
1.72 (5.11)
1.79 (5.34)
Unemployed, %
5.79
9.01
White, %
86.30
86.41
Black, %
9.11
9.18
Female, %
50.16
50.17
College Graduates, %
18.68
18.62
42.63
South, %
42.02
Contract-level Star Ratings %, n=252 and 295
0.00
1.5
1.98
2.0
9.92
4.07
2.5
24.21
24.41
3.0
28.97
29.83
3.5
21.43
20.67
4.0
11.11
12.20
4.5
2.38
7.78
5.0
0.00
1.02
a Enrollment data available for 20,768 plans in 2009 and 17,334 plans in 2010. Remaining plans have 10 or fewer
enrollments and specific enrollments are therefore not provided by CMS.
26
Table 2: OLS Results for Average Characteristicsa
Star Indicator
2.5
3.0
3.5
4.0
y = Average Premium
γ̂2
-5.18***
6.74*** 6.15*** -8.84***
(1.55)
(1.39)
(1.54)
(2.52)
N
4,303
4,182
2,672
1,213
0.52
0.66
0.71
0.75
R2
y = Proportion of $0 Premium Plans
γ̂2
0.17*** -0.13*** 0.03***
-0.03**
(0.02)
(0.01)
(0.01)
(0.01)
4,303
4,182
2,672
1,213
N
R2
0.36
0.70
0.75
0.63
a OLS regression of the 2010 mean characteristics on the relevant 2009 mean characteristic and star ratings. Regressions estimated separately for each star rating, with γ̂2 denoting the estimated effect of a one-half star increase
in quality ratings. Contract-level averages are based on all plans with more than 10 enrollments. Standard errors in
parenthesis are robust to clustering at the county level. Additional controls not in the table include county-level variables on the population over 65, population over 85, unemployment rate, percent white, percent black, percent female,
regional dummy (south), percent graduating college, and the number of MA plans and contracts in the county, the
CMS benchmark payment rate and average FFS cost, and number of physicians in the county, as well as contract-level
variables including the number of counties in which the contract operated in 2009, whether the contract operates as
an HMO or PPO, and the total number of enrollees under the contract in 2009. * p<0.1. ** p<0.05. *** p<0.01.
Table 3: RD Results for Average Characteristicsa
Star Threshold
2.25
2.75
3.25
3.75
y = Average Premium
γ̂2
4.81 33.60*** 29.30*** 31.85***
(4.27)
(7.27)
(6.12)
(6.38)
N
2,029
982
432
309
R2
0.39
0.72
0.69
0.92
y = Proportion of $0 Premium Plans
γ̂2
-0.14*
-0.16**
0.02
-0.13*
(0.08)
(0.06)
(0.04)
(0.07)
N
2,029
982
432
309
R2
0.21
0.90
0.72
0.55
a Results based on OLS regressions with RD approach and a bandwidth of h = 0.125. Robust standard errors in
parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an insufficient
number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Regressions estimated
at the contract level, with dependent variables measured as the average value of each plan characteristic by contract
(excluding plans with 10 or fewer enrollments). Additional controls not in the table include county-level variables
on the population over 65, population over 85, unemployment rate, percent white, percent black, percent female,
regional dummy (south), percent graduating college, and the number of MA plans and contracts in the county, the
CMS benchmark payment rate and average FFS cost, and number of physicians in the county, as well as contract-level
variables including the number of counties in which the contract operated in 2009, whether the contract operates as
an HMO or PPO, and the total number of enrollees under the contract in 2009. * p<0.1. ** p<0.05. *** p<0.01.
27
Table 4: RD Results for Plan-level Characteristicsa
Star Threshold
2.25
2.75
3.25
3.75
y = 2010 premium
γ̂2
5.00** 19.40*** 41.99*** 31.52***
(2.10)
(3.93)
(5.17)
(5.10)
4,912
6,894
1,024
1,082
N
R2
0.63
0.76
0.83
0.94
y = Indicator for $0 premium plan in 2010
γ̂2
0.04 -0.32***
0.02 -0.15***
(0.04)
(0.05)
(0.03)
(0.05)
N
4,912
6,894
1,024
1,082
R2
0.24
0.89
0.51
0.59
a Results based on OLS regressions with RD approach and a bandwidth of h = 0.125. Robust standard errors in
parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an insufficient
number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Regressions estimated
at the plan level for all plans in the dataset. Additional controls not in the table include county-level variables on the
population over 65, population over 85, unemployment rate, percent white, percent black, percent female, regional
dummy (south), percent graduating college, and the number of MA plans and contracts in the county, the CMS
benchmark payment rate and average FFS cost, and number of physicians in the county, as well as the plan’s total
number of enrollees in 2009 (set to 0 if missing), an indicator variable for missing number of enrollees (¡10 enrollees
in the plan), an indicator for HMO or PPO plan type, and the lagged dependent variable. * p<0.1. ** p<0.05. ***
p<0.01.
Table 5: Summary of Plan Exit and Entrya
2009 Rating Exit (%) Entry (%)
1.5 Star
99.49
36.51
2.0 Star
51.40
55.16
2.5 Star
53.58
52.79
3.0 Star
29.37
23.91
3.5 Star
25.97
17.20
4.0 Star
8.25
32.45
4.5 Star
8.24
7.72
All
49.77
38.20
a Exit
defined as the same plan-county-contract observation in 2009 no longer active in 2010.
28
Table 6: RD Results for Plan Exita
Star Threshold
2.25
2.75
3.25
3.75
Overall Results
γ̂2
-0.83***
-0.07
0.12** -0.25***
(0.06)
(0.09)
(0.06)
(0.06)
10,791
9,806
1,177
1,435
N
Among Plans with Premiums = $0
γ̂2
-0.84***
0.25** 1.07***
-0.07
(0.06)
(0.11)
(0.30)
(0.05)
N
9,110
613
140
281
Among Plans with Premiums > $0
γ̂2
-1.37*** -0.82***
0.04 -0.36***
(0.13)
(0.12)
(0.05)
(0.07)
N
1,681
9,193
1,037
1,154
a Results based on linear probability model with RD approach and a bandwidth of h = 0.125. Robust standard
errors in parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an
insufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Additional
controls not in the table include county-level variables on the population over 65, population over 85, unemployment
rate, percent white, percent black, percent female, regional dummy (south), percent graduating college, and the
number of MA plans and contracts in the county, the CMS benchmark payment rate and average FFS cost, and
number of physicians in the county, as well as 2009 plan characteristics and enrollment. * p<0.1. ** p<0.05. ***
p<0.01.
Table 7: RD Results for Plan Entrya
Star Threshold
2.25
2.75
3.25
3.75
Overall Results
γ̂2
0.06 -0.23*** 0.18*** 0.30***
(0.12)
(0.07)
(0.06)
(0.06)
N
6,352
2,453
1,252
852
Among Plans with Premiums = $0
γ̂2
-0.76***
-0.02 -1.80** 0.65***
(0.08)
(0.09)
(0.75)
(0.12)
N
3,360
793
171
331
Among Plans with Premiums > $0
γ̂2
2.34*** -1.28*** 0.22*** 0.20***
(0.16)
(0.19)
(0.07)
(0.06)
N
2,992
1,660
1,081
521
a Results based on linear probability model with RD approach and a bandwidth of h = 0.125. Robust standard
errors in parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an
insufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Additional
controls not in the table include county-level variables on the population over 65, population over 85, unemployment
rate, percent white, percent black, percent female, regional dummy (south), percent graduating college, the CMS
benchmark payment rate and average FFS cost, and number of physicians in the county, as well as plan characteristics
(premium, Part D participation, and HMO versus PPO). * p<0.1. ** p<0.05. *** p<0.01.
29
Table 8: RD Results for Premiums without Covariatesa
Star Threshold
2.25
2.75
3.25
3.75
y = Mean Contract Premiums
γ̂2
12.82 16.25*** 28.58*** 26.97***
(3.26)
(4.53)
(5.09)
(12.66)
N
2,029
982
432
309
y = Individual Plan Premiums
γ̂2
-4.34*** 10.88*** 31.27***
8.36
(1.59)
(2.31)
(3.42)
(7.23)
N
4,912
6,894
1,024
1,082
a Results based on RD with triangular kernel and a bandwidth of h = 0.125. Results were excluded for the 1.5 and
4.5 star ratings due to an insufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds,
respectively. * p<0.1. ** p<0.05. *** p<0.01.
30
31
Follow-up Visit within 30
Days of Discharge after
Hospital Stay for Mental
Illness (HEDIS)
Doctor Follow-up for
Depression (HEDIS)
Colorectal Cancer Screening
(HEDIS)
a Description
Call Answer Timeliness
(HEDIS)
Doctors Who Communicate
Well (CAHPS)
Customer Service (CAHPS)
Overall Rating of Health
Plan (CAHPS)
Overall Rating of Health
Care Quality (CAHPS)
Getting Appointments and
Care Quickly (CAHPS)
Plan Responsiveness
and Care
Controlling Blood Pressure
(HEDIS)
Rheumatoid Arthritis
Management (HEDIS)
Testing to Confirm COPD
(HEDIS)
Continuous Beta Blocker
Treatment (HEDIS)
Improving Bladder Control
(HOS)
Reducing the Risk of
Falling (HOS)
Diabetes Care - Kidney
Disease Monitoring
(HEDIS)
Diabetes Care - Blood
Sugar Controlled (HEDIS)
Diabetes Care - Cholesterol
Controlled (HEDIS)
Antidepressant Medication
Management (HEDIS)
Diabetes Care - Eye Exam
(HEDIS)
Osteoporosis Management
(HEDIS)
Managing Chronic
Conditions
Plan Makes Timely
Decisions about Appeals
(IRE)
Reviewing Appeals
Decisions (IRE)
Handling of Appeals
of domains and additional details available at www.cms.gov. Data source for CMS calculations provided in parenthesis.
Monitoring Physical
Activity (HOS)
Appropriate Monitoring of
Patients Taking Long-Term
Medications (HEDIS)
Annual Flu Vaccine
(CAHPS)
Pneumonia Vaccine
(CAHPS)
Improving or Maintaining
Physical Health (HOS)
Improving or Maintaining
Mental Health (HOS)
Osteoporosis Testing (HOS)
Getting Needed Care
without Delays (CAHPS)
Access to Primary Care
Doctor Visits (HEDIS)
Breast Cancer Screening
(HEDIS)
Cardiovascular Care Cholesterol Screening
(HEDIS)
Diabetes Care - Cholesterol
Screening (HEDIS)
Glaucoma Testing (HEDIS)
Getting Timely Care
from Doctors
Staying Healthy
Table 9: Domains, Metrics, and Data Sources for 2009 MA Star Rating Programa
32
Breast Cancer Screening
Colorectal Cancer Screening
Cardiovascular Care - Cholesterol Screening
Diabetes Care - Cholesterol Screening
Glaucoma Testing
Appropriate Monitoring for Long-term Medications
Annual Flu Vaccine
Pneumonia Vaccine
Improving or Maintaining Physical Health
Improving or Maintaining Mental Health
Osteoporosis Testing
Monitoring Physical Activity
Access to Primary Care Doctor Visits
Follow-up after Hospital Visit for Mental Illness
Doctor Follow-up for Depression
Getting Needed Care without Delays
Getting Appointments and Care Quickly
Overall Rating of Health Care Quality
Overall Rating of Health Plan
Call Answer Timeliness
Doctors Who Communicate Well
Customer Service
Osteoporosis Management
Diabetes Care - Eye Exam
Diabetes Care - Kidney Disease Monitoring
Diabetes Care - Blood Sugar Controlled
Diabetes Care - Cholesterol Controlled
Antidepressant Medication Management
Controlling Blood Pressure
Rheumatoid Arthritis Management
Testing to Confirm COPD
Continuous Beta Blocker Treatment
Improving Bladder Control
Reducing the Risk of Falling
Plan Makes Timely Decisions about Appeals
Reviewing Appeals Decisions
H0150
2
2
3
3
3
4
3
3
3
3
1
3
4
3
1
3
1
3
4
4
3
3
1
3
3
2
2
2
1
2
2
3
2
3
4
1
H0151
2
3
3
2
3
3
2
2
3
3
2
3
4
2
1
5
2
3
4
2
4
3
1
3
3
2
2
2
2
3
2
2
2
3
4
4
2
3
1
3
2
4
5
3
4
4
5
5
5
5
4
5
5
H1558
5
4
4
4
5
5
5
5
3
3
3
3
5
Stars
H0755
4
5
4
4
5
4
4
3
3
3
3
3
5
4
1
3
4
4
3
3
4
3
1
5
4
5
5
2
5
3
2
2
2
5
4
3
H1230
5
4
5
4
4
2
5
4
3
3
3
3
4
5
2
3
3
3
4
5
4
3
2
5
5
4
4
5
4
3
2
4
2
4
5
3
Table 10: Star Rating Calculation Examples
H0150
59
35
79
77
60
90
67
67
60
81
56
46
94
46
5
83
68
84
86
83
90
88
17
53
76
53
33
44
33
68
24
79
37
55
86
66
39
55
43
79
33
63
82
77
82
58
90
83
90
92
84
93
92
H1558
87
62
93
88
84
93
87
80
59
81
68
44
99
H0755
75
71
90
92
76
90
77
68
60
82
68
48
97
72
20
86
77
86
86
81
91
88
19
79
85
87
63
43
68
73
32
73
38
63
91
77
Raw Scores
H0151
57
45
81
74
60
88
66
63
54
78
58
41
92
41
3
88
72
85
87
72
91
87
16
55
77
55
30
40
51
71
21
69
34
55
88
86
H1230
87
59
96
94
73
82
84
77
55
80
71
51
89
77
22
83
75
85
87
96
91
86
28
91
97
83
59
63
62
75
30
85
37
61
100
77
Table 11: Star Rating Calculation
H0150 H0151
Mean Summary Score
2.5833 2.6667
0.80
Variance Summary Score 0.8786
i-Factor
0
0
Summary Score
2.5833 2.6667
Star Rating
2.5
2.5
Examples, Cont.
H1558 H0755 H1230
3.9667 3.5278 3.6944
1.2747 1.2849 1.0183
0.3
0.1
0.4
4.2667 3.6278 4.0944
4.5
3.5
4
Table 12: Welfare Analysisa
OLS
2SLS
Premium
-0.00**
-0.04***
(0.00)
(0.01)
Within-group Share
0.71***
0.74***
(0.03)
(0.10)
HMO
-0.03
-1.26**
(0.09)
(0.61)
PPO
-0.21
-0.55
(0.13)
(0.38)
Part D
1.19***
2.22***
(0.11)
(0.48)
Part D Cost
-0.00
-0.00
(0.00)
(0.00)
Summary Score
0.43***
1.96***
(0.10)
(0.63)
N
20,738
18,300
First-stage Statistics
Premium Within-group Share
Contract Count
0.07
-0.00
(0.30)
(0.01)
-0.27
1.45***
Hospital Inpatient HHI
(1.31)
(0.04)
Hospital Count
-0.43***
-0.00
(0.10)
(0.00)
Total Physicians
0.00
-0.00***
(0.00)
(0.00)
F-stat
9.80
647.94
a Robust standard errors in parenthesis, clustered at the contract level. In the 2SLS estimation, premium and
within group share were instrumented using number of contracts operating in a county, the number of hospitals in
a county, the Herfindahl-Hirschman Index (HHI) for hospitals in a county (based on discharges), and the number of
physicians in the county as instruments. * p<0.1. ** p<0.05. *** p<0.01.
33
60
10
-40
20
Star Rating Coefficient, g2
-20
0
Star Rating Coefficient, g2
30
50
40
20
Figure 1: Effect of Star Rating on Mean Contract Premium for Varying Bandwidths Around Thresholds
2.25, 2.75, 3.25 and 3.75
.1
.15
.2
.25
.1
.15
Bandwidth
.2
.25
.2
.25
Bandwidth
-20
10
Star Rating Coefficient, g2
0
20
40
Star Rating Coefficient, g2
20
30
40
60
b. 2.75
50
a. 2.25
.1
.15
.2
.25
.1
Bandwidth
.15
Bandwidth
c. 3.25
d. 3.75
34
40
0
-10
-5
Star Rating Coefficient, g2
10
20
30
Star Rating Coefficient, g2
0
5
10
15
Figure 2: Effect of Star Rating on Plan Premiums for Varying Bandwidths Around Thresholds 2.25,
2.75, 3.25 and 3.75
.1
.2
.3
Bandwidth
.4
.5
.1
.2
.4
.5
.4
.5
30
0
40
Star Rating Coefficient, g2
20
40
Star Rating Coefficient, g2
50
60
70
60
b. 2.75
80
a. 2.25
.3
Bandwidth
.1
.2
.3
Bandwidth
.4
.5
.1
c. 3.25
.2
.3
Bandwidth
d. 3.75
35
-50
-40
Star Rating Coefficient, g2
0
50
Star Rating Coefficient, g2
-20
0
20
40
60
100
Figure 3: Falsification Test: Effect of Star Rating on Mean Contract Premium around Counter-factual
Thresholds
2.1
2.2
2.3
Counterfactual Threshold
2.4
2.6
2.9
b. Around the true 2.75 threshold
-100
-50
Star Rating Coefficient, g2
0
Star Rating Coefficient, g2
-50
0
50
50
100
a. Around the true 2.25 threshold
2.7
2.8
Counterfactual Threshold
3.1
3.2
3.3
Counterfactual Threshold
3.4
3.6
c. Around the true 3.25 threshold
3.7
3.8
Counterfactual Threshold
d. Around the true 3.75 threshold
36
3.9
-1.5
-1.5
Star Rating Coefficient, g2
-1
-.5
Star Rating Coefficient, g2
-1
-.5
0
0
.5
Figure 4: Effect of Star Rating on Plan Exit for Varying Bandwidths Around Thresholds 2.25, 2.75,
3.25 and 3.75
.1
.2
.3
Bandwidth
.4
.5
.1
.2
.4
.5
.4
.5
b. 2.75
Star Rating Coefficient, g2
-.4
-.2
-.4
-.6
Star Rating Coefficient, g2
-.2
0
.2
0
.4
a. 2.25
.3
Bandwidth
.1
.2
.3
Bandwidth
.4
.5
.1
c. 3.25
.2
.3
Bandwidth
d. 3.75
37
-1
-.6
-.4
Star Rating Coefficient, g2
0
1
Star Rating Coefficient, g2
-.2
0
.2
2
.4
Figure 5: Effect of Star Rating on Plan Entry for Varying Bandwidths Around Thresholds 2.25, 2.75,
3.25 and 3.75
.1
.2
.3
Bandwidth
.4
.5
.1
.2
.4
.5
.4
.5
-.1
0
0
.1
Star Rating Coefficient, g2
.1
.2
.3
Star Rating Coefficient, g2
.2
.3
.4
.5
b. 2.75
.4
a. 2.25
.3
Bandwidth
.1
.2
.3
Bandwidth
.4
.5
.1
c. 3.25
.2
.3
Bandwidth
d. 3.75
38
Download