Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’

Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’ Compensation Insurance David Speights Senior Research Statistician HNC Insurance Solutions Irvine, California Presentation Outline • Introduction to the problem • Introduction to Bootstrap Resampling • Two resampling approaches for comparing two groups • Examples • Conclusions Introduction to the Problem • Compare two groups from observational data – Outcome (Y) {e.g. Claim Cost} – Characteristics (X) have distributions F1 and F2 • Difficulties – F1 F2 – X is associated with Y (i.e. X is a confounder) – example: claim severity associated with claim cost Introduction to the Problem • Ideal solution – Randomize subjects into the two groups – Ideal solution not usually possible • Alternate solution {Topic of the paper} – Identify characteristics where F1 F2 – Adjust the distribution of Y to account for the differing distributions of X Introduction to Bootstrap Resampling • Purpose – Obtain the distribution of a parameter estimate (i.e. sampling distribution) – Not rely on assumptions about the underlying distribution – Often used when parameter estimate • has difficult to obtain distribution • relies heavily on unrealistic assumptions Introduction to Bootstrap Resampling • Given Data – {X1, X2, …, Xn} where Xi is a p x 1 vector – X has unspecified distribution F • Parameter of interest Q – Q = T(F) is a parameter of interest • We want the distribution of ˆ  T ( Fˆ ) Q Introduction to Bootstrap Resampling • Distribution of Q̂ – usually obtained through theoretical properties if repeated sampling is performed on a population with a known distribution of X – bootstrap techniques resample from the data to simulate repeated sampling from the population with unknown distribution of X Introduction to Bootstrap Resampling Example -- Population Mean • Example -- Population Mean Q  T ( F )   xdF ( x) • Resample with replacement from data – Data is (X1, …, Xn). – Each data point equally likely to be selected – Resampled data is (X(b)1, …, X(b)n). ˆ (b) is the bth bootstrap estimate of m –Q (b ) (b ) ˆ ˆ Q   xdF ( x)  1 n X (b) i X (b ) Introduction to Bootstrap Resampling Example -- Population Mean • B bootstrap samples are drawn • Distribution of Q̂ is estimated with the empirical distribution function of ˆ (1) , ..., Q ˆ ( B) ) (Q • Mean and variance of this distribution used to estimate mean and variance of Q̂ Two Resampling Methods for Comparing Two Groups • Method 1: Normalized comparisons – Y is a response of interest – X is a category variable, confounder – Z=1 for group 1, Z=2 for group 2 – F(Y|Z=1) normalized for distribution of X in group 2 F ( 2) (Y | Z  1)   F (Y | Z  1, X  x j ) P( X  x j | Z  2) all x – F(Y|Z=2) non- normalized F (Y | Z  2)   F (Y | Z  2, X  x j ) P( X  x j | Z  2) all x Two Resampling Methods for Comparing Two Groups • Method 1: Normalized comparisons – Resample from (Yi,Xi) seperately for groups 1 and 2 – Construct estimates of F(Y|X=xj) and P(X=xj) for two groups – Construct estimates of the normalized distribution functions on the previous slide – Parameter estimates can be obtained from this Two Resampling Methods for Comparing Two Groups • Method 2: Bootstrapping linear regression – – – – Y is a response of interest X is vector of variables, confounders Z=1 for group 1, Z=2 for group 2 Use the regression model Y    I ( Z  2)  X '    Two Resampling Methods for Comparing Two Groups • Method 2: Bootstrapping linear regression – Estimate (, , ) with (ˆ , ˆ , ˆ ) the least squares estimates on original data – Resample with replacement from the residuals – Construct the bth bootstrap value of Y as Yi (b)  ˆ  ˆI (Zi  2)  X i ' ˆ   i(b ) – bth bootstrap sample is (Yi ( b ) , Z i , X i ) i  1, ..., n Two Resampling Methods for Comparing Two Groups • Method 2: Bootstrapping linear regression – Construct estimates of (, , ) with (ˆ , ˆ , ˆ ) ( b ) the least squares estimates on bootstrap sample – Using the B bootstrap estimates of (, , ), construct the distribution of the parameters of interest Examples Using Data from a Nationwide Data Base of Workers Compensation Claims • Normalized comparisons of percentiles – – – – Y= Total claim cost Group 1: Providers in network A Group 2: Providers not in network A X is a 10 level variable representing claim severity derived through ICD9 code on a claim – B = 500 bootstrap sample drawn – median, 75th, and 95th percentiles compared – Normalization relative to group 1 Examples Using Data from a Nationwide Data Base of Workers Compensation Claims • Normalized comparisons of percentiles Examples Using Data from a Nationwide Data Base of Workers Compensation Claims • Bootstrapping linear regression – Y = log(Total Indemnity Costs) – X consists of several variables • NCCI body part designation, nature of injury designation, accident cause, industry class code, and injury type • 10 level claim severity measure derived with ICD9 code • Age and gender – Group 1: Specific provider of interest (Provider Z) – Group 2: All other providers – B=500 bootstrap samples Examples Using Data from a Nationwide Data Base of Workers Compensation Claims Conclusions • Bootstrap methodology is a flexible robust method for deriving sampling distributions • Can be used to compare two groups while considering possible confounder variables • Useful method for observational studies • Only a few examples shown in this paper/presentation, much more potential

Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’

Related documents

Products

Support

Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib