Model X-Ray Image Data into ADaM BDS Structure Vincent Guo NJ CDISC Users Group meeting, Sep 17, 2014 Introduction X-ray image data is important and special efficacy data • To demonstrate long time efficacy on joint/bone structural preservation • Score system developed to quantify the assessment • Complex This presentation will cover: • SDTM data for X-ray image • Analysis requirements • Challenges, options considered, and solutions as to bridge the gap from source data to analysis • Demo of the dataset 2 | Presentation Title | Presenter Name | Date | Subject | Business Use Only SDTM Data Data is collected in a custom domain. Assessments (X-ray images) are performed by • • • • • test location (joint) body side visit two different readers and possible a third consensus read. Joint score is the result recorded in the source data. USUBJID VISIT 1 W24 1 W24 1 W24 1 W24 2 W24 2 W24 2 W24 2 W24 2 W24 2 W24 OMTEST EROSION EROSION EROSION EROSION EROSION EROSION EROSION EROSION EROSION EROSION OMLOC DIP4 DIP4 DIP4 DIP4 DIP4 DIP4 DIP4 DIP4 DIP4 DIP4 OMLT RIGHT RIGHT LEFT LEFT RIGHT RIGHT RIGHT LEFT LEFT LEFT 3 | Presentation Title | Presenter Name | Date | Subject | Business Use Only OMEVAL OMSTRESC READER 1 2 READER 2 3 READER 1 2 READER 2 3 READER 1 2 READER 2 4 CONSENSUS 3 READER 1 2 READER 2 4 CONSENSUS 3 Analysis Requirements Evaluation of Joint structural damage by visit • Parameter: Modified total Sharp score (mTSS) change from baseline • Covariate: Modified total Sharp score (mTSS) baseline • Consensus read to be used Evaluation of the proportion of subjects without disease progression at each visit Comparison of proportion of subjects with no disease progression between the two periods: from baseline to W24 versus from W24 to W52. 4 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Definition and Derivation Modified total Sharp score (mTSS) change from baseline for post-baseline assessments • Defined as sum of joint scores change from baseline • Imputation needed in case of missing joint score change from baseline: - Joints grouped into segments; segment score calculated as subtotal of joint score change from baseline within the segment: • Missing imputed with average of change from baseline of non-missing joints if >50% of joints non-missing; • otherwise, segment score is missing. - Total score (mTSS): sum of segment scores • Missing imputed with average of non-missing segments if >50% of segments nonmissing; • otherwise, total score is missing. 5 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Definition and Derivation Demo of imputation of missing joint change from baseline Baseline Post-baseline change Segment 1 Joint 1 4 6 2 Segment 1 Joint 2 5 2 -3 Segment 1 Joint 3 6 4 -2 Segment 1 Joint 4 2 3 1 Segment 1 Joint 5 3 4 1 Segment 1 Joint 6 Missing not imputed Missing not imputed -0.2 (imputed) Segment 1 Joint 7 Missing not imputed Missing not imputed -0.2 (imputed) Segment 1 Joint 8 Missing not imputed Missing not imputed -0.2 (imputed) Segment 1 Segment score N/A N/A -1.6 6 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Definition and Derivation Modified total Sharp score (mTSS) baseline • Defined as sum of joint score at baseline • No imputation in case of missing joint scores at baseline 7 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Definition and Derivation No disease progression • At each visit, defined as mTSS change from baseline <= 0 • Comparison between two periods, defined as change of mTSS change from baseline <= 0 8 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Challenges and Solutions Challenge #1: How to create PARAM for mTSS change from baseline? Solution Alternative • PARAM created for mTSS change from baseline (PARAMCD=TSSCBSI) • • AVAL stores change from baseline • Only for post-baseline visits • Different PARAMs for Reader 1, Reader 2 and consensus read. • No creation of PARAM for individual joints or individual segments 9 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Because of the definition of mTSS change from baseline, conventional method that calculates absolute total score for each visit and change from baseline at total score level is not applicable Challenges and Solutions Challenge #2: Need baseline score to be covariate Solution Alternative • PARAM created for mTSS baseline (PARAMCD=TSSBS) • • AVAL stores baseline • Only for baseline visit • Different PARAMs for Reader 1, Reader 2 and consensus read. • No creation of PARAM for individual joints or individual segments • Custom variable BASESCO (baseline mTSS score) created as a column using AVAL of this PARAM 10 | Presentation Title | Presenter Name | Date | Subject | Business Use Only • Leave it to reporting/analysis level without adding baseline score as a variable in the dataset, which is not analysis ready. Conventional BASE is not applicable for this purpose. Challenges and Solutions Demo of ADaM Dataset for Challenge #1 and #2: USUBJID PARAMCD AVISITN AVAL 1 TSSBS1 BASESCO 0 10 10 1 TSSCBSI1 16 2 10 1 TSSCBSI1 24 3 10 1 TSSCBSI1 24 2 10 1 TSSCBSI1 52 -1 10 0 11 11 1 TSSCBSI2 16 4 11 1 TSSCBSI2 24 6 11 1 TSSCBSI2 24 4 11 1 TSSCBSI2 52 0 11 0 10 10 1 TSSCBSI 16 3 10 1 TSSCBSI 24 4.5 10 1 TSSCBSI 24 3 10 1 TSSCBSI 52 -0.5 10 1 TSSBS2 1 TSSBS 11 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Challenges and Solutions Challenge #3: How to handle various imputations? Challenge Solution (a) Imputing missing data Linear extrapolation LOCF Apply ADaM methodology (insert new rows and use DTYPE) (b) Imputing missing consensus read by taking the average of Reader 1 and Reader 2 New rows for the imputed consensus reads Custom variable to indicate consensus type: original CONSENSUS (collected) or AVERAGE (imputed) 12 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Alternative It is not appropriate to use DTYPE as ADaM rule specifies that DTYPE should be used to indicate rows that are derived within a given value of PARAM but this imputation is done between parameters Challenges and Solutions Demo of ADaM Dataset for Challenge #3: USUBJID PARAMCD AVISITN AVAL 1 TSSBS1 DTYPE CONSTYPE BASESCO 0 10 10 1 TSSCBSI1 16 2 10 1 TSSCBSI1 24 3 ENDPOINT 10 1 TSSCBSI1 24 2 LOCF 10 1 TSSCBSI1 52 -1 10 0 11 11 1 TSSCBSI2 16 4 11 1 TSSCBSI2 24 6 ENDPOINT 11 1 TSSCBSI2 24 4 LOCF 11 1 TSSCBSI2 52 0 11 0 10 CONSENSUS 10 1 TSSCBSI 16 3 CONSENSUS 10 1 TSSCBSI 24 4.5 ENDPOINT CONSENSUS 10 1 TSSCBSI 24 1 TSSCBSI 52 1 TSSBS2 1 TSSBS 13 | Presentation Title | Presenter Name | Date | Subject | Business Use Only 3 LOCF -0.5 CONSENSUS 10 AVERAGE 10 Challenges and Solutions Challenge #4: How to handle no disease progression? Challenge Solution Alternative (a) Evaluation of the proportion of subjects without disease progression at each visit AVAL is change from baseline (PARAMCD=TSSCBSI) Create new PARAM CRIT1 (AVAL<=0) no disease progression at each visit Pros: • No need to create new PARAM (new rows) • Easily preserve DTYPE information (linear extrapolation, LOCF) for imputation as everything is at the same row. 14 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Cons: • Dataset actually becomes more complex due to imputation. Challenges and Solutions Demo of ADaM Dataset for Challenge #4a: CRIT1FL (AVAL<=0) DTYPE USUBJID PARAMCD AVISITN AVAL 1 TSSBS1 0 CONSTYPE 10 BASESCO 10 1 TSSCBSI1 16 2N 1 TSSCBSI1 24 3N ENDPOINT 10 1 TSSCBSI1 24 2N LOCF 10 1 TSSCBSI1 52 -1 Y 10 11 11 1 TSSBS2 0 10 1 TSSCBSI2 16 4N 1 TSSCBSI2 24 6N ENDPOINT 11 1 TSSCBSI2 24 4N LOCF 11 1 TSSCBSI2 52 0Y 1 TSSBS 0 11 11 10 CONSENSUS 10 1 TSSCBSI 16 3N CONSENSUS 10 1 TSSCBSI 24 4.5 N ENDPOINT CONSENSUS 10 1 TSSCBSI 24 3N LOCF CONSENSUS 10 1 TSSCBSI 52 -0.5 Y AVERAGE 10 15 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Challenges and Solutions Challenge #4: How to handle disease progression? Challenge Solution Alternative (b) Comparison of proportion of subjects with no disease progression between the two periods: from baseline to W24 versus from W24 to W52. For PARAMCD=TSSCBSI, Create new PARAM (e.g. one for disease progression from baseline visit to W24, another one for disease progression from W24 to W52) Populate: BASETYPE (W24 AVAL as baseline) BASE (W24 AVAL) CHG (change of change from baseline change from W24 to W52 = W52 AVAL – W24 AVAL[BASE]) CRIT2 (BASE<=0) no disease progression from baseline to W24 CRIT3 (CHG<=0) no disease progression from W24 to W52 where AVISIT=W52 Pros: • Analysis ready “one proc away”. • Easily keep DTYPE information for imputation • Data flow can be traced within the dataset. Cons: • Dataset looks complex at the first sight 16 | Presentation Title | Presenter Name | Date | Subject | Business Use Only Pros: Dataset looks simpler Cons: Not analysis ready “one proc away”. Data flow is not easily traced within the dataset. Challenges and Solutions Demo of ADaM Dataset for Challenge #4b: CRIT1FL (AVAL<=0) ABLFL BASE USUBJID PARAMCD AVISITN AVAL 1 TSSBS1 0 16 2N 1 TSSCBSI1 24 3N 1 TSSCBSI1 24 1 TSSCBSI1 52 0 CONSTYPE BASESCO 10 WEEK 24 AVAL AS BASELINE 10 3 N WEEK 24 AVAL AS BASELINE ENDPOINT 10 2N 3 N WEEK 24 AVAL AS BASELINE LOCF 10 -1 Y 3 -4 N WEEK 24 AVAL AS BASELINE 10 Y Y 11 4N 1 TSSCBSI2 24 6N 1 TSSCBSI2 24 1 TSSCBSI2 52 11 4.5 N WEEK 24 AVAL AS BASELINE 11 4.5 N WEEK 24 AVAL AS BASELINE ENDPOINT 11 4N 4.5 N WEEK 24 AVAL AS BASELINE LOCF 11 0Y 4.5 -4.5 N WEEK 24 AVAL AS BASELINE 11 Y Y 10 1 TSSCBSI 16 3N 1 TSSCBSI 24 4.5 N 1 TSSCBSI 24 1 TSSCBSI DTYPE N 16 0 BASETYPE 3 1 TSSCBSI2 1 TSSBS CRIT3FL (CHG<=0) 10 1 TSSCBSI1 1 TSSBS2 CRIT2FL (BASE<=0) CHG 3N -0.5 Y 52 (CHG 0-52) (CHG 0-52) Y CONSENSUS 10 3 N WEEK 24 AVAL AS BASELINE CONSENSUS 10 3 N WEEK 24 AVAL AS BASELINE ENDPOINT CONSENSUS 10 3 N WEEK 24 AVAL AS BASELINE LOCF 3 -3.5 N Y (CHG 0-24) (CHG 24-52) (CHG 0-24) (CHG 24-52) WEEK 24 AVAL AS BASELINE 17 | Presentation Title | Presenter Name | Date | Subject | Business Use Only CONSENSUS 10 AVERAGE 10 Conclusion Data is collected in custom domain which contains special elements that are not in standard findings domains such as LB, VS, EG. Complicated definitions and derivations lead to complexity in design and implementation of ADaM dataset. ADaM principles and methodology have been followed and adapted. It has demonstrated that sufficient tools are available for us to create a compliant and “analysis ready” ADaM dataset for this custom domain although some special situations require us to go beyond what’s specified in ADaM IG. The ADaM dataset created allows us to perform analyses easily. 18 | Presentation Title | Presenter Name | Date | Subject | Business Use Only