Analysis of Medical Data Research Perspective Nancy B. Clark. M.Ed. Director of Medical Informatics Education FSU College of Medicine Spring 2003 http://www.med.fsu.edu/informatics Objectives Determine what data exist relative to a clinical question or formal hypothesis use IT to locate existing data sources identify and locate existing data sets Within institution Outside institution Analyze, interpret and report findings Select appropriate computer software: Excel, SPSS Use software to perform simple statistical analysis and portray results graphically Interpret reports Prerequisite Skills (Step 1 USMLE) • Fundamental concepts of measurement • • • • • • • Scales of measurement Distribution, central tendency, variability, probability Disease prevalence and incidence Disease outcomes (eg, fatality rates) Associations (correlation or covariance) Health impact (eg, risk differences and ratios) Sensitivity, specificity, predictive values More Prerequisite Skills (Step 1 USMLE) Fundamental concepts of hypothesis testing and statistical inference Confidence intervals Statistical significance and type I error Statistical power and type II error More Step 1 Topics Fundamental concepts of study design Types of experimental studies (eg, clinical trials, community intervention trials) Types of observational studies (eg, cohort, casecontrol, cross-sectional, case series, community surveys) Sampling and sample size Subject selection and exposure allocation (eg, randomization, stratification, self- - selection, systematic assignment) Outcome assessment Internal and external validity Scales of Measure Nominal - gender, race, color, city Ordinal - socioeconomic status of families Interval - temperature Ratio - time or space Distribution, Central Tendency… Mean …Variability, Probability… Mean Median Mode Standard deviation Statistical Significance p < .01 Confidence Interval Statistical Significance Type I and Type II errors Null Hypothesis = Ho Reject Ho Ho True Ho False Type I error Correct decision Do Not Reject Ho Correct decision Type II error Statistics Online Textbook The Statistics Homepage http://www.statsoftinc.com/textbook/stathome .html Disease Prevalence and Incidence Prevalence probability of disease in entire population at any point in time 2% of the population has diabetes Incidence probability that patient without disease develops disease during interval 0.2% or 2 per 1000 new cases per year Sensitivity, Specificity sensitivity = a / (a+c) specificity = d / (b+d) Patients with disease Test is positive Test is negative Patients without disease a b c d Predictive Value Positive predictive value = a / ( a+b) Negative predictive value = d / (c+d) Post-test probability of Test is disease given positive positive test = a / (a+b) Post-test probability of disease given negative Test is test = c / (c+d) negative Patients with disease Patients without disease a b c d Good Resource Sen, Spc, PV An Introduction to Information Mastery http://www.poems.msu.edu/InfoMastery/defa ult.htm Diagnosis Sensitivity and specificity Predictive values Likelihood ratios Fundamental Concepts of Study Design Good Resource Epidemiology for the Uninitiated BMJ Online Textbook http://bmj.com/collections/epidem/epid.shtml Finding Health Statistics Types of Health Statistics Questions Fact lookups Research Presentations Social and Policy indicators Strategies for Finding Health Stats Use Portal Start at Internet site Start with book or article Internet Portals of Health Stats Lists of links that provide starting points for browsing or searching Keyword search in portal vs Google General idea what you want The Related Health Services Research Web Sites http://www.nlm.nih.gov/nichsr/hsrsites.html The NCHS portal: http://www.cdc.gov/nchs/ Other Statistical Web Sites CDC Data and Statistics http://www.cdc.gov/scientific.htm FedStats Home Page http://www.fedstats.gov/ Compare these two U Michigan’s Statistical Resources on the WEB – HEALTH What type of stats Lexis-Nexis Statistical Universe Subscription resource Searches stat data Subject List Limit search Reports or tables http://web.lexisnexis.com/statuniv?B1=Connect+to+Statistic al+Universe MMWR Morbidity – illness Mortality – death http://www.cdc.gov/mmwr/ Disease Trends Tables - searchable Health Care Data Healthcare Cost and Utilization Project HCUPnet Hospital discharges Ambulatory service Costs Amount of care By diagnosis and procedure Surveys of hosp, physicians, nursing homes Health Consequences Costs to society, individuals Cost from care Costs of illness Impact on infrastructure HCFA=>CMS Health Accounts http://cms.hhs.gov/statistics/nhe/default.asp State and International Data Floridahealthstat.com - Where Florida Health Data Resides DOH Epidemiology KFF State Health Facts Online United Nations Statistics Division World Health Organization Research Tools Individual Datasets EMR Billing CDCS Customized data collection tools Data Analysis Selecting the Appropriate Software Spreadsheet Numerical (interval or ratio) data Sums Averages Standard deviations Simple charts and graphs Statistical Software Nominal or Ordinal data Comparisons of two+ groups Frequency tables Complicated charts and graphs Normal curves Class intervals Statistical significance Statistical Software SPSS Provided by request/justification Lab Computers Start => Programs => SPSS for Windows => SPSS 11.0 for Windows Start Screen Don’t show this dialog in the future. OK Open Breast Cancer Survival Data View Views Variables View File Information Utilities Menu File Info… Output window Descriptive Statistics Analyze Menu Descriptive Statistics Frequencies Select Age ► Click Statistics button In Central Tendency Mean, Median, Mode In Dispersion Standard Deviation, variance In Percentile Values Quartiles Continue OK Graphing Graphs Menu Pie… Summary for Groups of cases Lymph Nodes ► OK Histogram with Normal Curve Graphs Menu Histogram.. Select Age ► Check Display Normal Curve OK Simple Correlation Analysis Age and Tumor Size Analyze Menu Correlate… Bivariate Select Age ► Select Pathological Tumor Size ► Check Pearson and Spearman – Two tailed OK Is there a correlation? Negative or Positive? Is it statistically significant? Save Output Save on All Users drive Under Nancy.clark SPSS Output Files Name it your name: ie, KerryBachista.spo Importing Data From Excel, SAS, dBase, etc. Variable names first row File Menu, Open Files of Type Data… Excel Tutorial, Samples Demo.exe Type in Labels Pick Type of variable Enter Value Labels Etc. SPSS Tutorials In the Help Menu On Informatics Web page Books: Statistics for Social & Health Research (Sage) Argyrous, George Statistics Applied to Clinical Trials (Klawer Academic Publishers) Cleophas, Ton J., et al Objectives Determine what data exist relative to a clinical question or formal hypothesis use IT to locate existing data sources identify and locate existing data sets Within institution Outside institution Analyze, interpret and report findings Select appropriate computer software: Excel, SPSS Use software to perform simple statistical analysis and portray results graphically Interpret reports Questions?