DATA ANALYSIS Module 4 Part 1 – Key Concepts Learning Objectives Understand the definition and purpose of data analysis Define statistical and M&E key concepts in data analysis Data Analysis Turning raw data into useful information Purpose is to provide answers to questions being asked at a program site or research questions Even the greatest amount and best quality data mean nothing if not properly analyzed—or if not analyzed at all Data Analysis Analysis does not mean using computer software package Analysis is looking at the data in light of the questions you need to answer: How would you analyze data to determine: “Is my program meeting its objectives?” Answering programmatic questions Question: Is my program meeting its objectives? Analysis: Compare program targets and actual program performance to learn how far you are from target. Interpretation: Why you have or have not achieved the target and what this means for your program. May require more information. Descriptive analysis Describes the sample/target population (demographic & clinic characteristics) Does not define causality – tells you what, not why Example – average number of clients seen per month Basic terminology and concepts Statistical terms Ratio Proportion Percentage Rate Mean Median Ratio Comparison of two numbers expressed as: a to b, a per b, a:b Used to express such comparisons as clinicians to patients or beds to clients Calculation a/b Example – In district X, there are 600 nurses and 200 clinics. What is the ratio of nurses to clinics? 600 = 3 nurses per clinic, a ratio of 3:1 200 Calculating ratios In Kwakaba district, there are 160 nurses and 40 clinics What is the nurse-to-clinic ratio? 160 40 =4 4:1 or 4 nurses to 1 clinic Proportion A ratio in which all individuals in the numerator are also in the denominator. Used to compare part of the whole, such as proportion of all clients who are less than 15 years old Example: If 20 of 100 clients on treatment are less than 15 years of age, what is the proportion of young clients in the clinic? 20/100 = 1/5 Calculating proportions Example: If a clinic has 12 female clients and 8 male clients, then the proportion of male clients is 8/20, or 2/5 12+8 = 20 8/20 Reduce this, multiple of 4 = 2/5 of clients = male Percentage A way to express a proportion (proportion multiplied by 100) Expresses a number in relation to the whole Example: Males comprise 2/5 of the clients, or 40% of the clients are male (0.40 x 100) Allows us to express a quantity relative to another quantity. Can compare different groups, facilities, countries that may have different denominators Rate Measured with respect to another measured quantity during the same time period Used to express the frequency of specific events in a certain time period (fertility rate, mortality rate) Numerator and denominator must be from same time period Often expressed as a ratio (per 1,000) Source: U.S. Census Bureau, International Database. Infant Mortality Rate • Calculation • # of deaths ÷ population at risk in same time period x 1,000 • Example – 75 infants (less than one year) died out of 4,000 infants born that year • 75/4,000 = .0187 x 1,000 = 18.7 19 infants died per 1,000 live births Calculating mortality rate In 2009, Mondello clinic had 31,155 patients on ART. During that same time period, 1,536 ART clients died. 1,536 = .049 x 1,000 = 49 31,155 49 clients died (mortality rate) per 1,000 clients on ART Rate of increase Calculation Total number of increase ÷ time of increase Used to calculate monthly, quarterly, yearly increases in health service delivery. Example: increase in # of new clients, commodities distributed Example: Condom distribution in Jan. = 200; as of June = 1,100. What is the rate of increase? 1,100 - 200 = 900/6 = 150 (150 condoms per mo) Calculating rate of increase In Q1, there were 50 new FP users, and in Q2 there were 75. What was the rate of increase from Q1 to Q2? Example: 75 - 50 = 25 /3 = 8.33 new clients/mo Central tendency Measures of the location of the middle or the center of a distribution of data Mean Median Mean The average of your dataset The value obtained by dividing the sum of a set of quantities by the number of quantities in the set Example: (22+18+30+19+37+33) = 159 ÷ 6 = 26.5 The mean is sensitive to extreme values Calculating the mean Average number of clients counseled per month – – – – – – January: 30 February: 45 March: 38 April: 41 May: 37 June: 40 (30+45+38+41+37+40) = 231÷ 6 = 38.5 Mean or average = 38.5 Median The middle of a distribution (when numbers are in order: half of the numbers are above the median and half are below the median) The median is not as sensitive to extreme values as the mean Odd number of numbers, median = the middle number Median of 2, 4, 7 = 4 Even number of numbers, median = mean of the two middle numbers Median of 2, 4, 7, 12 = (4+7) /2 = 5.5 Calculating the median Client 1 – 2 Client 2 – 134 Client 3 – 67 Client 4 – 10 Client 5 – 221 = 67 = 67+134 = 201/2 = 100.5 Use the mean or median? CD4 count Client 1 9 Client 2 11 Client 3 100 Client 4 95 Client 5 92 Client 6 206 Client 7 104 Client 8 100 Client 9 101 Client 10 92 Key messages Purpose of analysis is to provide answers to programmatic questions Descriptive analyses describe the sample/target population Descriptive analyses do not define causality – that is, they tell you what, not why Part 2: Basic analyses Part 2: Learning Objectives Identify approaches for setting targets Understand common analyses that calculate program coverage and retention Calculate program coverage and retention Terminology Indicator Target Program coverage Service availability Service utilization Program retention Indicator Program element that needs tracking Measures an aspect of a program’s performance Measures changes over a period of time • # of new family planning users • # of clients currently on ART Expressed as a number or percentage Target Definition A specified level of performance for a measure (indicator), at a predetermined point in time (i.e., achieve ‘x’ by ‘y’ date) Overall target Annual targets Why Set Targets? Targets help program staff with: Planning – Staffing and service delivery – Commodities Monitoring progress – Break long-term goals into manageable pieces – Check progress on indicators Setting Reasonable Targets The range of values for a given indicator can be from 0% to 100%. • Example: The theoretical range for the Polio indicator is between 0% of children immunized (bad) and 100% immunized (ideal) • Is it appropriate to set the Polio indicator target at 100% for a given program? Why/why not? Setting Reasonable Targets Example: In Somalia, the national CPR from 2007 to 2009 was15%. The following year, a national target was set for 70%. Is it appropriate to set the CPR target for Somalia at 70%? Why/why not? Overall Target Setting Approaches There are three approaches to set a target : Established long-term goals by contacting that national program Past performance (of your program, increasing by no more than 10%) Local high performer (a stellar program nearby) Consider the number of clients your program can realistically expect to serve during a given period of time Annual Target Setting Determine the increase your program needs to gain to reach your overall target Divide that number by the number of years in which you would like to achieve the target Add the number to your baseline indicator for each year Considerations for Target Setting Ensure you have an agreed-upon and realistic definition of target population Set a realistic target to achieve in the long term and short term Importance of Defining the Target Population: Case Example Target was 372 children to be immunized Actual was 488 children immunized To calculate the % target achieved, use (Actual/Target) * 100 488/372 = 1.31*100 = 131% How could the clinic have surpassed its target by so much? Implications of Incorrect Target Setting: Case Example You don’t really know to what extent you’re fully immunizing the children in your setting If your program purchases commodities (e.g., vaccines) based on the target set, supply could run out If you set your target too low, you may not have enough vaccines, leading to disease outbreaks Common Analyses Program Coverage Extent to which a program reaches its intended target population, institution, or geographic area Compare current performance to prior year/quarter Compare performance between sites Program Retention Extent to which the range of services is being delivered as initially intended so that client dropouts are minimal Why do we need to measure coverage? To understand program progress To determine if the target is reached Clients, commodities, adherence… To determine if one target is reached more effectively than another • Are there underserved area/regions, subpopulations? 39 Program coverage Extent to which a program reaches its intended target population, institution, or geographic area Utilization: Is the target population utilizing services, accessing commodities, being reached with services? Availability: Are the services available where there is a need? Utilization calculation Percentage of the target population utilizing services # of individuals in target population using a service ------------------------------------------# of individuals in target population X 100 Utilization calculation: Example No. of persons educated as of 6/12/09 = 300 Goal for 12/31/09 = 900 300 900 = 0.33 x 100 = 33% You have reached 33% of your target group with education messages Comparison of time periods Compare percentage achieved toward target for different time periods, different sites, etc. Rate of increase As of January, 70 people educated; by June, 300 people 300 – 70 = 230 increase in people educated 230/6 = 38.3 new people educated per month over the 6 months Utilization of PMTCT Programs All pregnant women (2,000) PMTCT Target (1,000) Target population Sought prenatal care (600) Utilization = Utilization = Service users 600/1,000 = 0.6 Counseled & Tested for HIV (500) 0.6 x 100 = 60% Program coverage Extent to which a program reaches its intended target population, institution, or geographic area Utilization: Is the target population utilizing services, accessing commodities, being reached with services? Availability: Are the services available where there is a need? Availability calculation Number of service outlets available per target population # of clinics with PMTCT per # of pregnant women Expressed as a ratio PMTCT clinic availability There are 8 clinics offering PMTCT & 100,000 pregnant women in region X. Ratio of clinics to pregnant women 8:100,000 Reduce to (1:12,500) pregnant women The standard recommendation is 1 clinic with PMTCT services per 10,000 pregnant women Clinic availability is not reaching the target Availability + Utilization = Coverage Service availability is 1:12,500 Service availability target is 1:10,000 PMTCT service utilization is 25% off the target What can we conclude? Service availability and utilization are too low; the program is not meeting the needs of pregnant women. Program retention Measures if the range of services are being delivered as initially intended Determines program retention, i.e., is the project keeping clients through entire package of services? • Important in clinical programs where drug adherence is an issue (TB, HIV/AIDS, immunization) and there are multiple steps (PMTCT) 49 Retention example: Immunization Utilization Enter service Polio dose 1 Completion Polio dose 2 Polio dose 3 All pregnant women (2,000 women) PMTCT Target (1,000) PMTCT Program Retention Sought prenatal care (600) 350 received HIVresult or no result Tested for HIV (500) 40 received prophylaxis 100 received HIV+ result All pregnant women (2,000 women) PMTCT Program Retention 1,000 Sought prenatal care 500 Tested for HIV 40 received prophylaxis 350 received HIVresult 100 received HIV+ result All pregnant women (2,000 women) PMTCT Target (1,000) PMTCT Program Retention Sought prenatal care (600) 350 received HIVresult Tested for HIV (500) 40 received prophylaxis 100 received HIV+ result All pregnant women (2,000 women) PMTCT Target (1,000) PMTCT Program Retention Sought prenatal care (600) 350 received HIVresult or no result Tested for HIV (500) 40 received prophylaxis 100 received HIV+ result All pregnant women (2,000 women) PMTCT Target (1,000) PMTCT Program Retention Sought prenatal care (600) 350 received HIVresult or no result Tested for HIV (500) 40 received prophylaxis 100 received HIV+ result Key messages Target Setting – A specified level of performance for a measure (indicator) at a predetermined point in time. Both overall and annual targets are set Coverage – extent to which a program reaches its intended target population, institution, or geographic area Retention – the extent to which the range of services are being delivered as initially intended, with clients retained throughout the full package of services Part 3: Data Presentation and Interpretation Part 3: Learning Objectives Understand different ways to best summarize data Choose the right table/graph for the right data Interpret data to consider the programmatic relevance Summarizing data Tables Simplest way to summarize data Data are presented as absolute numbers or percentages Charts and graphs Visual representation of data Data are presented as absolute numbers or percentages Basic guidance when summarizing data Ensure graphic has a title Label the components of your graphic Indicate source of data with date Provide number of observations (n=xx) as a reference point Add footnote if more information is needed Tables: Frequency distribution Set of categories with numerical counts Year Number of births 1900 61 1901 58 1902 75 Tables: Relative frequency number of values within an interval total number of values in the table Year x 100 # births (n) Relative frequency (%) 1900–1909 35 27 1910–1919 46 34 1920–1929 51 39 Total 132 100.0 Tables Percentage of births by decade between 1900 and 1929 Year Number of births (n) Relative frequency (%) 1900–1909 35 27 1910–1919 46 34 1920–1929 51 39 Total 132 100.0 Source: U.S. Census data; 1900–1929. Charts and graphs Charts and graphs are used to portray: Trends, relationships, and comparisons The most informative are simple and selfexplanatory Use the right type of graphic Charts and graphs Bar chart: comparisons, categories of data Line graph: display trends over time Pie chart: show percentages or proportional share Bar chart Comparing categories 6 5 4 Site 1 3 Site 2 Site 3 2 1 0 Quarter 1 Quarter 2 Quarter 3 Quarter 4 % o f new enrollees tested for HIV Percentage of new enrollees tested for HIV at each site, by quarter 6 5 4 3 2 Site 1 Site 2 Site 3 1 0 Quarter 1 Q1 Jan–Mar Quarter 2 Quarter 3 Q2 Apr–June Q3 July–Sept Months Quarter 4 Q4 Oct–Dec Data Source: Program records, AIDS Relief, January 2009 – December 2009.rce: Quarterly Country Summary: Nigeria, 2008 Has the program met its goal? % of new enrollees tested for HIV Percentage of new enrollees tested for HIV at each site, by quarter 60% 50% 40% 30% Site 1 Site 2 Site 3 20% 10% Target 0% Quarter 1 Quarter 2 Quarter 3 Quarter 4 Data Source: Program records, AIDS Relief, January 2009 – December 2009.. uarterly Country Summary: Nigeria, 2008 Stacked bar chart Represent components of whole & compare wholes Number of Months Female and Male Patients Have Been Enrolled in HIV Care, by Age Group Females 4 10 0-14 years 15+ years Males 3 6 0 5 10 15 Number of months patients have been enrolled in HIV care Data source: AIDSRelief program records January 2009 - 20011 Line graph Displays trends over time Number of Clinicians Working in Each Clinic During Years 1–4* Number of clinicians 6 5 4 Clinic 1 3 Clinic 2 2 Clinic 3 1 0 Year 1 *Includes doctors and nurses Year 2 Year 3 Year 4 Line graph Number of Clinicians Working in Each Clinic During Years 1-4* 6 Number of clinicians 5 4 Clinic 1 3 Clinic 2 Clinic 3 2 1 0 Y1 1995 1 Year Y2Year 19962 Y3Year 19973 Zambia Service Provision Assessment, 2007. *Includes doctors and nurses Y4 1998 4 Year Pie chart Contribution to the total = 100% Percentage of All Patients Enrolled by Quarter 8% 10% 1st Qtr 2nd Qtr 23% N=150 3rd Qtr 59% 4th Qtr Interpreting data Interpreting data Adding meaning to information by making connections and comparisons and exploring causes and consequences Relevance of finding Reasons for finding Consider other data Conduct further research Interpretation – relevance of finding Adding meaning to information by making connections and comparisons and exploring causes and consequences Relevance of finding Reasons for finding Consider other data Conduct further research Interpretation – relevance of finding Does the indicator meet the target? How far from the target is it? How does it compare (to other time periods, other facilities)? Are there any extreme highs and lows in the data? Interpretation – Possible causes? • Supplement with expert opinion • Others with knowledge of the program or target population Relevance of finding Reasons for finding Consider other data Conduct further research Interpretation – Consider other data Use routine service data to clarify questions - Calculate nurse-to-client ratio, review commodities data against client load, etc. Use other data sources Relevance of finding Reasons for finding Consider other data Conduct further research Interpretation – Other data sources Situation analyses Demographic and health surveys Performance improvement data Relevance of finding Reasons for finding Consider other data Conduct further research Interpretation – conduct further research Data gap conduct further research Methodology depends on questions being asked and resources available Relevance of finding Reasons for finding Consider other data Conduct further research Key messages Use the right graph for the right data Tables – can display a large amount of data Graphs/charts – visual, easier to detect patterns Label the components of your graphic Interpreting data adds meaning by making connections and comparisons to program Service data are good at tracking progress & identifying concerns – do not show causality Activity 5: Calculating coverage and retention Learning Objectives Use basic statistics to measure coverage and retention Develop graphs that display performance measures (utilization, trends) Interpret performance measures for programmatic decision making Small group activity Form groups of 4–6 Each group reviews 2 worksheets from Excel file and answers the questions (1 hr 45 min) Each group presents 2 findings from each worksheet, focusing on the programmatic relevance of the findings (10 min per group) Audience provides feedback on analysis and interpretation (notes errors, additional interpretation) (10 min per group) THANK YOU! MEASURE Evaluation is a MEASURE project funded by the U.S. Agency for International Development and implemented by the Carolina Population Center at the University of North Carolina at Chapel Hill in partnership with Futures Group International, ICF Macro, John Snow, Inc., Management Sciences for Health, and Tulane University. Views expressed in this presentation do not necessarily reflect the views of USAID or the U.S. Government. MEASURE Evaluation is the USAID Global Health Bureau's primary vehicle for supporting improvements in monitoring and evaluation in population, health and nutrition worldwide. Visit us online at http://www.cpc.unc.edu/measure.