Translation and Cross-Cultural Equivalence of Health Measures Context: why are we interested? 1. Multinational drug trials: Need to ensure products are tested in a standard way in different countries. 2. Cross-cultural research within countries 3. International health studies (WHOQoL, etc.) 4. Evidence-based medicine: how confident can I be that results of a clinical trial overseas would apply here? Translation Simply translate an instrument and administer it Differences in response interpreted as differences in prevalence: assumes equivalence of the measurement scale in the 2 countries Almost certainly inadequate Translation + back translation Issues to consider in translating an instrument Do you wish to adapt the measure for a new country, or make comparisons across countries? Should it be a strict translation or an adaptation? Back-translation gives linguistic identity rather than equivalence Example of long-term memory item in a dementia screen “In what year did the first world war begin?” In most countries the ‘official’ language differs from the vernacular. Which do we use? We often ignore linguistic variations within countries Translation or cultural domination? Beware hidden biases in whole idea of translation. Consider a comment by Alexander Leighton (circa 1955) : He made “refinements and changes … here and there in order to convey the meaning of the English questions as accurately as possible…” Why was this instrument chosen? Is the content truly relevant in another culture? Issues to consider - continued At least some of the content of most scales is culture-specific (e.g., items in Nottingham Health Profile were seen as blasphemous in an Arabic country) Was the scale developed on a particular cultural group? Verbal translation versus creating a cultural equivalent. Words & cultural concepts An etic approach to language (phonetic) describes the physical properties of the word, without referring to its functional meaning: language The emic approach takes account of the context, meaning and purpose of a word: concepts Translation example: Friends “Does poor health prevent you from seeing your friends?” Be careful: emic meaning of “friend” differs in UK, US, and Australian forms of English Even more differences between Ami(e), Amigo and Freund Relevance of Culture Culture shapes the way we think about health and illness: it influences the attention we pay to symptoms, our reactions to pain, etc. Expectations & definitions of feeling good, etc. It influences customary behaviours, relationships with others, including people with clip boards & questionnaires: the ‘questionnaire sophistication’ of the group. It affects the way we interpret the language used in our questionnaires. Level of abstraction Concepts can be: Abstract but general E.g., Happiness, Ability These terms probably apply in different cultures, but are imprecise and subjective: their meaning may differ. However, being subjective may be sufficient in itself: perhaps a person’s subjective answer is inherently valid. (Discussion point: does it matter if happiness means slightly different things in different cultures?) Concrete and specific Number of hospital beds per capita You can compare these across cultures, but They are very context-dependent so less cross-culturally comparable. Establishing Cross-Cultural Equivalence Are you using the same general measurement procedures? Or, at least culturally equivalent approaches? (This could mean using different words) Item equivalence: items should mean the same thing to people in one culture as in another And be similarly difficult. E.g. on FAS test, items with identical meaning in French are not FAS, but T, N and P. Response scale equivalence (e.g., is the distance between “moderately severe” and “severe” the same in both cultures? Will respondents feel equally comfortable with responses like “Disagree strongly”?) Conceptual or Functional Equivalence Is the theme being measured really a universal experience? Does this construct mean the same thing in both cultures? (How do we know this?) Does it matter that a theme such as quality of life has a different range in 2 cultures? Should it be measured relative to local expectations, or in an absolute way? Do the same cause-effect relations exist in each culture? Does a similar situation lead to similar behaviours across cultures? (E.g., sick enough to go to a doctor) Developing cross-cultural measures 1. Sequential approach 2. Simultaneous approach 3. Translate an instrument into another language Conceptualize & develop measure in each culture Choose a set of equivalent items that reflect the same construct in different cultures Core instrument plus culture-specific additional components Common strategies for ensuring cross-cultural equivalence Direct translation and comparison Better translation techniques Multi-trait, multimethod matrix Item response theory methods Differential item functioning Strategies, continued Response pattern method Factor analysis Multidimensional scaling Combined etic-emic approach Multi-strategy approach Factor analysis Empirical analysis of how items relate to one another Shows how many concepts scale measures and which items measure that scale Confirmatory: must have theory about how items go together Simultaneous factor analysis in different populations Factor structure should be the same Test whether data are similar to be called equal Same factor pattern-loadings Same goodness of fit Differential Item Functioning (DIF) Related to IRT theory DIF = a difference in an item score between two groups who are equal in overall ability (e.g. as indicated by equal total scores) E.g. male & female difference in responses to sports played or symptoms of depression Uniform DIF = differences between groups at all trait levels. Nonuniform DIF = differences only at certain trait levels. Needed because tests can have matching factor structures and still be biased. Example in Crane PK. Statistics in Medicine 2004;23:241. DIF analyses Involves comparison of 2 or more groups (e.g. different languages) Step1: match people on ability (total score) 2nd step: for each score group, compare performance of reference and focal group on each item. (Reference is usually the original language) Two types of DIF Uniform Difference in item difficulty between reference and focal group Item may be more difficult for one group (perhaps translation problem?) Non-uniform Difference in item discrimination parameter between reference and focal group Translation & cultural equivalence suggestions Plan cross-cultural applications from the outset Consider relevance of quality of life carefully: omit? Avoid questionnaires! Use ‘DIF’ analyses Run within-country analyses Develop measures within each country Search for a core set of universal items (e.g. WHOQoL) Make sure the values are explicit