Data in empirical research Fundamental issues

advertisement
Data in empirical research
Some fundamental issues
Daniel Gile
daniel.gile@yahoo.com
www.cirinandgile.com
D Gile Data in empir res
1
Reminder: Data, the foundation of progress in CSA (1)
In HSA, scholars can observe reality, and then speculate and
theorize with much freedom
The norms of caution and rigorous inferencing make this
impossible in CSA
In CSA theoretical speculation is acceptable
- As a starting point for further empirical
exploration
- As a basis for theory construction, but the theory
will need to be tested empirically
- As tentative ideas to explain findings
But unlike the situation in HSA, in CSA,
all progress is by definition based on data and their analysis
D Gile Data in empir res
2
Reminder: Data, the foundation of progress in CSA (2)
So the quality of research is limited by
the quantity and quality of the data on which it is based
In many cases, it is difficult to:
- Collect valid, relevant data
- Measure the data in a way that will help advance
towards finding an answer to the research question(s)
- Extrapolate from the data that can be collected on
part of the environment or population to which the
research question(s) apply to the whole population
If the data are not valid or representative of the population,
no reliable inferences on the population can be made
If cannot measure them adequately, they are of limited use
D Gile Data in empir res
3
Collecting data – Access and indicators
Access to the data is often problematic:
Cost, confidentiality, difficult to detect…
Cost and complexity of technical equipment
Physical access to the location
Permission to observe/record…
But more fundamentally
How do you gain access to the content of dreams?
How do you gain access to mental processes?
How do you gain access to skills for observation?
You cannot observe them directly
What you generally observe (and measure) are indicators
In other words, data are not the phenomenon itself, but an
indicator of the phenomenon – more later
D Gile Data in empir res
4
Collecting data – Identifying target data
When collecting data on a phenomenon or an indicator
Inot always easy to identify the target data from other
information picked up
When studying language skills
and using errors and infelicities as an indicator,
How identify error and infelicities in linguistic data?
When studying translation tactics
(decisions made when confronting a problem)
How distinguish between the result of a tactic
and the result of insufficient skills?
(e.g. omissions, small semantic changes)
D Gile Data in empir res
5
Problems with data validity (1)
Reminder: Research explores various phenomena in Reality
Generally, data are not the phenomena themselves,but
something believed to ‘correspond’ to them in some way
For instance,
When studying voting behavior, the data used, e.g. the number of
ballots cast in favor of a certain candidate, are not the voting
behavior itself. They are something that reflects voting behavior.
One could say that generally, data are indicators
Though the term ‘indicator’ tends to be used to call ‘something’ that is
even more remotely connected to the reality it is supposed to represent
Data are said to be valid if they correspond strongly to what
they are supposed to correspond to.
D Gile Data in empir res
6
Problems with data validity (2)
Data are valid if one or some of their features correspond
strongly to what they are supposed to correspond to in the
object of study.
Such correspondence may be required for detection only
i.e. if and only if a particular feature of the object of study exists,
the data take on a particular feature and vice-versa
(the presence of particular objects on archaeological sites is
valid data to indicate skills/beliefs/rituals in the population which
lived in these particular sites)
Quantitative correspondence may be required in other cases
(e.g. measuring the amount of radioactivity, of a particular
chemical substance etc…)
D Gile Data in empir res
7
Data validity – uncertain correspondence (1)
Voting statistics are a valid indicator of voting behavior
What about voting intentions as stated in interviews?
are they valid as an indicator of voting behavior?
They say something about voting behavior, but that something is
not enough to determine how people are going to vote
Because :
Some people may change their mind
Some people do not speak the truth
Data
Phenomenon
D Gile Data in empir res
8
Data validity – uncertain correspondence (2)
One frequent problem with data validity is the uncertain
correspondence between the data and the target phenomenon
e.g. Native speakers’ assessment of a non-native speaker’s
mastery of their language
(How sensitive are they to errors and infelicities? What are their
personal norms? What are their expectations?…)
Students’ assessment of their teachers
(Personal bias, political correctness…)
Problems because of interference from affective factors + (often
subconscious) desire to preserve self-image
Ex.: In Translation Studies, relative weight of quality components
This problem is particularly frequent in behavioral sciences
D Gile Data in empir res
9
Data validity – partial correspondence (1)
Are police reports about sexual assaults a valid indicator of
actual sexual assault activity in a given city?
Most police reports about sexual assaults probably report genuine
sexual assaults, but there are many which are never reported
because the victims are afraid to report them or ashamed
So the data are valid for one part of the phenomenon only
Data
Phenomenon
?
D Gile Data in empir res
10
Data validity – partial correspondence (2)
When data are valid for one part of the phenomenon only,
whereas exploration of the whole phenomenon is sought
How safe is it to extrapolate from info on part of the phenomenon
only?
(This is distinct from the issue of representativeness, taken up
later)
Example:
A single test to test language proficiency?
Language proficiency is multi-dimensional
(declarative knowledge, procedural knowledge, distinct skills like
pronunciation, fluency, reading ability, listening
comprehension ability, flexibility in using various registers…)
D Gile Data in empir res
11
Validity of other research environment components
The validity of the data/the indicator chosen is not the only
validity issue in empirical research
As will be seen later, especially in experimental research
Ecological validity can be an issue
Task
Environment
Participants
D Gile Data in empir res
12
Measurable data
Often, advancing towards an answer to the research question(s)
requires some kind of measurement of data
(intensity, magnitude, amount, frequency…)
In some cases, this is rather easy
(thermometer, number of ballots cast, money/time spent…)
In other cases, it is difficult
(intensity of feelings, ‘amount’ of deviation from a norm…)
D Gile Data in empir res
13
Representative data (1)
Generally, it is not possible to have data on all the object of study
(cost, time [including future], physical access…)
You can only access data on part of it
They may be valid and measurable,
but are they representative of the whole object of study?
Or of part of it only?
Data
Phenomenon
D Gile Data in empir res
14
Representative data (2)
If the phenomenon is very homogeneous
If the accessible part has the same relevant features as the whole
The data are said to be representative
If not, you cannot legitimately make inferences from your sample
on the whole
Data
Phenomenon
D Gile Data in empir res
15
Validity and Representativeness
They are not the same:
Data can be valid, that is, provide reliable indications
on part of a phenomenon/object of study
(for instance, on a sample of people from a population)
Without being representative
Because it is possible that the characteristics of the sample are
different from the characteristics of the population
(for instance, the average height of a population, if the sample of
people used has a high proportion of basket-ball players)
D Gile Data in empir res
16
Priorities and strategies
Validity is particularly important
Scientifically legitimate inferences on a phenomenon
can only be made if the data are valid
Representativeness is less of a problem
Provided no generalization is asserted
Measurability can be important
If only to measure the actual impact of a particular factor or
feature on the object of study
Sometimes, measurability can be constructed
(scales)
But limited measurability does not mean nothing can be learned
about the object of study → Qualitative research
D Gile Data in empir res
17
The effects of variability
One other important issue in empirical research is
variability
Variability can be intrinsic to the phenomenon
(for instance in meteorological phenomena)
It can also be a feature of the data collected
Due to intrinsic variability in the phenomenon and/or
Heterogeneity in the phenomenon and/or
Variability in the collection procedures
Its effects can be very large
D Gile Data in empir res
18
CASE STUDY (FICTION): THE EFFECT OF
EXPERIENCE ON TRANSLATION QUALITY
• Suppose you want to investigate the effect of
experience on translation quality
• Suppose that in reality, on average, there is a fast
progression along the learning curve during the first 5
years, and over the next decades, translators continue
to improve, but at a lower and lower speed
D. Gile Variability
19
“REAL” AVERAGE PERFORMANCE VS.
EXPERIENCE
As measured by some valid indicator on a scale from 1 to 10
Exper.
0 yrs
5 yrs
10 yrs
15 yrs
20 yrs
25 yrs
Qual.
1
5
7
8
8.5
8.8
D. Gile Variability
20
“Real” average learning curve
10
9
8
7
6
5
4
3
2
1
0
0
ye
ar
5
ye
ar
s
10
ye
ar
s
15
ye
ar
s
20
ye
ar
s
25
ye
ar
s
Quality
D. Gile Variability
21
Effects of attitude
- The translators’ attitude towards translation may
influence the quality of their work.
- Attitudes may change over time
- Suppose that attitudes are very positive in the
beginning, that they become negative after a while
because translators are disappointed with market
conditions, and that they gradually become more
positive when they adapt to the situation.
D. Gile Variability
22
Experience vs. Attitude
Very positive to very negative to positive
Exp.
0 yrs
Attit. + + +
5 yrs 10 yrs 15 yrs 20 yrs 25 yrs
++
---
D. Gile Variability
-
+
+
23
The effect of attitude: two scenarios
Exp.
0 yrs
5 yrs
10 yrs 15 yrs 20 yrs 25 yrs
Large
influ.
+3
+2
-3
-1
+1
+1
Small
influ.
+0.3
+0.2
-0.3
-0.1
+0.1
+0.1
D. Gile Variability
24
The effect of attitude
12
10
8
Real
Large influ
Small influ
6
4
2
0
0
5
10
15
20
D. Gile Variability
25
25
The effect of attitude
- In the small influence scenario, the output pattern is
only changed marginally
- In the large influence scenario, it is changed
considerably. In particular, real improvement seems
to occur only after 10 years of experience.
D. Gile Variability
26
Controllability
- Experimenters may be able to control attitude, for instance by
telling participants that the quality of their output is important,
or that they will be assessed by peers, etc.
- But it is not realistic to assume they can control everything – the
participants’ personality, fatigue, biorhythm, likes or dislikes
of certain types of texts, themes, etc.
D. Gile Variability
27
The effect of uncontrolled variability
Assume a variability of up to ±30%, either intrinsic
or from uncontrolled factors:
Exp.
0 yrs
5 yrs
10 yrs 15 yrs 20 yrs 25 yrs
Var.
+30%
-30%
-30%
D. Gile Variability
+30%
-20%
-30%
28
The effect of uncontrolled variability
12
10
8
Real
w/ variab,
6
4
2
0
0
5
10
15
20
D. Gile Variability
25
29
The effect of variability
- With such variability, very common in empirical studies in
translation and interpreting
(actually, in such studies variability is often of several hundred
percent),
the underlying “true” pattern is severely distorted
- In particular, from the data, it seems that improvement occurs
for 15 years, after which there is a steady decline in the quality
of the translation output.
D. Gile Variability
30
Consequences and conclusion (1)
Variability is a major enemy of research, in that it is
likely to hide ‘true’ trends and suggest false trends.
In experiments, some variability is counter-balanced by
the use of control over relevant variables, both in
sampling and in the control of environmental and
independent variables
Variability is further reduced by strict design and
implementation of the experimental procedure
Replications also reduce the effects of variability by
providing data for different constellation of
parameters
D. Gile Variability
31
Consequences and conclusion (2)
But in behavioral sciences, residual variability is often very large
If you plan to do experimental research, expect to find high
variability, and do not be disappointed if this happens.
Unless you need to arrive at a ‘clear-cut result’, results that are
not clear cut can also be of interest
They may show for instance
that there is no regular, clear ‘superiority’ of one method or one
condition over another
so don’t let the probability of not reaching ‘significance’ stop you
from doing the research.
D. Gile Variability
32
The sensitivity of indicators/tools (1)
The concepts of ‘signal’ and ‘noise’:
(from radio transmission)
In empirical research, when seeking to collect data, you need
tools with a certain sensitivity
For instance, casual listeners will not necessarily spot traces of
foreign accent or infelicities in a non-native speaker
Their sensitivity to these phenomena may be too low
And they will miss the ‘signal’ which is supposed to be detected
Other listeners may be too sensitive and mistake ‘native’
deviations from norms for signs of non-native language use
(certain violations of rules of grammar, false cognates…)
D. Gile data in empir
33
Sensitivity of data collection tools (2)
a
S
e
n
s
i
t
i
v
i
t
y
At level a
At level b
At level c
b
c
Not sensitive enough. Does not pick up the signal, or picks up
part of it only
Appropriate sensitivity. Picks up the signal, not the noise
Too sensitive. Picks up the signal and the noise
D. Gile data in empir
34
The sensitivity of indicators/tools (3)
Very high sensitivity which may pick up the ‘noise’
(i.e. non-signal)
is all right if it is then possible to filter out the noise from the
signal
But often, this is not possible,
Because the noise is very similar to the signal
Other tactics may help
One is triangulation,
i.e. using a different method to throw a different light on the
phenomenon/data, including qualitative methods
D. Gile data in empir
35
Download