Error Why do we care?

advertisement
Error
Why do we care?
Reliability
• Reliability
• Degree to which measures are free from
random error and, therefore, provide consistent
data.
• There are three ways to assess reliability
• Test-retest, equivalent forms, and internal
consistency (see the next slide)
Assessing the Reliability of a
Measurement Instrument
Test-retest reliability:
Use the same instrument a second time under
as nearly the same conditions as possible.
Equivalent form reliability:
Use two instruments that are as similar as
possible to measure the same object during the
same time period.
Internal consistency reliability:
Compare different samples of items being
used to measure a phenomenon during the
same time period.
Validity
• Extent to which a measurement instrument
actually measures the attribute it was
intended to measure.
• Validity can be examined from a number of
different perspectives, including:
• Face, content, criterion-related, and construct
validity (see next slide)
Assessing the Validity of a
Measurement Instrument
Face validity
Researchers judge the degree to which a measurement
instrument seems to measure what it is supposed to.
Content validity
The degree to which the instrument items represent
the universe of the concept under study.
Criterion-related
validity
The degree to which a measurement instrument can
predict a variable that is designated a criterion.
(a) Predictive ability; (b) Concurrent validity
Construct validity
The degree to which a measure confirm a hypothesis
created from a theory based upon the concepts under
study.
(a) Convergent validity; (b) Discriminate validity
Illustrations of Possible
Reliability and Validity Situations
in Measurement
Situation 1
.
.
.
.
.
.
Situation 2
.
Situation 3
.........
. .
.
.. ..
..
.
.
Neither reliable
nor Valid
Highly reliable
but not valid
Highly reliable
and valid
Measurement
The Concept of Measurement
and Measurement Scales
• Measurement
• Process of assigning numbers or labels to things
in accordance with specific rules to represent
quantities or qualities of attributes.
• Rule: A guide, method, or command that tells a
researcher what to do.
• Scale: A set of symbols or numbers constructed to
be assigned by a rule to the individuals (or their
behaviors or attitudes) to whom the scale is applied.
Types of Measurement Scales
• Nominal Scales
• Scales that partition data into mutually
exclusive and collectively exhaustive
categories.
• Ordinal Scales
• Nominal scales that can order data.
• Interval Scales
• Ordinal scales with equal intervals between
points to show relative amounts; may include
an arbitrary zero point.
• Ratio Scales
• Interval scales with a meaningful zero point so
that magnitudes can be compared
arithmetically.
Nominal
Win
Place
Show
Ordinal
Interval
1 length
2 lengths
Ratio
40 to 1 long-shot pays $40
Type of Scale
Nominal
Numerical
Operation
Counting
Descriptive
Statistics
Frequency;
Percentage; mode
Ordinal
Rank ordering
Interval
(plus…)Median
Range; Percentile
(plus…) Mean;
Standard deviation;
variance
Arithmetic
operations on
intervals bet
numbers
Arithmetic
(plus…) Geometric
operations on actual mean; Co-efficent
quantities
of variation
Ratio
Selecting appropriate univariate
statistical method
Scale
Nominal
Scale
Business
Problem
Identify sex
of key
executives
Statistical
question to be
asked
Possible test
of statistical
significance
Is the number Chi-square
of female
test
executives
equal to the
number of
males
executives?
Scale
Nominal
Scale
Business
Problem
Indicate
percentage of
key
executives
who are male
Statistical
question to be
asked
Possible test
of statistical
significance
Is the
T-test
proportion of
male
executives the
same as the
hypothesized
proportion?
Scale
Ordinal scale
Business
Problem
Compare
actual and
expected
evaluations
Statistical
question to be
asked
Possible test
of statistical
significance
Does the
Chi-square
distribution of test
scores for a
scale with
categories of
poor,good,
excellent
differ from an
expected
distribution?
Scale
Interval or
Ratio scale
Business
Problem
Statistical
question to be
asked
Possible test
of statistical
significance
Compare
actual and
hypothetical
values of
average salary
Is the sample
mean
significantly
different from
the
hypothesized
population
mean?
Z-test (sample
is large)
T-test (sample
is small)
Questionnaire design
A survey is only as good as the
questions it asks
What should you ask?
• The questions asked are a function of
previous decisions
• The questions asked are a function of future
decisions (such as statistical analysis)
Key criteria
• Questionnaire relevancy
• No unnecessary information is collected and
only information needed to solve the problem is
obtained. Be specific about your data needs; tie
each question to an objective
• Questionnaire accuracy
• Information is both reliable and valid
Phrasing Questions
• Open ended response versus fixed
alternative questions
“?”
• Decision criteria: type of research; time;
method of delivery; budget; concerns
regarding researcher bias
Avoid
•
•
•
•
Leading questions
Overly complex questions
Use of jargon
Loaded questions (can use a counterbiasing
statement)
• Ambiguity
• Double barreled questions
• Making assumptions
Order?
• Order bias results from an alternative
answer’s position in a set of answers or
from the sequencing of questions
• Funneling technique: general to specific helps
understand the frame of reference first
• Anchoring effect: the first concept
measured tends to become a comparison
point from which subsequent evaluations
are made
Types of questions
Types of fixed alternative questions…
• Single dichotomy or dichotomous-alternative
questions
“Are you currently registered in a course at the
University of Lethbridge?
Yes____ No____”
• Respondent chooses one of two alternatives
(yes/no; male/female)
• What scale would this data create?
Types of fixed alternative questions…
• Multi-choice alternative
• Respondent chooses from several
alternatives
• Many types…
Multi-choice alternative questions…
• Determinant choice
• Choose only one from several possible responses
“Which faculty are you currently registered in at the
University of Lethbridge?
Management ___
Education ____
Arts/Science____
Health sciences____
Combined degree____
• What type of scale would these data create?
• Frequency determination
• Asks for an answer about frequency of
occurrence
In a typical week, how often do you
purchase chocolate chip cookies?
__never
__ once
__ 2 or more times
What type of scale would these data create?
• Check list
• Provide multiple answers to a single question
• Should be mutually exclusive and exhaustive
“What brands of chocolate chip cookies have
you, to the best of your memory, purchased in
the past month (check all that apply?)”
__ Dare
__ Chips A’hoy
__ Presidents Choice Decadent etc. etc.
• What type of scale would these data create?
• Attitude rating scales
Attitude:
An enduring disposition to consistently
respond to various aspect of the world,
including persons, events and objects
Typically seen as having three components:
• Cognitive
• Affective
• Behavioural
Attitude Scales: Scaling Defined
The term scaling refers to procedures for
attempting to determine quantitative measures of
subjective and sometimes abstract concepts. It is
defined as a procedure for the assignment of
numbers to a property of objects in order to
impart some of the characteristics of numbers to
the properties in question.
Affective
The feelings or emotions toward an
object
Cognitive
• Knowledge and beliefs
Behavioral
• Predisposition to action
• Intentions
• Behavioral expectations
Unidimensional
Scaling
Multidimensional
Scaling
Procedures
designed to
measure only one
attribute of a
respondent or
object
Procedures
designed to measure
several dimensions
of a respondent or
object
Attitude measuring process
•
•
•
•
Ranking
Rating
Sorting
Choice
Types of attitude scales
• Simple attitude scales
• Most basic form – respondent responds to a single
question
• Do not allow for fine distinctions or placement on
continua
• You are at a company party and are feeling nervous, but
you are obligated to be there. Do you:
__ find someone you know to buddy up with
__ take it as an opportunity to meet new people
What type of scale would these data create?
• Category scales
• More sensitive; provides more information
• Overall, how satisfied are you with the high speed
performance of your Mercedes:
__ very satisfied
__ somewhat satisfied
__ neither satisfied nor dissatisfied
__ somewhat dissatisfied
__ very dissatisfied
If you could choose, how long would each term be?
___26 weeks __ 13 weeks __ 6 weeks ___4 weeks
What type of scale would these data create?
• Summated rating scales – the Likert scale
• Respondents indicate their attitudes by
checking how strongly they agree or disagree
with statements
• Chocolate chip cookies are my preferred variety
of cookie
Strongly disagree Disagree Uncertain
(1)
(2)
(3)
Agree
(4)
Strongly Agree
(5)
What type of scale would these data create?
• Semantic Differential Rating scale
• An attitude measure consisting of a series of
seven-point bipolar rating scales allowing
response to a “concept”
Think of your favorite type of cookie. Rate it on each of
the following continua:
Hard------------------------------------------------------Soft
Lots of chips---------------------------------------Fewer chips
Crispy---------------------------------------------------chewy
What type of scale would these data create?
• Numerical Rating scale
• Similar to a semantic differential except that it uses
numbers as response options to identify response
positions instead of verbal descriptions
Think of your favorite type of cookie. Rate it on each of the following
continua:
Hard------------------------------------------------------------------------Soft
8
7
6
5
4
3
2
1
This scale is called an 8 point numerical scale, why?
What type of scale would these data create?
• Constant Sum Scales
• Attributes based on their importance to the person.
Respondents are asked to divide a constant sum to
indicate the relative importance of attributes
Example: Suppose the photocopy budget per professor
was $100 per month. How much should be allocated to
the following. Divide the $100 according to your
preference:
____ photocopying for student needs;
____ photocopying for research needs;
____ photocopying for committee needs.
====
$100 TOTAL
• Stapel Scales
• An attitude measure that places a single adjective in the
center of an even-number range of numerical values
Example:
Research Methodology
+3
+2
+1
Exciting
-1
-2
-3
• Graphic Rating Scales
• An attitude measure consisting of a graphic
continuum that allows respondents to rate an
object by choosing any point on the continuum
• Rank-Order Scales
• Scales in which the respondent compares one
item with another or a group of items against
each other and ranks them.
Example: handout
Most important skills
•
•
•
•
•
•
•
•
•
•
•
Adaptability to change
Problem identification
Listening skills
Written communication
Leadership
Informal Oral communication
Analytical thinking/problem solving
Time management
Coping with stress/job pressures
Interpersonal relations
Formal oral presentations
Most important skills
Adaptability to change
Problem identification
Listening skills
Written communication
Leadership
Informal Oral communication
Analytical thinking/problem solving
Time management
Coping with stress/job pressures
Interpersonal relations
Formal oral presentations
8
6
1
2
4
3
5
7
11
9
10
9
6
1
4
2
3
5
10
7
8
11
• Paired Comparison Scales
• Respondent is presented with two objects and is
asked to pick the preferred.
Example: Which type of cookie do you prefer
__ chocolate chip
__ oatmeal
__ I do not have a preference between these two
• Sorting
• Respondent indicates their attitudes or beliefs
by arranging items.
Example: Please sort the following cards with pictures of
cookies into the following categories
Like
Dislike
Neither like nor dislike
Decisions
• Ranking, sorting, rating or choice?
• How many categories or response
positions?
• Forced choice or nonforced choice?
• Single measure or index?
Download