STATISTICS: DATA CONCEPTS

advertisement
MEASURES OF SPREAD
LEVELS OF MEASUREMENT
INTERQUARTILE RANGE
Upper quartile – Lower quartile
Gives range covered by 50% of the data
INTERVAL
Data is measured in intervals:
0-5, 5-10, 10-20,…
RANGE
Maximum – Minimum
Gives measurement of how wide the
RATIO
•
•
Variance = 1/ (n-1)*Σ(X – mean(x))2
Gives a measurement of spread, but not the
difference of units from the original data.
Always use the standard deviation when making
statements about the original data.
STANDARD DEVIATION
Square-root of Variance
Gives a measurement of spread in the same units
as the original data.
This allows checking of outliers / extreme
values.
Some data is measured in intervals for
Arithmetic operations may be applied to
those intervals with a true zero point.
ORDINAL
Fixed classification values on some kind of
scale. (Ordinal data – there is some order…)
This data is most commonly found in
questionnaires with rating agreement, or some
other measurement.
NOMINAL
Fixed classification values where the
classification value has no meaning in itself.
(Nominal data).
For example: male, female may be coded as (0, 1),
Car Manufacturer (Ford, Holden, Toyota, other) may
be coded as (0,1,2,3) respectively. It is irrelevant
which category is coded as 0, as 1, etc.
An outlier
If a value falls outside the interval:
The mean plus or minus 3 x standard deviation
Mean ± 3 x the standard deviation
STATISTICS:
DATA CONCEPTS
examples height, weight, age
area covered by the data is.
VARIANCE
STUDENT LEARNING CENTRE
Compiled by David Munroe
Academic Mentor
Student Learning Centre
Massey University, Auckland,2004
This brochure will help you with
some basic statistical concepts.
•
Types of Variable
•
Examining Distribution
•
Measures of Centrality
•
Measures of Spread
•
Levels of Measurement
TYPES OF VARIABLE
EXAMINING DISTRIBUTION
OVERALL PATTERN
CATEGORICAL VARIABLES
An individual is placed into one of several groups
or categories
MEAN (µ )
= Sum(values)/Number of values
= 1/n * Σ xi
Shape
Y axis
•
Unimodal = one peak
e.g. Values = 2, 4, 6, 8, 10, 24, 26, 28
•
Symmetrical?
n=8
Y axis
Mean = (2 + 4 + 6 + 8 + 10 + 24 + 26 + 28) = 13.5
8
Use with interval or ratio data
Pie graphs
(Avoid using these!)
(x axis)
Bar graphs
MEDIAN
QUANTITATIVE VARIABLES
Numerical values
You can apply arithmetic operations such as adding
and averaging
X axis
• Right skewed
Y axis
(Mean > Median)
Mean
Sort data and take middle value
Take average of middle values if n - the
number of values, is even. )
e.g. Values = 2, 4, 6, 8, 10, 24, 26, 28
n=8
3
344
12
1
5
Median = (8+10) ÷ 2 = 9
Use with Interval or Ordinal data
2
Stem plots
Middle value
•
•
Median
Display using:
Σ = sum of
Average value
Look at Shape, Centre, and Spread
(See measures of centrality and spread)
Display using:
1
2
3
4
5
MEASURES OF CENTRALITY
4 6 8 10
Histograms
Time plots can be used e.g. rainfall data
MODE
• Left skewed
Y axis
X axis
(Mean < Median)
Mean
DISTRIBUTION
Median
The most common value(s)
(y)
axis
This is the pattern of variation of a variable
Tells us what value a variable takes and how often it
takes these values.
(x) axis
Use with Ordinal or Nominal data
X-axis
Download