Uploaded by Unaizah Mahomed

STK110 Chapter 1

advertisement
Chapter 1: Data and Statistics
STK 110
…and I thought DATA
and STATISTICS are
the same … will have
to Google it!
1
Mobile/QT clicker questions: will be asked throughout
the lecture. Slides with a clicker question can be
distinguished from lecture slide by the red bar with the
turning point icon.
Read study guide for rules regarding the mobile/QT
clicker-practise
When you see this icon you may discuss the question with immediate
peer/s. Decide on an answer that is your “group” attempt.
When you see this icon you may NOT discuss the question.
2
Terminology: Data and Data Sets
Data is the facts and figures collected, analysed and
summarized for presentation and interpretation.
All the data collected in a particular study is called a
data set
3
Chapter 1: Data and Statistics
Study
Table 2: Means of transport used by people attending an
educational institution by type of institution (numbers), 2001
On foot
Pre-school
School
College
By
By
By mini- By bus
By train
bicycle motor
bus/ taxi
619 005
3 727
3 020
56 587
18 844
3 169
9 897 862
70 007
73 092
2 023
28 937
605 161
1 663
DATA65 344
462 151
23 040
81 202
17 350
Technikon
51 425
1 218
1 152
44 401
25 739
13 387
University
70 794
3 402
2 732
42 510
19 517
8 804
Other
20 638
434
352
5 505
3 074
1 114
Source: Census 2001
DATA SET
4
Terminology: Elements, Variable, Observation
Elements
The entities on which data are collected.
Variable
A characteristic of interest for the elements.
Observation
The set of measurements obtained for a particular
element.
5
Chapter 1: Data and Statistics
Table 2: Means of transport used by people attending an
educational institution by type of institution (numbers), 2001
VARIABLES
ELEMENTS
On foot
Pre-school
School
College
By
By
By mini- By bus
By train
bicycle motor
bus/ taxi
619 005
3 727
3 020
56 587
18 844
3 169
9 897 862
70 007
73 092
2 023
28 937
1 663
605 161
65 344
462 151
23 040
81 202
17 350
Technikon
51 425
1 218
1 152
44 401
25 739
13 387
University
70 794
3 402
2 732
42 510
19 517
8 804
Other
20 638
434
352
5 505
3 074
1 114
Source: Census 2001
OBSERVATION
6
Data types and Measurement scales
Variable define Data
Variable
Data
Time spent watching TV
during weekdays (hours)
•
•
•
•
•
5 hours
3 hours
5 hours
0 hours
2 hours
•
•
•
•
•
Low
Low
Very low
Very low
High
Numerical
meaning
Difference?
Level of physical fitness
(Very high, High, Low,
Very low)
Without
numerical
meaning
Average time spent watching TV on weekdays?
Average level of physical fitness?
7
Data types and Measurement scales
Variable
Data
The weekday that you
have the most leisure
time
•
•
•
•
•
Tuesday
Tuesday
Tuesday
Friday
Thursday
•
•
•
•
•
10
8
4
5
3
Without
numerical
meaning
Difference?
The number of push-ups
done in one minutes
Numerical
meaning
Average weekdays that students have the most leisure time?
Average number of push-ups done in one minute?
8
Classification of data: Quantitative and Qualitative Data
DATA
Numerical meaning
Without numerical meaning
Can do calculations
Cannot do calculations
Quantitative
data
Qualitative/
Categorical
data
9
Classification of data
Variable define Data
Variable
Data
Time spent watching TV
during weekdays
(hours)
•
•
•
•
•
5,5 or 5 1 2 hours
3,25 or 3 1 4 hours
5 hours
0,5 or 1 2 hours
2,45 or 2 3 4 hours
Decimals
or
fractions
Difference?
Number of days in a
week with at least 30
min. of leisure time
•
•
•
•
•
2
2
3
3
3
Integers
or whole
numbers
10
Classification of data: Scales of measurement
DATA
Quantitative
data
Fractions or
decimals
Integers or
whole numbers
(theoretical,
between any two
values, another
value exists)
(interval between
values is expressed in
terms of fixed values)
Discrete
Data
Continuous
Data
Variables such as
distance, height,
weight, time
Data type
Variables such as
number of push-ups,
goals scored in
a
soccer game
(ratio scale)
(Interval scale)
Measurement scale
11
Classification of data: Scales of measurement
Variable
Data
•
•
•
•
•
Preference of spending
time at friends:
(rate from 1 to 5 with
1 = not at all
and
5 = very much)
5
5
3
3
4
Rank of data
is meaningful
Difference?
Venues where you spend
most of your leisure time:
At home
At shopping malls
At friends
•
•
•
•
•
At friends
At friends
At friends
At home
At friends
Data is
classified into
categories
12
Classification of data: Scales of measurement
DATA
Quantitative
data
Qualitative
data
Fractions or
decimals
Integers or
whole numbers
Classified into
categories
Rank is
meaningful
Continuous
Data
Discrete
Data
Nominal
Data
Ordinal
Data
(ratio scale)
(Interval scale)
Note: Ratio
scale requires
that a zero value
exists. E.g. 0%
for Math test =
no marks.
Note: Interval
scale requires that
a zero doesn’t
mean zero. E.g.
0°C doesn’t mean
nothing = it is very
cold.
Example: Soft
Example: Rating
drinks
Excellent, Average
Pepsi, Coke, Fanta, Poor
Sprite
Example: Sizes of
Example: Sport
an item
Cricket, Rugby,
Small ,Medium,
Tennis,
Large
Swimming
13
Cross sectional Vs. Time series
Cross-sectional data – Data collected at the same or
approximately the same point in time.
For example:
On foot
Pre-school
School
College
By
By
By mini- By bus
By train
bicycle motor
bus/ taxi
619 005
3 727
3 020
56 587
18 844
3 169
9 897 862
70 007
73 092
2 023
28 937
1 663
605 161
65 344
462 151
23 040
81 202
17 350
Technikon
51 425
1 218
1 152
44 401
25 739
13 387
University
70 794
3 402
2 732
42 510
19 517
8 804
Other
20 638
434
352
5 505
3 074
1 114
Source: Census 2001
14
Cross sectional Vs. Time series Data
Time series data – Data collected over several time periods.
South African Rand per 1 US Dollar Graph
Dec 2014 – Dec 2015
15
Data and Statistics: Difference?
Statistics makes sense of numbers
More information
Statistics
More Organized
More
Understandable
The baby weighs 2.5 kg
2.5 kg = data
The average weight of a new born baby is 2.5kg = statistics
16
The way forward
Statistics
Descriptive Statistics
Describe
 Express
 Explain
 Illustrate
 Portray
Inferential Statistics
Infer
Speculate/ Deduce/ Reason
Realize/ Gather/ Assume
17
Download