Ungrouped Data - WordPress.com

advertisement
1
To be able to determine which
of the three measures(mean,
median and mode) to apply
to a given set of data with
the given purpose of
information.
2





Introduction
Definition of Measures of Central Tendency
Mean
Arithmetic Mean
 Ungrouped data
 Grouped data
Median
 Ungrouped data
 Grouped data
 Graphical Method
3

Mode
 Ungrouped data
 Grouped data
4

Measures of central tendency are single values that
are typical and representative for a group of numbers.

They are also called measures of locations.

A representative values of location for a group of
numbers is neither the biggest nor the smallest but is a
number whose values is somewhere in the middle of
the group.

Such measures are often used to summarize a data
set and to compare one data set with another.
5

The average value of a set of data.

Appropriate for describing measurement
data, eg. heights of people, marks of
student papers.

Often influenced by extreme values.
6
• For Ungrouped data :
x
x1  x 2  ...........xn
x

x
n
x


N
n
where,
x  value of x variable,
n  sample size,
N  population size
  the summation
of
• Usually we seldom use population mean, µ,
because the population is very large and it would be
troublesome to gather all the values. We usually
calculate the sample mean, and use it to make an
estimation of µ.
7
Example:
Find the average age of five students whose ages
are 18, 19, 19, 19, 20 and 22 respectively.
Solution
x

x
n
98

5
 19.6 years old.
8
1) Basic Method
fx

x
n
For Example:

The distances traveled by 100 workers of XYZ
Company from their homes to the workplace are
summarized below (next slide). Find the mean
distance traveled by a worker.
9
Distances (km)
No. of
workers
0 and under 2
12
2 and under 4
35
4 and under 6
24
6 and under 8
8 and under 10
TOTAL
18
11
100
Kilometers
travelled
No of
Mid
Total
Workers Point distance
(X) travelled
ƒ(x)
0 and under 2
12
1
12
 2 and under 4
35
3
105
4 and under 6
24
5
120
6 and under 8
18
7
126
8 and under
10
11
9
99
total
100
Total
462
Mean = 462/100
= 4.62
10

It is easy to calculate and understand.

It makes use of all the data points and can
be
determined
with
mathematical
exactness.

The mean is useful for performing statistical
procedures like comparing the means
between data sets.
11

Can be significantly
abnormal values.
influenced
by
extreme

It may not be a value which correspond with a
single item in the data set.

Every item in the data set is taken into
consideration when computing the mean. As a
result it can be very tedious to compute when we
have a very large data set.

It is not possible for use to compute the mean with
open ended classes.
12

The weighted mean enables us to calculate an
average which takes into account the relative
importance of each value to the overall total.

Example
A lecturer in XYZ Polytechnic has decided to
use weighted average in awarding final marks
for his students. Class participation will account
for 10% of the student’s grade, mid term test
15%, project 20%, quiz 5% and final exam 50%.
13


From the information given, compute the
final average for Zaraa who is one of the
students.
Zaraa’s marks are as follows:
class participation
90
quiz
80
project
75
mid term test
70
final examination
85
14
Solution
Subjects
Marks (x)
Weight (w)
wx
Participation
90
10
900
Quiz
80
5
400
Project
75
20
1500
Mid term test
70
15
1050
Final exam
85
50
4250
total
100
8100
wx

x
w
 8100
100
 81
15

The median of a set of values is defined
as the value of the middle item when the
values are arranged in ascending or
descending order of magnitude.
16

If the data set has an odd number of
observations, the middle item will be the
required median after the data has been
arranged in either ascending or descending
order.

If the data array has an even number of
observations, we will take the average of the
two middle items for the required median as
shown at the next slide.
17
Example :
a) odd number of values data set:
2,1,5,2,10,6,8
array: 1,2,2,5,6,8, 10
median = value of fourth item.
b) Even number of values data set:
9,6,2,5,18,4,12,10
array: 2,4,5,6,9,10,12,18
median = (6+9) / 2
= 7.5
18
Example
Kilometers No of
travelled
workers
Cumulative
frequency
0 and
under 2
12
2 and
under 4
35
4 and
under 6
24
71
6 and
under 8
18
89
8 and
under 10
11
100
Total
12
 Middle term = n/2 =
100/2 = 50th term
47
 We identify the
median class by
examining the
cumulative frequencies
100
19

We can also find the median using a graph. Here,
we will first plot the “more than” and “less than”
ogives. The median is the value of the
intersection point of the two ogives.
ogives
cf
50
2
4
6
8
10
20
Example :
Distance
(km)
0 and under 2
No of
workers
12
Less than More than
cf
cf
12
88
2 and under 4
35
47
53
4 and under 6
24
71
29
6 and under 8
18
89
11
8 and under
10
11
11
0
total
100
21

It is easy to compute and simple to understand.

It is not affected by extreme values.

Can take on open ended classes.

It can deal with qualitative data.
22

It can be time consuming to compute as we have
to first array the data.

If we have only a few values, the median is not
likely to be representative.

It is usually less reliable than the mean for
statistical inference purpose. It is not suitable for
arithmetical calculations, and has limited use for
practical work.
23

The most frequent or repeated value.

In the case of a continuous variable, it is
possible that no two values will repeat
themselves. In such a situation, the mode is
defined as the point of highest frequency
density, i.e., where occurrences cluster
most closely together.
24

Like the median, the mode has very limited practical
use and cannot be subjected to arithmetical
manipulation.

However, being the value that occurs most often, it
provides a good representation of the data set.
25

The mode can be obtained simply by
inspection.
Example
 1,4,10,8,10,12,13
 1,3,3,7,8,8,9
 1,2,3,4,9,10,11
Mode=10
Mode= 3 and 8
No mode
26

In the case of grouped data, finding the mode
may not be as easy. Since a grouped frequency
distribution does not show individual values, it is
obviously impossible to determine the value
which occurs most frequently.

Here we can assign a mode to grouped data that
have the highest frequency even though we may
not know whether or not any data value occurs
more than once.
27
Advantages




It can take on open ended classes.
It cannot be affected by extreme values.
It is also applicable to qualitative data.
Disadvantages



Not clearly defined. Some data may have no mode
It is difficult to interpret and compare if data set has
more than one mode
28
References :
Lecture & Tutorial Notes from
Department of Business &
Management, Institute Technology
Brunei, Brunei Darussalam.
Download