Uploaded by sheslaygem

statistical-analysis-module-CHAPTER-2

advertisement
COLLEGE OF BUSINESS MANAGEMENT AND
ACCOUNTANCY
MODULE IN STATISTICAL ANALYSIS WITH
SOFTWARE APPLICATION
Chapter 2:
Collection, Organization &
Presentation of Data
LEARNING OBJECTIVES:
At the end of the chapter, the students must be able to:
 Learn some important points in collecting data.
 Learn the different methods of data collection.
 Organize and display data into a frequency distribution.
 Present ways to portray quantitative and qualitative
variables graphically.
Collection of Data
Collecting data is a method of gathering data for a specific purpose. The
following guidelines can be used to manage the collection of data practically
and effectively.
Important Points in Collecting Data
1. If measurements of some characteristic from people (such as height) are being
obtained, better results will be achieved if the researcher does the measuring instead of
asking the respondent for the value.
2. The method of data collection used may expedite or delay the process. Avoid a medium
that would produce low response rates.
3. Ensure that the sample side is sufficiently large for the required purposes. The size of
the sample will depend on the variability of the population data, the cost of sampling,
and the margin of error.
4. If possible, the sampling method to be used to collect data should result in a sample
that is representative of the population.
Methods of Data Collection
1. Direct or Interview Method
The direct or interview method of data collection uses at least two persons (an
interviewer and interviewee/s) exchanging information. This method will give us precise
and consistent information because clarifications can be made. Also, questions not fully
understood by the respondent, the interviewer could repeat the question until it suits
the interviewee’s level. However, the method is time consuming, expensive and has
limited field coverage.
2. Indirect or Questionnaire Method
This is a method where written answers are given to prepared questions. The method
requires less time and is inexpensive since the questionnaires can simply be mailed or
hand-carried. Also, this will give a respondent a sense of freedom and honesty in
answering the questions because of anonymity. Online procedure thru various social
media is now a common way of administering questionnaires.
3. Registration Method
This is a method by certain laws, civil codes, and regulations of each country. It is a
process which imposes registration of vital events such as births, marriages, deaths, etc.
4. Observation Method
This is a method which involves personal observation of the behavior of individuals or
organizations in the study. This is also used when the respondents cannot read nor write
wherein interview or observation method is not possible.
5. Experiment Method
This method is used when the objective of the study is to determine the cause and
effect of certain phenomena or event. It should be made clear that causality can only be
established with the use of experimental method. Under this method, randomly choses
subjects are randomly assigned to a particular treatment under a particular
experimental design adopted by the researcher.
Frequency Distributions
A frequency distribution is a table which summarizes an arranged data into various
classes or categories. When the data are organized in this form, analysis of data and
interpretation can easily be managed.
Parts of a Frequency Distribution Table
1. Table Heading – includes the table number and the title of the table.
2. Body – main part that contains the information or figures.
3. Stubs or classes – classification or categories describing the data and usually found at
the leftmost side of the table.
4. Caption – designations or identifications of the information contained in a column,
usually found at the topmost of the column,
Remark: In a quantitative frequency distribution table, the classes are class intervals which
are composed of Lower Limit (LL) and Upper Limit (UL)
Example:
Table 2.1:
Frequency Distribution of Staff Perception of the Leadership Behavior of the
Administrator
Perception of Leadership Behavior
Strongly Favorable
Favorable
Slightly Favorable
Slightly Unfavorable
Unfavorable
Strongly unfavorable
TOTAL
Frequency
10
11
12
14
22
31
100
Types of Frequency Distribution Table (FDT)
1. Qualitative and Categorical FDT – a frequency distribution table where the data are
grouped according to some qualitative characteristics, data are grouped into nonnumerical categories.
Example of a Qualitative FDT:
Table 2.2:
Frequency Distribution of the Gender of Respondents of a Survey
Gender of Respondents
Frequency
Male
65
Female
98
TOTAL
163
2. Quantitative FDT – a frequency distribution table where the data are grouped according
to some numerical or quantitative characteristics.
Example of Quantitative FDT:
Table 2.3:
Frequency Distribution for the Weights of 50 Pieces of Luggage
Weight (in kilogram)
Frequency
7-9
2
10-12
8
13-15
14
16-18
19
19-21
7
TOTAL
50
Steps in Constructing a Frequency Distribution Table
1. Determine the Range (R).
R = highest value − lowest value
2. Determine the number of classes (Κ)
Where N is the total number of observations in the data sheet
3. Determine the class size (c) by calculating first the preliminary class size of c’.
Preliminary class size c’:
Remarks:
a. It should have the same number of decimal places as in the raw date; i.e. if the
observations in the data set are all whole numbers, then you c should be a whole
number.
b. The class size of an interval is the difference between the Upper Class Boundary
(UCB) and the Lower Class Boundary (LCB) of that interval.
4. Enumerate the classes or categories.
Remark: We usually make the lowest value as the lowest lower limit.
5. Tally the observations.
6. Compute for values in other columns of the FDT as deemed necessary.
Note:
Sometimes the number of classes (k) is not followed. An extra class will be added to
accommodate the highest observed value in the data set and a class will be deleted if it turns
out to be empty.
Other Columns in the FDT
1. Class Boundaries (CB)
a. Lower class boundary (LCB)
LCB = LL – ½ unit of measure
b. Upper class boundary (UCB)
UCB = LL + ½ of measure
2. Class Mark (CM) – midpoint of the class interval where the observations tend to clutter
about.
3. Relative Frequency (RF)
If we are to express the frequencies in a frequency distribution as percentages, it is a
relative frequency distribution. It is obtained by dividing the frequency for each class by
the total frequencies.
4. Cumulative Frequency Distribution
This indicates the number of scores that fall below and above the class limits of class
intervals.
Two Kinds of Cumulative Frequency Distribution
to
1. Less Than Cumulative Frequency (<CF) – total number of observations where values
do not exceed the upper limit of the class.
2. Greater Than Cumulative Frequency (> CF) – total number of observations whose
values are less than the lower limit of the class.
Illustrative Example:
Given below are the raw data of the daily wages of 40 workers in Pangasinan.
Raw Data
201
324
649
623
458
322
486
234
650
493
453
129
568
357
145
540
583
349
695
698
124
127
389
405
340
267
653
321
276
295
390
489
680
395
601
212
175
489
203
392
Notice that in here, the lowest value is 124. Hence, it was assigned as the lowest
lower limit.
Graphical Presentations of Frequency Distributions
1. Histogram. The classes are plotted on the horizontal axis and the frequencies on
the vertical classes. The lines that separate the bars intersect the v-axis at the
lower and upper limits of the class intervals. The height of the bar corresponds to
the frequency of the class interval. Since the intervals are continuous, the lower
limit of any one interval is also the upper limit of the previous interval and the
vertical bars must touch each other rather than be spaced apart.
Example:
Consider the frequency distribution of the words per minute of 60 individuals using a
word processor is given in the table below.
2. Frequency Polygon. It consists of the segments connecting the points formed by
the intersections of the class midpoints and the class frequencies. The polygon is
closed by considering an additional class of each end and the ends of the lines are
brought down to the horizontal axis at the midpoints of the additional classes.
Example: Frequency polygon of Table 2.5
3. Ogive. It is a graph where a point is plotted above each class boundary at a height
equal to the cumulative frequency corresponding to that boundary.
Example: Ogive of Table 2.5
4. Stem-and-leaf Plots
Stem-and-leaf plot is a new way of displaying data. It gives a quick picture of the
shape of a distribution while including the actual numerical values in the graph.
A stem is the common leading digit/s for a subset of the data set. A leaf is the trailing
digit/s that follow the stem.
To make a stem-plot:
1. Separate each observation into a stem and a leaf. Stems may have as many digits as
needed, but each leaf contains only a single digit.
2. Write the stems in a vertical column with the smallest at the top, and draw a vertical
line at the right of this column.
3. Write each leaf in the row to the right of its stem, in increasing order out from the stem.
Example:
Consider the following grades of 20 students in a statistics subject:
75
86
82
77
78
91
82
84
93
79
80
83
94
85
86
76
92
88
93
85
In constructing the stem-and-leaf plot, the choose the leading digits or stem from the data.
We have 7, 8 and 9 and the final digit or leaf of each number to right of the appropriate
leading digit. Then, arrange the leaves in ascending order.
Figure 2.1: Stem-and-leaf Plot for the Grades Data
Frequency
Stem
&
Leaf
5.00
7
-
56789
10.00
8
-
02234 55668
5.00
9
-
12334
Stem width:
10.00
Each Leaf:
1 case (s)
Graphing Qualitative Variables
1. Column and Bar Graph. It consists of bars or heavy lines of equal widths, either all
vertical or horizontal; the lengths of bars represent the magnitudes of the quantities
being compared. (for nominal, ordinal and interval data)
Example: The data of the causes of death due to accidents or violence for males during
a recent year is as follows:
Causes of Death Due to Accidents or Violence for Males during the Recent Year
Cause of Death
Number
Motor vehicle accident
30,500
All other accident
27,500
Suicide
20,234
Homicide
8,342
Example: Bar graph of Table 2.6
2. Pie Chart. It is a circular graph that is useful in showing how a total quantity is
distributed among a group of categories. It is constructed by dividing a circle (a pie) into
sectors, each sector having a size proportional to the percentage it represents. (for
nominal data)
Example: Pie chart of Table 2.6
Download