Uploaded by Daily Books

Branches of Statistics, Data Types, and Graphs

advertisement
SECTION 7.1: BRANCHES OF STATISTICS, DATA TYPES, AND GRAPHS
STATISTICS DEFINED
Statistics is the branch of mathematics about data collection (the aspect dealing with obtaining numerical
measurements), data tabulation/presentation (the aspect dealing with organizing data into tables, graphs, or
charts), data analysis (the aspect dealing with extracting relevant information from the given data), and data
interpretation (the aspect dealing with drawing conclusions from the analyzed data). (Pagoso et al., 1992).
Based on this definition, data is the major component of statistics. Hence, statistics is succinctly referred to as
the science of data.
BRANCHES OF STATISTICS
Statistics is divided into two: descriptive statistics and inferential statistics. Procedures focused on
collecting and describing a set of data to obtain relevant information are concerns of descriptive statistics.
These procedures apply only to the group (whether sample or population) from which the data has been
collected. On the other hand, procedures concerned with the analysis of the data from the sample in order to
make predictions or inferences about the population are the concerns of inferential statistics. These procedures
may be about making generalizations from samples to populations, doing estimations and hypothesis tests,
finding relationships among variables, and making predictions. It is important to note that population as used in
statistics refers to the totality of the group under study while sample is just a subset of this group.
Example 1: Describing the enrollment in a university in terms of the percentages per level (freshman,
sophomore, junior, and senior) is a concern of descriptive statistics.
Example 2: Testing the hypothesis that male and female students significantly differ in performance in
mathematics test is a concern of inferential statistics.
DATA AND THEIR TYPES
Data are referred to as pieces/bits of information that function as the basic component of any
statistical investigation. These are obtained whenever measurements are done or observations are recorded.
TYPES OF DATA
Data may be classified according
to the type of variable being collected.
Quantitative data are data obtained on
quantitative variables (variables that can be
measured numerically). Qualitative data, on
the other hand, are those collected on
qualitative variables (variables that cannot
assume a numerical value but can be
divided into two or more non numeric
categories).
Examples of quantitative data: number of siblings, speed of a car, blood pressure reading, etc.
Examples of qualitative data: eye color, year level, socioeconomic status, etc.
Data may also be classified according to the different measurement characteristics. Numbers have
the following functions: to classify and to compare values either by ranking, getting differences, or forming
quotients. Nominal data are data where numbers can be assigned to categories but they cannot be ranked,
and no mathematical computation can be done. Ordinal data are data where numbers can be assigned to
categories and these numbers can now be compared by ranking. Interval data are data where the numbers
can be subtracted and these differences can now be compared. Ratio data are data obtained from
measurements with a unique origin.
Examples:
Gender is nominal because if the number 1 is assigned to male and 2 to female, you cannot compare
the numbers. You cannot say that 1 < 2. You cannot perform subtraction either. You cannot say that 2 – 1 = 1.
Grade levels in basic education are ordinal. You can now compare the grade levels. The statement 1
< 2 is now true. It simply means that grade level 1 is lower than grade level 2. But just like in nominal data,
mathematical computations are not possible. There is no meaning in the statement
“5 – 3 = 2”.
Temperature readings are intervals. There is no unique origin for temperature reading in the Celsius
scale. But you can now compare differences. If, on Monday, the highest temperature is 36oC and the lowest is
30oC, the difference in temperature is 6oC. If, on Tuesday, the highest temperature is 37oC and the lowest is
29oC, the difference in temperature is 8oC. These differences can now be compared. The difference in
temperature on Monday is lower than the difference in temperature on Tuesday.
Height of a person is a ratio of data. There is a unique origin in the instrument being used for
measurement. Comparing by ranking, forming differences, and forming quotients are now possible.
TYPES OF GRAPHS
Bar – a graph made of bars, with heights representing the frequencies (or
percentages) of respective categories
Example: The graph shows the frequency of students from different levels in a
university with 1300 students.
Pie – a circle divided into slices/portions representing the percentages of a
population or a sample that belong to different categories
Example: The graph shows the percentages of students from different levels in
a community college with 650 students.
Line – a graph which shows the relationship between two or more set of
quantities using lines
Example of a line graph: The graph below shows the enrolment for schools A
and B from 2017 to 2020.
Pictograph – a graph in which picture symbols are used to represent
values
Example of a pictograph: The graph below shows the number of months
that four students obtained a score of not lower than 80% from a spelling
test.
SECTION 7.2: MEASURES OF CENTRAL LOCATION AND MEASURES OF POSITION
When you want to describe a given set of numerical data, one of the descriptions can be a number that
represents its central value. There are three ways of describing the location of that central value: mean,
median, and mode.Aside from describing the location of the central value of a given data set, you may also be
interested in locating the position of a certain value relative to the position of the other values. If that is the
case, you have four possible measures: median, quartile, decile, and percentile.
MEASURES OF CENTRAL LOCATION/CENTRAL TENDENCY
COMPARING THE DIFFERENT MEASURES
Table 11 below shows the comparison of the
different measures in terms of characteristics,
stability of the measure, subsequent manipulations,
number of values, type of data, and effect of
extreme values.
Table 11: Comparison of the Different Measures
MEASURE OF POSITION/FRACTILES: DEFINITIONS AND INTERPRETATIONS
The definitions and interpretations of the different measures of position are given in the boxes below.
PROCEDURE IN COMPUTING THE FRACTILES
Different authors suggest different ways of computing the fractiles. The procedure adapted here is the one used
by Bluman (2013) and Freund & Simon (1997).
​
​
Let p be afraction between 0 and 1.
Compute pn.
○ If pn is not an integer, use the next higher integer for the pth fractile position.
○ If pn is an integer, use the mean of the values in positions pn and (pn + 1) as the pth fractile.
Example:
Given the following scores obtained by students in a
statistics test:
20 16 18 30 10 12 18 13 25 28
a) Compute the three quartiles.
b) Compute the 7th decile.
c) Compute the 35th percentile.
ARRANGE: 10 12 13 16 18 18 20 25 28 30
SECTION 7.3: MEASURES OF VARIABILITY
Knowing the location of the central value of a given data set and the position of a certain value relative to the
other values in a data set does not fully give a full description about the data set. Knowing also how spread out
the scores are from each other is another way of describing the set. This is known as the measure of spread or
dispersion or variability. Another measure that is likewise important is the measure comparing the variations of
two or more groups which enables you to determine which of these groups is more variable than each of the
other groups. The measure used for this is the coefficient of variation is known.
MEASURES OF VARIABILITY
1. Range . This is a measure of spread obtained by subtracting the smallest value from the largest value in a
data set.
Example: The data below are the scores of 8 students in a Statistics examination.
43 46 41 39 36 48 41 28
Find the range: Range = 48 – 28 = 20
2. Quartile Deviation(QD) . This is a measure of spread which is one-half the range of the middle 50% of the
cases or observations (This is also called semi-interquartile range.)
MEASURE OF RELATIVE VARIATION: COEFFICIENT OF VARIATION (CV)
SECTION 7.4: FUNDAMENTAL PRINCIPLE OF COUNTING AND PROBABILITY
Knowledge of the fundamental principle of counting is needed in understanding probability and inferential
statistics. These topics will be discussed in this section.
Fundamental Principle of Counting or Multiplication Principle
If one event can occur in m ways and a second one
can occur in n ways, then the number of ways both
can occur is (m)(n).
Note: The principle also holds true for more than
two events.
Example 1:
A room has 4 doors. In how many ways can an
individual make a trip into this room and out again if
he must enter and leave only by means of the
doors?
Solution: There are two events: entrance and exit.
For entrance there are 4 doors the same with exit.
n = (4)(4) = 16
Answer: There are 16 ways.
PROBABILITY
Probability is a branch of mathematics that deals
with measuring or determining quantitatively the
likelihood that an event or experiment will have a
particular outcome.
DEFINITIONS
​
An experiment refers to some situation of
interest whose outcome is determined by
chance.
​
A sample space S is a set of all possible
outcomes of an experiment. Each element
in a sample space is called an outcome or
sample point.
​
An event is any subset of a sample space.
Example 1:
A bowl has 4 orange marbles, 5 green marbles, and 6 yellow marbles. One marble is then picked at random.
Determine the probability that it is green.
Solution:
Let
S = the experiment of picking a marble from the bowl
n(S) = 15
E = the event of selecting a green marble
n(E) = 5
Download