Uploaded by MARIGOLD JOYCE LACONSE

Chapter I Basic Concept in Statistic1

advertisement
Chapter I Basic Concept in Statistics
Introduction:
Statistics affects many facets of our lives. In every life, whether at home or at work, we
usually keep records and read reports. An item is a record, or report is a fact that expressed in
terms of a numerical value or described by its quality or kind. The ingle item or fact is referred
to as a datum, such as color of the leaves, the number of students in the class, the height and
width and the number of bacterial colonies are all example of data. And how to deal with it is the
major concern of statistics.
Objectives:
1. Define biostatistics and identify its importance.
2. Explain the methods of collecting statistical data and variables.
3. Discuss different sampling techniques
Lesson 1.2. Biotatistics and its Importance
Overview:
For most people the word “statistics” is a scary thing that must be avoided as much as possible.
They think of statistics as collection of numbers and formulas that have vague meanings. Actually,
without noticing it, people often apply statistics in their everyday life. When a clinician records
the result of a physical examination of the patient, he is collecting data to aid the physician in
diagnosing the patients’ illness and to determine the appropriate medical treatment to be
prescribed to the patient.
Objectives:
The students should be able to:
1. Define statistics and biostatistics
2. Discuss the inductive and deductive reasoning in medical diagnoses.
3. Explain the scientific methods employ in medical research.
Content:
Statistics is a science that deals with the collection, organization, analysis, interpretation and
presentation information that can be stated numerically.
Major areas of Statistics:
1. Descriptive Statistics- this includes anything done to the data which is designed to
summarize or describe, without going any further; that is without attempting to infer
anything that goes beyond the data themselves.
2. Statistical Inference- comprises the methods concerned with the analysis of a subset of
data leading to predictions or inferences about he entire set of data.. analysis requires
the generalization which go beyond the data.
Biostatistics Is statistics applied to the biological sciences.
Perhaps the most difficult of statistics is the logic associated with inductive Inferences, yet all scientific
evidence is based on this type of statistical inference. The same logic is used, though not always
explicitly, when a physician practices medicine: what is observed for a large group of patients to make a
specific decision about that particular patient.
When taking a clinical history, conducting a physical examination, or requesting laboratory analyses,
radiographic evaluations or test, a physician is collecting information (data) to help choose diagnostic
and therapeutic actions. The decisions reached are based on knowledge obtained from training, from
literature, from experience, or from some similar sources.
General principles are applied to specific situation at hand in order to reach the best decision possible
for a particular patient. Much of the basic medical training centers around deductive reasoning
This type of reasoning- from the general to the specific- is called deductive reasoning.
We conduct experiments and comparative studies to focus on questions that arise from our work. We
study few patients ( or experimental animals), and from what we observe we try to make rational
inferences about what happens in general. This type of reasoning- from the specific subject(s) at hand to
general. This type of reasoning is called Inductive Reasoning. This approach to medical research- pushing
back the bounds of knowledge concerning human health- follows what is known as the Scientific
Method, which has four basic steps.
1. Making observation………i.e.., gathering data
2. Generating a hypothesis………the underlying law and order suggested by the data
3. Deciding how to test the hypothesis……what critical data required?
4. Experimenting ( or observing) –this leads to an inference that either to rejects or affirms the
hypothesis. If the hypothesis is rejected, then we go back to step 2.
If it is affirmed, this does not necessary mean it is true, only that in the light of current
knowledge and methods it appears to be so. The hypothesis is constantly refined and tested as
more knowledge becomes available.
All data collected from biological system have variability, the statistician is concerned with summarizing
trends in data and drawing conclusions in spite of the uncertainty by variability in the data. An
understanding of statistics will enhance your ability to interpret data, whether for the purpose of
treating a particular patient or for drawing general conclusions from a research study, as well as enable
you to distinguish fact from fancy in everyday life.
Summary:
Biostatistics deals with the collection, organization, presentation, analysis and interpretation of
biological information that can be stated numerically.
Activity:
A. Suppose that a set of measurement represent the total rainfall in the province of Sultan Kudarat
during the month of July has been recorded for the past 15 years. Any values describing the
data.
Write descriptive or Inferential statistics in the following value based on the data above.
1. The average rainfall within 15 yrs is 3.0 cm.
2. For 15 years , Month of July have rain.
3. Next July we expect a rain.
4. This July 2021 we will expect between 3.2 and 3.4 cm of rain.
B. 1.Decide what reasoning must be employ in the situation below in order to give diagnoses and
treatment. Discuss why?
a. Stroke patient
b. Yellowing of leaves of your potted plant.
c. Swelling of gums and painful tooth.
C. Differentiate the following:
1. Statistics and Biostatistics
2. Deductive and Inductive reasoning
3. Descriptive and Inferential statistics
Study 1.2. Statistical data and Variables
Overview:
The basic unit of statistical analysis is data. There are generally two types of data and there is no
formula for selecting the best method to be used in gathering data. It depends on the researcher’s
design of the study, the type of data, the time available to complete the study, and the financial
capacity.
Objectives:
1. Identify the types and kinds of data.
2. Explain the methods in collecting data.
3. Discuss the types of variables.
4. Determine the scales of measurement.
Content:
Classification of Data
1. Quantitative Data-data that can be expressed in numbers. These are the things that can be
measured like weight, length, number of colonies, mortality rate and etc.
2. Qualitative Data- are facts for which no numerical measure exists. They are usually expressed in
categories or kind. Example are color of the skin, which could be black, brown or white; a
person’s sex, which is male or female; It may be presense or absence of metallic sheen in the
colony of the bacteria; and others
In order to assure the accuracy of data, one must know the right sources and methods of collecting
them.
Types of data according to sources
1. Primary Data- it refers to the information which are gathered directly from an original source, or
which are based on the direct of first hand experiences.
2. Secondary Data- refer to the information which are taken from published data which are
previously gathered by other individuals or agencies or data which comes from other sources
other than the respondents.
Methods of Collecting Data.
1. Interview Method- person to person exchange between the interviewer and interviewee.
2. Questionnaire Method- written response are given to prepared questions. A questionnaire is a
list of questions which are intended to elicit answers to the problem of a study. Questionnaire
may be mailed, send online or hand carried.
3. Registration Method- method of gathering information is enforced by certain laws. Examples are
the registration of birth, deaths, motor vehicles, marriages and licenses.
4. Observation method- the investigator observes the behavior of persons or organisms and their
outcomes. This is usually used when the subjects can not talk and write.
5. Experimental Method- this method is used when the objective is to determine the cause and
effect relationship of certain phenomena under controlled condition. Scientific researchers
usually use the experimental method.
Collected data must be organized in order to show significant characteristics. They can be presented
in three forms
1. Textual- when data is presented in paragraph
2. Tabular – data is presented in rows and columns.
3. Graphical – data is presented in visual form.
Kinds of graphs
a.Bar graph
b. Pie graph
c. Line graph
Variable is a numerical characteristic or attribute associated with the population being studied.
Types of Variables
1. Categorical or qualitative variables are classified according to some attributes or categories
Ex. Gender, religion, blood type, civil status…
Categories may be ordered which may or may not assigned specific numerical values such as:
Performance Rating ( poor, fair, good, very good, excellent). IQ score ( low, average, high)
2. Numerical – valued or quantitative variables are variables that are classified according to
numerical characteristics such as height, age, pulse rate, number of children, speed.Numericalvalued variables are often grouped into class intervals.
Ex. Age in year- 5-9, 10-14, 15-19 and 20& above.
Height in cm- 100-149, 150-199, 200-249
Numerical-valued variables are classified as:
1. Discrete – is a variable whose values are obtained by counting.
Ex. Number of children, number of persons with blue eyes, number of patients with T.B.,
Number of males and females in a Statistics class.
2. Continuous – is a variable whose values are obtained by measuring such as temperature,
distance, area, density, age, height. All of which cannot be put into a list because they can have
any value in some interval of real numbers.
Scales of Measurement
In selecting the statistical tool to be used for drawing inferences on a random sample, the type of
measurement scale must be carefully chosen. Measurements are classified into four.
1.Nominal scale - is a measurement scale that classified elements into two or more categories or
classes, the numbers indicating that the elements are different but not according to order or magnitude.
Ex.
Table 1. Distribution of Medical Students of University of the Philippines Grouped According to Race
And Civil Status
Race
Single
Married
Widow/er
Separated
Total
American
10
5
0
1
16
Chinese
29
8
5
10
52
Japanese
18
11
1
3
33
Filipino
32
3
4
20
59
Total
89
27
10
34
160
The medicals are classified according to race and civil status.
2.Ordinal Scale - is a measurement scale that ranks individuals in terms of the degree to which they
possess a characteristic of interest.
Ex.
Table2. Anxiety Level of Patients with Mental Disorder on Hospital Q.
Sex
0
1
2
3
Total
Male
9
16
2
1
28
Female
21
10
4
7
42
Toatal
30
26
6
8
70
Legend:
0 = not anxious
1 = low anxiety level
2 = moderate anxiety level
3 = high anxiety level
3. Interval Scale – Interval is a measurement scale, in addition to ordering scores from high to low.
It also establishes a uniform unit in the scale so that any equal distance distance between two
scores is of equal magnitude. Aptitude scores from 80 to 90 are of equal difference as aptitude
scores from 90-100 ( both being equal to 10.)
4. Ratio Scale – Ratio is a measurement scale in addition to being an interval scale, that also has
absolute zero in the scale.
Summary:
SCALE of Measurement
Each number represents a category
Greater than and less than relationships
and
and
Units of measurement
and
Absolute Zero
Application:
A. Evaluate the data below write qualitative or quantitative.
1.
2.
3.
4.
5.
6.
7.
8.
9.
25 ft.
Medium size
30%
6 meter
4 colonies
Male
Absent
100 seeds
Blue eyes
Nominal
Ordinal
Interval
Ratio
10. 500 acre
B. Write Primary or secondary data .
1. number of public vehicles in the city of Tacurong.
2. Enrollees of SKSU from 2010 to 2020
3. Information from diary
4. Data from the Daily Inquirer
5. TB patients of St. Louise Hospital from Jan. to June 2020.
6. information from police investigator
7. Information from the victim of accident.
8.Response from your respondent
9. data from the State of Nation Address of the Philippines.
10. data from the research journal.
C. Write D if discrete and C if continuous
1. Number of foreigners migrating to the Philippines
2. Length of hair
3. Boiling point of water 1000C
4. John’s height is 160cm
5. number of children in Brgy Sebu with missing tooth.
6. Average speed of UB express along National High ways.
7. Number of online students present in zoom meeting.
8. number of leaves affected by leaf rot.
9. leaf width and length
10. Number of vaccinated Filipino.
D.Write the advantages and disadvantages of each method in collecting data.
Study 1.3 Sampling Techniques
Overview:
Analysis of data in research work requires that the number of population should be determined and
specified if possible, so that the required sample size can easily be calculated based on sampling
techniques and research designs. If the population is small, it is sometimes convienient to obtain the
information by collecting the data for the whole of the population (total enumeration). However, if the
population is large, more time and money can be saved by measuring only a sample drawn from the
population. When the measurement is destructive, sampling is of course unavoidable for obvious
reason.
Objectives:
At the end of the lesson, you should be able to:
1. compute the sample size;
2. enumerate the different sampling methods;
3. identify the use of different sampling methods in data collection.
Content:
Population – is the group of all study units about which a particular investigation may provide
information. Population is denoted by “μ”
Target population – is the whole group of study units to which we are interested in applying our
conclusions.
Study population - is the group of study unit to which we legitimately apply our conclusion.
Sample – a subset or a representative part of the population; hence, the sample must possess the same
characteristics of the population. Sample size is denoted by “n”.
Sampling
Types of Sampling:
1. Non- Probability or Judgment sampling
Sample
Sampling is based on a judgment selection of “typical” or representative elements of the
population
under study considering
an arbitrarily set criteria.
Population
Inference
1.1. Purposive Sampling – a sample is drawn from the population where what constitute the
representative elements or sample is already a preconceived idea.
1.2. Quota Sampling – sample is drawn for convenient and on the basis of a quota.
1.3. Sampling is done haphazardly
1.4. Sampling which involves volunteers
1.5. Convenience or Accidental sampling – Sampling where elements of the sample are those
that are readily accessible to the sampler.
2. Probability Sampling – Sampling with a definite set of rules and procedures for drawing the
sample is being followed. It allows one to evaluate the probability of each element to be part of
the sample, even prior to drawing the actual sample. Probability samples are suitable to
statistical analysis and scientific research.
2.1. Simple Random Sampling – sampling actually drawn from the the whole population,
without replacement and with equal probability of selection for every possible sample.
Methods of simple random sampling are:
a. The box method
b. Use of the table of random number
c. Use of computer software package of random number generated.
2.2. Systematic Sampling – a method of sampling wherein a sample is drawn by taking say every
K- the unit in the population starting from the ith unit drawn at random. This is used when
there is ready list of the total population. Most practical way of sampling.
2.3. Stratified Sampling – a sampling procedure wherein the population is divided into non
overlapping strata. These strata is homogeneous and a random sample is drawn
independently from each stratum. This scheme is used to that different groups of a
population are adequately represented in the sample.
2.4. Cluster Sampling – the total population is divided into a number of relatively small
subdivision and some of these subdivisions or clusters are randomly selected for inclusion in
the overall sample.
2.5. Multi-stage Sampling the technique uses several stages or phases in getting the sample from
the general population. However selection of the sample is still done at random. It is useful
in conducting nation - wide survey involving a large universe.
Determination of Sample Size (n)
Important criteria in determining the sample size (n).
1. Variability of the population(N)
2. Error will be tolerated / accepted. This is the desired precesion
3. Degree of confidence desired attached to the estimate of the parameter. That is, one needs to
specify the confidence coefficient, (1-α x 100%) desired.
4. Resources available to obtain the data and the time diration to produce output.
5. Safety/risk of the enumerators.
Sample size is advisable if the population is equal to or more than 100. But it is inapplicable to a
population less than 100. Total population or census is advisable for population less than 100 for
categorization purposes. To have a scientific determination of sample size, the formula below was
suggested by Calmorin and Calmorin(1997 )
Ss= NV + { S2 + (1-p) }
NSe + { V2 + p(1-p) }
Where:
Ss = Sample size
N = Total number of population
V = The standard value (2.58) of 1 percent level of probability with 0.99 reliability.
Se = Sampling Error (0.01)
P = The largest possible proportion (0.50)
For instance, if the total population is 500, the standard value at 1% level of probability is 2.58 with 99%
reliability with a sampling error of 1% or 0.01, and the proportion of a target population is 50% or 0.05;
then the sample size is computed as follows:
Given:
N = 500
V = 2.58
Se =0.01
P = 0.50
Ss= NV + { S2 + (1-p) }
NSe + { V2 + p(1-p) }
Ss= 500(2.58) + { (0.01)2 + (1-.50) }
500(.01) + { 2.582 + .50(1-.50) }
Ss= 1290+ { (0.0001) (0.50) }
5 + (6.6564) (.50) (.50)
S= 193.57 or 194
The sample size of 500 is 194 which represents the subject of the study.
Summary:
In gathering statistical information for data analysis, the researcher:
1.must identify first the subject of the study.
2. delimit of determine the scope and coverage of the subject of the study.
3. determine their population and sampling size.
4. determine the sampling methods or techniques to be utilized.
5. prepare the necessary data gathering instruments for purposes of investigation.
There are two types of samples: the probability sample and the nonprobability sample.
Activity:
Choose the best answer among the choices.
1.The best random sampling design because every individual in the population has equal chance of
inclusion in the sample is
a. Stratified random sampling
b. Simple random sampling
c. Restricted random sampling
2.The sampling design in which all individuals in the population are arranged in methodical manner and
the nth name may be chosen in the construction of the sample is
a. Systematic sampling
b. Stratified random sampling
c. Unrestricted random sampling
3.The sampling design based on selecting the individuals as samples according to the criteria of the
researcher which serve as controls is
a. Quota sampling
b. Incidental sampling
c. Purposive sampling
d. Cluster sampling
4.The sampling design which is intended to improve the validity of the sample and is applicable when
the population being studied is homogeneous is
a. Cluster sampling
b.Simple random sampling
c. stratified sampling
5 A population of 900 has a sample size of
a.218
b.217
c.219
d.220
6.Sampling is inapplicable to the population of
a.100
b.110
c.99
7.Which of the following does not belong to the group?
a. Quota
b. Incidental sampling
c. Cluster Sampling
8. The sample size of 750 population is
a. 210
b. 211
c.208
9. A 2000 population has a sample size of
a.236
b.238
c.232
10.Sampling design in which the population is grouped into small units such as blocks or districts is
a.Purposive sampling
b. Quota sampling
c. Cluster sampling
11.Which of the following does not belong to the group?
a. Purposive sampling
b. Multi-stage sampling
c. Cluster sampling
12.Sampling design in which the researcher simply takes the closest individuals as subjects of the study
because they are most available is
a. Quota sampling
b. Purposive sampling
c.Cluster sampling
13. A population of 300 has a samle size of
a.181
b.166
c.165
14. The sampling design which is popular in the field of opinion research is
a. Incidental sampling
b. Cluster sampling
c. Quota sampling
15. The sample size of 550 population is
a.196
b.194
c. 192
II. Compute the sample size of the following population.
1.230
2. 340
3. 570
4.890
5. 2,300
Download