Uploaded by leanne decrepito

Intro to biostatistics

advertisement
BIOSTATISTICS AND EPIDEMIOLOGY
3rd term / Prelims / Lecture 1 – Intro to Biostatistics
STATISTICS
Statistics
→ the science of conducting studies to collect,
organize, summarize, analyze, and draw
conclusions from data.
→ In the era of Information Age, data is
everywhere making statistics a very valuable
tool to make sense of all data in the world.
Biostatistics
→ is the development and application of statistical
concepts and techniques to biological sciences.
→ Basically, it is statistics applied to biology dealt
in the field of medical laboratory science.
STATISTICAL TERMS
Variable
→ Characteristic or attribute that
can assume different values.
o Ex. Age, Sex
Random
Variable
→ variable that can have values that
are determined by chance, or it is
yet to be determined.
→ may still assume different values.
o Ex. Age – yet to be
determined.
Data
Data set
Data Value
or Datum
(2) Inferential Statistics
→ It is describing and drawing conclusion from a
given data.
Ex.
a) Generalization from samples to populations.
→ concept of probability is used.
▪ Probability – the study of chance of an
event from occurring.
▪ Population (N) – consists of all subjects
that is being studied.
▪ Sample (n) – group of subjects selected
from a population.
b) Performing estimations and hypothesis test.
▪ Hypothesis testing – a decision-making
process for evaluating claims (whether
true or not, reject or accept) about the
population.
c) Determining relationships among variables.
d) Making predictions
VARIABLES AND TYPES OF DATA
→ values a variable can assume
There are two (2) classifications of variables:
→ collection of data values.
(1) Quantitative Variable
→ Numerical and can be ordered or ranked.
→ each value in the data set or the
individual values in a data set
2 major branches of statistics
(1) Descriptive Statistics
→ A collection, organization, summarization,
and presentation of data.
→ It often describes a situation or just
describing the data.
o Ex. Census, no. of family members,
and age of family members.
a. Discrete Variables – characterized by gaps in the
values it can assume. (Can be counted as whole)
b. Continuous Variables – does not possess the
gaps. (Can have decimals, obtained by
measurements)
c. Dichotomous – can only assume two values.
(ex. male or female/yes or no)
(2) Qualitative Variable
→ Variables that can be placed into distinct
categories, according to some characteristic
or attribute.
RECORDED VALUES AND BOUNDARIES
Variable
Length
Temperature
Time
Mass
•
•
Recorded
Value
15 cm
86 oF
0.43 sec
1.6 g
Boundaries
14.5 – 15.5 cm
85.5 – 86.5 oF
0.425 – 0.435 sec
1.55 – 1.65 g
Since continuous variable must be measured,
answers must be rounded off because of the limits
of the measuring device.
Creating boundaries:
o the values should have 1 decimal place
higher than the recorded value.
o Always ends in 5.
DATA COLLECTION
•
The sample should be representative of a whole
population.
Sources of data:
- Routinely kept records.
- Surveys
- Experiments
- External records
SAMPLING TECHNIQUES
Random
Systematic
MEASUREMENT SCALES
(1) Nominal Level
→ classifies data into:
o mutually exclusive (nonoverlapping)
o exhausting categories (must ensure
that all the samples will be measured;
all possible responses are captured)
o in which no order or ranking can be
imposed on the data.
→ it is categorical in nature.
→ naming observations
(2) Ordinal Level
→ Classifies data into categories that can be
ranked; however, precise differences
between the ranks do not exist.
(3) Interval Level (quantitative)
→ Ranks data, and precise differences
between units of measure do exist;
however, there is no meaningful zero.
(4) Ratio Level
→ Possesses all the characteristics of interval
measurement, and there exists a true zero.
→ True ratios exist when the same variable is
measured on two different members of the
population.
Stratified
Cluster
Subjects are selected by chance
or random numbers.
Subjects are selected by using
every kth number after the first
subject is randomly selected
from 1 through k
Subjects are selected by dividing
up the population into groups by
characteristics (strata), and
subjects are randomly selected
within groups.
Population is divided into groups
called clusters by some means
like geographic area
STATISTICAL STUDIES
Observational study
→ Merely observes without intervention and tries
to draw conclusions.
→ disadvantage: it can be done in situations where
it would be unethical or downright dangerous to
conduct an experiment because you are not
manipulating the population.
Experimental study
→ The researcher manipulates one of the variables
(have control over the variable) and tries to
determine how the manipulation influences
other variables.
→ disadvantage:
o they occur in an unnatural setting.
o Hawthorne effect – subjects knew the
experiment and try to change behavior.
VARIABLES IN STATISTICAL STUDIES
Independent variable (Explanatory variable)
→ Manipulated by the researcher.
Dependent variable (Outcome variable)
→ Dependent on the independent variable
→ Resultant variable that is heavily affected by the
independent variable.
Confounding variable
→ It influences the dependent or outcome variable
but was not separated from the independent
variable.
˃ Ex. you found out that men who have
lighters in their pockets have higher
chances of having lung cancer, but it is
the smoke from cigarette not the lighter.
The lighter is the confounding variable
because if they have cigarette, they
have lighter in their pockets.
→ Role of researcher is to separate the
confounding variable to independent variable.
MISUSES OF STATISTICS
o
o
o
o
o
o
o
Suspect samples
Ambiguous averages
Changing the subject
Detached statistics
Implied connections
Misleading graphs
Faulty survey questions
Download