Introduction - Department of Mathematics and Statistics

advertisement
620.152 Introduction to Biomedical Statistics
620-152
Introduction to Biomedical Statistics
Ray Watson
room 104 [Mon, Wed & Fri 1.15–2.15]
email: rayw@ms.unimelb.edu.au
Pre-requisite: VCE Math Methods
(and 620-151)
“Good, Watson! You always keep us flat-footed on the ground.”
Sherlock Holmes, The Adventure of the Creeping Man, 1927.
Course Notes
There are several recommended text-books (see below), but the course
notes, the tutorial problems, the computer prac-notes should be sufficient. These are all available from the course web-site:
Course web-site
Notes
Chapter 0
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Summary Notes
Statistical Tables
Problems
Answers
Computer Labs
Answers
Problem Set 1
Problem Set 2
Problem Set 3
Problem Set 4
Problem Set 5
Problem Set 6
Problem Set 7
Problem Set 8*
Problem Set 9
Problem Set 10
Problem Set 11
Problem Set 12*
Assignment
Revision Exercises
Last Year’s Exam
Answers 1
Answers 2
Answers 3
Answers 4
Answers 5
Answers 6
Answers 7
Answers 8
Answers 9
Answers 10
Answers 11
Answers 12
Asst Answers
RE Answers
LYE Answers
Computer Lab 1
Computer Lab 2
Computer Lab 3
Computer Lab 4
Computer Lab 5
Computer Lab 6
Computer Lab 7
Computer Lab 8
Computer Lab 9
Computer Lab 10
Computer Lab 11
CL Answers 1
CL Answers 2
CL Answers 3
CL Answers 4
CL Answers 5
CL Answers 6
CL Answers 7
CL Answers 8
CL Answers 9
CL Answers 10
CL Answers 11
Reference books:
If you find the course notes are not quite right for you, (and even if
you do), there are a range of similar texts (mostly with Biostatistics and
Introduction in the title) which may suit you better, including:
Pagano M & Gauvreau K “Principles of Biostatistics’ (Duxbury)
Rosner B “Fundamentals of Biostatistics” (Thomson)
Devore J & Peck R Statistics, the Exploration and Analysis of Data. (Duxbury)
Utts JM & Heckard RF “Statistical Ideas and Methods” (Duxbury)
Lectures:
Monday 9.00
Laby Theatre
Wednesday 9.00 Laby Theatre
Friday 9.00
Laby Theatre
Lectures: The lecture notes will appear on the web-site — mostly the
weekend before, or possibly earlier.
Tutorials & Computer Labs start in the second week.
Tutorials: set problems, homeworks and general clarification of the stuff
I haven’t explained properly.
Computer Labs: One hour per week. Using MINITAB.
MINITAB will be essential for some of the homework questions, some of
the assignment questions and will also be examined.
620.152 Introduction to Biomedical Statistics
Computer packages:
MINITAB is the standard statistical package available in the Computer
Labs; it’s easy to use and will do all the statistical things you’ll need.
You will be expected to handle MINITAB and in particular to interpret
MINITAB output.
[A student version is available for ∼A$200; 5 months rental costs ∼$50.]
EXCEL is readily available and will get a lot of the Statistics done, even if
it is a bit DIY; there is a Statistics add-on, which is clunky and minimally
useful. It is often useful for data and for simple graphs . . . but not pie
charts!
WORD is useful for presentation of reports. At least one of the questions
on the assignment must be presented as a report.
Assessment:
end of semester exam
weekly homeworks
assignment
prac-tests
80%
10%
5%
5%
Exam:
Standard three hour format.
Questions like those in the problem sets and computer labs.
But . . . No calculators: simple and approximate arithmetic required.
Homework:
Each week a problem sheet will be handed out. This will contain a
number of homework problems to be submitted for assessment and a
number of problems for discussion in the tutorial.
Assignment:
There will be an assignment, handed out in week 8, with questions that
are a bit longer and more involved that those in the weekly homeworks.
Prac tests:
There will be five short and simple tests in the computer practical classes
(roughly every second week) to ensure that you can use MINITAB to do
some basic statistical analysis.
Statistical tables “Statistical Tables for Students”
are available from the web-site.
This set of tables will be available for use in the exam.
Summary notes
are available from the web-site.
These summary notes will be available for use in the exam.
Note: if there are any additional formulae that you want included, just ask.
620.152 Introduction to Biomedical Statistics
types of studies
Probability
?
−→
←−
population
model
sample
observations
data
description
Statistics
?
rates
life tables
Course contents
0. Introduction
1. Exploratory data analysis
2. Studies and Design
3. Probability and applications
4. Probability distributions
5. Sampling and sampling distributions
6. Estimation: point and interval estimations
7. Hypothesis testing
8. Inference on proportions
9. Comparative inference
10. Correlation and regression
11. Life tables and standardisation
There are eleven “chapters” and there are twelve weeks in the semester. The
chapters correspond roughly (but only roughly) to weeks.
Mathematics and Statistics
Administration
Teaching and Learning
620.152 Introduction to Biomedical Statistics
0.1
0.2
0.3
What is statistics?
Population and sample
Variability
References: Pagano & Gauvreau, Chapter 1
What is Statistics?
Statistics is the science of collecting, analysing and drawing conclusions from
data.
Statistics is the study of variability and uncertainty.
The science of collecting, organizing, and analyzing data.
A branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate
population parameters.
A branch of mathematics that deals with the analysis and interpretation of
numerical data in terms of samples and populations
The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference
from sampling.
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to
a wide variety of academic disciplines, from the physical and social sciences to the humanities.
Biostatistics or biometry is the application of statistics to a wide range of topics
in biology. It has particular applications to medicine and to agriculture.
Epidemiology is the science devoted to the statistical study of categories of
persons and the patterns of diseases from which they suffer, with the aim
of determining the events or circumstances causing these diseases.
Epidemiology is the use of medical science and statistics to track population
health and to find causes of disease in groups of people.
Statistics provides the tools scientists use to analyse their data, and
principles on how best to design their experiments to collect data. In
evidence-based medicine, treatments and procedures advocated must
be supported by hard evidence, which means data from well-designed
experiments, ensuring valid and efficient outcomes; and analysed by
appropriate statistical methods.
Why might you study statistics?
• because it’s interesting, useful, enjoyable!
• to conduct research (in any field); to read and understand research
papers in your discipline area;
• to apply basic statistical methods in your course (project, lab work,
honours thesis); and more immediately, to pass this course.
620.152 Introduction to Biomedical Statistics
Population and sample
Perhaps the most fundamental concepts in statistics are:
• Population — the totality of units under study, which may be
(and often is) hypothetical;
• Sample — the observed units, i.e. the units on which we have
information (measurements);
types of studies
Probability
?
population
model
−→
←−
sample
observations
data
description
Statistics
?
rates
life tables
We first describe and explore the data (i.e. the sample);
and then examine where it came from (i.e. the population) and how
[though, in practice this would come first];
then we learn how to use the data to infer about the where it came from
. . . possible generalisations.
By and large, we want to be able to say something about the population
on the basis of the information you have in the sample. This requires a
scientific investigation, which typically takes the following steps:
Question(s)
↓
* Study design
(2)
↓
Data collection
↓
* Data display (EDA)
(1)
↓
* Inference
↓
* Answers and Conclusions
↓
* Reporting results
(1)
The steps marked with an asterisk all involve statistics.
620.152 Introduction to Biomedical Statistics
Example
(Communication)
Data in Four Areas and Eight Three-Month Periods in 1998-1999
13-15 16-18 19-21 22-24 25-27 28-30 31-33 34-36
A 97.63 92.24 98.90 90.39 95.69 94.44 91.13 97.81
B 48.29 42.31 49.98 39.09 46.38 49.74 41.74 37.39
C 75.23 75.16 77.04 74.23 74.23 76.97 71.66 76.47
D 49.69 57.21 75.19 51.09 52.88 49.41 59.32 52.56
Variability
We need to use statistics when the data show variability: i.e. all the time!
Usually the mean is “obvious” or “guessable”; but the variability is usually not. This is one reason why you need a course in Statistics.
For a lot of things you do in this course, modelling or estimating the
mean is “obvious”, but assessment of the variability (and hence the accuracy or precision) of your model or estimate is not so obvious.
32.4
59.1
53.6
89.9
30.0
58.4
50.6
51.7
58.9
87.3
61.0
58.0
71.4
63.0
54.4
63.5
67.8
63.3
36.9
52.2
45.1
57.4
44.2
75.9
56.6
42.9
46.6
66.1
41.1
41.2
60.9
47.9
39.6
73.9
67.9
72.1
55.7
78.5
40.6
61.3
44.2
37.3
49.1
39.4
29.6
61.7
73.2
53.5
99.9
47.1
60.8
50.5
51.3
48.9
45.4
32.1
40.7
75.8
27.9
34.1
67.9
57.0
43.2
61.3
32.7
44.1
52.4
42.1
54.7
42.1
43.1
61.5
28.2
54.6
37.9
56.1
60.3
60.4
63.5
52.1
61.4
42.2
65.6
72.9
56.1
51.6
46.2
48.9
41.1
45.6
57.5
37.2
66.0
55.9
61.7
63.2
60.6
28.9
39.1
41.3
.
:
. :
. .
:.
.
:.:::.. .:.::::.:: ..
.
::.:. :::::::.:::::::::: :: :: :.
. .
.
-----+---------+---------+---------+---------+-------- x
30
45
60
75
90
x
N
100
Mean
53.40
StDev
13.96
Min
27.90
Q1
42.38
Med
53.55
Q3
61.48
Max
99.90
What is the underlying mean?
It is not enough to say that the estimate is 53.4.
Is it 53.4 ± 0.6? or 53.4 ± 2.6? or 53.4 ± 12.6?
Statistics provides a handle on variability.
Statistics, as a scientific discipline, is concerned with:
1. dealing with variability in a population;
2. drawing conclusions that will stand the test of reproducibility.
“You are developing a certain unexpected vein of pawky humour, Watson,
against which I must learn to guard myself.”
Sherlock Holmes, The Valley of Fear, 1914.
Download