8 Jan 2011 Chapter 2-1 Describing variables, levels of

advertisement
WHATS NEW: Book Revision History
This is a list of all updates made to individual chapters. This is a quick reference to verify that
you have downloaded the latest revision of any particular chapter.
26 Nov 2012 Chapter 5-14 Added bootstrap validation of linear regression to end of chapter.
22 Oct 2012 Chapter 4-4 Added this chapter on power analysis for basic science applications
(small sample sizes), specifically, Fisher’s exact test.
---2 Sep 2012
Chapter 2-16 Added a section on how the ICC is affected by having a very narrow
range of scores, a problem analogous to what happens to the Kappa
statistic when data clump up in a corner of a 2 x 2 table.
2 Sep 2012
Chapter 1-6 Revised the section on submitting graphs to journals, now
recommending EPS graphs, rather than TIFF graphs, and explain
how to make the fonts of EPS graphs look crisp when importing into
Microsoft PowerPoint or Word.
16 Mar 2012 Chapters 3-98, 3-99 Added some homework problems for Chapter 3-5
20 Feb 2012 Chapter 1-4 Fixed a couple of confusing typos
20 Feb 2012 Title Page Added year 2012 to copyright line and modified it slightly
20 Feb 2012 Chapter 1-2 Updated for Stata version 12. Now shows the import command for
reading directly from an Excel file.
20 Feb 2012 Chapter 4-3 Added this chapter on sample size paragraphs for grants
19 Feb 2012 Chapter 4-2 Added this chapter on computing sample sizes and power for
noninferiority studies.
15 Feb 2012 Chapter 5-99 Updated a problem for Chapter 5-9 to match Stata version 12
15 Feb 2012 Chapter 5-98 Updated a problem for Chapter 5-9 to match Stata version 12
15 Feb 2012 Chapter 5-9 Added Stata version 12 command for imputation using chained
equations.
14 Feb 2012 Chapter 3-5 Corrected a calculation and wrong example—basically
minor typographical error corrections.
13 Feb 2012 Chapter 5-3 In formula for population standard deviation, replaced x-bar with
mu, so the notation is correct.
13 Feb 2012 Chapter 2-6 In formula for population standard deviation, replaced x-bar with
mu, so the notation is correct. Also, correctly ordered the
categories for the orthopaedic scale in the “a refinement of this
idea” section so that it is an ordinal scale.
13 Jan 2012 Chapter 5-2 It now reads better. Also, two nice graphs were added for
illustrating the slope formula and for illustrating how control for
confounding works. The do-file, Ch 5-2.do updated to reflect the
changes to the chapter. The data file Ch5-2.dta is no longer used
and can be deleted, as it is now created in the do-file editor.
… continued on page 4…
_____________________
Source: Stoddard GJ. Biostatistics and Epidemiology Using Stata: A Course Manual. Salt Lake City, UT: University of Utah
School of Medicine. Revision History Page. (Accessed September 2, 2012, at http://www.ccts.utah.edu/biostats/
?pageId=5385).
Revision History (revision 26 Nov 2012)
p. 1
Last Revision Date of Each Chapter
Front Matter
20 Feb 2012
16 May 2010
15 Mar 2011
16 May 2010
Title & copyright page
Preface & suggestions for use & author contact information
WHATS NEW: Book revision history
Table of contents
Section 1. Stata: Data Management, Graphics, and Programming
8 Jan 2012
Chapter 1-1 Installing Stata and recovering Stata windows
20 Feb 2012 Chapter 1-2 Getting data into Stata and some other basics
15 Feb 2011 Chapter 1-3 Cleaning data
20 Feb 2012 Chapter 1-4 Merging files
23 Feb 2011 Chapter 1-5 Labeling variables and values
2 Sep 2012
Chapter 1-6 Basic graphics
4 Mar 2011 Chapter 1-7 Looping, collapsing, and reshaping
16 May 2010 Chapter 1-8 Operators, ifs, dates, and times
27 Jun 2011 Chapter 1-9 More graphics: popular scientific graphs
16 May 2010 Chapter 1-10 Programming Stata
16 May 2010 Chapter 1-11 Compilation of frequently used variable generation and
Modifying commands (a chapter for quick look up)
14 Oct 2011 Chapter 1-12 Stata results into Excel & Word
9 Mar 2011 Chapter 1-13 replaced by Chapter 1-98 to allow for book expansion
9 Mar 2011 Chapter 1-14 replaced by Chapter 1-99
8 Jan 2012
Chapter 1-98 Homework problems
8 Jan 2012
Chapter 1-99 Homework problem solutions
Section 2. Biostatistics
8 Jan 2011
Chapter 2-1
15 Mar 2011
16 May 2010
17 Oct 2011
22 Dec 2011
13 Feb 2012
16 May 2010
8 May 2011
16 May 2010
13 Feb 2011
16 May 2010
17 Aug 2010
Chapter 2-2
Chapter 2-3
Chapter 2-4
Chapter 2-5
Chapter 2-6
Chapter 2-7
Chapter 2-8
Chapter 2-9
Chapter 2-10
Chapter 2-11
Chapter 2-12
6 Oct 2011
Chapter 2-13
16 May 2011 Chapter 2-14
30 Aug 2011 Chapter 2-15
2 Sept 2012 Chapter 2-16
Describing variables, levels of measurement, and choice of
descriptive statistics
Logic of significance tests
Choice of significance test
Comparison of two independent groups
Basics of power analysis
More on levels of measurement
Comparison of two paired groups
Multiplicity and the Comparison of 3+ Groups
Correlation
Linear regression
Logistic regression and dummy variables
Survival analysis: Kaplan-Meier graphs, Log-rank Test, and Cox
regression
Confidence intervals versus p values and trends toward
significance
Pearson correlation coefficient with clustered data
Equivalence and noninferiority tests
Validity and reliability
Revision History (revision 26 Nov 2012)
p. 2
9 Jan 2011
16 May 2010
8 May 2011
8 May 2011
8 Jan 2012
8 Jan 2012
Chapter 2-17
Chapter 2-18
Chapter 2-19
Chapter 2-20
Chapter 2-98
Chapter 2-99
Bland-Altman analysis
One sample tests
replaced by 2-98 to allow for book expansion
replaced by 2-99
Homework problems
Homework problem solutions
Section 3. Epidemiology
16 May 2010 Chapter 3-1
16 May 2010 Chapter 3-2
1 Aug 2010 Chapter 3-3
19 Aug 2011 Chapter 3-4
14 Feb 2012 Chapter 3-5
16 May 2010 Chapter 3-6
14 Jul 2011 Chapter 3-7
16 May 2010 Chapter 3-8
16 May 2010 Chapter 3-9
16 May 2010 Chapter 3-10
16 May 2010 Chapter 3-11
16 May 2010 Chapter 3-12
16 May 2010 Chapter 3-13
16 May 2010 Chapter 3-14
16 May 2010 Chapter 3-15
16 Mar 2012 Chapter 3-98
16 Mar 2012 Chapter 3-99
Introduction to epidemiologic thinking
Sufficient/component cause theory of disease
Hill’s causal criteria
Logic and errors
Effect measures
Study designs
Randomization using Excel
Bias and confounding
Random error and statistics
Crude analysis
Stratified analysis
Standardization
Sensitivity (bias) analysis
Case-cohort study design
replaced by 3-98
Homework problems
Homework problem solutions
Section 4. Power Analysis
23 Jun 2010 Chapter 4-1
19 Feb 2012
20 Feb 2012
22 Oct 2012
8 Jan 2012
8 Jan 2012
Sample size determination and power analysis for specific
applications
Chapter 4-2 Sample size determination and power analysis for equivalence,
noninferiority, and nonsuperiority tests.
Chapter 4-3 Grant sample size paragraphs
Chapter 4-4 Basic science applications
Chapter 4-98 Homework problems
Chapter 4-99 Homework problem solutions
Section 5. Regression Models
16 May 2010 Chapter 5-1 What regression is and curvilinear correlation
13 Jan 2012 Chapter 5-2 Holding constant
13 Feb 2012 Chapter 5-3 Dichotomous predictor variables
16 May 2010 Chapter 5-4 Adjusted means, Analysis of Variance (ANOVA), and interaction
16 May 2010 Chapter 5-5 Deriving logistic regression
16 May 2010 Chapter 5-6 Exact logistic regression
16 May 2010 Chapter 5-7 Introducing Cox regression and Kaplan-Meier plots
16 May 2010 Chapter 5-8 Interaction
15 Feb 2011 Chapter 5-9 Missing data imputation
16 May 2010 Chapter 5-10 Linear regression robust to assumptions
Revision History (revision 26 Nov 2012)
p. 3
16 May 2010
16 May 2010
16 May 2010
26 Nov 2012
16 May 2010
16 May 2010
16 May 2010
16 May 2010
16 May 2010
16 May 2010
17 Aug 2010
16 May 2010
16 May 2010
16 May 2010
16 May 2010
16 May 2010
8 Jan 2011
8 May 2011
8 May 2011
15 Feb 2012
15 Feb 2012
Chapter 5-11
Chapter 5-12
Chapter 5-13
Chapter 5-14
Chapter 5-15
Chapter 5-16
Chapter 5-17
Chapter 5-18
Chapter 5-19
Chapter 5-20
Chapter 5-21
Chapter 5-22
Chapter 5-23
Chapter 5-24
Chapter 5-25
Chapter 5-26
Chapter 5-27
Chapter 5-28
Chapter 5-29
Chapter 5-98
Chapter 5-99
Linear regression diagnostics and transformations
Variable selection and collinearity
Monte Carlo Simulation and Bootstrapping
Model Validation
Response feature (summary measure) analysis
Analysis of covariance (ANCOVA) versus change analysis
Conditional logistic regression
Repeated measures analysis of variance
Generalized estimating equations (GEE)
Multilevel (mixed effects) models
Regression post tests
Modeling cost
Cox regression proportional hazards assumption
Cluster analysis
Multilevel (mixed effects) logistic regression
Trend tests
Propensity Scores
replaced by 5-98
replaced by 5-99
Homework problems
Homework problem solutions
Section 6. Diagnostic Tests
16 May 2010 Chapter 6-1
16 May 2010 Chapter 6-1
16 May 2010 Chapter 6-1
16 May 2010 Chapter 6-1
Test characteristics
Comparing diagnostic tests
Imperfect reference tests
Sampling with verification bias
Appendices
12 Jul 2011 Appendix 1
9 Jan 2010
Appendix 2
16 May 2010 Appendix 3
Dataset Descriptions
Bibliography
List of cross references
Continuation from page 1 (list of specific changes to book
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
8 Jan 2012
Chapter 5-99
Chapter 5-98
Chapter 4-99
Chapter 4-98
Chapter 3-99
Chapter 3-98
Chapter 2-99
Chapter 2-98
Chapter 1-99
Chapter 1-98
Replaces Chapter 5-29 “Homework problem solutions”
Replaces Chapter 5-28 “Homework problems”
Added Chapter 4-99 “Homework problem solutions”
Added Chapter 4-99 “Homework problems”
Added Chapter 3-99 “Homework problem solutions”
replaces Chapter 3-15 “Homework problems”
replaces Chapter 2-20 “Homework problem solutions”
replaces Chapter 2-19 “Homework problems”
replaces Chapter 1-14 “Homework problem solutions”
replaces Chapter 1-13 “Homework problems” to allow for book
expansion
Revision History (revision 26 Nov 2012)
p. 4
8 Jan 2012
Chapter 1-1
22 Dec 2011 Chapter 2-5
Updated to make consistent with Stata version 12.
Added another journal that requires a power analysis be reported
when statistical significance is not achieved for a primary outcome
5 Nov 2011 Preface
Changed “it is not appropriate to cite” to “it is” and direct the
reader to the Title page for citations instructions.
5 Nov 2011 Title Page
Added a second page to show detailed correct citation format for
this internet-based textbook.
17 Oct 2011 Chapter 2-4 Quoted two additional works further defining the concept of
a confidence interval
14 Oct 2011 Chapter 1-12 revised this chapter to take advantage of the Windows mouse right
click options for moving Stata output into Microsoft Word and
Excel. What was in the chapter before was very clutsy.
6 Oct 2011
Chapter 2-13 finished the section on making claims of “marginally significant”
when 0.05 < p value < 0.10.
30 Aug 2011 Chapter 2-15 expanded discussion of noninferiority testing, recommending the
use of one-sided tests using a two-sided 95% confidence interval
(so alpha is 0.025).
19 Aug 2011 Chapter 3-4 added how to apply Stoddard’s aphorism, listed some additional
articles discussing false conclusions in the medical literature,
and added a summary paragraph tying together some ideas
27 Jul 2011 Chapter 2-4 made the definition of a confidence interval more rigorous, added
reporting style for p values (number of decimals)
14 Jul 2011
Chapter 3-7
12 Jul 2011
Appendix 1
27 Jun 2011
Chapter 1-9
16 May 2011 Chapter 2-14
8 May 2011
Chapter 5-28
8 May 2011
Chapter 5-29
8 May 2011
Chapter 2-20
8 May 2011
Chapter 2-19
8 May 2011
Chapter 2-16
8 May 2011
Chapter 2-8
15 Mar 2011 Chapter 2-4
added examples of researchers used random permuted blocks
approach and a suggested citation.
added internet source of data or data itself, for some more of the
datasets, to the Dataset Descriptions chapter
added publication quality ROC curve graph and Kaplan-Meier
curve graph
fixed a bug in the program betweencorr so it now gives the correct
sample size when missing data are present
added more homework problem solutions to the Regression
Models section of the book
added the chapter Homework problem solutions to the Regression
Models section of the book
added more homework problem solutions to the Biostatistics
section of the book
added more homework problems to the Biostatistics section of the
Book
added section explaining why anomalous values can occurr with
the ICC, where low ICCs occur even though the agreement looks
tight
added mcpi and fdri programs as an appendix; added how to
convert between adjusted p and adjusted alpha formulas for
Bonferroni and Finner procedures
toned down discussion of checking assumptions for t-test, pointing
out robustness of t-test to normality and homogeneity assumptions
Revision History (revision 26 Nov 2012)
p. 5
15 Mar 2011 Chapter 2-2
added a very large section on choosing between standard
deviations, standard errors, and 95% CIs, for reporting study
outcomes and for error bars on graphs.
15 Mar 2011 Chapter 1-6 added how to change text size in Stata version 10.
9 Mar 2011 Chapter 1-14 added this new chapter of solutions to homework problems for the
Stata section
9 Mar 2011 Chapter 1-13 greatly expanded this chapter of homework problems for the Stata
section
4 Mar 2011 Chapter 1-7 added more on looping structures
3 Mar 2011 Chapter 5-9 updated with “mi ice” so get more imputed values for multiple
imputation than when using “mi impute regress”; added how to
automate imputed categorical variables using most frequent
category
23 Feb 2011 Chapter 1-5 reformatted so much easier to follow discussion
22 Feb 2011 Chapter 1-4 updated to Stata version 11 merge syntax that uses 1:1, 1:m, etc.
15 Feb 2011 Chapter 1-3 expanded it, added the inlist command
13 Feb 2011 Chapter 1-2 added how to import an Excel file when the variable names are not
on the first row
13 Feb 2011 Chapter 1-1 added that “run as administrator” should be used when installing
with Windows 7 in order to be able to create license file
29 Jan 2011 Chapter 2-5 added null and alternative hypothesis notation and added a power
function graph created with Stata
9 Jan 2011
Chapter 5-28 reformated the homework problems and gave it a new chapter
number
9 Jan 2011
Chapter 2-17 added confidence interval for the limits of agreement and
added a protocol suggestion; renamed chapter from “methods
comparison analysis” to “Bland-Altman analysis”
9 Jan 2011
Chapter 2-8 added a quote by Zolman explaining conservativeness of ANOVA,
Added a quote by Scott justifying use of false discovery rate (FDR)
in this research article.
13 Feb 2011 Chapter 2-10 added Rothman’s explanation of why restriction is more important
than representativeness
9 Jan 2011
Chapter 2-6 added a page discussing the number of categories in an ordinal
scale that are required to analyze it as an interval scale, giving a
citation. Added a citation for justifying the treatment of a visual
analog scale as an interval scale.
8 Jan 2011
Chapter 5-27 began development of new chapter on propensity scores
8 Jan 2011
Chapter 2-1 added box plot for two grouping variables
8 Jan 2011
Title & copyright page added a suggestion citation
8 Sep 2010
Chapter 5-9 added “set seed” preceding “hotdeckvar” command.
6 Sep 2010
Chapter 2-20 new chapter—solutions to homework problems
6 Sep 2010
Chapter 2-19 added more homework problems
6 Sep 2010
Chapter 1-2 added setting up file association in Windows so clicking on file
correctly opens Stata and reads in the data.
6 Sep 2010
Chapter 2-1 added simulation of standard error and when to use it, graph
showing varying SDs, description of degrees of freedom
4 Sep 2010
Chapter 2-13 added quote by Altman in favor of not interpreting a p=0.04 and
Revision History (revision 26 Nov 2012)
p. 6
p=0.06 differently.
17 Aug 2010 Chapter 5-21 added some additional examples of post-estimation tests:
comparison of regression coefficients within the same model,
comparison of regression coefficients from separate models, and
comparison of two correlation coefficients.
17 Aug 2010 Chapter 2-12 shortened the chapter, taking out tests of assumptions for Cox
regression (these are still available in Chapter 5-23) and taking out
the advanced formulas (still available in Chapter 5-7). Added
interpretation exercise of Kaplan-Meier probabilities.
6 Aug 2010 title & copyright page added the two websites where this book is available from
1 Aug 2010 Chapter 3-3 improved the introduction to the Cheskin article
1 Aug 2010 Chapter 1-1 added more clarification and removed installation question
responses specific to the author’s institution
29 Jul 2010 Chapter 1-6 expanded the chapter to include: change size of symbols, lines, and
text by multiplying the default size; mention of graphics editor;
logarithm y-axis to odds ratio graph; finer details, such as 300 dpi,
of preparing graph for publication
19 Jul 2010 Chapter 2-2 added presentation of role of sampling distribution and how to
simulate it
12 Jul 2010 Chapter 2-1 expanded the chapter to include: graph demonstrating
relationship of mean, median, and mode for symmetrical and
skewed distributions; graph demonstrating percent of scores within
1,2, and 3 standard deviations; explanation of degrees of freedom
for standard deviation formula; Stata commands table and tabstat
for descriptive statistics
23 Jun 2010 Chapter 4-1 added section, “Interrater Reliability (Precision of Confidence
Interval Around Intraclass Correlation Coefficient)” which
provides a Stata program to compute sample size for interrater
reliability
Revision History (revision 26 Nov 2012)
p. 7
Download