Measures of central tendency
Measures of central tendency provide a single value representation for the middle of a group of data.
Arithmetic mean or average
The arithmetic mean or average is a measure of central tendency that equally weighs all values; it is most affected by outliers.
Median
The median is the value that lies in the middle of the data set. Fifty percent of data points are above and below the median.
Mode
The mode is the data point that appears most often; there may be multiple (or zero) modes in a data set.
Distributions can be classified by _____ and _____.
measures of central tendency
measures of distribution
The normal distribution is _____.
symmetrical
The _____ are all the same in the normal distribution.
mean, median, and mode
Standard distribution
The standard distribution is a normal distribution with a mean of zero and a standard deviation of one; it is used for most calculations.
_____ of data points occur within one standard deviation of the mean, _____ within two, and _____ within three.
68%
95%
99%
Skewed distributions
Skewed distributions have differences in their mean, median, and mode; the skew direction is the direction of the tail of the distribution.
Bimodal (multimodal) distributions
Bimodal distributions have multiple peaks, although not necessarily multiple modes, strictly speaking. It may be useful to perform data analysis on the two groups separately.
Range
Range is the difference between the largest and smallest values in a data set.
Interquartile range
Interquartile range is the difference between the value of the third quartile and first quartile; interquartile range can be used to determine outliers.
Standard deviation
Standard deviation is a measurement of variability about the mean; standard deviation can also be used to determine outliers.
Outliers may be a result of _____, _____, or _____.
true population variability
measurement error
a non normal distribution
Procedures for handling outliers should be ...
... formulated before the beginning of a study.
Independent events
The probability of independent events does not change based on the outcomes of other events.
Dependent event
The probability of a dependent event changes depending on the outcomes of other events.
Mutually exclusive outcomes
Mutually exclusive outcomes cannot occur simultaneously.
Exhaustiveness
When a set of outcomes is exhaustive, there are no other possible outcomes.
Hypothesis tests
Hypothesis tests use a known distribution to determine whether a hypothesis of no difference (the null hypothesis) can be rejected.
Whether or not a finding is statistically significant is determined by the comparison of a _____ to the selected _____.
p-value
significance level (α)
Confidence intervals
Confidence intervals are a range of values about a sample mean that are used to estimate the population mean. A wider interval is associated with a higher confidence level (95% is common).
_____ and _____ are both used to compare categorical data.
Pie charts (circle charts) and bar charts
_____ and _____ are both used to compare numerical data.
Histograms and box plots (box-and-whisker plots)
Maps are used to compare ...
... up to two demographic indicators.
Linear, semilog, and log–log plots can be distinguished by their _____.
axes
Slope can be calculated most easily from _____.
linear plots
Correlation and causation are separate concepts that are linked by _____.
Hill’s criteria
Data must be interpreted in the context of _____ and _____.
the current hypothesis
existing scientific knowledge
_____ and _____ significance are distinct.
Statistical and practical significance