Uploaded by empfehlenswertwarmerpiranha

Statistics, Notes

advertisement
Statistics Class
2 Semester
Chapter 1 – The Nature of Statistics
Section 1.1 Statistic Basics
Definition 1.1
Descriptive Statistics: Consists of methods for organizing and summarizing information. It
includes the construction of graphs, charts and tables and the calculation of various
descriptive measures, such as averages, measures of variation, and percentiles.
Furthermore, are in this no inferences made.
Population: Collection of all individuals or items under consideration in a statistical study.
Sample: That part of the population from which information is obtained.
Inferential Statistics: Methods for drawing and measuring the reliability of conclusions
about a population based on information obtained from a sample of the population.
The Information obtained from a sample of the population lets one make inferences (draw
conclusions), about preferences of the entire population.
Section 1.2 Simple Random Sampling
Definition 1.4
Simple random sampling: Sampling procedure for which each possible sample of a given
size is equally likely to be the one obtained.
Simple random sample: Sample obtained by simple random sampling.
SRSWR (Simple Random Sampling with replacement): A member of a population can be
selected more than once.
SRS (Simple Random Sampling without replacement): A member of the population can be
selected at most once.
Random-Number Tables: Procedure to obtain a random sample, involving a table of random
numbers.
Random-Number Generators: Software to get random numbers.
Section 1.3 Other Sampling Designs
Systematic Random Sampling
Step 1: Divide a populations size by the sample size and round it down (=𝑚)
Step 2: Use random number generator or similar to obtain, 𝑘, between 1 an 𝑚.
Step 3: select for a sample those members of the population, that are numbered 𝑘, 𝑘 +
𝑚, 𝑘 + 2𝑚, …
Cluster Sampling
Step 1: divide population into groups (clusters).
Step 2: Obtain random sample of the clusters.
Step 3: Use all members of the clusters obtained in Step 2 as the sample.
Stratified Random Sampling with Proportional Allocation
Step 1: Divide populations into subpopulations (strata).
Step 2: Now take from each stratum, a simple random sample of size proportional to the size
of the stratum. (That is, the sample size for a stratum = the total sample size * the stratum
size / by the population size).
Step 3: Use all members obtained in Step 2 as the sample.
Section 1.4 Experimental Designs
Definition 1.5
Experimental Units; Subjects: Individuals or items on which a designed experiment is
performed on, are called experimental units. When the experimental units are humans, the
term subject is often used instead.
Principles of Experimental Design:
Following principles are being looked out for, so the result of an experiment is not
reasonably attributable to chance are likely caused by treatments:
 Control: Two or more treatments should be compared.
 Randomization: Experimental units should be randomly divided into groups, to avoid
unintentional bias.
 Replication: Sufficient numbers of experimental units should be used to ensure that
the randomization creates groups that resemble each other closely and also to
increase chances of detecting differences among the treatments.
 Often involve experimental situations a treatment group and a control group, which
is receiving placebo’s.
Definition 1.6
Response variable: The characteristic of the experimental outcome that is to be measured
or observed.
Factor: A variable whose effect on the response variable is of interest of an experiment.
Levels: The possible values of a factor.
Treatment: Resembles each experimental condition. For one-factor experiments, the
treatments are the levels of the single factor. For multifactor experiments, each treatment is
a combination of levels of the factor.
Definition 1.7
Completely Randomized Design: All experimental units are assigned randomly among all the
treatments. Once treatments have been chosen, one must decide on how the experimental
units are to be assigned to the treatments (or vice versa).
Definition 1.
Randomized Block Design: Experimental units are assigned randomly among all the
treatments separately within each block. This means, that experimental units that are similar
in ways that are expected to affect the response variable are grouped in blocks; then the
random assignment of experimental units to the treatments is made block by block.
Chapter 2 – Organizing Data
Definition 2.1 - Variables
Variable: Traits or characteristics that can vary from one person or thing to another.
Qualitative Variable: Non-numerically valued variable.
Discrete Variable: Quantitative variable. Only with a finite number of possible values are
discrete variables.
Continuous Variable: Quantitative variable, whose possible values form some kind of
interval of numbers.
Definition 2.2 - Data
Data: Values of a variable.
Qualitative Data: Values of a qualitative variable.
Quantitative Data: Values of a quantitative variable.
Discrete Data: Values of a discrete variable.
Continuous Data: Values of a continuous variable.
Section 2.2 Organizing Qualitative Data
Definition 2.3 – Frequency Distribution of Qualitative Data
Frequency Distribution of Qualitative Data: A frequency distribution of qualitative data is a
listing of the distinct values and their data
Procedure 2.1
Construct a Frequency Distribution of Qualitative Data:
Step 1: Listing the distinct values of the observations in the data set in the first column of a
table.
Step 2: A tally mark is being placed for every observation in the second column of the table,
in the row of the appropriate distinct value.
Step 3: The tallies are being counted for each tally and the totals are recorded in the third
column of the table.
Definition 2.4 – Relative-Frequency Distribution of Qualitative Data
Relative-Frequency Distribution of Qualitative data: A relative-frequency distribution of
qualitative data is a listing of the distinct values and their relative frequencies.
Procedure 2.2
Step 1: Obtain a frequency distribution of the data.
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
Step 2: Divide each frequency by the total number of observations (𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠).
Definition 2.5
Pie Chart: A disk divided into wedge-shaped pieces, proportional to the relative frequencies
of the qualitative data.
Procedure 2.3 – To Construct a Pie Chart
Step 1: Obtain data through procedure 2.2.
Step 2: Divide disk into pieces proportional to the relative
frequencies.
Step 3: Label pieces with the distinct values and their relative
frequencies.
Definition 2.6
Bar Chart: Displays distinct values of the qualitative data on a horizontal axis and the relative
frequencies (or frequencies or percents) on a vertical axis. Relative data is being represented
by a bar, whose height is equal to the relative frequency of that value. The bars do not touch
each other.
Procedure 2.4 – To Construct a Bar Chart
Step 1: Obtain data by applying procedure 2.2.
Step 2: Place horizontal axis and vertical axis to display the relative
frequencies.
Step 3: Each value gets a bar, whose height equals the relative
frequency of that value.
Step 4: Label the bars with the distinct values, the horizontal axis with
the name of the variable, and the vertical axis with “Relative
frequency”.
Section 2.3 Organizing Quantitative Data
Step 1: collect Data (e.g. 50 households, Nr. Tv)
Step 2: Sort, categorize and compute relative frequency
Definition 2.7 – Terms Used in Limit Grouping
Lower Class Limit: Smallest value that could go in a class.
Upper Class Limit: Largest value that could go in a class.
Class Width: Difference between lower limit of class and lower limit of the next-higher class.
Class Mark: Average of the two class limits of a class
Definition 2. – Terms Used in Cutpoint Grouping
Lower Class Cutpoint: Smallest value that could go in a class.
Upper Class Cutpoint: Largest value that go in the next-higher class (= lower cutpoint of the
next-higher class).
Class Width: Difference between the cutpoints of a class.
Class Midpoint: Average of the two cutpoints of a class.
Choosing the Grouping Method
Grouping Method
Single-value Grouping
Limit Grouping
Cutpoint Grouping
When to use
Used with discrete data, when there are
only a small number of distinct values.
Used when data is being expressed as
whole numbers and there aren’t too many
distinct values to employ single-value
grouping.
Used when data is continuous and is
expressed through decimals.
Definition 2.9 – Histograms
Histogram: Displays quantitative data on a horizontal axis and the frequencies of those on a
vertical axis. The frequencies of each class are presented by a vertical bar whose height is
equal to the frequency. The bars should be positioned so the touch each other.
 Single-value grouping: distinct values are used to label the charts, with each of them
centered under the bar.
 Limit Grouping or Cutpoint Grouping: Lower class limits are used to label the bars.
(Note: sometimes class marks or class midpoints are used and centered under the
bars)
Procedure 2.5 – To Construct a Histogram
Step 1: Obtain a frequency (relative-frequency, percent) distribution.
Step 2: Place the bars on the horizontal axis and display the frequencies on a vertical axis.
Step 3: Each class gets a vertical bar, whose height equals the frequency of its class.
Step 4: Bars get labels with their classes, the horizontal axis the name of the variable, and
the vertical axis with “frequency”.
Definition 2.10 – Dotplot
Dotplot: A graph in which each observation is plotted as a dot at an appropriate place above
a horizontal axis. Observations having equal values are stacked vertically.
Procedure 2.6 – To Construct a Dotplot
Step 1: Horizontal axis displays the possible values of the quantitative data.
Step 2: Each observation is recorded, by placing a dot over the appropriate value on the
horizontal axis.
Step 3: The horizontal axis gets a label with the name of the variable.
Definition 2.11 – Stem-and-Leaf Diagrams
Stem-and-Leaf Diagrams: (also called stemplot), each observation is separated into two
parts, namely, a stem-consisting of all but the rightmost digit- and a leaf, the rightmost digit.
Procedure 2.7 – To Construct a Stem-and-Leaf Diagram
Step 1: Each observation as a stem-consisting of all but the rightmost digit-and a leaf, the
rightmost digit.
Step 2: Write the stems from smallest to largest in a vertical column to the left of a vertical
rule.
Step 3: Write each leaf to the right of the vertical rule in the row that contains the
appropriate stem.
Step 4: Arrange the leaves in each row in ascending order.
Section 2.4 Distribution Shapes
Definition 2.12 – Distribution of a Data Set
Distribution of a Data Set: Is a table, graph, or formula that provides the values of the
observations and how often they occur.
Different kinds of distribution
Download