Chapter 1
Introduction to Statistics
1-1 Overview
1- 2 Types of Data
1- 3 Abuses of Statistics
1- 4 Design of Experiments
1
1-1
(Definition)
A collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data
2
Definitions
Population
The complete collection of all data to be studied.
Sample
The subcollection data drawn from the population.
3
Example
Identify the population and sample in the study
A quality-control manager randomly selects 50 bottles of Coca-Cola to assess the calibration of the filing machine.
4
Definitions
Broken into 2 areas
Descriptive Statistics
Inferencial Statistics
5
Definitions
Descriptive Statistics
Describes data usually through the use of graphs, charts and pictures. Simple calculations like mean, range, mode, etc., may also be used.
Inferencial Statistics
Uses sample data to make inferences (draw conclusions) about an entire population
Test Question
6
Types of Data
Parameter vs. Statistic
Quantitative Data vs. Qualitative Data
Discrete Data vs. Continuous Data
7
Definitions
a numerical measurement describing some characteristic of a population population parameter
8
Definitions
a numerical measurement describing some characteristic of a sample sample statistic
9
Examples
Parameter
51% of the entire population of the US is
Female
Statistic
Based on a sample from the US population is was determined that 35% consider themselves overweight.
10
Definitions
Quantitative data
Numbers representing counts or measurements
Qualitative (or categorical or attribute) data
Can be separated into different categories that are distinguished by some nonnumeric characteristics
11
Examples
Quantitative data
The number of FLC students with blue eyes
Qualitative (or categorical or attribute) data
The eye color of FLC students
12
Definitions
We further describe quantitative data by distinguishing between discrete and continuous data
Discrete
Quantitative
Data
Continuous
13
Definitions
Discrete data result when the number of possible values is either a finite number or a ‘countable’ number of possible values
0, 1, 2, 3, . . .
Continuous
(numerical) data result from infinitely many possible values that correspond to some continuous scale or interval that covers a range of values without gaps, interruptions, or jumps
2 3
14
Examples
Discrete
The number of eggs that hens lay; for example, 3 eggs a day.
Continuous
The amounts of milk that cows produce; for example, 2.343115 gallons a day.
15
Definitions
Univariate Data
» Involves the use of one variable (X)
» Does not deal with causes and relationship
Bivariate Data
» Involves the use of two variables (X and Y)
» Deals with causes and relationships
16
Example
Univariate Data
How many first year students attend FLC?
Bivariate Data
Is there a relationship between then number of females in Computer Programming and their scores in Mathematics?
17
Important Characteristics of Data
1. Center : A representative or average value that indicates where the middle of the data set is located
2. Variation : A measure of the amount that the values vary among themselves or how data is dispersed
3. Distribution : The nature or shape of the distribution of data
(such as bell-shaped, uniform, or skewed)
4. Outliers : Sample values that lie very far away from the vast majority of other sample values
5. Time : Changing characteristics of the data over time
18
Almost all fields of study benefit from the application of statistical methods
Sociology, Genetics, Insurance, Biology, Polling,
Retirement Planning, automobile fatality rates, and many more too numerous to mention.
19
1-3 Abuses of Statistics
Bad Samples
Small Samples
Loaded Questions
Misleading Graphs
Pictographs
Precise Numbers
Distorted Percentages
Partial Pictures
Deliberate Distortions
20
Abuses of Statistics
Bad Samples
Inappropriate methods to collect data. BIAS
(on test)
Example: using phone books to sample data.
Small Samples
(will have example on exam)
We will talk about same size later in the course.
Even large samples can be bad samples.
Loaded Questions
Survey questions can be worked to elicit a desired response
21
Abuses of Statistics
Bad Samples
Small Samples
Loaded Questions
Misleading Graphs
Pictographs
Precise Numbers
Distorted Percentages
Partial Pictures
Deliberate Distortions
22
Salaries of People with Bachelor’s Degrees and with High School
Diplomas
$40,000
$40,500
$40,000
$40,500
35,000 30,000
$24,400
30,000 20,000
$24,400
25,000 10,000
20,000
Bachelor High School
Degree Diploma
(a) (test question)
0
Bachelor High School
Degree Diploma
(b)
23
We should analyze the numerical information given in the graph instead of being mislead by its general shape.
24
Abuses of Statistics
Bad Samples
Small Samples
Loaded Questions
Misleading Graphs
Pictographs
Precise Numbers
Distorted Percentages
Partial Pictures
Deliberate Distortions
25
Double the length, width, and height of a cube, and the volume increases by a factor of eight
What is actually intended here? 2 times or 8 times?
26
Abuses of Statistics
Bad Samples
Small Samples
Loaded Questions
Misleading Graphs
Pictographs
Precise Numbers
Distorted Percentages
Partial Pictures
Deliberate Distortions
27
Abuses of Statistics
Precise Numbers
There are 103,215,027 households in the
US. This is actually an estimate and it would be best to say there are about 103 million households.
Distorted Percentages
100% improvement doesn’t mean perfect.
Deliberate Distortions
Lies, Lies, all Lies
28
Abuses of Statistics
Bad Samples
Small Samples
Loaded Questions
Misleading Graphs
Pictographs
Precise Numbers
Distorted Percentages
Partial Pictures
Deliberate Distortions
29
Abuses of Statistics
Partial Pictures
“Ninety percent of all our cars sold in this country in the last 10 years are still on the road.
”
Problem: What if the 90% were sold in the last 3 years?
30
1-4
Design of Experiments
31
Definition
Experiment apply some treatment (Action)
Event observe its effects on the subject(s) (Observe)
Example: Experiment: Toss a coin
Event: Observe a tail
32
Designing an Experiment
Identify your objective
Collect sample data
Use a random procedure that avoids bias
Analyze the data and form conclusions
33
Methods of Sampling
Random (type discussed in this class)
Systematic
Convenience
Stratified
Cluster
34
Definitions
Random Sample members of the population are selected in such a way that each has an equal chance of being selected (if not then sample is biased)
Simple Random Sample
(of size n ) subjects selected in such a way that every possible sample of size n has the same chance of being chosen
35
Random Sampling selection so that each has an equal chance of being selected
36
Systematic Sampling
Select some starting point and then select every K th element in the population
37
Convenience Sampling use results that are easy to get
38
Stratified Sampling subdivide the population into at least two different subgroups that share the same characteristics, then draw a sample from each subgroup (or stratum)
39
Cluster Sampling divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters
40
Definitions
Sampling Error the difference between a sample result and the true population result; such an error results from chance sample fluctuations.
Nonsampling Error sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly).
41
Using Formulas
Factorial Notation
8! = 8x7x6x5x4x3x2x1
Order of Operations
1. ( )
2. POWERS
3. MULT. & DIV.
4. ADD & SUBT.
5. READ LIKE A BOOK
Keep number in calculator as long a possible
42