Business Statistics: A Decision

advertisement
Business Statistics:
A Decision-Making Approach
7th Edition
Chapter 1
The Where, Why, and How of
Data Collection
What is Statistics?

Statistics is the development and application of methods
to collect, analyze and interpret data.


Modern statistical methods involve the design and analysis of
experiments and surveys, the quantification of biological, social
and scientific phenomenon and the application of statistical
principles to understand more about the world around us.
Statistics is a discipline which is concerned with:




designing experiments and other data collection,
summarizing information to aid understanding,
drawing conclusions from data, and
estimating the present or predicting the future.
Population vs. Sample
Population
a b
Sample
cd
b
ef gh i jk l m n
o p q rs t u v w
x y
z
c
gi
o
n
r
y
u
Populations and Samples

A Population is the set of all items or individuals
of interest


Examples:
All likely voters in the next election
All parts produced today
All sales receipts for November
A Sample is a subset of the population

Examples:
1000 voters selected at random for interview
A few parts selected for destructive testing
Every 100th receipt selected for audit
Why Sample?

Less time consuming than a census

Less costly to administer than a census

It is possible to obtain statistical results of a
sufficiently high precision based on samples.
Sampling Techniques
Sampling Techniques
Nonstatistical Sampling
Convenience
Statistical Sampling
Simple
Random
Systematic
Judgment
Not interested in……
Stratified
Cluster
Statistical Sampling

Items of the sample are chosen based on
known or calculable probabilities
Statistical Sampling
(Probability Sampling)
Simple Random
Stratified
Video Clip
Systematic
Cluster
Please read the book
Example of Random Sampling

Suggesting how the statistical sampling
techniques can be used to gather data on
employees' preferences for scheduling vacation
times.

Simple random sampling could be used by assigning
each employee a number and then using a random
number generator to select employees.
 Table and Excel Toolpak

Simple Random Sampling

Every possible sample of a given size has an equal
chance of being selected

Selection may be with replacement or without
replacement

The sample can be obtained using a table of random
numbers or computer random number generator
Type of Statistics

Descriptive statistics


Mathematical methods (such as mean, median, standard
deviation) that summarize and interpret some of the
properties of a set of data (sample) but do not infer the
properties of the population from which the sample was drawn.
Inferential statistics

Mathematical methods (such as hypothesis development) that
employ probability theory for deducing (inferring) the
properties of a population from the analysis of the properties of
a set of data (sample) drawn from it. It is concerned also with
the precision and reliability of the inferences it helps draw.
Descriptive Statistics

Collect data

e.g., Survey, Observation,
Experiments

Present data


e.g., Charts and graphs
Characterize data

e.g., Sample mean =
x
n
i
Inferential Statistics

Making statements about a population by
examining sample results
Sample statistics
(known)
Population parameters
Inference
Sample
(unknown, but can
be estimated from
sample evidence)
Population
Inferential Statistics
Drawing conclusions and/or making decisions
concerning a population based on sample results.

Estimation


e.g., Estimate the population mean
weight using the sample mean
weight
Hypothesis Testing

e.g., Use sample evidence to test
the claim that the population mean
weight is 120 pounds
Tools for Collecting Data
Data Collection Methods
Experiments
Telephone
surveys
Written
questionnaires
Direct observation and
personal interview
Survey Design Steps

Define the issue

what are the purpose and objectives of the survey?

Define the population of interest

Develop survey questions

make questions clear and unambiguous

use universally-accepted definitions

limit the number of questions
Survey Design Steps
(continued)

Pre-test the survey

pilot test with a small group of participants

assess clarity and length

Determine the sample size and sampling
method

Select sample and administer the survey
Types of Questions

Closed-end Questions


Select from a short list of defined choices
Example: Major: __business __liberal arts
__science __other
Open-end Questions

Respondents are free to respond with any value, words, or
statement
Example: What did you like best about this course?

Demographic Questions

Questions about the respondents’ personal characteristics
Example: Gender: __Female __ Male
Data (variable) Types
Data
Qualitative
(Categorical)
Quantitative
(Numerical)
Examples:



Discrete
Marital Status
Political Party
Eye Color
(Defined categories)
Continuous
Examples:


Number of Children
Defects per hour
(Counted items)
Examples:


Weight
Voltage
(Measured
characteristics)
Qualitative vs. Quantitative Variables (Data)

Qualitative variables (data) take on values that are
names or labels.


Example: the color of a ball (e.g., red, green, blue) or the breed
of a dog (e.g., collie, shepherd, terrier)
Quantitative variables are numerical. They represent a
measurable quantity.

Example: # of students in CSUB or # of people in Bakersfield
Discrete vs. Continuous Variables (Data)

Quantitative variables can be further classified
as discrete or continuous.


If a variable can take on any value between its minimum value
and its maximum value, it is called a continuous variable;
otherwise, it is called a discrete variable.
Example: The fire department mandates that all fire fighters
must weigh between 150 and 250 pounds. The weight of a fire
fighter would be an example of a continuous variable; since a
fire fighter's weight could take on any value between 150 and
250 pounds.
Discrete vs. Continuous Variables (Data)

Example: If we flip a coin and count the number of heads. The
number of heads could be any integer value between 0 and plus
infinity. However, it could not be any number between 0 and
plus infinity. That is, we could not, for example, get 2.3 heads.
Therefore, the number of heads must be a discrete variable.
Data Measurement Levels
Measurements
Ratio/Interval Data
Rankings
Ordered Categories
Categorical Codes
ID Numbers
Category Names
Ordinal Data
Nominal Data
Highest Level
Complete Analysis
Higher Level
Mid-level Analysis
Lowest Level
Basic Analysis
Nominal

Nominal basically refers to categorically discrete data
such as name of your school, type of car you drive or
name of a book. This one is easy to remember
because nominal sounds like name (they have the
same Latin root).
Ordinal

Ordinal refers to quantities that have a natural ordering.
The ranking of favorite sports, the order of people's
place in a line, the order of runners finishing a race or
more often the choice on a rating scale from 1 to 5. With
ordinal data you cannot state with certainty whether the
intervals between each value are equal. For example,
we often using rating scales (Likert-Scale questions).
On a 10 point scale, the difference between a 9 and a
10 is not necessarily the same difference as the
difference between a 6 and a 7. This is also an easy
one to remember, ordinal sounds like order.
Interval

Interval data is like ordinal except we can say the
intervals between each value are equally split. The most
common example is temperature in degrees Fahrenheit.
The difference between 29 and 30 degrees is the same
magnitude as the difference between 78 and 79
(although I know I prefer the latter). With attitudinal
scales and the Likert questions you usually see on a
survey, these are rarely interval, although many points
on the scale likely are of equal intervals.
Ratio

Ratio data is interval data with a natural zero point. For
example, time is ratio since 0 time is meaningful.
Degrees Kelvin has a 0 point (absolute 0) and the steps
in both these scales have the same degree of
magnitude.
Data Types

Time Series Data


Ordered data values observed over time
Cross Section Data

Data values observed at a fixed point in time
Data Types
Sales (in $1000’s)
2003
2004
2005
2006
Atlanta
435
460
475
490
Boston
320
345
375
395
Cleveland
405
390
410
395
Denver
260
270
285
280
Cross Section
Data
Time
Series
Data
Download