Uploaded by Rae Stanwood

Chapter 1(1)

advertisement
Chapter 1:
Why Study Statistics?
Statistical techniques are used to make many decisions that involve data.
An understanding of statistical methods will help us make decisions more effectively.
Statistics turns _data_______ into __information_____________.
What is Meant by Statistics?
Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting
numerical data which can then be used as a basis for inference to assist in making better
decisions.
Who Uses Statistics?
Statistical techniques are used extensively by Marketers, Investors, Accountants,
Consumers, Professional sports people, Hospital administrators, Educators, Politicians,
Physicians. What examples can you think of?
See examples: operations: quality control, reliability; Advertising: household surveys, tv
viewing habits; strategists: forecasting, planning, risk min.
Data
Raw Facts or measurements of interest/Values assigned to observations or measurements
Information
data that are transformed into useful facts that can be used for a specific purpose, such as
making a decision
Low Temperature in Celsius for NY first week of January, 2018
Jan 1: -14
Jan 2: -16
Jan 3: -9
Jan 4: -7
Jan 5: -13
Jan 6: -14
Jan 7: -15
www.accuweather.com
Average: ______
Meaningful: Lowest Jan. temperature historically for NY
Data Set: A collection of data points
Database: A collection of data points that contains many rows (records) and columns
(fields)
Data Sources:
Primary: Collected for your own use.
Secondary: Data collected by someone else.
Which kind of data was the temperature data?
Primary Data
Advantages
Collected by the person or organization who uses the data
1
Disadvantages can be expensive and time consuming to gather
Secondary Data
Disadvantages: no control over how the data was collected, less reliable unless collected
and recorded accurately
Advantages: readily available, less expensive to use
Data Sources
Existing Sources
Data needed for a particular application might already exist within a firm. EG:
Detailed information on customers, suppliers, and employees.
Government agencies are another important source of data.
Substantial amounts of business and economic data are available from organizations
that specialize in collecting and maintaining data.
Data are also available from a variety of industry associations and special-interest
organizations.
Statistical Studies
In experimental studies the variables of interest are first identified. Then one or more
factors are controlled so that data can be obtained about how the factors influence the
variables.
In observational (non-experimental) studies no attempt is made to control or influence
the variables of interest. Survey perhaps the most common If surveys are collected
properly, no attempt is made to control or influence the variable of interest.
See EG of Bias
Bias can occur when a question is stated in a way to encourage particular answers
Observing, Experiments: treatments in controlled environment; Surveys:
subjects are asked questions
Data Acquisition : Cost Benefit Analysis
Time Requirement: Information might no longer be useful by the time it is available.
Cost of Acquisition: The cost of acquiring the information must be worth the
information it provides.
Data Errors: Part of the cost of acquiring worthwhile information is choosing how
the data is measured with care, so that the study measures what it is supposed to
measure.
2
Qualitative
vs
Quantitative Data
Descriptive
Counted/Measured
Qualitative: ex. eye color, political party
Quantitative: ex. number of children, weight, voltage
EG:
Quantitative Data/Variables
 Ordinary arithmetic operations are meaningful only with quantitative data.
 Discrete variables assume certain counted values
 Gaps when we graph discrete data on a number line
 A continuous variable can assume any value within a specified range.
 We observe a solid line on the number line with no gaps.
There are four levels of data in terms of Level of Measurement
Nominal: Data: Qualitative
 labels or names used to identify an attribute.
 Nonnumeric label or numeric code.
 Qualitative. Examples? Postal codes, hair color
Ordinal: Qualitative
 Like nominal data but order or rank of the data is meaningful.
 The differences between data values cannot be determined / meaningless.
 A nonnumeric label or a numeric code may be used

Ordinal data is qualitative. Examples? Education level (masters and doctorate)
Interval: Quantitative
 similar to the ordinal level, but meaningful amounts of differences between data
values can be determined.
 There is no natural zero point.
 Interval data is always numeric and is quantitative.
 The “zero” is assigned: it is unphysical and not meaningful
 Zero does not mean the absence of the quantity that we are trying to measure,
zero is assigned, scale based Examples? calendar year, temperature
Ratio level Quantitative
 Interval level with an inherent zero starting point.
 Differences and ratios are meaningful for this level of measurement.
 Ratio data is always numeric and is quantitative.
 True zero-point Examples? income
Cross-Sectional and Time Series Data
Time series data is collected over a range of time periods.
Cross-sectional data is collected at the same or approximately the same point in time.
See Table 1.4, 1.5, 1.6 in text
3
Branches of Statistics
Descriptive Statistics:
Methods of organizing, summarizing, and presenting data in an informative way.
EG: Census of populations
Individual response of registered voters regarding their choice of PM of Canada
Inferential Statistics
Making claims or conclusions about a population by examining sample results
Predictive statistics
Analyzing past data to predict future values and make decisions
Population vs Sample
Population
• represents all possible subjects that are of interest in a particular study.
Sample
• refers to a portion of the population that is representative of the
population from which it was selected
Parameter vs Statistic
Parameter –
 a described characteristic about a population
Statistic –
 a described characteristic about a sample
Why Sample the Population?
To contact the whole population would often be time prohibitive.
The cost of studying all the items in a population may be prohibitive.
The sample results are adequate.
The destructive nature of some tests or the physical impossibility of checking all items in
the population makes it impossible to sample the entire population.
Inferential Statistics
Figure 1.8
4
5
Download