Unit 1: Statistics and Statistical Thinking

advertisement
Unit 1: Statistics and Statistical Thinking
• Statistics is the science of data
• Statistics involves collecting, classifying, summarizing,
organizing, analyzing and interpreting numerical information
• Statistics is used in several different disciplines (scientific and
non-scientific) to make decision and draw conclusions based
on data.
For instance:
• In the pharmaceutical industry. It is impossible to
test every drug for every person that may require
it. So the industry needs a statistician.
• In business, managers must often decide whom
to offer their company’s products to such as a
credit card company must asses how risky a
potential customer is.
• An individual who needs to lose weight for his
upcoming new film. He needs to see data of
successful diet.
Average weight loss on Various diets across 8 weeks
Weight
Diet
1
2
3
4
5
6
7
8
Diet 1
310
310
304
300
290
285
280
284
Diet 2
310
312
308
304
300
295
290
289
Diet 3
310
307
306
303
301
299
297
295
Diet 4
310
308
305
303
297
294
290
287
Based on these numbers, which
diet should he/she addopt?
two types of statistics
• Descriptive statistics: utilize numerical and graphical method to look for
patterns in a data set, to summarize information revealed in a data set,
and to present the information in a convenient form that individuals can
use to make decisions. The main goal of descriptive statistics is to describe
a data set. The class of descriptive statistics include both numerical
measures (e.g. Mean, Median) or graphical displays of data (e.g. Charts or
graphs)
• Inferential statistics: utilize sample data to make estimates, decisions,
predictions, or other generalizations about a larger set of data.
descriptive statistics
• Look at example of the table of various diets
• What informations provided by the table?
 The most significance of diet process is Diet 1
 Furthermore, Diet 1 is not stable (see week 7 & 8)
 Diet 4 shows a steady decline in weight loss
 One can make an educated decision suitable for his/her
personal weight loss goals.
inferential statistics
•
•
•
•
•
•
•
The main goal is to make a conslusion about a population based on a sample of a
population.
Inferential statistics mostly uses hypethesis testing.
Key Definition:
Experimental unit (an object upon which data is colletced)
Population (a set of units that is of interest to study)
Variable (a characteristic or property of an individual experimental unit)
Sample (a subset of the units of a population)
statistical hypothesis
• An educated guess about the relationship between two (or
more) variables.
• Two main variables:
 Independent variable (the variable that represents the inputs
to the dependent variable, or the variable that can be
manipulated to see if they are the cause.
 Dependent variable (the variable which represents the effect
that is being tested
a case of statitical hypothesis
• A literature teacher has a hypothesis that by demanding the students to
read a novel in a week for 16 meetings, the students are able to be selfmotivated in reading habit rather than those who are accustomed to
lecturing in every meeting.
• Ind. Variable : reading a novel per week
• Dep. Variable: self-motivation
Since it is impossible to take all students as the sample, so the teacher is to
take a sample to generalize the entire population.
key steps of problem
•
•
•
•
Descriptive
Define the population (or
sample) of interest
Select the variables that are
going to be investigated
Select the tables, graphs, or
numerical summary tool
Identify patterns in the data
Inferential
• Define the population of
interest
• Select the variables that are
going to be investigated
• Select a sample of population
units
• Run the statistical tests on
sample
• Generalize the result to your
population and draw
conclusions
types of data
Qualitative Data
• Measurement that cannot
be measured on a natural
numerical scale
• Measurement can only be
classified into one or more
groups of categories
• Example: brands of shoes
(Nike, Adidas, or K-Swiss),
gender (male or female)
Quantitative Data
• Measurement that can be
recorded on a natrually
occuring scale
• Example: people’s salary in
a year
take-home assignment
Discuss the difference between descriptive and inferential statistics
1. Give an example of research question that would use an inferential
statistic solution
2. Identify the independent and dependent variable in the following
research question: A production manager is interested in knowing
if employees are effective if they work a shorter work week. To
answer his question he proposes the following research question:
Do more widgets get made if employees work 4 days a week or 5
days a week?
3. What is the difference between population and sample?
4. Write about a decision you made once in your life time using
descriptive statistics!
Download