Lecture Set 1 - SFU Mathematics and Statistics Web Server

advertisement
Statistics 100
Lecture Set 1
Lecture Set 1
• Course outline and important details about the course
• Chapter 1 … today
• Will be doing chapter 2 in the next lecture set
• Some suggested problems:
– Chapter 1: 1.3, 1.5, 1.11, 1.13, 1.17
Important Stuff
• Statistics and Actuarial Science Stats Lab
(Statistics Workshop)
– What is Stats Lab for? One-on-one help is available during its
operation hours.
– Where is it? The Stats Lab is located in K9516 (inside k9510)…
– How does the Stats Lab Work?
– The Statistics Workshop opens for regular use from the second
week of classes. The hours will depend on the amount of T.A. time
available and will be posted at the end of the first week of classes.
The Workshop will be open only when there is a T.A. on duty.
• Typically, Mon-Fri: 9:30-16:30
Important Stuff
• Text: Statistics: Concepts and Controversies, 8th
edition, by Moore and Notz
• People have asked about 7th edition …
• Read Chapters 1 and 2 this week (they are short)
Important Stuff
• Course web page can be found:
www.stat.sfu.ca/~dbingham/stat100
• Download lecture notes day before class
• Will also have announcements (e.g., exam dates)
• Also has my office hours posted (Monday and
Wednesday 1:00-2:00)
Important Stuff
Grading Scheme:
–
–
–
–
Assignments – 10%
Midterm 1 – 20%
Midterm 2 – 20%
Final – 50%
Tentative mid-term dates
– Mid-Term 1: Monday, February 17
– Mid-Term 2: Monday, March 17
Important Stuff
• Assignments: 8-10 of them
• Usually will be due Wednesdays, before 4:30 in
boxes outside lab
• The boxes are labeled (by class and alphabetically)
• Note:
1. Late assignments will not be accepted
2. Assignments placed in the wrong box (e.g., stat 270) will not be
accepted
Important Stuff
• The classroom is likely to be full
• Be courteous … when you come in, do not sit in the
aisle seat (unless you are left-handed)
• Do not put your bag down on a seat ….
• Turn off cell phones, do NOT text, …
• People with laptops …
Important Stuff
Other stuff
– Class email list: I will occasionally email the class with
hints and other information.
– If the email is not from me or Robin Insley (lab instructor),
then it is likely spam
What is this course about
• Statistical methods are are used everywhere
–
–
–
–
Health studies
Industry
Economics
Studying manuscripts
• Most courses I teach are concerned with statistical methods
– How to fit models to data
– How to use statistics to make decisions
• This course is not about those things
• This course is about statistical reasoning
How to do well
• Study and practice
• Ask questions
• Office hours and the drop-in lab
PART I: Producing Data
• Not every product you could buy is well-made
– Cars, phones, clothes, food
– Cheaply & poorly made vs. carefully & properly made
• Data are the same way
– Not all numbers should be viewed as having equal quality
– How they are collected says a lot about the information that they
convey and our degree of belief
• Chapter 1 introduces data collection
Chapter 1: Where do Data Come From?
Chapter 1: Where do Data Come From?
What are the data?
Example
• Does living high voltage power lines cause childhood
leukemia?
• Study conducted to see if there is evidence that magnetic
fields were related to leukemia a study was conducted
• Researchers compared 628 children who had leukemia
and 620 who did not
• Measured magnetic field in the rooms in their houses
• What are the data?
Example
• What are Data?
Variables
Individuals
Name
Leukemia
Susan
Yes
0.15 μT
Bobby
No
0.12 μT
.
.
.
.
.
.
Magnetic field in bedroom
…
Some Definitions
• Interested in something about a population.
• Population is a collection of individuals.
• Individuals are the objects described by the data
• Data sets contain information/facts relating to
individuals.
• Variables are attributes of an individual (e.g., hair
color, pain severity, ...).
How are data collected?
• A good deal of effort is spent trying to figure out
what data to collect
– Which individuals are measured?
– What should be measured to answer the questions of
interest?
– What population was the data collected from?
– What is the population of interest?
– Can we afford to conduct the study?
How are data collected?
• Purpose of Study
– Learn something about a group of individuals
– Population = group of individuals that you want to know
about
– Sample = group of individuals that you actually measure
– Examples…
– Why not just measure the entire population (census)?
Observational studies
• Observational study: observes individuals and measures
variables but does not attempt to influence a study.
• The outcome(s) of interest is called the response variable.
• Observational studies (“you can observe a lot by watching”)
– Identify an individual, watch/measure variables
– Do not interfere, merely observe (collect data)
– Generally inexpensive… very common
Example (back to leukemia study)
• Chapter 1 has a discussion of Leukemia and Power lines
• Looking for association between magnetic fields and Leukemia
• Measured Electro-magnetic fields & lots of other variables
• Found no link, despite anecdotes
• Notice that researchers did not interfere in the study (e.g., did
not intentionally expose children to magnetic fields)
Sample Surveys
• Sample Surveys: A collection of individuals (the sample) from
the population are measured and chosen in a specific
quantifiable manner
– Special kind of observational study
– Use a sample carefully chosen from the population to best
represent the population
– Idea is that the sample should be representative of the population
and can learn from the sample
• Examples:
•
•
•
•
Political Polls: How can we tell who will win an election
Government Surveys: inform policy
Market Research
NOT the leukemia study (more on this in chapters 2-4)
Census
• Census: A sample survey where the sample is (ideally) the
entire population
• Example: Statistics Canada conducts the Census of Population
and the Census of Agriculture to develop a statistical portrait of
Canada and its people
Census
• Interesting side note: In the summer of 2010, the Canadian
Federal Government announced that the 2011 long-form census
questionnaire will no longer be mandatory
• What does this mean?
Census
• Interesting side note: In the summer of 2010, the Canadian
Federal Government announced that the 2011 long-form census
questionnaire will no longer be mandatory
• What does this mean?
• “I want to take this opportunity to comment on a technical
statistical issue which has become the subject of media
discussion. This relates to the question of whether a voluntary
survey can become a substitute for a mandatory census. It can
not.” — Munir Sheikh, Chief Statistician of Canada
Experiments
• Experiment: Is a study where a treatment is deliberately
imposed on an individual in order to observe their response.
• Why do this?
• Why was this not done in the leukemia study?
• Experiments:
• Clinical Trials
• Agriculture
• Manufacturing
Example (Pain Reduction and Reiki)
•
Is Reiki an effective pain management tool?
•
Reiki treatment is touch therapy used as an alternative to pain
medication.
•
A pilot study involving 20 volunteers experiencing pain was conducted
•
All treatments were provided by a certified Reiki therapist
•
Pain was measured using before and after the Reiki treatment
•
What kind of study is this?
•
Is this a good study (more on this later)?
•
If study was repeated, would we see the same results?
Example (Saving for Retirement)
• What are the attitudes of low wage earners about saving for
retirement?
• Americans earning $35,000 or less were asked how they are
likely to accumulate enough money to retire.
• What are the data?
• What is the population?
• What kind of study is this?
Observational study, likely a
sample survey
Chapter 1: Where do Data Come From?
• Which is worse:
– Not knowing the answer to a question
– Thinking you know the answer, but being wrong
• Which is worse:
– Not knowing the answer to a question
– Thinking you know the answer, but being wrong
"We know he's been absolutely devoted to trying to
acquire nuclear weapons, and we believe he has, in fact,
reconstituted nuclear weapons."
Dick Cheney, March 16, 2003
• There are many ways to collect data
– Some studies provide good information
– Most don’t
– How can you tell which is which?
– Being skeptical about studies and identifying good
sampling techniques is key
Brief Moment of Statistical Relevance
Brief Moment of Statistical Relevance
• Two highlights from the commercial:
– Shoe is proven to “…work your hamstrings and calves 11%
harder…”
– Shoe is proven to “…tone your butt up to 25% more than regular
sneakers just by walking…”
Brief Moment of Statistical Relevance
• Two highlights from the commercial:
– Shoe is proven to “…work your hamstrings and calves 11%
harder…”
– Shoe is proven to “…tone your butt up to 25% more than regular
sneakers just by walking…”
• Some other facts not in the commercial:
– The study was based on a sample of 5 women who walked on a
treadmill for 500 steps wearing either the EasyTone or another
Reebok walking shoe, and while barefoot.
Brief Moment of Statistical Relevance
• Two highlights from the commercial:
– Shoe is proven to “…work your hamstrings and calves 11%
harder…”
– Shoe is proven to “…tone your butt up to 25% more than regular
sneakers just by walking…”
• Some other facts not in the commercial:
– The study was based on a sample of 5 women who walked on a
treadmill for 500 steps wearing either the EasyTone or another
Reebok walking shoe, and while barefoot.
– From the Reebok fine print: “The shoes are designed only for
walking, and because of the instability design, wearers are
discouraged from running, jumping and engaging in other
athletic activities while wearing them.”
Download