Uploaded by tafiw94265

Chapter 1 Stats Starts Here

advertisement
STAT 200
Chapter 1 Stats Starts Here
What is Statistics?
Statistics is a science that involves the design of studies, data collection,
summarizing and analyzing the data, interpreting the results and drawing
conclusions.
Conclusions are made about specific phenomena on the basis of relatively
limited information.
Data and Variables
Questions of interest:
How big and frequent are earthquakes in Canada?
How are earthquakes that occur in British Columbia compared to the rest
of Canada?
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
1
STAT 200
Statistical process of investigation:
1. Collect data:
What data to collect?
How and where do we obtain data?
2. Examine the data:
How do we present and obtain useful information from the data?
We will learn the techniques in the next few chapters.
3. Interpret the results and draw conclusions
Solution: We can examine earthquake data during the past year that are
available on the Natural Resources Canada website at:
http://www.earthquakescanada.nrcan.gc.ca/recent/maps-cartes/index-eng.php
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
2
STAT 200
Let’s examine the data on earthquakes that occurred in year 2020. The
data are displayed in reverse chronological order.
Date Season Depth Magnitude Region
20201231 Winter 20 1.3
BC
20201231 Winter 20.11 1.9
NT
20201231 Winter 2.71 4
BC
20201231 Winter 19.86 1.8
BC
20201231 Winter 18 2.1
ON
20201230 Winter 14.14 2.1
BC
20201230 Winter 20.32 2.5 BC
20201230 Winter 25.55 2.4 BC
20201230 Winter 2 2.2 ON
20201230 Winter 18 2.1
QC
20201230 Winter 38.75 1.9
BC
20201229 Winter 1 0.8
BC
20201229 Winter 13.57 1.8
BC
20201229 Winter 5 2
NB
:
:
:
:
:
:
:
The full data set can be found on the lecture notes page on the course website.
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
3
STAT 200
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
4
STAT 200
In a typical data set, each row contains information corresponding to an
individual or an object or an experimental unit.
A variable refers to a characteristic of interest, e.g.
an earthquake. A variable can be:
magnitude of
1. qualitative/categorical – categorical variables with categories that can
be ordered are called ordinal variables.
An example is marital status, with four categories: single, married,
divorced, and widowed
Another example is severity of pain - suppose that there are three
categories: mild, moderate and severe. Since the categories can be
ranked, it is an ordinal variable.
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
5
STAT 200
2. quantitative (measured on a numerical scale) – units should be
attached.
Some examples of quantitative variables include height, weight,
lifetime.
Example: Earthquake data
Variable
Variable type
Season
Categorical
Unit
N/A
Depth
Quantitative
kilometers
Magnitude
Quantitative
dyne-cm (Richter scale)
Region
Categorical
N/A
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
6
STAT 200
Understanding the Data
• Who - subjects we wish to study about
• What - variables of interest
• Where - location in which the study is conducted / data are collected
• When - at what time point or over what time period are the data
collected
• Why - purpose of doing a study / collecting the data
• How - method used to collect the data
Eugenia Yu, UBC Department of Statistics. Not to be copied, used, or revised without explicit written permission from the copyright owner.
7
Download