SAMPLING METHODS
Simple Random – Everyone in the population has an equal chance of selection.
Example: Picking older cardiac patients in Toronto via randomization algorithm after contacting local GPs.
Voluntary & Convenience – Participants are the most accessible or self-selected.
Example: Recruiting the first 40 patients entering a hospital’s emergency department.
Stratified Random – Population divided into subgroups, then sampled randomly within each.
Example: Studying newcomers’ housing experiences by sampling within nationality groups.
Cluster Random – Random sampling within naturally occurring groups.
Example: Surveying patient satisfaction across different hospital departments.
DATA
CASE=ROW // VARIABLE NAMES=HEADERS // VARIABLE=COLUMN // RELATIONAL DB= links db’s
cases through id# (like merging by ping_id in powerquery)
Primary Data = Data we collect ex: from survey
Secondary Data = Data collected by another party ex: stats canada
VARIABLES
Qualitative Var / Categorical Var = answers how ?s
Quantitative Var = Measures numerical values with units
Some variables can be both categorical and quantitative
How data are classified depends on Why we are collecting the data
Identifier Variable = unique identified linked to individual or item in group (ID# or PRODUCT#)
Nominal Variable = Qualitative vvars used only to name categories
Ordinal Variable = When data values can be ordered EX:employees can be ranked according to the
number of days worked in the company.
Time Series = Variables that are measured at regular intervals over time EX:the share price of the Royal
Bank of Canada at the end of each day for the past year.
Cross-Sectional Data – Multiple variables measured at the same point in time. Ex:Tracking sales
revenue, customer count, and expenses for each Starbucks location in the past month.
UNITS
Some quantitative units indicate how value was measured, the corresponding scale of measurement,
how much of something we have, how far apart two values are
Other quantitative variables have no units, such as: Number of visits to a web site, or the number of
shares of a company traded in Toronto Stock Exchange
Variables can be qualitative and quantitative, depending on the purpose of data collection.
For example:
- Age as a quantitative variable – Measured in years to calculate the average age of customers.
- Age as a categorical variable – Grouped (Child, Teen, Adult, Senior) to tailor music offers
(folk, jazz, hip-hop, reggae).
Counts = 1. Summarize qualitative var 2. To measure the amount of things
Quantitative variables differ based on whether zero has a defined value:
Interval Scale = No true zero (e.g., 80°F isn’t twice as hot as 40°F).
Ratio Scale = Has a true zero, allowing meaningful comparisons.
Statistics Process = 0. Determine analysis objectives/goals 1.Data Collection 2.Data Organizing 3.Data
Analysis 4.Results Interpretation 5.Results Presentation
Data Collection Process = 1.Review the analysis 2.Data Identification(What question type) 3.Data
Sources (primary or secondary) 4.Tools 5.Budget+Time
Population = whole group
Sample = selected group to study
Subgroup = subsection of a group
Sample Unit = Sample units are the members of the population from which measurements are taken
during sampling
Creating Data Structure amd Data Input Template = 1.Developing surveys/forms 2.Determining
Sample Size 3.Collecting a pilot sample 4. Finalize surveys / forms / tools 5.Distributing surveys / forms
/ tools 6. Receiving Data, Input and start organizing
Survey Development = 1.Intro (purpose, who, confidentiality, assurance blah blah blah), 2. Question
Section (multiple choice, open-ended questions, closed ended questions.) 3.Ending section(thank you /
incentives?)
Sampling Errors = Sample does not represent population (wrong respondent) 1.Non sampling errors =
right respondent - wrong question / time / organization