Introduction to statistics Abdiweli Hassan Master of Research and statistics Lecturer Vision International College WHAT IS STATISTICS? Data Science • • • • Collection Summarization Presentation Interpretation Use • Testing • Estimating • Predicting Associ ations Variation Account • Random errors • Systematic errors BRANCHES 1 Descriptive statistics 2 Inferential statistics ROLES IN RESEARCH 1 Choice of appropriate design 2 Conduct of a research 3 Analysis of results BASIC CONCEPTS 1 Scales of measurement 2 Populations Vs samples 3 Variables 4 Statistic Vs parameter SCALES OF MEASUREMENT 1 2 Nominal e.g. ethnicity, nationality, gender Ordinal e.g. disease severity (mild, moderate, severe) 3 Interval e.g. temperature in Celsius 4 Ratio e.g. weight, height POPULATIONS AND SAMPLES 1 2 Population - Entire collection of objects - Subset of a collection without the intent to generalize to the whole Sample - Subset of a whole with the intention to generalize the results to the whole POPULATION SAMPLE VARIABLES 1 Types of variables - Numerical - Categorical Data Science • • • • Collection Summarization Presentation Interpretation UNDERSTANDING DATA 1 2 Components of data - Observations - Variables Types of data - Numerical - Continuous - Discrete - Categorical - Nominal - Ordinal 4 Analysis - Descriptive - Predictive - Prescriptive 5 Interpretation - Conclusions - decision-making Data Science SUMMARIZING DATA 1 Measures of central tendency - Mean - Median 2 Measures of spread - mode - Quantiles - Percentiles - Standard deviation - Interquartile range - Range - Variance - Coefficient of variation • • • • Collection Summarization Presentation Interpretation Data Science DATA PRESENTATION 1 2 Tables - Frequency - Frequency distributions - cumulative frequency distributions - Cross tabultions Graphs - Histograms - Frequency polygon - Box and whisker plot - Scatter plot - Stem-leaf plot - Pie chart - Bar chart • • • • Collection Summarization Presentation Interpretation Data Science WHICH PLOT TO USE WHEN • Categorical (C) versus categorical (C): barcharts • Quantitative (Q) versus quantitative (Q): scatterplots • Quantitative (Q) versus categorical (C): boxplots. • Categorical (C) versus quantitative (Q): boxplots. • • • • Collection Summarization Presentation Interpretation Data Science CATEGORICAL DATA Example: The housing variable with three categories (for free, own and rent). • • • • Collection Summarization Presentation Interpretation Data Science NUMERICAL DATA Example: Comparing two groups of children: one reporting respiratory symptoms and the other not reporting • • • • Collection Summarization Presentation Interpretation THANKS Questions 1. What is the difference between data and information? - Data = facts and figures - Information = processed data which is meaningful