Agenda Course 02402 Introduction to Statistics Lecture 1: Introduction to Statistics 1 Practical Information 2 Introduction to Statistics 3 Descriptive Statistics: Summary Statistics 4 Software: R Per Bruun Brockhoff DTU Informatics Building 305 - room 110 Danish Technical University 2800 Lyngby – Denmark e-mail: pbb@imm.dtu.dk Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 1 / 22 Practical Information Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 2 / 22 Spring 2013 5 / 22 Practical Information Practical information Practical Information Teaching module: Tuesdays 13.00-17.00 Generic weekly agenda: Homepage: 02402.imm.dtu.dk Note about software R Syllabus, Lecture plan Exercises & solutions Slides Podcasts of lectures (In English AND Danish) Quizzes BEFORE teaching module: Read announced stuff 2 hours long lectures (curriculum of the week) 2 hours of exercises (Mix of: Book, Rnote, online quiz-questions) AFTER teaching module: Test yourself by online exam quiz. Exam: 4 hour multiple choice Campusnet: www.campusnet.dtu.dk Messages and (certain) file sharings Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 4 / 22 Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Introduction to Statistics Introduction to Statistics Introduction to Statistics Statistics and Engineers How to treat (or analyse) data? What is random variation? Statistics is a tool for making decisions: How many computers did we sell last year? What is the expected price of a share? Is machine A more effective than machine B ? Statistics can be used Statistics can be used in most disciplines and is therefore a very important tool Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 7 / 22 Statistics is an important tool in problem solving Data analysis Quality improvement Design of experiments Predictions of future values .. and much more! Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics Spring 2013 8 / 22 Introduction to Statistics Statistics Statistics Modern statistics Modern statistics are based on theory of probabilities and descriptive statistics. Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Introduction to Statistics, Lecture 1 Spring 2013 9 / 22 Statistics is often about analyzing a sample, that is taken from a population Based on the sample, we try to generalize (or comment on) the population Therefore it is important that the sample is representative of the population Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 10 / 22 Descriptive Statistics: Summary Statistics Descriptive Statistics: Summary Statistics Chapter 2: Summary statistics Mean We use a number of summary statistics to summarize and describe data (stochastic variables) Mean x̄ Median Variance s2 Standard deviation s Percentiles Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 We say that x̄ is an estimate of the mean value Spring 2013 12 / 22 Descriptive Statistics: Summary Statistics Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 13 / 22 Descriptive Statistics: Summary Statistics Median Variance and standard deviation The median is also a key number, indicating the center of the data. In some cases, for example in the case of extreme values, the median is preferable to the mean Median: The observation in the middle (in sorted order) Per Bruun Brockhoff (pbb@imm.dtu.dk) The mean value is a key number that indicates the centre of gravity or centering of the data The mean: n 1X x̄ = xi n i=1 Introduction to Statistics, Lecture 1 Spring 2013 14 / 22 The variance (or the standard deviation) indicates the spread of the data: Variance n X 1 s2 = (xi − x̄)2 n − 1 i=1 Standard deviationv u n √ u 1 X s = s2 = t (xi − x̄)2 n − 1 i=1 Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 15 / 22 Descriptive Statistics: Summary Statistics Descriptive Statistics: Summary Statistics The coefficient of variation Percentiles The standard deviation and the variance are key numbers for absolute variation. If it is of interest to compare variation between different data sets, it might be a good idea to use a relative key number, the coefficient of variation: s V = · 100 x̄ Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 16 / 22 The median it the point that divides the data into two halves. It is of course possible to find other points that divide the data in other parts, they are called percentiles. Often calculated percentiles are 0, 25, 50, 75, 100 % percentiles (quartiles) and/or 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 % percentiles Note: the 50% percentile is the median Per Bruun Brockhoff (pbb@imm.dtu.dk) Descriptive Statistics: Summary Statistics Spring 2013 17 / 22 Software: R Figures/Tables Software: R Quantitative data: Scatter plot (xy plot) Histogram Cumulative distribution Boxplots Count data: Bar charts (pareto diagram) Pie charts Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Introduction to Statistics, Lecture 1 Appendix C in the textbook (7. and 8. edition): Description of R. R Commander: a graphical user interface. R-exercise today. You can run R from the G-bar at home via Thinlinc. R can (easily) be installed on your own computer. (See Rnote) Spring 2013 18 / 22 Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 20 / 22 Software: R Software: R Next week: Agenda Discrete distributions - chapter 4. Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 21 / 22 1 Practical Information 2 Introduction to Statistics 3 Descriptive Statistics: Summary Statistics 4 Software: R Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 1 Spring 2013 22 / 22