Statistics & probability Class Lecture notes Aqeel Rafique Statistic: Statistics is the science of collection, presentation, analysis and interpretation of numerical data.as well as drawing valid conclusion and making reasonable decision on the basis of such analysis. Branch of Statistics: There are two main branches of statistics. Descriptive/ Deductive statistics: This branch of statistics is concerned with the description (collection, presentation, analysis and interpretation) of numerical data. Inferential or Inductive statistics: This branch of statistics is concerned with the drawing conclusion about the population on the basis of sample information is called inferential statistics. Population: Whole aggregate of material about which we need to get some information is called population. Sample: (model) Representative part of the population collected to get the required information about the population is called sample. Parameter: Any numerical value or characteristic calculated from population data is called parameter. Statistics: Any numerical value or characteristic calculated from sample data is called statistics. Inference: To get result for population after testing sample is called inference. Data: Raw fact and figures or the collection of information about any problem under study is called data. Types of data: There are two types of statistical data. Primary data (initial): The collection of information that are collected initially from its source and have not gone through any sort of statistical treatment. It is also called raw data, ungrouped data or first-hand information. Secondary data: It is the data which have already be collected and compile through some sort of statistical treatment at least once. It is also called grouped data or second hand information. Presentation of data: The process of summarizing (Tabular form) and arranging of raw data to get its meaningful form is called presentation of data. The presentation of data involve following steps. Classification: The arrangement of raw data into different classes, group or categories according to some common characteristics present in the data is called classification. If the data are classified according to a single common characteristics is called one way classification. If classified according to two common characteristics called two way classification. And multi way if more than two criteria of classification is used. Tabulation: The process of arranging the data into vertical columns and horizontal rows in a systematics manner is called Tabulation. The statistical table has at least four parts. i. Title ii. Box head/ column caption iii. Stup/row caption iv. Body of the table Frequency distribution: The classification of the data into tabular form along with the number of objects in each class is called frequency distribution. Variable: Variable contains different values or change from individual to individual. Qualitative variable: Qualitative variable is a variable whose value can never be measured numerically, but can be expressed as the categories or classes. e.g.: religion, gender etc. Quantitative variable: A variable that can assume numerical value or measurements is called quantitative variable. There are two types of quantitative variable. Discrete variable: It is a variable that can take isolated value or countable values. e.g.: no. of students in a class, no of families in a house etc. Counter variable: A variable which can assume any possible value within an interval or can take measureable values is called continuous variable. e.g.: temperature, height, weight etc. Interval: The width or size of each class is called interval, denoted by h or I which is obtained by taking difference of any two consecutive lower or upper limits. Class boundaries: These are the actual class limits in which upper limit of first class and lower limit of the next class are same. Class marks: Class marks are the representative value of each class that lies in middle of the classes. Denoted by X. Commutative frequency: It is the sum of the frequencies preceding to a particular class including that class frequency. These frequencies are used to represent the arrangement of the observations in the distribution. We can easily trace the position of a particular observation in the distribution. Graphical representation of data: Another way to present the raw data into a meaningful form is diagrams or graphs. It is a visual or pictorial representation of the data. There are many reasons for drawing graphs the most compelling being that one simple graph say more than twenty pages of prose. Many graphs just represent a summary of the data that has been collected to support a particular theory. It is usually suggested that the graphic representation of data should be looked at before preceding for format statistical analysis. There are two ways to present the data into visual or pictorial form. i. Diagrams or charts ii. Graphs Class Frequency Class boundaries Class marks Commutative Interval (f) frequency (c f) 30 - 39 2 29.5 - 39.5 34.5 2 40 - 49 3 39.5 - 49.5 44.5 5 50 - 59 11 49.5 - 59.5 54.5 16 60 - 69 20 59.5 - 69.5 64.5 36 70 - 79 32 69.5 - 79.5 74.5 68 80 - 89 25 79.5 - 89.5 84.5 93 90 - 99 7 89.5 - 99.5 94.5 100 Q. what is the basic difference in diagram & graph? Q. discuss in detail different types of diagram & graphs, with single example of each.