Business Statistics 1 L1 Data and dataset: Data are the facts and figures collected, summarized, analyzed, and interpreted The data collected in a particular study are referred to as the data set Elements, variables, and observations: The elements are the entities on which data are collected. Examples: individuals, firms, countries A variable is a characteristics of interest for the elements. Example: shoe size, GDP, number of employees A set of measurements Scales of measurement: Nominal scale: No natural ranking of categories; ex. Genders, names, colors The variable values can only be described in words, not numbers. = & ≠ to see if the categories are the same or not Ordinal scale Natural ranking of categories E.g. grades, education level, the Likert scale Mathematical operations used: =, ≠, < and > Interval scale Always numeric Natural ranking Variables have fixed measurement units Arbitrary zero point Mathematical operations used: =, ≠, <, >, + and Ratio scale Always numeric Natural ranking E.g. length, weight, age Variables have fixed measurement units Mathematical operations used: =, ≠, <, >, +, -, multi and div. L2 Sample space (S) – Collection of all possible outcomes. Example: Toss a coin: S = {head, tail} Roll a die: S = {1, 2, 3, 4, 5, 6} Play a football game: S = {Win, draw, lose} Sample point – An experimental outcome is called a sample point or an element. Properties: One outcome in the sample space must occur. Two outcomes cannot occur at the same time Example: “Roll a die” experiment – When you roll a die it is only possible to get one outcome. Events – is a set of consisting of a specific collection of sample points Compliment event (E): Example: S = All companies in Sweden E = All manufacturing companies in Sweden E^- = All non-manufacturing companies in Sweden Intersection:The intersection between two events, A and B: Outcomes in both events A and B “AND” statement Denoted AnB Example: - S = All students in Sweden - A = All Female students in Sweden - B = All students in Sweden owning a car - A n B = All female students in Sweden owning a car Union: The union between two events, A and B: Outcomes in either events A or B or both “OR” statement Denoted: A u B Example: - S = All cars produced by Volvo in 2020 - A = All cars in S with defective gearbox - B = All cars in S with defective brakes - A u B = All cars in S with defective gearbox or defective breaks or both Mutually exclusive events: Two events, A and B, are mutually exclusive if: The two events cannot occur simultaneously I.e. A n B does not contain any sample points (written A n B = 0/) Example: - S = All animals in Sweden - A = All cats in Sweden - B = All dogs in Sweden - A n B = All cats that are dogs in Sweden = 0/ - Cats cannot also be dogs. Thus, the events A and B are mutually exclusive events. Probability of an event: If S has a finite number of sample points and each outcome are equally as likely to happen, then the probability of an event A is: Example: Roll a die S = {1, 2, 3, 4, 5, 6} Roll a 4: A = {4} P(A) = n(A) / n(S) = 1/6 Probability of the complement of any even: Example: Roll a die S = {1, 2, 3, 4, 5, 6} A = {4} The probability that event A or B or both occurs is calculated as: Example: A = the event that a students owns a bike B = the event that a students owns a car Conditional probability: The probability of an event given that another event occurred is called conditional probability Denoted and calculated as: Bayes´ theorem Combinations - Combinations are arrangements without respect to order Permutations - Permutations are arrangements with respect to order L3: A random variable is a variable that associates a numerical value with each possible outcome of an experiment. There’s two types of random variables: Continuous random variables Discrete random variables can assume a countable number of values, finite or infinite Discrete random variables Example: Let X be the number of heads when tossing two coins. The possible outcomes are: HH, HT, TH and TT Expected value of discrete random variables: Example: If the probability distribution is known it is also possible to calculate the population variance and standard deviation as: And Now when we have the p(x) value, and u-value, we can plot it into the calculation, Example: The Bernoulli distribution: Take the value of 0 and 1 example: means like choosing YES/ NO- you are one possible answer (winning or losing the football match. The formula is used when the equations only has two possible outcomes Let X be distributed Bernoulli (Ber) with parameter π, denoted X – Ber(π) Expected value: E(X) = π Variance: V(X) = π(1-π) Example: (Coin toss) Let X = 1, if the toss results in “head” And X = 0, if the toss results in “tail”. Then X – Ber(0.5) Expected value: E(X) = π = 0.5 Variance: V(X) = π(1-π) = 0.5*(1-0.5) = 0.25 Binomial distribution: L4 Continuous variable: - Definition: A continuous random variable can assume any value in a particular interval on the real line or in a collection of intervals - Characteristics: There’s infinite possible values that a continuous random variables assumes a singel specified value (e.g. P(X=x) = 0 To get a higher/smaller probability you can increase/decrease the width of the intervals Formula: The normal distribution: - A descriptive model that describes the real worlds situation - Can take any values The probability density function of the normal distribution µ= the expected value of X = the mean Sigma ^2 = standard deviation Is: Standard Normal Distribution: To find the probability – you can look at the table on canvas to find the area you calculate. There’s two ways to find the probability: - Symmetry rule Ex. P(Z>2.67) = P(Z<-2.67) Awnser: 0.0038 - Complement rule Ex: 1 – P(Z>2.67) Awnser: 0.0038 P(Z < z) - Z = Random variable - z = number Example: Find P(Z < 1.70) - Awnser: 0.9554 Example 2: Find P(-2.13 < Z < 0) - Awnser: P(-2.13 < Z < 0) = P(- infinity < Z < 0) – P(- infinity < Z < -2.13) = 0.5 – 0.0166 = 0.4832 Standardization theorem: If X in normally distributed with the expected value E(X) = miu and variance (sigma ^2) Then: Example: Awnser: The Inverse Transformation Exercise Ch4, Ex 19 Anwer: I: p(A) 0,05+0,20=0,25 P(B) 0,20+0,25=0,45 P(C) 0,20+0,20+0,15=0,55 III P(AandB) = empty because they do not have nothing in common P(AandB)= P(empty)=0 IV: If A n C= 0 or not ? AnC= (E2 upphöjt i 2) = 0 not mutually exclusive U=union V: B=S/B= (E1,E2,E5,E6,E7) P(B)= 0,05+0,20+0,15+0,10+0,05=0,55 Extra exercises lecture 2 with answers. I. P(regular U over 40)0 160+545+745+200/2000= 0,825 II. P(regular and Under 20)= 160/2000= 0,08