Bulacan State University City of Malolos College of Science Categorical Data Analysis (MAS 401) Contingency Tables 12 October 2021 Prepared by: Carla M. Clemente Schedule: October 19, 2021: Asynchronous session October 26, 2021: Online session, 2:30 PM to 3:30 PM October 15, 2021: Submission of Activity 1 Objectives: ● understand the probability structure of contingency tables ● describe the associations between categorical variables and conduct inferences Contingency Table Table 1. Cross Classification of Belief in Afterlife by Gender Definition Suppose there are two categorical variables, denoted by X and Y . Let I denote the number of categories of X and J the number of categories of Y . A rectangular table having I rows for the categories of X and J columns for the categories of Y has cells that display the I J possible combinations of outcomes. A table of this form that displays counts of outcomes in the cells is called a contingency table. Joint Distribution of X and Y Let πij = P(X = i,Y = j) denote the probability that (X, Y ) falls in the cell in row i and column j . The probabilities {πij } form the joint distribution of X and Y . Marginal Distribution The marginal distributions are the row and column totals of the joint probabilities. We denote these by {πi+} for the row variable and {π+j} for the column variable. For 2x2 tables, Joint and marginal distributions Notation for Cell Counts Conditional distributions A conditional distribution refers to probability distribution of Y at fixed level of X. Notation for Cell Counts Independence Two variables are said to be statistically independent if the population conditional distributions of Y are identical at each level of X. Statistical independence is the property that all joint probabilities equal the product of their marginal probabilities, COMPARING PROPORTIONS IN 2x2 TABLES ● Difference of Proportions ● Relative Risk Difference of proportions The difference of proportions π1 − π2 compares the success probabilities in the two rows. This difference falls between −1 and +1. Let p1 and p2 denote the sample proportions of successes. The sample difference p1 − p2 estimates π1 −π2. When the counts in the two rows are independent binomial samples, the estimated standard error of p1 − p2 is Difference of proportions The estimated standard error of p1 − p2 is A large sample 100(1- alpha)% confidence interval for π1 − π2 is Example Table 2. Cross Classification of Aspirin Use and Myocardial Infarction This was a five-year randomized study testing whether regular intake of aspirin reduces mortality from cardiovascular disease. Every other day, the male physicians participating in the study took either one aspirin tablet or a placebo. Example Table 2. Cross Classification of Aspirin Use and Myocardial Infarction Relative risk For 2x2 tables, the relative risk is the ratio Two groups with sample proportions p1 and p2 has a sample relative risk of It can be any nonnegative real number and value of 1.0 corresponds to independence. Example Table 2. Cross Classification of Aspirin Use and Myocardial Infarction Relative risk Relative risk is the ratio of the probability of an event occurring with an exposure versus the probability of the event occurring without the exposure. One must know the exposure status of all individuals (either exposed or not exposed) to calculate the relative risk. This implies that relative risk is only appropriate for cases where the exposure status and incidence of disease can be accurately determined, such as prospective cohort studies. Tenny, S. & Hoffman, M. Relative risk. https://www.ncbi.nlm.nih.gov/books/NBK430824/. References Agresti, A. (2007). An Introduction to Categorical Data Analysis, 2nd Edition. New York: Wiley. Analysis of Discrete Data. https://online.stat.psu.edu/stat504/ Bluman, A.G. (2014). Elementary Statistics: A Step By Step Approach (9th ed). McGraw-Hill Education. Gallistel, CR. Bayes for Beginners: Probability and Likelihood. https://ruccs.rutgers.edu/images/personal-charles-r-gallistel/publications/2015-A PS-Bayes-for-Beginners-1-Probability-and-Likelihood---Association-for-Psycholo gical-Science.pdf Tenny, S. & Hoffma, M. Relative risk. https://www.ncbi.nlm.nih.gov/books/NBK430824/.