Statistics Lab Assignment: Hypothesis Testing Practice

Lab Assignment 1 Practice Problems A particular brand of tires claims that its deluxe tire averages at least 50,000 milesbefore it needs to be replaced. From past studies of this tire, the standard deviationis known to be 8000. A survey of owners of that tire design is conducted. From the28 tires surveyed, the average lifespan was 46,500 miles with a standard deviationof 9800 miles. Do the data support the claim at the 5% level? # Given data sample_mean <- 46500 # average lifespan from the survey population_mean <- 50000 # claimed average lifespan sample_size <- 28 sample_sd <- 9800 # standard deviation from the survey # Calculate the standard error standard_error <- sample_sd / sqrt(sample_size) # Calculate the t-statistic t_statistic <- (sample_mean - population_mean) / standard_error # Degrees of freedom df <- sample_size - 1 # Calculate the critical t-value for a one-tailed test at alpha = 0.05 critical_t <- qt(0.05, df, lower.tail = FALSE) # Print the t-statistic and critical t-value cat("t-statistic:", t_statistic, "\n") ## t-statistic: -1.889822 cat("Critical t-value:", critical_t, "\n") ## Critical t-value: 1.703288 # Perform the t-test and print the result if (t_statistic < critical_t) { cat("Reject the null hypothesis. The data supports the claim.\n") } else { cat("Fail to reject the null hypothesis. The data does not support the claim.\n") } ## Reject the null hypothesis. The data supports the claim. In the large city A,20 per cent of Random sample of 900 School children had defective eye – sight. In the large city B,15 percent of random sample of 1600 school children had the same defective. Is this Difference between the two Proportions Significant? Obtain 95% confidence limits of the difference in the population proportions. # Given data for City A sample_size_A <- 900 defective_A <- 0.20 * sample_size_A eyesight in City A # Given data for City B sample_size_B <- 1600 defective_B <- 0.15 * sample_size_B eyesight in City B # Proportions p_A <- defective_A / sample_size_A City A p_B <- defective_B / sample_size_B City B # Number of children with defective # Number of children with defective # Proportion of defective eyesight in # Proportion of defective eyesight in # Standard error of the difference in proportions SE_diff <- sqrt((p_A * (1 - p_A)) / sample_size_A + (p_B * (1 - p_B)) / sample_size_B) # Z-score for 95% confidence level z <- qnorm(0.975) # Two-tailed test # Calculate the difference in proportions diff_proportions <- p_A - p_B # Confidence interval for the difference in proportions lower_limit <- diff_proportions - z * SE_diff upper_limit <- diff_proportions + z * SE_diff # Print the results cat("Difference in proportions:", diff_proportions, "\n") ## Difference in proportions: 0.05 cat("95% Confidence Interval for the difference in proportions:", lower_limit, "to", upper_limit, "\n") ## 95% Confidence Interval for the difference in proportions: 0.01855096 to 0.08144904 A cigarette manufacturing firm claims its brand A of the cigarettes outsells its brand B by 8%.if its found that 42 out sample of 200 smoker prefer brand A and 18 out of another random sample of 100 smokers prefers brand B, test whether the 8% difference is a valid cliam. # Given data n_A <- 200 # Sample size for brand A n_B <- 100 # Sample size for brand B x_A <- 42 # Number of smokers preferring brand A x_B <- 18 # Number of smokers preferring brand B # Proportions p_A <- x_A / n_A p_B <- x_B / n_B # Proportion of smokers preferring brand A # Proportion of smokers preferring brand B # Null hypothesis: There is no difference in proportions (p_A - p_B = 0.08) # Alternative hypothesis: There is a difference in proportions (p_A - p_B ≠ 0.08) # Standard error of the difference in proportions SE_diff <- sqrt((p_A * (1 - p_A)) / n_A + (p_B * (1 - p_B)) / n_B) # Test statistic (z-score) z <- ((p_A - p_B) - 0.08) / SE_diff # p-value for two-tailed test p_value <- 2 * pnorm(-abs(z)) # Print the test statistic and p-value cat("Test Statistic (z):", z, "\n") ## Test Statistic (z): -1.041328 cat("p-value:", p_value, "\n") ## p-value: 0.2977235 # Test the hypothesis at a significance level of 0.05 if (p_value < 0.05) { cat("Reject the null hypothesis. There is evidence to suggest that the claim is not valid.\n") } else { cat("Fail to reject the null hypothesis. There is not enough evidence to reject the claim.\n") } ## Fail to reject the null hypothesis. There is not enough evidence to reject the claim. The average number of sick days an employee takes per year is believed to be about 10. Members of a personnel department do not believe this figure. They randomly survey 8employees. The number of sick days they took for the past year are as follows: 12; 4; 15; 3; 11; 8; 6; 8. Let X = the number of sick days they took for the past year. Should thepersonnel team believe that the average number is about 10? # Given data sick_days <- c(12, 4, 15, 3, 11, 8, 6, 8) employees # Number of sick days taken by 8 # Calculate the sample mean sample_mean <- mean(sick_days) # Null hypothesis: The average number of sick days is 10 # Alternative hypothesis: The average number of sick days is not 10 # Conduct a one-sample t-test t_test_result <- t.test(sick_days, mu = 10) # Print the test result print(t_test_result) ## ## One Sample t-test ## ## data: sick_days ## t = -1.12, df = 7, p-value = 0.2996 ## alternative hypothesis: true mean is not equal to 10 ## 95 percent confidence interval: ## 4.94433 11.80567 ## sample estimates: ## mean of x ## 8.375 The mean life time of a sample of 400 fluorescent light bulbsproduced by a company is found to be 1, 570 hours with a standarddeviation of 150 hours. Test the hypothesis that the mean life time ofbulbs is 1600 hours against the alternative hypothesis that it is greaterthan 1, 600 hours at 1% and 5% level of significance # Given data sample_mean <- 1570 # Sample mean sample_sd <- 150 # Sample standard deviation sample_size <- 400 # Sample size population_mean <- 1600 # Hypothesized population mean # Generate sample data sample_data <- rnorm(sample_size, mean = sample_mean, sd = sample_sd) # Conduct a one-sample t-test for 1% level of significance t_test_result_1 <- t.test(sample_data, mu = population_mean, alternative = "greater", conf.level = 0.99) # Print the test result for 1% level of significance print(t_test_result_1) ## ## One Sample t-test ## ## data: sample_data ## t = -5.3219, df = 399, p-value = 1 ## alternative hypothesis: true mean is greater than 1600 ## 99 percent confidence interval: ## 1540.07 Inf ## sample estimates: ## mean of x ## 1558.35 # Conduct a one-sample t-test for 5% level of significance t_test_result_5 <- t.test(sample_data, mu = population_mean, alternative = "greater", conf.level = 0.95) # Print the test result for 5% level of significance print(t_test_result_5) ## ## One Sample t-test ## ## data: sample_data ## t = -5.3219, df = 399, p-value = 1 ## alternative hypothesis: true mean is greater than 1600 ## 95 percent confidence interval: ## 1545.447 Inf ## sample estimates: ## mean of x ## 1558.35 A certain stimulus administered to each of the 13 patients resulted in the following increase of blood pressure: 5, 2, 8,-1, 3, 0, -2, 1, 5, 0, 4, 6, 8. Can it be concluded that the stimulus, in general, be accompanied by an increase in the blood pressure. # Given data increase <- c(5, 2, 8, -1, 3, 0, -2, 1, 5, 0, 4, 6, 8) pressure for each patient # Increase in blood # Null hypothesis: The mean increase in blood pressure is zero (mu = 0) # Alternative hypothesis: The mean increase in blood pressure is greater than zero (mu > 0) # Conduct a one-sample t-test t_test_result <- t.test(increase, alternative = "greater") # Print the test result print(t_test_result) ## ## One Sample t-test ## ## data: increase ## t = 3.2613, df = 12, p-value = 0.003406 ## alternative hypothesis: true mean is greater than 0 ## 95 percent confidence interval: ## 1.360534 Inf ## sample estimates: ## mean of x ## 3 The manufacturer of a certain make of electric bulbs claims that his bulbs have a mean life of 25 months with a standard deviation of 5 months. Random samples of 6 such bulbs have the following values: Life of bulbs in months: 24, 20, 30, 20, 20, and 18. Can you regard the producer’s claim to valid at 1% level of significance # Given data sample <- c(24, 20, 30, 20, 20, 18) # Life of bulbs in months sample_mean <- mean(sample) # Sample mean sample_sd <- sd(sample) # Sample standard deviation sample_size <- length(sample) # Sample size population_mean <- 25 # Claimed population mean # Null hypothesis: The mean life of bulbs is 25 months (mu = 25) # Alternative hypothesis: The mean life of bulbs is not 25 months (two-tailed test) # Conduct a one-sample t-test t_test_result <- t.test(sample, mu = population_mean, alternative = "two.sided", conf.level = 0.99) # Print the test result print(t_test_result) ## ## One Sample t-test ## ## data: sample ## t = -1.6771, df = 5, p-value = 0.1544 ## alternative hypothesis: true mean is not equal to 25 ## 99 percent confidence interval: ## 14.78708 29.21292 ## sample estimates: ## mean of x ## 22 The life time of electric bulbs for a random sample of 10 from a large consignment gave the following data: 4.2, 4.6, 3.9, 4.1, 5.2, 3.8, 3.9, 4.3, 4.4, 5.6 (in ’000 hours). Can we accept the hypothesis that the average life time of bulbs is 4, 000 hours # Given data sample <- c(4.2, 4.6, 3.9, 4.1, 5.2, 3.8, 3.9, 4.3, 4.4, 5.6) bulbs in '000 hours' # Lifetime of # Calculate sample statistics sample_mean <- mean(sample) # Sample mean sample_sd <- sd(sample) # Sample standard deviation sample_size <- length(sample) # Sample size population_mean <- 4 # Hypothesized population mean (in '000 hours') # Null hypothesis: The average lifetime of bulbs is 4,000 hours (mu = 4) # Alternative hypothesis: The average lifetime of bulbs is not 4,000 hours (two-tailed test) # Conduct a one-sample t-test t_test_result <- t.test(sample, mu = population_mean, alternative = "two.sided") # Print the test result print(t_test_result) ## ## One Sample t-test ## ## data: sample ## t = 2.1483, df = 9, p-value = 0.0602 ## alternative hypothesis: true mean is not equal to 4 ## 95 percent confidence interval: ## 3.978809 4.821191 ## sample estimates: ## mean of x ## 4.4 a=c(2,2.7,2.9,1.9,2.1,2.6,2.7,2.9,3.0,2.6,2.6,2.7) b=c(3.2,3.6,3.7,3.5,2.9,2.6,2.5,2.7) u=var.test(a,b) u ## ## F test to compare two variances ## ## data: a and b ## F = 0.58045, num df = 11, denom df = 7, p-value = 0.4033 ## alternative hypothesis: true ratio of variances is not equal to 1 ## 95 percent confidence interval: ## 0.1232526 2.1817180 ## sample estimates: ## ratio of variances ## 0.5804544 The following data come from a hypothetical survey of 920 people (Men, Women) that ask for their preference of one of the three ice cream flavors (Chocolate, Vanilla, Strawberry). Is there any association between gender and preference for ice cream flavor? # Given data men <- c(100, 120, 20) strawberry women <- c(350, 320, 150) strawberry # Men's preference for chocolate, vanilla, # Women's preference for chocolate, vanilla, # Combine the data into a matrix ice_cream_data <- rbind(men, women) # Perform chi-square test for independence chi_square_test <- chisq.test(ice_cream_data) # Print the test result print(chi_square_test) ## ## Pearson's Chi-squared test ## ## data: ice_cream_data ## X-squared = 16.916, df = 2, p-value = 0.0002122 As a part of quality improvement project focused on a delivery of mail at a department office within a large company, data were gathered on the number of different addresses that had to be changed so that the mail could be redirected to thee correct mail stop. Table shows the frequency distribution. Fit binomial distribution and test goodness of fit # Given data x <- 0:4 f_observed <- c(5, 20, 45, 20, 10) # Total number of trials (sample size) n <- sum(f_observed) # Estimate the probability of success (p) p_estimate <- sum(x * f_observed) / n # Ensure that the probability estimate is within the valid range [0, 1] p_estimate <- pmin(p_estimate, 0.99) # Set maximum value to 0.99 to avoid numerical instability p_estimate <- pmax(p_estimate, 0.01) # Set minimum value to 0.01 to avoid numerical instability # Calculate the expected frequencies using the binomial distribution formula f_expected <- dbinom(x, size = max(x), prob = p_estimate) * n # Normalize the expected frequencies so that they sum up to 1 f_expected <- f_expected / sum(f_expected) # Perform goodness-of-fit test goodness_of_fit_test <- chisq.test(f_observed, p = f_expected) ## Warning in chisq.test(f_observed, p = f_expected): Chi-squared approximation ## may be incorrect # Print the test result print(goodness_of_fit_test) ## ## Chi-squared test for given probabilities ## ## data: f_observed ## X-squared = 26044540, df = 4, p-value < 2.2e-16

Statistics Lab Assignment: Hypothesis Testing Practice

Related documents

Products

Support

Statistics Lab Assignment: Hypothesis Testing Practice

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib