Statistical Distributions Practical 1. QQ Plots and Quantile Normalisation, Fitting a distribution to data. This exercise uses the data in the file mt.txt to explore qqplots and quantile normalization. The file contains two columns, giving case-control status and mitochondrial levels for a set of subjects. (i) Generate and plot empirical densities and cumulative distribution functions for the cases and the controls, using the R functions density() and ecdf(). (ii) Test if the distributions are different using the Kolmogorov-Smirnov test, the t-test and compare the variances using the F-test. (iii) Use the R function fitdistr() to try to find parametric distributions that model the distributions for cases and controls. Note – I don't know the answer to this – I have tried fitting the Gamma distribution without much success, but you should try others such as the log normal and the extreme value distributions (using the R package gev). 2. The Central Limit Theorem ∑𝑁 𝑥 The CLT says that the distribution of the sample mean 1 𝑖⁄𝑁 converges to 2 𝑁 (𝜇, 𝜎 ⁄𝑁) , where 𝜇, 𝜎 2 are the mean and variance of the observations. Write an R program to demonstrate this. For 𝑁 = 1,5,10,25,50,60,75,100, the program should generate 1000 sets samples of size 𝑁 from a given distribution (try sampling from 𝑁(10,2), 𝜒𝑘2 , 𝑃𝑜(𝜆), 𝑇𝑘 , where 𝑘 = 1,2,10, 𝜆 = 1,10). There are two types of convergence to investigate: (a) convergence of the sample mean to the true mean 𝜇. This can be explored by computing the standard deviation of the sample means, and their quantiles;(b) convergence of the distribution of the 2 sample means to 𝑁 (𝜇, 𝜎 ⁄𝑁). This can be visualized by a series of qqplots, comparing the distribution of the sample means to a Normal, using qqnorm(). What do you notice about the behavior of data sampled from the 𝑇1 distribution?