Mathematics 241 Populations, parameters, samples, statistics Populations and parameters Remember that a population can be a fixed, finite collection of objects (a tangible population) or it can be an infinite, conceptual population. A parameter is a numerical characteristic of the population, usually one that can be computed from a particular variable defined on the population. Parameters are usually (but not always) named by Greek letters. For example, if x is a variable defined on the population, the population mean µx and the population standard deviation σx are both parameters. Samples and Statistics If x is a variable defined on some population and we choose a sample of size n, we denote the elements of the sample by x1 , . . . , xn . A statistic is a numerical characteristic of the sample computed from x1 , . . . , xn . For example, the sample mean x̄ and the sample standard deviation sx are statistics. Random samples In order to make a connection between the population and the sample, we usually assume that the sample is a simple random sample. What this means, ideally is that each object in the population had an equal chance to be any element in the sample. With random sampling, each value of the sample can be considered to be the result of a random variable. Thinking of the sample this way, we have n random variables X1 , . . . Xn . In other words Xi is the random experiment of choosing the ith value of the sample and xi is the actual value chosen in this sample. Distribution of Xi When we think of the sample in this way, we can see that each random variable Xi has the same distribution (it is just the same as the distribution of the variable x in the population.) We will also usually assume that the variables Xi are independent. This is not really true if the population is finite and we sample without replacement but if the population is very large relative to n it is practically true. Of course we don’t know the distribution of the variable x in the population. (If we did, we wouldn’t be taking samples.) Sometimes we make assumptions about the shape of that distribution however. For example we often assume that it is approximately normal. Sampling distribution of a statistic A statistic such as X̄ is itself a random variable. As such every statistic has a distribution. In order to understand how to make inferences about the population parameter from a statistic, we have to know something about the sampling distribution of the statistic. The big question: If we choose a simple random sample of size n from a population, how close is our statistic (e.g., x̄) likely to be to our parameter (e.g., µ)?