Week 1 Vocabulary: Section 1.1 Data: consist of information coming from observations, counts, measurements, or responses. Statistics: is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions. Data Set: Population: is the collection of all outcomes, responses, measurements, or counts that are of interest. (This includes all potential responses measurements, etc.) key words – Sample: is a subset of a population key words – A study of, A survey of (and their respective quantitative or qualitative groups) Parameter: is a numerical description of a population characteristic Statistic: is a numerical description of a sample characteristic. Descriptive statistics: is the branch of statistics that involves the organization, summarization, and display of data. (this includes finding percentages) Inferential statistics: is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.(Interpretation of what is found) Section 1.2 Types of Data: Qualitative data: consist of attributes, labels, or nonnumerical entries Quantitative data: consists of numerical measurements or counts Level of Measurement: Nominal level: of measurement are qualitative only. Data at this level are categorized using names, labels, or qualities. No mathematical computations can be made at this level. Ordinal level: of measurement are qualitative or quantitative. Data at this level can be arranged in order, or ranked, but differences between data entries are not meaningful. Interval level: of measurement can be ordered, and you can calculate meaningful differences between data entries. At the interval level, a zero entry simply represents a position on a scale; the entry is not an inherent zero. (inherent zero – implies none, no 0oC is not in inherent) Ratio level: of measurement are similar to data at the interval level, with the added property that a zero entry is an inherent zero. A ratio of two data values can be formed so that one data value can be meaningfully expressed as a multiple of another. VERY IMPORTANT CHART and examples ON PAGE 14 OF TEXT Example on page 14, hard to tell difference between Interval level and Ratio level example, unless you apply Zero test. Type of Level of Put data in Arrange Subtract Determine Characteristics Zero Data measurement categories data in data if one data Means order values value is a none multiple of another Qualitative Nominal Yes No No No No N/A (just tally mathematical amounts of calculation categories) can be made Qualitative Ordinal Yes Yes No No Difference N/A or between data Quantitative is not meaningful Quantitative Interval Yes Yes Yes No calculate No meaningful differences between data and a zero entry simply represents a position on a scale Quantitative Ratio Yes Yes Yes Yes zero entry is Yes an inherent zero and one data value can be meaningfully expressed as a multiple of another. I added I added I added Section 1.3 Reason for knowing how to design a statistical study: Before you interpret the results of a study, you should determine whether the results are valid. Designing a Statistical Study: 1. Identify the variable(s) of interest (the focus) and the population of the study. 2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is representative of the population 3. Collect the data. a. Observational study – observe and measures characteristics of inters but does NOT change existing conditions. b. Experiment- a treatment is applied to part of a population (control group vs. experimental units) Elements of a well-designed experiment Control – influential factors Confounding variable – occurs when an experimenter cannot tell the difference between the effects of different factors on a variable. Placebo effect – occurs when a subject reacts favorably to a placebo (Blinding technique- where the subject does not know whether he or she is receiving a treatment or a placebo) (Double Blind – neither experimenter or subject knows) Randomization – is the process of randomly assigning subjects to different treatment groups. Completely randomized, blocks, randomized block design Replication – this is the repetition of an experiment using a large group of subjects c. Simulation – mathematical or physical model to reproduce the conditions d. Survey – investigation or one or more characteristics of a population. 4. Describe the data, using descriptive statistics techniques Sampling techniques Census- count of an entire population Sampling – count or measure of part of the population’ Random Sample – every member of the population has an equal chance of being selected Simple random sample – every possible sample of the same size has the same chance of being selected. (Assign number to each and use a random number table) Stratified Sample – grouping by “strata” or segments of population, then randomly picking. This sample represents ALL the population. The key to stratified samples is that after the population has be divided into subgroups, a random sample is selected from EACH and polled. (The random sample from one subgroup does not have to be the same size as from another. Cluster sample – population falls into naturally occurring subgroups, with similar characteristics. All in each cluster queried. In a cluster, after the populations is divided into subgroups, one or more of the subgroups are randomly selected, and ALL the elements of those subgroups are polled Systematic Sample – each member is assigned a number, ordered in some way, starting number is randomly selected, and then every 3rd, 5th, or 100th member is selected. Convenience sample – not recommended – available members of a population 5. Interpret the data and make decisions about the population using inferential statistics. 6. Identify any possible errors. Technology Generating a random number Excel – rand() generates a random number between 0 and 1 To make a whole number out of it =INT([Beginning number]+RAND()*[Ending number]) Ex: =INT(1+RAND()*10) generates a random number between 1 and 10, Ti-83 – randint(_)