In general, “statistically significant” means “reject null,” “investigate

ES 25 Quantitative Thinking: Homework 4

Due in Class on Tuesday, May 1 st

Your Workbook will be COLLECTED. Neatness counts (particularly because it leads to correct answers). Go through your workbook and make sure it is complete, and your mathematical process is clear (can you understand it?). This will be a good way to review for the Midterm Exam on Thursday, May 3 rd .

If you need to redo any problems, please feel free to revise problems on a blank sheet of paper and insert into the workbook (behind the original page).

1.

Complete all workbook problems through page 54. (if you have attended all classes (and finished past workbook assignments), the only new problems should be on pages: 36, 46, 50-54. See partial solution for page 50-51, below. o Page 36 solution: The consumer ratings for Greenies-R-Us have just been released, as well as the standard deviations for each category: z-score =( observation – mean)/(standard deviation) z-score is the number of standard deviations an observation is away from the mean (+ indicated above mean, - indicates below mean)

Category Mean Standard deviation

Greenies-R-Us

Score

Greenies-R-Us z-score

Reliability

Eco-Friendliness

Ergonomics

6.2

4.5

5.4

0.8

2.2

1.5

5.0

6.5

6.2

-1.5

0.9

0.5

Customer Service 7.1 0.5 6.0 -2.2 a)

Compute Greenies’ standardized test score for each category.

Show your work! b)

Based on the standard scores, on which test did Greenies’ score highest? Lowest?

**Lowest = Customer service(note that reliability “appears” lower using raw score)

**Highest = eco-friendliness (note there was a large st. dev’n in this category, this means that vehicles vary quite a bit in their “eco-friendliness,” while vehicles seem to vary only a little bit in their “customer service” rating.

o Page 46 solution:

Based on the 1990 census, the number of hours per day adults spend watching television varies with a mean of 5 hours and a standard deviation of 2 hours. Based on these results alone, can you conclude that about 95% of the adult population spends between 1 and 9 hours per day watching television?

IF the distribution of “hours per day adults watch TV” is normally distributed , (with the mean and standard deviation given), then we can conclude that 95% of the population watches between 1 and 9 hours per day.

Be sure to draw a good picture that shows the 68/95/99.7 rule. z(x = 9) = (9 – 5)/2 = 2 (so “9 hours” is two standard deviations above the mean)

Z(x = 1) = (1 – 5)/2 = -2 (so “1 hour” is two standard deviations below the mean).

Since 95% of the data (in this case, each observation is a person’s reported daily hours watching TV) is contained within 2 standard deviations of the mean, we can conlude that

95% of the population watches between 1 and 9 hours of television. o Second problem, page 46.

The NAS and EPA also have concluded that 32 parts per billion of mercury in the blood of pregnant women corresponds to approximately a doubling in the risk of abnormal performance on a range of neurodevelopmental tests. If mercury concentrations in pregnant women follow the distribution N(20, 7) ppb, what percent of children should we expect to have at least double risk of neurodevelopment problems?

Note: we can NOT compute a probability (or percent) for an EXACT z-score, it must be for a range of values (see #2, below). I revised the question in yellow highlighter to reflect this fact. z(32) = (32-20)/7 = 12/7 = 1.7 (so, 32 ppb is 1.7 standard deviations above the mean)

Since I know my 1-2-3, 68/95/99.7 rule, I know that the percentage is between 2.5%

(if it were 2 standard deviations) and 16% (if it were 1 standard deviation)…

To approximate the percentage, look at the area that you have shaded on your sketch of the distribution, and guess what percentage of the total area under the curve the shaded area represents. Using the website

(http://faculty.vassar.edu/lowry/tabs.html#z), I find that z>=1.7 = 0.446.

We should expect about 4.5 % of children to be at least doubly at risk of neurodevelopmental problems due to mercury levels in pregnant women.

2.

Read pages 37-42. Think about how the first sentence on the top of page 38 relates to the difficulty of assigning a probability to observing a pine needle of length 5.42 (hunter problem, page 44). Remember, we had to find the probability of 5.42, or more extreme to get an area under the curve (for the South Forest).

(for the North Forest, it seems to make the most sense to report the z-score for the needle length, which would show just how close to the mean the needle length was). No written answer is required, but put a star /note to yourself next to the key sentence.

The key here is that the AREA under the curve represents the proportion of values that are in the range you are investigating. To get an area, you have to multiply the height of the density curve by a “width” (range) on the x-axis. Thus, you cannot evaluate a probability at an exact point.

3.

IF you need more practice, repeat the problem on page 41, using a value of 70 mm instead of 66mm for parts 1-5. For parts 6-8, recalculate the answers, using a value of 80 mm instead of77.5 mm. You may do this problem on page 61 of your workbook, in pencil, to keep all of your work together. Note: this is not a required problem, but it is highly recommended if you need extra practice. o 1. z= 1 o 2. .16 or 16% (by the 68/95/99.7 rule, (1-.68)/2 tails = .16

o 3. 320 bones o 4. 84% o 5. .68/2 = .34 or 34% o 6. p(z>2) = .025 or 2.5% o 7. 50 bones o 8. 47.5%

Sample solution for pages 50-51. I have attempted to explain my answers clearly, for easy understanding (since we went through it quickly in class). Please use your own words ( and you can be much more concise ) when you fill in your workbook.

Proper article citation:

.

Fried, Peter, B Watkinson, D James, and R Gray, 2002.

Current and former marijuana use: preliminary findings of a longitudinal study of effects on IQ in young adults .

Canadian Medical Association Journal.

Online:

http://www.ecmaj.ca/cgi/content/abstract/166/7/887

(click on Full

Text.pdf for entire article)

1.

Interpret the p-value of < 0.001 for “prenatal substance exposure (mean),

Marijuana, joints/wk.”

The p-value always tells you the probability of getting the observed value(s), or more extreme, if the null (no-effect) hypothesis is true. But what is the null hypothesis?

For each of the “characteristics” listed in bold on the left hand side of Table 1, the null hypothesis is, “Characteristic has no effect on current level of marijuana use.”

For this question, the Null Hypothesis is: “average prenatal exposure to marijuana has no effect on current level of marijuana use.” o An associated p-value of 0.001 means, “We would expect to get these value, or values even more extreme, 1/1000 times, just by random chance, if prenatal exposure to marijuana has no effect on current level of marijuana us.” That is SO unlikely (improbable) that we don’t believe it (we reject the null hypothesis, at a

.05 significance level). There is support for the alternative hypothesis, (“prenatal exposure to marijuana has an effect on current level of marijuana use”). o Notice that the statement is not specific as to “which “ (of 1.4, 1.4, 11.6, 1.5) are too extreme, though I expect that smoking 11.6 joints/week, while pregnant, affects an unborn baby. This value makes me suspicious of the study. Does the sample of “heavy smokers, > 5 joints/wk” represent the entire population of heavy pot smokers adequately? I doubt if worldwide, the average heavy smoker’s mother smoked 11.6 joints/wk while pregnant. o Interestingly, this p-value says nothing about the effect of prenatal marijuana exposure to IQ (only to current smoking level). In the full report, the authors say, “ although some characteristics did differ across the

4 groups (such as father’s and mother’s education), none of these was associated with the IQ difference score; therefore, they were not used as covariates.”

2. Which of the “family characteristics” measured had a “significant” effect on current users marijuana smoking?

Both “mother’s education” (p-value=.013) and “father’s education” (p-value=.0009) had a statistically significant effect on current levels of marijuana use. Try writing a sentence, “ An associated p-value of 0.001 means…” for this problem. Think: what are the null and alternative hypotheses?

3.

Do you believe that light marijuana smoking actually leads to an increase in IQ score

(between preteen and young adult ages), compared to non-users? o According to the chart, light smokers had a “within subject difference score of

5.8” which is the largest increase among all groups (and the highest overall score). This suggests that smoking < 5 joints/wk improves IQ more than not smoking (non-users had a 2.6 average gain in IQ). I found this hard to believe, so

I read more in the full document, which says, “For analyses in which number of joints smoked per week was treated as a categorical variable, ANOVA with

Dunnett’s procedure indicated that the mean IQ difference score for the heavy current user group was significantly different from that for non-users (-4.0 v 2.6, p-<0.05) whereas no significant difference were evident in comparisons with the light current users and former users (5.8 vs. 2.6 and 3.5 vs 2.6).” Thus, it seems

that the p-values in Table 1 only tell us that (at least) two of the four values presented are different enough from each other to make the difference statistically significant. o A larger issue is that there are only 9 people in the light user group. This small sample size makes me question the entire study. o Before I believe that marijuana either does, or does not effect IQ, I would want to know if a causal mechanism has been suggested to explain how the drug effects

IQ. This reasoning is similar to controversy about low frequency electromagnetic fields, and the possibility of the exposure causing cancer. Though the two have been correlated, a causal mechanism has not been explained (or accepted by the scientific community).

( http://www.who.int/mediacentre/factsheets/fs263/en/ )

4. What other factors (confounding variables) might explain the observed relationship between marijuana use and increase, or decrease, in IQ score? o As stated above, the only “statistically significant” effect of marijuana use on IQ

(change from preteen to young adult) appeared to be between the non-users and heavy users (> 5 joints/wk). o Perhaps the heavy smoking teens stopped attending class, thereby lowering their IQ. “Not going to class” is the confounding variable. o Perhaps these heavy smoking adolescents are social deviants, who do not place high value on IQ tests, and thus do not try as hard. Thus, effort confounds the relationship between “current use” and “mean IQ score difference.”

Sample Solution for page 54:

Childhood leukemia and parents' occupational and home exposures.

Lowengart RA , Peters JM , Cicioni C , Buckley J , Bernstein L , Preston-Martin S ,

Rappaport E .

A case-control study of children of ages 10 years and under in Los Angeles County was conducted to investigate the causes of leukemia. The mothers and fathers of acute leukemia cases and their individually matched controls were interviewed regarding specific occupational and home exposures as well as other potential risk factors associated with leukemia. Analysis of the information from the 123 matched pairs showed an increased risk of leukemia for children whose fathers had occupational exposure after the birth of the child to chlorinated solvents [odds ratio (OR) = 3.5, P =

.01], spray paint (OR = 2.0, P = .02), dyes or pigments (OR = 4.5, P = .03), methyl ethyl ketone (CAS: 78-93-3; OR = 3.0, P = .05), and cutting oil (OR = 1.7, P = .05) or whose fathers were exposed during the mother's pregnancy with the child to spray paint (OR =

2.2, P = .03). For all of these, the risk associated with frequent use was greater than for infrequent use. There was an increased risk of leukemia for the child if the father worked in industries manufacturing transportation equipment (mostly aircraft) (OR = 2.5, P =

.03) or machinery (OR = 3.0, P = .02). An increased risk was found for children whose

parents used pesticides in the home (OR = 3.8, P = .004) or garden (OR = 6.5, P = .007) or who burned incense in the home(surprising to me) (OR = 2.7, P = .007). The risk was greater for frequent use. Risk of leukemia was related to mothers' employment in personal service industries (OR = 2.7, P = .04) but not to specified occupational exposures. Risk related to fathers' exposure to chlorinated solvents, employment in the transportation equipment-manufacturing industry, and parents' exposure to household or garden pesticides and incense remains statistically significant after adjusting for the other significant findings.

Implies that other reported risks (such as spray paint) are NOT significant, one adjusted

(presumably for other confounding factors).

How the Studies Are Done (Cancer in Children and Pesticide Exposure Summary - M.

Moses M.D)

Epidemiology is the study of diseases and their causes in human populations. It compares groups of people with an exposure to those without it, or people with a disease to those without it. In the studies in this table, groups of children with cancer or with pesticide exposure are the “cases” . Groups of children without cancer or without exposure to pesticides are the

“controls”.

The aim is to find out if the children with cancer (the cases) are more likely to have exposure to pesticides than the children without cancer (the controls). Or to find out if the children with pesticide exposure (the cases) are more likely to have cancer than children without pesticide exposure (the controls).

How Study Results are Reported

Study results are reported as risk ratios . These ratios indicate whether the children with cancer were more likely to be exposed to pesticides (at increased risk), equally likely to be exposed to pesticides (no difference in risk), or less likely to be exposed to pesticides

(at decreased risk) than the children without cancer. Or whether the children with pesticide exposure were more likely to have cancer (at increased risk), equally likely to have cancer (no difference in risk), or less likely to have cancer (at decreased risk) than the children without pesticide exposure.

For example : In a study of leukemia**, the cases would be children with leukemia, and the controls children without it. There are three possible outcomes. The children with leukemia could be more likely, equally likely, or less likely to have exposure to pesticides.

1. More likely: If the ratio is greater than 1 (> 1), this means that the children with leukemia were more likely to have exposure to pesticides – that pesticide exposure increases the risk of leukemia. The size of the ratio indicates how much the risk is increased. A ratio of 1.4 means a 40% increase in risk. A ratio of 2.0 means a doubling of the risk, or a 200% increase. At least a doubling of the risk is considered more important than ratios less than 2.

2. Equally likely - If the ratio is equal to one ( = 1) this means that there was no difference in pesticide exposure found in the children with or without leukemia – pesticides did not increase the risk of leukemia in the study.

3. Less likely - If the ratio is less than one (< 1), this means that children with leukemia

were less likely tobe exposed to pesticides than children without it, or the risk was decreased . The smaller the number the lower the risk. A ratio of 0.80 means that children with leukemia are 20% less likely to have been exposed to pesticides. A ratio of 0.40, that they are 60% less likely.

When studying humans, it is impossible to determine every factor that might influence the results of a study. It might have occurred anyway, by chance (p-value). It is possible that any increase in risk was not from pesticides, but something else (if true cause is correlated with pesticide use, pesticide use would be a confounding factor). This could be something the researcher didn’t think of, or didn’t even ask about. Or it could be from pesticide exposure in combination (interesting.. pesticides + shaving cream?? ;) with other unknown or unstudied factors.

Therefore, finding an increase in risk does not mean that pesticides “cause” leukemia

.

This is why it is common to report an increase in risk by stating that “pesticide exposure increases the risk of leukemia in children”, or “pesticide exposure is a risk factor for leukemia in children”, and not that pesticides “cause” leukemia.

Are the Study Results “Significant”?

There are methods to determine how strong the link or associations between leukemia and pesticides are, and if they occurred by chance(p-value!) They are called tests of statistical significance. The statistical part is usually left out, and the results reported as

“significant” or “not significant”.

“p” value

: This tests whether the findings could have occurred by chance 5% of the time or less. The 5% is converted to a fraction and written as 0.05. For example, you will see the results as “p = 0.05" (read as p equals point 0 5 ), or “p < 0.05" (read as p less than point 0 5), or “p 

0.05" (read as p less than or equal to point 0 5).

If the “p” value is less than or equal to 0.05, the findings are considered to be statistically significant (very arbritrary); that is, they are unlikely to have occurred by chance. The smaller the “p” value the more significant the findings. For example” p 

0.01" (read as p less than or equal to point 0 1) means that it could have occurred by chance 1% of the time or less.

 Which of the “treatments” was correlated with the largest increase in risk of childhood leukemia?

Children whose parents use pesticides in the garden [OR =6.5, p=.007]. It a little misleading, this actually means, “Children with leukemia are 650 times more likely to have parents who use pesticides in the garden than kids without leukemia.” IT IS

NOT THE SAME as “Parents who use pesticides in the garden are 650 times more likely to have kids with leukemia.”

 The study reports, “…increased risk of leukemia for children whose fathers had occupational exposure after the birth of the child to methyl ethyl ketone

(CAS: 78-93-3 (chemical reference number); OR = 3.0, P = .05). o Write a sentence interpreting the OR and p-value for someone who has not taken a statistics class.

Kids with leukemia were 300 % more likely to have fathers who worked with MEK than

Kids without leukemia. We would expect to get this odds ratio, or an odds ratio even more extreme, about 5% of the time (by chance alone). o Search www.wikipedia.org

for methyl ethyl ketone. Do you feel that this chemical is “safe” or “dangerous”? Explain.

It sounds pretty harmless. Before I start adding my 2 cents to Wiki, I think I will review a broader literature on links between leukemia and MEK (or other common solvents). In general, it is always good to look at multiple sources of information before deciding what to believe.

I would be much more easily convinced that MEK actually causes leukemia (the study just reports the risk associated with correlation) if a mechanism was proposed to explain how MEK leads to leukemia.

I also try to look at who the author is, their sponsoring organization, reputation, etc.

 What “significance level” (  ) was used for this study? Offer an argument for a higher or lower significance level (you choose).

They used  (significance) = 0.05. Thus, if an odds ratio has less than a 5% of occurring, just by chance, we believe that the effect is real. If I want to ‘take action sooner,’ and don’t need quite as compelling of evidence, I might choose .10. Thus, if the effect (odds ratio, in this case… but think of “pine needle length” in the hunter problem) has a 10% chance of occurring, or less, I will take action (“reject the null hypothesis that there is no effect”). However, if the costs of “reacting” are high (or the costs of “not reacting” are low), we could set  (significance) = 0.01. In this case, we won’t reject the hypothesis that there was “no effect” beyond random chance until the probability of getting the effect are 1/1000.

In general, “statistically significant” means “reject null,” “investigate further,” or “take action.”

If you want action to be taken sooner, choose a larger value for the significance level.

In general, “statistically significant” means “reject null,” “investigate

Fried, Peter, B Watkinson, D James, and R Gray, 2002.

Canadian Medical Association Journal.

Online:

(click on Full

Text.pdf for entire article)

1.