Voirin Final Report

advertisement
Client’s Report:
Chase Voirin, Wildlife Management and Conservation
By Haomiao Yang
Background:
Your study raises two different diet assessment methods: next-generation sequencing
and microhistology. Two separated mule deer populations, Chuska and Carrizo, are
studied to evaluate these two assessment methods. For each mule deer population, 20
individual samples are collected, and these 20 individual samples are tested in two
different seasons: winter and summer. Therefore, for each method, 2 different mule
deer populations, 20 samples for one population and test under 2 different seasons,
and equally 80 outcomes can be obtained in the study.
You intend to compare the following factors between these two different diet
assessment methods:
1. Diet richness (the presence of unique plant item).
2. Taxonomic diet resolution (how many of those unique plant items can be
identified to the lowest level of taxonomy).
3. Frequency of occurrence of plant items across samples.
4. Proportions of each plant item in a diet.
All the factors above are called “diet composition”.
After collecting scat piles of each sample, the diet composition data is obtained for
both methods, where each method produced a given percentage of “hits” for particular
plant item, at a given level of taxonomy (or sometimes morphology as expressed in
the micohistology data: “shrub”, “flower”, “berry”, etc.) And the mean percentages,
with their associated standard deviations, represent the mean proportion of “hits” for
that plant item across 20 individual samples.
Problems:
1. What type of diversity indices (e.g. Shannon-Weiner, Simpson, etc...) would
be appropriate to show the differences in richness between each method, while
also taking into account the discrepancies in plants identified between each
method (e.g. some plants species, genera, and families were discovered in one
method but not the other, and vice versa).
2. What would make a good statistical representation of differences in
proportions of diet data between each method?
3. You are looking for using software to provide useful figures to show
difference between each method’s results. What kind of figure can be used to
show the results?
Recommendations:
1. For each population of deer, the Venn diagram can be used to show the results.
Depending on the detectability type, plants will be circled out according to the
detecting method. In other words, plants that can be detected by method
Micohistology are circled out in Micohistology, those that can be detected by
method Next-generation sequencing are circled out in Next-generation
sequencing. The intersection part indicates that the plant can be detected by
both methods.
2. We can note that there are only 3 factors, which could determine the result of
diet composition. They are season(summer or winter), deer(Chuska or
Carrizo), detecting method(mic or ngs). They are all binary variable.
Therefore, we can use a 3-dimension cube plot to show the results.
0
x-axis : π‘₯ = {
1
0
y-axis: 𝑦 = {
1
0
z-axis: 𝑧 = {
1
πΆβ„Žπ‘’π‘ π‘˜π‘Ž
;
πΆπ‘Žπ‘Ÿπ‘Ÿπ‘–π‘§π‘œ
𝑀𝑖𝑐
;
𝑛𝑔𝑠
π‘Šπ‘–π‘›π‘‘π‘’π‘Ÿ
.
π‘†π‘’π‘šπ‘šπ‘’π‘Ÿ
And then, depending on the detectability type, the plants will be concentrated
at different apex according to the different condition. For example, plants can
be detected from Chuska deer at Summer by ngs method will be concentrated
at the point (0,1,1). The advantage of this figure is that we can clearly find out
the differences between different situations.
3. Bland-Altman plot can be adopted to show the results.
Consider a set of 20 samples in summer (or winter), Both methods are
performed on each sample, resulting in 40 data points. Each of the 20 samples
is then represented on the graph by assigning the mean of the two
measurements as the x-axis value, and the difference between the two values
as the y-axis value. Hence, given sample S with values 𝑆1 and 𝑆2 determined
by two methods is:
𝑆(π‘₯, 𝑦) = (
𝑆1 + 𝑆2
2
, 𝑆1 − 𝑆2 )
4. MA Plot
MA Plot is an application of a Bland-Altman plot for visual representation of
two types of data (Mic and ngs) which has been transformed onto the M (log
ratio) and A (mean averages) scale.
Download