One of the crucial issues facing Allied military planners during World War II was the challenge of estimating the number of German tanks, half-tracks and aircraft. Intelligence officers used a variety of methods to compile reasonable estimates. Because prewar and ongoing propaganda emphasized the strength of the German forces and because prisoners of war gave exaggerated accounts, the Allied military leaders received highly inflated estimates.
As the war progressed, Allied forces gathered information about serial numbers of captured tanks. In time, cryptographers concluded that all Mark I tanks were numbered sequentially. By looking at the serial numbers of the sample, that is, the captured tanks, mathematicians could use statistical techniques to revise the estimates prepared by intelligence officers.
1. Commit to an Outcome
Assume that the Allies had captured tanks numbered 58, 20, 74 and 8 during one week early in the war. Estimate the number of tanks in service at that time.
2. Expose beliefs
Share your ideas in your small group. Select a member of your group to share all of your group members’ estimates and explanations with the class.
1
3. Confront beliefs
In your group, decide on a strategy to make an estimate.
Share your method(s) with other groups.
4. Accommodate the concept
In your own words, what have you learned about effective strategies to make such estimates?
5. Extend the concept
Besides those you have heard already from other classmates, what are
some other examples of strategies that can be used to estimate
the number of German tanks in service at that time?
What are some other real-world things that would be estimated the
same way?
2
6. Go beyond
What if the next captured tank had the number 190? How would that
additional information alter the predictions generated already?
Simulation Using a Random Number Generator
To test the methods further, an experiment is suggested in which one student chooses a number to represent the total number of German tanks. That student then records the number without revealing it to anyone. By using a graphing calculator to select random numbers from 1 to the selected N, the student supplies “data” for the class to use.
The class is divided into five groups. Group 1 doubles the mean, group 2 doubles the median, group 3 doubles the midrange, group 4 adds S -1 to L (where S is the smallest observed tank number and L is the largest) and group 5 uses the formula using the mean interval length: ( where L is the largest observed tank number and n is the number of tank observed).
After the student revealed the chosen number, classmates then compare estimates to learn which group’s is closest. Repeat this experiment with a different student choosing a number and individual students choosing their preferred method of making the estimate. After comparing the various estimates, focus a discussion on the characteristics of a good estimate and the ways to choose from among several possible estimates.
During World War II, sampling German tanks was done without replacement, that is, no chance existed of observing the same serial number a second time. To make the simulation as accurate as possible, the experiment should also be done
3
without replacement; if the calculator happens to choose the same random number a second time, that piece of data should be eliminated and another random number should be generated. If the sample size is small compared with
N, say, less than 5 percent, the difference between sampling without replacement is negligible.
(Students who are interested in programming may wish to accomplish this task by using a ‘While’ command in a program on their calculator.)
4
[The tank-estimation problem is easy to simulate another way. Students can draw objects without replacement from a bag of items numbered from 1 to N. Imitating the real-world problem is possible by using numbered toy tanks as items for the drawing. Such a method, however, is not realistic because the instructor is estimating a known quantity. Students are rightfully skeptical and believe that the situation is contrived. They need to be convinced that estimation techniques are used in the real world only when the actual value is not known.]
Methods that may or may not have been covered by a particular class of students:
Method 1: One group calculated the sample mean, that is
, or 40, and doubled that number to obtain 80 as
the estimate.
Method 2: One group found the sample median, that is, , or 39, and
doubled that value for an estimate of 78.
Method 3: One group suggested doubling the midrange (even though they
did not know that terminology). Estimate:
Method 4: One group finally said “It’s seven from ‘1’ to the first serial number,
so it’s probably about 7 from ‘74’ to the last number” and suggested
5
81 as an estimate. [ This could be thought of as S – 1 + L, where L is the largest observed tank number and S is the smallest.
Method 5: One group found each interval length and then calculated the mean
interval length to be = 18.25 resulting in an estimate of
74 + 18 or 92 – which they noted was somewhat higher than the estimates that used the other methods. When pressed to summarize
the steps in method 5, one group noticed that the mean interval length could be found more efficiently by calculating where
L is the largest observed tank number and n is the number of tank
numbers observed.
6