Materials for Learning Mathematical Statistics

advertisement
Materials for Learning Mathematical Statistics
Peter Petocz & Narelle Smith
University of Technology, Sydney
Broadway, NSW 2007
peterp@maths.uts.edu.au ; narelle@maths.uts.edu.au
In this paper we discuss the preparation and use of a set of worksheets to assist students
studying a second statistics course in mathematical statistics. We talk about the learning
principles on which these worksheets are based, and give examples from them. We also
include comments from lecturers and students who have used the worksheets.
Introduction
There has been much discussion in statistics education concerning the best approach to the rst
statistics course. In Australia three workshops have focused on this topic (July, 1992, Perth;
July, 1994, Melbourne; July, 1996, Sydney) while internationally many papers in the ICOTS
series have also taken introductory statistics education as their topic ([1],[2]). Papers in the
electronic Journal of Statistics Education routinely discuss some aspect of the rst statistics
course, [3],[4]. In contrast, the literature on further statistics education is more sparse and
tends to focus on applied statistics, [5],[6].
Most educators believe that a rst statistics course should be based on statistical thinking and
ideas, at the expense of mathematical detail in areas such as probability and random variables.
For some students, this is the only statistics course that they take, but for students studying
courses majoring in statistics there is of course a need for the mathematical background the study of combinatorics, probability, random variables, expectation, generating functions
and limit theorems. For such students, the problem of coming to grips with the necessary
mathematics still remains, and may even be exacerbated by a rst course which focuses on
statistical ideas rather than the mathematical principles.
This is the position in our degree at UTS. Students take a degree in mathematical sciences
(majoring in statistics or operations research), mathematics and computing, or mathematics
and nance. All take an introductory statistics course in rst year, followed by a course in
mathematical statistics in second year. This course uses a standard textbook, [7]. They take
further statistics courses in third year depending on their specialisation.
The Problem
Historically, students have found the mathematical statistics course a dicult one. They have
trouble applying the mathematical techniques that they have learnt in their other courses to the
area of statistics (eg summing the series for the mean of a Poisson distribution, integrating to
nd the moment generating function of an exponential distribution). Comments from students
show that many of them believe that they can \cram" for the examination in a few days,
and so they do not seriously engage with the material during the classes. Other students are
put o by a lack of relevance to the \real world." Despite the fact that they have chosen to
study for a degree in mathematical sciences, they are not very interested in or knowledgeable
165
166
Petocz & Smith { Mathematical Statistics
about mathematics. They seem to have a particularly weak understanding of the nature of
mathematical proof.
Solutions
We have been debating our approach to the pedagogy of this subject for many years. One
particular focus of our attention has been the development of appropriate learning materials
to assist students of mathematical statistics. A summary of the course has been written,
containing an outline of topics and worked solutions to textbook questions, [8], and this has
attracted positive comment from students. A video-based package of materials for learning
combinatorics has also been prepared, [9], [10]. This consists of a drama-based exploration
of combinatoric ideas and applications, and a booklet of questions that stress language and
interpretation problems, historical background and unusual practical applications. This is also
a popular learning resource with students, although some of our colleagues nd the concept of
a jailed professor discussing combinatorics with his fellow prisoners (including ways of escaping
from their cell) rather bizarre, and so don't use it when they are teaching the course!
Our latest eort along these lines is the preparation of a series of 10 worksheets covering the
topics in the mathematical statistics course. These build on the success of the Minitab-based
statistical laboratories, [11], that we developed for use in our introductory statistics course.
We have found that these statistical computer-based and data-driven exercises give students
an opportunity to actually practise statistics, rather than just listening to us talking about the
subject. Students concur, and in our course appraisals we always get many positive comments
about these laboratories.
Our mathematical statistics worksheets use many of the features of the introductory statistics
laboratories. They are designed to be carried out in small groups of two or three students
working together in a tutorial. The worksheets are mostly based on applied situations, and
make particular use of the idea of modelling data with probability distributions. They tend
to be grouped into 5 or 10 questions on the same problem, and ask students to plan, discuss
and carry out a strategy, and to explain the reasons for their approach. We believe that the
inclusion of reading, explaining and writing, rather than just carrying out mathematical manipulations, enables students to obtain a fuller appreciation of the techniques and applications
of mathematical statistics.
Some of the groups of questions end with an open-ended problem, allowing and encouraging
further exploration. Other questions tackle the problem of understanding the nature of proof,
using some of the principles expounded in Smith and Petocz, [12]. Rather than just asking
students to supply a proof of a result, we ask them to comment on a particular step in the
proof, give a reason for a step, ll in a missing step, put proof steps in logical order, or construct
a variation on a proof for an alternative case.
Although originally the worksheets were optional, we now include them, or some subset of
them, as part of the assessment for the subject, so students can get some credit for the regular
work they are doing. A simple marking scale of 0 to 3 is used for each worksheet to allow tutors
to quickly check over the students' work, yet at the same time to reward their eort.
Learning Principles
We believe that learning in statistics takes place only if students actively engage with material,
individually constructing their understanding of mathematical and statistical ideas, linking it
The Challenge of Diversity { 0 99
167
to their previous knowledge. This constructivist idea is summarised in more detail in Gareld,
[13]. Our aim as teachers is to encourage this 'deep learning', characterised by an intention
to understand, rather than 'shallow learning', characterised by an intention to complete task
requirements: the notions of deep and shallow learning are discussed by Ramsden, [14].
Gareld, [13], lists principles of learning statistics, several of which are relevant in the context
of examining the eectiveness of these worksheet materials. These include the notions that
students learn by active involvement - preferably in small groups - in learning activities, learn
to do well only what they practice, and benet from using technology that enables them to
visualise and explore data in a variety of ways. Gareld's summary of research in statistics and
mathematics education also notes that working in small groups seems to lead to more positive
attitudes, and that activities involving written discussion and explanation lead to improved
understanding.
We aim to provide a student-focused learning environment, wherein students are free to use a
variety of learning tools that enhance their own view of learning. Providing variety for students
in terms of providing variation in the ways they can learn or move through a course of study can
enhance their learning. This variety is particularly important for some groups of learners. Race,
[15], writes that \As the proportion of mature and non-traditional entry learners increases,
we need to complement traditional teaching and learning approaches by creating additional
exible learning pathways, and to replace entirely some traditional approaches disliked by
mature learners". The materials that we are developing for learning mathematical statistics
have this as one of their goals.
Another goal is to build up students' condence with small, early successes, and this is why
the worksheet questions are broken down into smaller blocks than typical textbook questions.
This avoids the problem of students never succeeding with typical tutorial questions in class
or on their own.
Examples
Some examples of questions from the worksheets are included here to indicate the approach
that we have taken. (In the actual worksheets, spaces are left for students' answers.)
Sharing Jobs: How many ways are there of sharing 6 jobs between 3 contractors?
In this form, there are several interpretations of the question, leading to several dierent answers.
Sharing can be interpreted in three dierent ways: each contractor gets the same number
of jobs, at least one job, or any number of jobs. The jobs can be considered as distinguishable or indistinguishable (depending on whether you can or want to tell the jobs apart).
The contractors can be considered as distinguishable or indistinguishable (depending on
whether you want to tell them apart). How many dierent interpretations of the original
question does this lead to?
If each job is dierent, each contractor is identied and each contractor can get any
number of jobs, what is the answer to the original question?
At the other extreme, what is the answer if the jobs are all the same, contractors are not
considered identiable, and each contractor must get the same number of jobs?
Petocz & Smith { Mathematical Statistics
168
Assume that the jobs are the same but the contractors are distinguished and each can
get any number of jobs. How many ways are there then of sharing the jobs?
Describe any other interpretation and then give an answer to the original question.
(We have identied and acknowledged a linguistic ambiguity, asked for solutions under several
assumptions and then given an open-ended task at the end.)
Airline Reservations: During business hours, the number of calls per minute X on a com-
puter reservation system follows a Poisson distribution with probability function
x
p(x) = P (X = x) = e, : 3x:5! ; x = 0; 1; 2; : : :
Show that this is a proper probability function (in other words, that all the probabilities
are 0 and that they sum to 1).
3 5
Examine the following steps in a proof:
!
{ If X Bin(n; p), then p(x) = nx px(1 , p)n,x, x = 0; 1; 2 : : : ; n
{ Let n ! 1 and !p ! 0 in such a way that np = n,x
n n
{ Then p(x) = x n 1 , n
n ,x x
1
1 , n
1 , n1 1 , n2 : : : 1 , x ,
= x! 1 , n
n
{ As n ! 1; 1 , n n ! e, and each of the following terms tend to 1.
x e, , and hence X Poisson().
{ So nlim
p
(
x
)
=
!1
x!
What theorem is being proved here? Write a brief explanation of each of the steps and
ll in any missing mathematical details.
(First, a standard question reinforcing basic ideas about probability functions. Then, standard
theory with non-standard questions involving ideas of proof.)
Queuing Distributions: The Erlang distribution was named after A.K. Erlang, a Danish
telecommunications engineer who studied queuing theory early this century. If S represents
the service time in a queue where there are k independent stages of service, each lasting an
average of m minutes, then S Erlang(k; m) with
sk, e,s=m (s > 0)
f (s) = m
k (k , 1)!
1
Write down the integral that represents Ms(t), the moment generating function of the
service time. which variable will be the variable of integration, and which variable will
be left?
You can evaluate this integral without doing any integration! Describe your general plan
for doing this, and explain why you will not need to perform an integration.
Evaluate the integral to show that Ms(t) = (1 , mt),k for t < m .
1
The Challenge of Diversity { 0 99
169
Dierentiate this moment generating function to nd the rst two moments of S .
Now expand the moment generating function to get the rst two moments of S . You may
remember that Isaac Newton showed that the binomial theorem
(1 + ax)n = 1 + n(ax) + n(n , 1)(ax) + n(n , 1)(n , 2)(ax) + : : : + (ax)n
holds also for negative values of n, although then, of course, the series is innite.
(In an applied setting, we focus on a common problem in setting up the moment generating
function, help students to plan out an approach and to carry out their plan. Then we ask them
to demonstrate how the moment generating function can be used to get moments, with a bit
of historical background thrown in.)
1
2
2
1
3
3!
Reactions from Lecturers and Students
The worksheets have been well received by lecturers running the course in mathematical statistics. They have generally used them in weekly tutorials and sometimes as assignment questions.
Here are comments from two lecturers:
I found the worksheets to be a useful tool to help students gain an understanding
of the theory discussed in lectures. The problems in the worksheets are dierent
from the questions given in the textbook and often require the students to think about
what they are doing, rather than blindly using a formula without understanding the
rationale behind it. However, there needs to be an incentive for the students to do
the worksheets, such as handing them in to be included in their assessment for the
subject, otherwise a lot of students do not complete all the worksheets.
The mathematical statistics worksheets are a very useful tool to help students understand the fundamentals of some of the basic ideas in a setting that relates these
concepts to real-life problems.
When we have taken the course ourselves, we have found the worksheets to be a useful alternative to traditional lecturing. For instance, we have asked students to go through the worksheet
on order statistics and bivariate transformations working in small groups, rather than give a
lecture on the topic.
Many of the students have also expressed positive opinions. The following comments are taken
from nal subject appraisals and telephone conversation with students:
I liked how the questions in the assignments were asked relating to practical events,
for example, fat and protein in the diet. [referring to one of the problem sets bivariate
probability functions]
Worksheets - a great idea! Will have a lot of applications in real life.
Some of the worksheet examples were very good. They took it from a dierent perspective. They built up so you had to do the rst bit before you did the next one,
and so on. Answers would be good so that you could check that you were going on
the right track.
We have a worksheet for each topic - that's good. I like that because I have understood
the work I have done.
Petocz & Smith { Mathematical Statistics
170
However, some students were less impressed:
I didn't use the worksheets at all because they didn't have answers. Given the shortness of time, I focussed on questions from the textbook with worked solutions in the
blue course outline.
The course was a bit disjointed. The questions in the textbook, the worksheets and
the exams were quite dierent.
On the basis of these and other comments, we would in future supply students with answers
to the numerical questions so that they could check their progress. Of course, many questions
ask for discussion or explanation, so answers are not so appropriate. We also recommend to
lecturers that at least some of the worksheets should be handed in for credit, either as solo or
small-group eorts.
Comments and Conclusions
We don't believe that we have solved the problems associated with teaching a second course
in mathematical statistics, although we nd that the course we are giving now has been signicantly improved over the last few years. We are planning to extend the set of worksheets
to cover other aspects of the course, with more worksheets pitched at a simpler level than the
present ones. We aim to have a set of materials which can be used successfully for individual
study of the whole course.
In further worksheets, we would also strengthen the links between statistics and mathematics,
using topics such as re-sampling as an alternative to carrying out a t-test, and checking whether
a particular set of data was adequately described by (say) an exponential distribution. We have
also been investigating the use of interactive web-based laboratories in mathematical statistics:
a preliminary version of the combinatorics worksheet has already been prepared.
We have discussed whether to integrate the worksheets more with a statistics package such as
Minitab or a computer algebra system such as Mathematica. Both have an important place
in mathematical statistics: for instance, Minitab can be used to check whether an exponential
distribution is an appropriate model for a set of data, and Mathematica can be used to set up
and evaluate integrals representing joint probabilities. However, we feel that it is important
for students to understand the basic 'tricks' of mathematical statistics (eg that 'almost every
integral equals 1', as in the queuing example above). Students need some time with paper and
pencil to fully appreciate these techniques.
The biggest problem remains the following. How can we motivate students to nd the mathematical side of statistics interesting and important? We believe that appropriate learning
contexts and materials are an important part of the solution.
References
[1] Dansie, B. (1998). Using Collaborative Learning Packages to Teach Introductory Statistics
at the Post-Secondary Level, Proc. 5th Int. Conf. on Teaching Statistics, IASE.
[2] Witmer, J.A., (1998), Using Activities in Stats 101, Proc. 5th Int. Conf. on Teaching
Statistics, IASE.
The Challenge of Diversity { 0 99
171
[3] Steinhorst, R.K. & Keeler, C.M., (1995). Developing Material for Introductory Statistics Courses from a Conceptual, Active Learning Viewpoint, Journal of Statistics Education,3(3).
[Online at http://www.amstat.org/publications/jse/toc.html]
[4] Roiter, K. & Petocz, P., (1996). Introductory Statistics Courses - A New Way of Thinking,
Journal of Statistics Education, 4(2).
[Online at http://www.amstat.org/publications/jse/toc.html]
[5] Simono, J. S. (1997). The 'Unusual Episode' and a Second Statistics Course, Journal of
Statistics Education, 5(1).
[Online at http://www.amstat.org/publications/jse/toc.html]
[6] Love, T.E. (1998). A Project-Driven Second Course, Journal of Statistics Education, 6(1).
[Online at http://www.amstat.org/publications/jse/toc.html]
[7] Wackerly, D.D., Mendenhall, W. & Scheaer, R.L. (1996). Mathematical Statistics with
Applications, 5th ed., Duxbury Press.
[8] Petocz, P. (1996). 35252 Statistics 2 Course notes and worked solutions to exercises, School
of Mathematical Sciences, University of Technology, Sydney.
[9] Petocz,P. & Wood, L.N. (1993). Count Me In - Combinatorics . . . The Art of Counting,
(Video, 24 mins, book of exercises), University of Technology, Sydney and Open Training
and Education Network.
[10] Petocz, P. & Petocz, D. (1994). The art of counting - Materials for teaching and learning
combinatorics, Proc. 4th Int. Conf. on Teaching Statistics(2), Marrakech, 485.
[11] Petocz, P. (1998). Statistical Laboratory Exercises Using Minitab, John Wiley & Sons,
New York.
[12] Smith, G. & Petocz, P. (1994). Proofs: Teaching and Testing - A Tragedy in Three Acts,
Int. J. Mathematics Education in Science and Technology, 25(5), 139-158.
[13] Gareld, J. B. (1995). How Students Learn Statistics, Int. Statistical Review, 63(1), 25-34.
[14] Ramsden, P. (1992). Learning to Teach in Higher Education, Routledge, London.
[15] Race, P. (1999). Practical Pointers to Flexible Learning,
[Online at http://www.lgu.ac.uk/deliberations/ex.learning/race fr.html]
Download