Uploaded by rxxphfexpeueoyljtf

(INCOMPLETE) Statistics Key Points

Statistics Key Points
1.1.1
Discrete data can only take on certain values.
Continuous data can take on any value, possibly within a limited
range.
1.1.2
For ungrouped data, the mean is
๐‘ฅ=
๐›ด๐‘ฅ
๐‘›
๐‘ฅ=
๐›ด๐‘ฅ๐‘“
๐›ด๐‘“
For grouped data,
1.1.3
For ungrouped coded data, the mean is
1 ๐›ด(๐‘Ž๐‘ฅ − ๐‘)
๐‘ฅ= (
+ ๐‘.
๐‘Ž
๐‘›
For grouped coded data,
1 ๐›ด(๐‘Ž๐‘ฅ − ๐‘)๐‘“
๐‘ฅ= (
+ ๐‘.
๐‘Ž
๐›ด๐‘“
These formulae can be summarised by writing
1
๐‘ฅ = ⋅ [mean(๐‘Ž๐‘ฅ − ๐‘) + ๐‘]
๐‘Ž
1.1.4
Interquartile range = upper quartile – lower quartile
or
๐ผ๐‘„๐‘… = ๐‘„! − ๐‘„"
1.1.5
Standard deviation = √Variance
For ungrouped data,
๐›ด (๐‘ฅ − ๐‘ฅ ) #
๐›ด๐‘ฅ #
๐›ด๐‘ฅ
#
๐œŽ=E
=E
− ๐‘ฅ where ๐‘ฅ =
๐‘›
๐‘›
๐‘›
For grouped data,
๐›ด(๐‘ฅ − ๐‘ฅ )# ๐‘“
๐›ด๐‘ฅ # ๐‘“
๐›ด๐‘ฅ๐‘“
#
๐œŽ=E
=E
− ๐‘ฅ where ๐‘ฅ =
๐›ด๐‘“
๐›ด๐‘“
๐›ด๐‘“
1.2.1
A probability distribution shows all the possible values of a
variable and the sum of the probabilities is
๐›ด๐‘ = 1
1.2.2
The expectation of a discrete random variable (DRV) is
๐ธ (๐‘‹) = ๐›ด๐‘ฅ๐‘ = ๐›ด [๐‘ฅ ⋅ ๐‘ƒ(๐‘‹ = ๐‘ฅ)]
1.3.1
If ๐‘ฟ~๐‘ฉ(๐’, ๐’‘) then the probability of ๐‘Ÿ successes is
๐‘›
๐‘$ = S T ๐‘$ (1 − ๐‘)%&$
๐‘Ÿ
1.3.2
The mean and variance of ๐‘‹~๐ต(๐‘›, ๐‘) are given, respectively, by
๐œ‡ = ๐‘›๐‘ and ๐œŽ # = ๐‘›๐‘(1 − ๐‘) = ๐‘›๐‘๐‘ž
1.4.1
A random variable ๐‘‹ that has a geometric distribution is denoted
by ๐‘ฟ~๐‘ฎ๐’†๐’(๐’‘), and the probability that the first success occurs on
the ๐‘Ÿth trial is
๐‘$ = ๐‘(1 − ๐‘)$&" for ๐‘Ÿ = 1,2,3, …
1.4.2
When ๐‘‹~๐บ๐‘’๐‘œ(๐‘) and ๐‘ž = 1 − ๐‘, then
• ๐‘ƒ (๐‘‹ ≤ ๐‘Ÿ) = 1 − ๐‘ž $
• ๐‘ƒ (๐‘‹ > ๐‘Ÿ) = ๐‘ž $
1.4.3
The mode of all geometric distributions is 1.
1.5.1
๐‘‹~๐‘ (๐œ‡, ๐œŽ # ) describes a normally distributed random variable.
We read this as “๐‘‹ has a normal distribution with mean ๐œ‡ and
variance ๐œŽ # ”
1.5.2
The standard normal variable is ๐‘~๐‘ (0,1)
1.5.3
When ๐‘‹~๐‘(๐œ‡, ๐œŽ # ) then
๐‘‹−๐œ‡
๐œŽ
has a normal distribution. [Refer to 2.2.3]
A standardised value
๐‘ฅ−๐œ‡
๐‘ง=
๐œŽ
tells us how many standard deviations ๐‘ฅ is from the mean.
๐‘=
1.5.4
๐‘‹~๐ต(๐‘›, ๐‘) can be approximated by ๐‘(๐œ‡, ๐œŽ # ), where ๐œ‡ = ๐‘›๐‘ and
๐œŽ # = ๐‘›๐‘๐‘ž, provided that ๐‘› is large enough to ensure that ๐‘›๐‘ > 5
and ๐‘›๐‘ž > 5
1.5.5
Continuity corrections must be made when a discrete distribution is
approximated by a continuous distribution.
2.1.1
A Poisson distribution can be used to model a discrete probability
distribution in which the events occur singly, at random and
independently, in a given interval of space or time. The mean and
variance of a Poisson distribution are equal; hence, a Poisson
distribution has only one parameter.
2.1.2
When modelling data using Poisson distribution:
• Work out the mean and variance and check if they are
approximately equal.
• If mean and variance are not approximately equal, the
Poisson distribution is not a suitable model to use with the
data.
• Use the mean to calculate probabilities and expected
frequencies.
• Compare expected frequencies with observed frequencies.
2.1.3
If the random variable ๐‘‹ has a Poisson distribution with parameter
๐œ†, where ๐œ† > 0, we write ๐‘‹~๐‘ƒ๐‘œ(๐œ†) and:
'!
• ๐‘ƒ (๐‘‹ = ๐‘Ÿ) = ๐‘’ &' ⋅
where ๐‘Ÿ = 0,1,2, …
• ๐ธ (๐‘‹) = ๐œ†
• ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) = ๐œ†
$!
2.1.4
In a Poisson distribution, events occur at a constant rate; the mean
average number of events in a given interval is proportional to that
interval.
2.2.1
For a random variable ๐‘‹ and constants ๐‘Ž and ๐‘:
๐ธ (๐‘Ž๐‘‹ ± ๐‘ ) = ๐‘Ž๐ธ (๐‘‹) ± ๐‘ and ๐‘‰๐‘Ž๐‘Ÿ(๐‘Ž๐‘‹ + ๐‘) = ๐‘Ž# ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹)
2.2.2
For two independent random variables ๐‘‹ and ๐‘Œ and constants ๐‘Ž
and ๐‘:
๐ธ (๐‘Ž๐‘‹ ± ๐‘๐‘Œ ) = ๐‘Ž๐ธ (๐‘‹) ± ๐‘๐ธ (๐‘Œ)
and
๐‘‰๐‘Ž๐‘Ÿ(๐‘Ž๐‘‹ ± ๐‘๐‘Œ) = ๐‘Ž# ๐‘‰๐‘Ž๐‘Ÿ (๐‘‹) + ๐‘ # ๐‘‰๐‘Ž๐‘Ÿ(๐‘Œ)
These results can be extended to any number of independent
random variables.
2.2.3
If a continuous random variable ๐‘‹ has a normal distribution, then
๐‘Ž๐‘‹ + ๐‘, where ๐‘Ž and ๐‘ are constants, also has a normal
distribution.
If continuous random variables ๐‘‹ and ๐‘Œ have independent normal
distributions, then ๐‘Ž๐‘‹ + ๐‘๐‘Œ, where ๐‘Ž and ๐‘ are constants, has a
normal distribution.
2.3.1
A graph, f(x), representing a continuous random variable is the
probability density function (PDF).
The PDF has following properties:
• It cannot be negative since you cannot have a negative
probability; ๐‘“ (๐‘ฅ ) ≥ 0.
• Total probability of all outcomes = 1; hence,
)
n ๐‘“(๐‘ฅ ) ๐‘‘๐‘ฅ = 1
&)
In many situations, the data are defined across a specified interval
or across specified intervals, outside of which ๐‘“(๐‘ฅ) = 0.
2.3.2
• With continuous random variables, each individual value has
zero probability of occurring.
For a continuous random variable with PDF ๐‘“ (๐‘ฅ ),
๐‘ƒ (๐‘‹ = ๐‘Ž ) = 0.
• Because we cannot find the probability of an exact value,
when finding the probability in a given interval it does not
matter whether you use < or ≤.
๐‘ƒ (๐‘Ž < ๐‘ฅ < ๐‘ ) = ๐‘ƒ (๐‘Ž ≤ ๐‘ฅ < ๐‘) = ๐‘ƒ(๐‘Ž < ๐‘ฅ ≤ ๐‘) = ๐‘ƒ (๐‘Ž ≤ ๐‘ฅ ≤ ๐‘ )
Note that this does not imply that ๐‘‹ cannot take the value ๐‘Ž, it just
means the probability of the exact value ๐‘Ž is zero.
2.3.3
The probability of ๐‘‹ lying in the interval (๐‘Ž, ๐‘) is given by the area
under the graph between ๐‘Ž and ๐‘.
That is:
*
๐‘ƒ(๐‘Ž < ๐‘‹ < ๐‘ ) = n ๐‘“ (๐‘ฅ ) ๐‘‘๐‘ฅ
+
2.3.4
The median, ๐‘š, of a continuous random variable is that value for
which
,
1
๐‘ƒ(๐‘‹ < ๐‘š) = n ๐‘“ (๐‘ฅ ) ๐‘‘๐‘ฅ =
2
&)
2.3.5
The ๐‘Ÿth percentile, ๐‘˜, of a continuous random variable is that value
for which
๐‘Ÿ
๐‘ƒ (๐‘‹ < ๐‘˜ ) = n ๐‘“(๐‘ฅ) ๐‘‘๐‘ฅ =
where 0 < ๐‘Ÿ < 100
100
&)
2.3.6
For continuous random variables with PDF ๐‘“(๐‘ฅ):
)
๐ธ (๐‘‹ ) = n ๐‘ฅ๐‘“ (๐‘ฅ ) ๐‘‘๐‘ฅ
&)
and
)
)
&)
&)
๐‘‰๐‘Ž๐‘Ÿ(๐‘ฅ ) = n ๐‘ฅ # ๐‘“(๐‘ฅ) ๐‘‘๐‘ฅ − sn ๐‘ฅ๐‘“ (๐‘ฅ ) ๐‘‘๐‘ฅt
Page 102
#