Further Probability & Statistics Student's Book

Cambridge International AS & A Level Further Mathematics Further Probability & Statistics STUDENT’S BOOK: Worked solutions Yimeng Gu, Dr Patrick Wallace Series Editor: Dr Adam Boddison Pure Mathematics 1 International Students Book Title page.indd 1 57736_Pi_viii.indd 1 WS TITLE PAGE_Further Probability & Statistics.indd 1 14/11/17 10:46 pm 6/18/18 10:38 3:21 PM 31/07/18 AM 1 Worked solutions Worked solutions 1 Continuous random variables Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Prerequisite knowledge 1 1 y − 5 2 x=  2  6 ∫1 k(x − 2)dx = 1 a  2  k   x − 2x   = 1 2  1  (( ) ( )) Exercise 1.1A 1 b P (X = 2.5) = 0 a P(X < 2) = 0 1 b P(−0.5 < X < 3) = x  41 ∫1 7 x dx =  14 1 = 3 u = x2 − 4 (X 0.8) = 1 − ∫ ∫2 x3 x2 − 4 dx = 2 −4=0 1 5 − 12 u (u + 4) du 2 ∫0 32 a 1 1 3 3 −4=5 1 ( 41 dx = 1 − 1 = 14 15 15 1 ∫1 6 dx = 2 ) 2 ⌠  5 m 2 + a dm + ⌡−1 2 ∫1 a dm = 1  2 3 1 2  15 m + am  + [am ]1 = 1 −1 a= 11 45 f(m) b 5 1   2 u 23 + 8u 21   =  2 3  −0.8 1 −1 2 22 31 P(X > −0.8 ) = 1 − P(X  −0.8) (1614 ) − (141 ) = 1141 du = 2x, x 2 = u − 4 dx 3 c d P(X > 1) = x u 1 0 ∫−0.5 3 dx + ∫0 6 dx = 6 + 2 =2 3 6 2  x3 44 2 2 ∫1 x 15 (x − 2)dx = 15   3 − x 1  = 9 6 2 2 = 1 x  + 1 x  = 1 + 1 = 2  3  −1  6 0 3 3 3 = 2 (4) = 8 15 15 2 4 21 ∫−1 3 dx + ∫0 6 dx 0 5 2 (x − 2) dx = 2   x 2 − 2x   ∫3 15   15   2 3  5 d 1 2 Range: f –1(x)  0 k 36 − 2(6) − 1 − 2 = 1 2 2 2 k= 15 c ( ) Therefore f −1( x ) = x − 5 2 Domain: x  5 6 29 45   0  3 1  = 1  2 × 52 + 8 × 52  23  = 5 5+4 5 3 17 = 5 3 4 y = 2x2 + 5 x2 = y−5 2 11 45 –1 0 1 2 m 1 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 CONTINUOUS RANDOM VARIABLES 1 c 3 2 1 a ) c 0 k cos x −π 3 ∫ P(−0.2 y 1.2) =∫ 0 =∫ 0 1 × 8 (2 − y ) dy + 1 1 × 8 dy ∫0 2 15 15 1 .2 1 × 8 y dy +∫ 15 1 2 −0.2 4 1 dx + ∫ (1 − x ) dx = 1 0 1 [k sin x ]0− π3 +  x − 12 x 2 0 = 1 = 7 + 4 + 22 = 143 125 15 375 375 ( )       0 −  − 3 k   +  1 − 1 − 0 = 1 2    2    5 a 3 1 k + =1 2 2 25 ∫0 1 a dt + ∞ at − 23dt = 1 ∫25 50 ∞ 25 1  1 at  +  −2at − 2  = 1    50 0   25 10 a= 9 3 1 k= 2 2 3 3 k= 1 m(2 − y ) dy + 1 1 m dy + 1.2 1 my dy ∫0 2 ∫1 2 −0.2 4 11 dm 1 = 45 2 +∫ 3 ( ⌠ 2 2 11 P(0 < M  3 ) =  5 m + 45 dm + ⌡0 2 b b f(t) 1 45 f(x) 1 √3 3 0 –π 3 c . 0.008 ( ) ⌠ P (x < π ) =  3 ⌡ 1 x π P x= 4 =0 i ii 0 −π 3 ( 3 cosx dx + 1(1 − x ) dx = 1 ∫0 3 ) π 0 iii P − π x π = ∫ π 3 cosx dx + ∫ 6 (1 − x )dx 6 6 3 0 − 6 = 0.675 (3 s.f.) 4 a f(y) ) a ∫0 3 t dt + ∫1 k −1 1 11 3 2 2 2 2 2 2 1 1 2 1 1 1 1 1 1 + k − + k2 − k2 + k + k2 − k + = 1 6 3 3 6 3 3 6 3 6 0 1 2 y − 1 1 1 + k + =1 2 3 6 1 1 1 k =1− + 3 6 2 1 0  1  2 my − 1 my 2  +  1 my  +  1 my 2  = 1 8  2  −1  2 0  4 1 8 m = 15 k=4 1 1 (k − t ) dt = 1 k −1 3  1 2  1  1  k −1  1 k t + t +  kt − 1 t 2  =1 6  k −1  6  0  3  1 3 ∫−1 4 m(2 − y )dy + ∫0 2 m dy + ∫1 2 my dy = 1 0 k dt + ∫ ( 16 − 0) + ( 13 (k − 1) − 13 ) +  ( 13 k − 16 k ) − ( 13 k (k − 1) − 16 (k − 1) ( 16 − 0) + ( 13 (k − 1) − 13 ) +  ( 13 k − 16 k ) − ( 13 k (k − 1) − 16 (k − 1) ) = 1 1 1 2 1 1 1 1 1 1 + k − ) +  k − ( k − k − k + k − ) = 1 6 (3 3 6 3 3 6 3 6  1m 2 b ( 3 ⌠ 1 × 10 dt + 30 10 t − 2dt = 0.150 (3 s.f.)  ∫ 25 9 ⌡20 50 9 2 3m 4 –1 6 25 c 2 m t 25 11 21 2 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions b P(T > 2) = 1 − P (T  2) =1− 3 2 t dt + ∫ 1 dt 1 3 ( 1 11 0 =1− 1 + 1 6 3 1 = 2 c (∫ 1 2. 5 1 ∫0.5 3 t dt + ∫1 3 Or ) P(T  3) ) = 1 – P(T > 3) a dt = 5 8 1 6 1 8 7 11 ( ) ( ) 75 10 27 6  =1− + − + 97 97   97 97 4 2 5 f(t) 0 5 ∫3 97 (3t + 1)dt 3 2  = 1 −  t2 + t 97 3  97 7 =1− =1− 52 97 45 = 97 Alternative method, using the graph to work out the area: t ( ) 2 1 14 20 28 17 Using the graph from part a P (T 3) = 14 × + ×1× + = + = 97 2 97 97 97 97 4 1 1 2 P ( 2 T < 7 ) = ∫ t dt + ( 7 − 4 ) 6 P (T 3) = 14 × 2 + 1 × 1 × 14 + 20 = 28 + 17 = 45 2 128 97 2 97 97 97 97 97 7 1 = + = 31 48 2 48 5 2 69 ii P ( 2 T 5 ) = ∫ (3t + 1) dt = 97 2 97 ii P (T > 7 ) = 1 × 1 × (11 − 7 ) = 1 2 6 3 Alternative method, using the graph: b i ( c P (T < 2 ) = 2 1 ∫0 128 t 2 P (2 T 5) = dt = 1 48 Alternative method, using the previous answers: 9 P(T < 2) = 1 – P(2  T < 7) – P(T > 7) 31 1 1 = 1− − = 48 3 48 (q) p – 4q 10 ∫5 0 2 t 5 b Using the graph, 1 2 × 7k + ( 5 − 2 )( 7k + 16k ) = 1 2 14k + k= P (T 3 ) = 14 × 2 4 10 x From the graph, (10 – 5) × q = 0.5 q = 0.1 or q can be found using integration: 7k i 0 b P(X > 5) = 0.5 f(t) 16k c ) p – 2q 1 × 100% = 2.08% appointments 48 were delayed by less than 2 minutes. a ( 1 14 32 3 46 69 × 3× + = × = 2 97 97 2 97 97 f(x) a Therefore, 8 ) [qx ]105 = 0.5 10q – 5q = 0.5 q = 0.1 Total area under PDF = 1 69 k =1 2 4 ∫2 p − 0.1x dx + 6 × 0.1 = 1 2 97 P (T 3 ) = 14 × q dx = 0.5 4 ∫2 p − 0.1x dx = 0.4 3 2 2 28 17 45 + 3t + 1) dt = + = 97 ∫2 97 ( 97 97 97 3 2 2 28 17 45 +∫ + = ( 3t + 1) dt = 97 97 97 97 2 97 4 px − 0.05x 2  = 0.4 2 (4p – 0.8) – (2p – 0.2) = 0.4 2p = 0.4 + 0.8 – 0.2 = 1 p = 0.5 3 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables 10 a 0. 5 ∫0 1 1. 5 0. 5 1 kx dx + ∫ 1 dx + ∫ 3 − kx dx = 1 0.5 a For x ∈ ( 0, 1) , F ( x ) = 2 1.5  kx 2   1 kx 2   2  + [ x ]0.5 +  3x − 2  = 1  0  1 x ∫0 x 3 4 dx = x 4 x For x ∈ (1, 1.5 ) , F ( x ) = F (1) + ∫ 1 dx = 1 + [ x ]1x 4 1 3 =x− 4 For x ∈ (1.5, 2.5 ) , ( k8 − 0) + (1 − 0.5) +  ( 29 − 98k ) − (3 − k2 ) = 1 x F ( x ) = F (1.5 ) + ∫ − ( x − 2.5 ) dx 4k =1 8 3 1. 5 x k=2 b P ( X < 1.2 ) = 0. 5 ∫0 2x dx + 1. 2 1 ∫0.51 dx + ∫1 = 3 − 2x dx 4 ( x − 2.5)4 3  ( x − 2.5 )  + −  =1− 4  4 4  1.5 Therefore F( x ) =  0 Alternatively: x4 1. 5 4 P ( X < 1.2 ) = 1 − P ( X > 1.2 ) = 1 − ∫ 3 − 2x dx = 1 − 9 = 0.91  100 1.2 2 x − 3 4 1. 5  9 P ( X < 1.2 ) = 1 − P ( X > 1.2 ) = 1 − ∫ 3 − 2x dx = 1 − = 0.91  100 1.22 ( x − 2.5 ) 4 1 − 4  1 = 1 + 0.5 + ( (3.6 − 1.44) − (3 − 1) ) = 0.91 4 Exercise 1.2A 1 a For x ∈ (1, 6 ) , F ( x ) = x x 1 x  0  1 x  − 10 10  Therefore F ( x ) =  2  4x − x − 7 25 200 25   1 b For x ∈( 0, 2 ) , M( x ) = 6 x < 16, 3 x For x ∈ ( 2, 8 ) , M ( x ) = M ( 2 ) + 2 x P(0.5 < X < 2) = F(2) – F(0.5) = 0.969 a ∫−1 5 (x + 1)dx = 1 ∫2   = 1 +  1 x − x  = −x + 1 x − 1 4 3 48  48 3 3 2 2 0  2 x  16 Therefore M ( x ) =  2  −x + 1 x − 1 3  48 3 1 x <0 0x <2 2x 8 x > 8. k 1 1 (k + 1)2 − 0 = 1 10 k = 10 − 1, reject k = − 10 − 1  0  2 b F(x) =  ( x + 1)  10 1  2 1 (8 − x ) dx 24 ) k  1 2  10 (x + 1)  = 1 −1 ∫0 8 x dx = 16 x x 2.5. x 16. x1 1.5 x < 2.5 c x < 1, 1 x < 6, 1 x 1.5 ( 1 2 2   = 1 +  4x − x  = 4x − x − 7 2  25 200  25 200 25 6 0x <1 b P(X > 1.2) = 1 – P(X  1.2) = 1 – F(1.2) = 1 − 1.2 − 3 = 0.55 4 ∫1 10 dx = 10 − 10 x For x ∈( 6, 16 ) , F ( x ) = F ( 6 ) + ∫ 4 − 1 x dx 6 25 100 x <0 4 a 2 ∫0 a(3 − z) 2 x < −1 −1 x 10 − 1 x > 10 − 1. dz + ∫ 6 2 1 (3z + 2) d z = 1 80 2 ( ) 6 9az − 3az 2 + a z 3  +  1 3 z 2 + 2 z  = 1  3 0  80 2 2 (18a − 12a + 83 a ) + 107 = 1 26 3 a= 3 10 a= 9 260 4 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions b For 0 z < 2 , z ∫−∞ a (3 − z) 2 z dz = F(0) + ∫ a(3 − z) dz 72a = a= 2 0 z 3 1 = 0 + − a (3 − z )   3 0 = 81 3 ( 3 − z ) − 260 260 s s 1 For 2 s < 8, F (2) + ∫ (s − 8)2 ds = 12 + 2 144 z 1 1 ∫−∞ 80 (3 z + 2) dz = F(2) + ∫2 80 (3z + 2) dz = Therefore, 0  3  81 − 3 ( 3 − z ) 260 F(x) =  260 2 3 2) z + ( 1 + 6 480 1  5 1 144 2 s b For 0 s < 2, F(0) + ∫ 1 s ds = 0 +  1 s 2  = s 0 4  8 0 8 3 For 2 z < 6, 1 2 z 3  1 1 ( 3z + 2 ) + (3 z + 2)2  = + 480 10  480 2 6 2 z 2 z<6 c z 6. a P ( X > 8) = 1 − P ( X 8) = 1 − 1 2 41 8 − 5) = 50 ( 50 8  1 ( x − 5 ) 5 x 10  25  f (x ) =  1 10 x 12 4 0 otherwise.  0s < 2 2s < 8 s 8. P(1.5 < s 2.5) = F (2.5) − F (1.5) b P ( X > 1.5 ) = 1 − F (1.5 ) = 1 − c F (2 ) = 1 , therefore y > 2 3 F( y ) = 2 3 1 1 2 y− = 3 3 3 9 s <0 = 1153 or 0.334 (3 s.f.) 3456 1 1 a F(a) = 1, therefore a − = 1 3 3 Therefore the graph of f(x) is: f(x) 3 s  0 s2  F(s) =  8  ( s − 8 )3  432 + 1  1 0 z < 2 c (s − 8 ) + 1 1  1 + (s − 8)3  = 2  432 432 2 Therefore, z<0 83 P(1 < Z 3) = F(3) − F(1) = 416 b = a 3 1 13 2 1.5 = 12 ( ) 16 y=3 k ln 2 1 −3t ∫0 a=4 e dt + ∫ 1 dt = 1 ln 2 24 k 1 4 1 5 0 6 5   a f(t) =  cos t 0  ( x 10 12 t ) () () b P π <t < π = F π −F π = 2 − 1 = 6 4 4 6 2 2 7 a 21 = 0.207 (3 s.f.) 8 ∫0 4 s ds + ∫2 a(s − 8) 2 2 ds = 1 k ln 2 65 − = 24 24 72 65 k = 3 + ln 2 t b For 0 t < ln 2, F ( 0 ) + ∫ 1 e −3t dt = 0 +  − 1 e −3t   9 0 03 0 t π , 2 otherwise. 7 1  + t =1 72  24  ln 2 = 1 (1 − e −3t ) 9 For ln 2 t < ln 2 + 65 , 3 t F(ln 2) + ∫ t 1 dt = 7 +  t  = 7 − 3 ln 2 + t 72  24  ln 2 72 24 ln 2 24 8  1 s 2  +  1 a(s − 8)3  = 1  8 0  3 2 5 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables Therefore,  0 1 −3t 9 (1 − e ) F(t) =  7 − 3 ln 2 + t 24  72 1  ln 2 1 −3t e dt 0. 5 3 ∫ c 2 t <0 ln 2 t < ln 2 + 65 3 65 t ln 2 + . 3 + ∫ 1 dt ln 2 24 X 0 10 F(1) − F (0.5) or = 0.0237 (3 s.f.) 10 ( b ( ) ) M 0 5 For m ∈( 0, 5 ) , FM (m ) = P ( ) ( ( 31 ) ) ) ( 3 Y =X3  0   F(x) =  1 x − 3 4 2 1 33  1 3 60 × P (t > 3 ) = 60 ×⌠  − 5 t − 3 3 dt = 60 × 0.03 = 2 members  1 ⌡3 3 13 Change limits: 1 3 ×⌠  − 5 t − 3 3 dt = 60 × 0.03 = 2 members ⌡3 X ( ( ) x ∫0 0.08x dx = 0.04x 2 X 0 5 0x 5 x > 5. 4  For y ∈( 0,25 ) , FY ( y ) = P X 2 y = FX   2  21   21  2 X y = FX  y  = 0.04 ×  y  = 0.04 y      y <0  0 F ( y ) =  0.04y 0 y 25  y > 25 . 1 ) 1  y2 ) 6  Hence, f ( y ) =  0.04  0 0 y 25 otherwise. x > 10. Y = X3 0 0 6 216 10 1000  2 1 − f(y) = 12 y 3 0  Y 0 25 ( 6 x 10 1 1 1 3 3 F(y) = P(X y) = P(X y 3 ) = 4 y 3 − 2 Therefore, x <0 0  Therefore F ( x ) =  0.04x 2 1  Limits of Y : x <6 ) Exercise 1.3A 1 X ) 3 2 3 ⌠ 3 − 3 t − 3 1 dt =  − 3 t − 3 1  = 5 (or 0.208 to 3. s.f.) =  3 3  1 24 ⌡2 1 5  10 2 2 2 c 60 members, therefore ( ( 12 x m) = F (2m) = 0.2(2m) − 0.01(2m) 1 2 m ∈ ( 0, 5 ) , FM (m ) = P x m = FX ( 2m ) = 0.2 ( 2m ) − 0.01( 2m ) = 0.4m − 0.04m 2 31 2 7, ⌠ 3 3 1 8 = − t − 3 dt = ∫  15  3 15 ⌡2 5 m <0 0 7 8 F (m ) = 0.4m − 0.04m 2 0 m 5 + = 1 , therefore a = 2 15 15  1 m > 5.  1 31 2 33 3  3 1 3 1 1  5 P t 2 = ⌠ − t −3 dt =  − t −3 = (or 0.208 to 3. s.f.) 0m5 2  3 10 3  1 24 Hence, f (m ) = 0.4 − 0.08m ⌡2 1 5  2 2 0 otherwise. 2  21 2 t dt 1 5 1 ( a Substitute a = 2, 2  x <0  0 Therefore F ( x ) =  0.2x − 0.01x 2 0 x 10  x > 10. 1 Limits of M: 0 t < ln 2 1 x ∫0 0.02 (10 − x)dx = 0.2x − 0.01x   = 0.04 ×  a 1 2  y2 10 ∫0 216 y 1000 otherwise. kx dx = 1 10  = 0.04 y 1 kx 2  = 1  2 0 50k = 1 k= b 1 50  1 0 x 10 f ( x ) =  50 x 0 otherwise.  x 1 1 2 ∫0 50 x dx = 100 x ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions  0 Therefore F ( x ) =  1 2 x 100 1 Change limits: ) , FY ( y ) = P ( ) Y 0 10 0 2 1 1 = F ( 5y ) = 5y = y ( 15 x y ) 100 ( ) 4 2 X  0 F ( y ) =  1 2 y 4 1 a ) T ( r ∈ (1, 3 ) , FR (r ) = P ) 1 1 1 3 = F (3r ) = + + r 3r = ( 13 t r ) 10 10 ( ) 10 10 T ( 15 x y ) = F (5y ) = 1001 (5y ) = 14 y 2 X 2 2 y <0 0y 2 0 y 2 otherwise. 9 ∫1 0.4 − kt dt + ∫3 k dt = 1 3 0.4t − 1 kt 2  + [ kt ]9 = 1 3 2  1  0  F (r ) =  6 r − 9 r 2 − 7 5 20 20 1 3 10 + 10 r  1 r<1 3 1 r 1 3 1r 3 r > 3.  6 − 9 r 5 10 Hence, f (r ) =  3 10 0  y > 2. 3 R 1 1 1 For r ∈ (1, 3 ) , FR (r ) = P 3 t r = FT (3r ) = 10 + 10 ( 3r ) =  1  y Hence, f ( y ) =  2  0 5 ( ( 13 ,1), F (r ) = P ( 13 t r ) = F (3r ) = 25 (3r ) − 201 (3 1 1 2 1 2 7 6 9 7 r0∈ x,1 ,10 FR (r ) = P t r = FT ( 3r ) = ( 3r ) − ( 3r ) − = r − r2 − 3 3 5 20 20 5 20 20 x > 10. X For y ∈ ( 0, 2 ) , FY ( y ) = P For r ∈ x <0 6 c P ( R < 1.5 ) = F (1.5 ) = a 1 2 a = 1 so a2 = 16 16 1 r 1 3 1r 3 otherwise. 1 3 11 + 1.5 = 10 10 ( ) 20 Therefore, a = 4, a = –4 (reject as a > 0) 0.8 – 4k + 6k = 1 Change limits: k = 0.1 t 2 t X 0 4 ⌠   b For t ∈ (1, 3 ) ,  0.4 − 0.1t dt = 0.4t − 0.1t  = 2 t − 1 t 2 − 7 2  5 20 20  ⌡1 1 t t ⌠  0.1t 2  2 1 2 7  0.4 − 0.1t dt = 0.4t − 2  = 5 t − 20 t − 20  1 ⌡1 ( 1t 3 3 t 9 b t > 9.. 7 R 1 1 3 3 9 1 3 y <0 0 y 16 y > 16.  1 Hence, f ( y ) =  16 0  t <1 T )  0 F ( y ) =  1 y 16 1 Therefore Change limits: 2  1 1  1 1 FY ( y ) = P x 2 y = FX  y 2  =  y 2  = y 16   16   t For t ∈ (3, 9), F ( 3 ) + ∫ 0.1t dt = 2 + [0.1t ]t3 = 1 + 1 t 10 10 5 3 t t 1 2 1 + t F ( 3 ) + ∫ 0.1t dt = + [0.1t ]3 = 5 10 10 3  0 2 1 2 7 F (t ) =  5 t − 20 t − 20 1 + 1t 10 10 1  Y 0 16 a 0 y 16 otherwise. 1 15 P (Y > 1) = 1 − F (1) = 1 − = 16 16 x ∫−∞ 2 dx = F(−1) +  2 x −1 = 2 x + 2 x 1  0 F(x ) =  1 x+1 2 2 1 1 1 1 x < −1 −1 x 1 x > 1. 7 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables Therefore Change limits: Y = eX X −1 e−1 1 e 9 F(Y ) = P(Y y ) = P(e X y ) = P(x ln y ) = =    F( y ) =    a = −2 y <e 0 1 1 ln y + 2 2 1 1  Hence f(y) =  2 y 0  −1  0 F ( x ) =  1 3 1 x + 2 16 1 Change limits: e −1 y e y > e. e −1 y e otherwise. b P(Y  k) = 0.25 is the same as P(Y  k) = 0.75 1 ln k + 1 = 0.75 2 2 k = e0.5 1 ∫02 c x 3 x < −2 −2 x 2 x > 2. X Y = X2 −2 4 0 0 2 4 1 1 1 3 F(y) = P(X 2 y) = P(− y 2 X y 2 ) = y 2 8 Therefore  1 3 0y 4 f(y) = 16 y 2 0 otherwise.  1 ln k = 0.25 2 ln k = 0.5 8 0 y 17 4 otherwise.  3x 3  2   =1  48  a 1 1 ln y + 2 2 Therefore,  1024 3 f(y) =  83521 y 0  2 3 2 ∫a 16 x dx = 1 dx = 1 x 10 1 4 2   cx  4  =1  0 2 3 3 x ⌠ ( x − 10 ) dx = F ( 0 ) +  ( x − 10 )  = ( x − 10 ) + 1    9000 9000 9 ⌡−∞ 3000   0  0 F ( x ) = ( x − 10 )3 1  9000 + 9 1  c = 64  3  f(x) =  64x 0  0x 1 2 otherwise.  0 F(x ) = 16x 4  1  x 0 0 < x < 30 x 30. Since X + T = 30, T = 30 − X x <0 Change limits: 0x 1 2 1 x> . 2 Change limits: X Y = 8.5X 0 0 1 2 17 4 y  256 4 F(y) = P(8.5X y) = P  X = y 8.5  83521  X T 0 30 30 0 FT (t ) = P(T < t ) = P(30 − X < t ) = P(X > 30 − t ) = = P(X > 30 − t ) = 1 − P(X 30 − t )  (( 30 − t ) − 10 )3 1  +  = 1 − F X (30 − t) = 1 −  9000 9  (20 − t ) =8− 9000 9 3 8 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions Therefore  0 F (t ) =  8 ( 20 − t )3  9 − 9000 1  t 0 0 < t < 30 1 0 ∫−1 3 x b E(Z) = ∫−1 3 x E(P) = ∫−1 3 x c 2 4 33 dx + ∫ 1 x 2dx = 9 0 6 b b 0 1 4 4 dx + ∫ 1 x 4dx = 513 15 0 6 1 f ( x ) =  8 x 0  10 1 4 57 4 a x 3 dx = 544 cm3 4 E( X ) = ∫0 x b Y = X3 b = c 10 1 ∫6 × 3x 6dx = 4 7 290048 7 7 3 4 0 ( ) 4 ( ) 0x 4 otherwise. x  1  ∫ (2x ) 8 x  dx =  16  4 4 2 0 4 = 16 0 31 2 ∫−1k dy +∫2 5 (7 − 2y ) dy = 1 3 2   [ky ] 2−1+  75 y − 210y  = 1  2 5 2 E (Y ) = ∫0 x ( 252 x )dx = 1002 x  = 252 4 5 2k + k + 0 E(2X + Y) = 2E(X) + E(Y) ( ) k= 5 5 = 2∫ x 2 x dx + 25 = 2 ×  2 x 3  + 25 25 2  75 0 2 2 a a 2 25 =2 × ( 18 x ) dx =  24x  = 83 4 ⌠  4 E X 2 =  x 2 1 x dx =  x  = 8 8  32 0 ⌡0 E(Y ) = ∫0 kx dx = 1 5 c otherwise. 1 Y = 2X 2 and f ( x ) =  8 x 0  5  1 kx 2  = 25 k = 1  2 0 2 b 0x 4 Alternatively, E(3Y 2 + Y + 2) = E(3Y 2 ) + E (Y ) + E(2) k= 13 8 +3=5 9 9 E(Y) = E(2X2) = 2E(X2) = 16 7 290048 = + 544 + 2 7 7 293870 = 7 a 18 = 13 5 3 1 1 13 13 ET = × = 3 ( ) 3 3 9 E ( 2R + 3 ) = 2E(R) + 3 = 2 × E(3Y 2 ) = E(3X 6 ) 4 E(R) = 4 dx + ∫ 1 x 3dx = 127 12 0 6 6 11 9 3 3Y 2 = 3X 6 3 3 10 5 − =5 3 3 ∫1 t (0.4 − 0.1t )dt + ∫3 0.1t dt = 15 + 1 ∫6 a E(Y) = a E (T ) = 0 d E(Y + Z) = E(Y) + E(Z) = 2 10 5 15 + = =5 3 3 3 b E ( 2 X − M ) = 2E ( X ) − E ( M ) = 2 × t 30. 5 a E(Y) = 5 5 Therefore E ( X ) + E ( M ) = Exercise 1.4A 1 ∫0 m (0.4 − 0.08m )dm = 3 E(M ) = b 1 5 f(y) 10 25 115 + = 3 2 6 E( X ) = 10 ∫0 3 5 2 5 0.02x (10 − x )dx = 10 3  f (m ) = 0.4 − 0.08m 0 4 =1 10 0 m 5 otherwise. –1 0 2 3 y 9 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables 2 2 2 3 3Therefore, 1 y dy + 3 1 y (7 − 2y ) dy =  y  +  7y − 2y  = 3 + 29 = 19  10   ∫−1 5 ∫2 5  015 15  10 30   −1  10 2  2 3  1 w3 31  7y 2 2y 3  y2  3 29 19 y dy + ∫ y (7 − 2y ) dy =   +  − = + =  5  10  10 15  10 30 15 2 5    27 27 1 2 − F (w ) =  40 w − 80   189 + e −1.9 − e −w d X = 1 Y , therefore X 2 = 1 Y 2 2 4  200  1 2 3  1 1 1 1 E (Y ) = ⌠  4 y 2 5 dy + ⌠  4 y 2 5 ( 7 − 2y ) dy ⌡−1 ⌡−2 Y = 2.2W c E (Y ) = 2 ( )( ) ( )( 2 w<0 0 w < 1.5 1.5 w < 1.9 1.9 w < 2.36 ) 3  x3  7 1 4 =   +  y3 − y 40  2  60  −1  60 w 2.36. y  For y ∈ ( 0, 3.3) , F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W = 2.2   3 y  1 y  25 3 ∈ ( 0, 3.3) , F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W y = = 3 y 71 2.2  5  2.2  1331  = + 20 120 For y ∈(3.3, 4.18), 89 = y  27  y  27 120 F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W = − 2.2  40  2.2  80  21 8 1 2 2 8 ∫ x(x ) dx + ∫ (8 − x )(x ) dx = 82 0 4 2 4 y  27  y  27 27 27 y− F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W = − = 2.2 40 2.2 80 88 80     6 43 1 9 E(R 2) = ∫ r 2 dr = 3 1 5 For y ∈ (4.18, 5.192), −5 y  189 E(A) = πE(R 2) = 43 π cm 2 or 45.0 (3 s.f.) 3 F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W = + e −1.9 − e 1 2.2  200  E 1 ( A + 1)2 = 1 E ( A + 1)2 2 2 −5 y y  189 F ( y ) = P (Y y ) = P ( 2.2W y ) = P  W = + e −1.9 − e 11 2.2 200   1 2 = 2 (E( A ) + 2E( A) + E(1)) = 1059 10 To find the value of k, Therefore, f ( y ) = d F( y) dy k 1.5 1.9 2 −w ∫0 0.6w dw + ∫1.5 0.675 dw + ∫1.9 e dw = 1  75 2 k 1.5 0 < y < 3.3  1331 y 0.2w 3  + [ 0.675w ]1.9 +  −e −w  = 1  0  1.9 1.5  27  3.3 y < 4.18 0.675 + 0.27 + (– e–k + e–1.9) = 1 f ( y ) =  88 5  5 −11 y 4.18 y < 5.192 e–k = e– 1.9 – 0.055  11 e  0 otherwise. –k = ln(e–1.9 – 0.055)  4.18 3. 3 75 2 ⌠ k = 2.36 y × 27 dy + E(Y ) = ∫0 (y × 1331 y ) dy +  88 ⌡3.3 Change limits: 5.192 5y ⌠   5 −11   y × 11 e  d y = 2.94 (3 s.f.) W Y = 2.2 W ⌡4.18 0 0 Alternatively: 1.5 3.3 Y = 2.2W so E(Y) = 2.2E(W) 1.9 4.18 k 1.5 1.9 E(W ) = ∫ 0.6w 3dw + ∫ 0.675w dw + ∫ we −w dw 2.36 5.192 ( ) ( ) ( ) ( ) ( 0 For W ∈(0,1.5), w ∫0 0.6w 2 dw = 1.5 1 3 w 5 1.9 1.5 1.5 = 0.15w 4  0 ) 1.9 1.9 + 0.3375w 2  1.5 +  −e −w (w + 1)  k 1.9 k −w 0.3375 w)4 + w+ 0.675 w 2= 27+w −−e27 (w + 1)  where k = − ln(e −1.9 − 0.055) ≈ 2.36 For W ∈(1.5,1.9),0.15 F (1.5 dw 0 ∫1.5  1.5 1.9 40 80 = 0.759375 + 0.459 + 0.116147 = 1.335 w For W ∈ (1.9,2.36 ), F (1.9) + ∫ e −w dw = 189 + e −1.9 − e −w 200 1.9 E(Y) = 2.2 × 1.335 = 2.94 (3 s.f.) 10 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 where k = − ln( Worked solutions Exercise 1.5A 1 – 0.05x2 + 0.4x – 0.55 = 0 a F(x) = 0.5 1 x 2 = 0.5 25 x = 3.54 x = 1.76, x = 6.24 (reject, as x < 3 from part b) Therefore the 20th percentile is 1.76. 3 x = −3.54, reject a y = 1, 5 (y − 4)2 = 5 72 8 f(y) b Q1: 1 x 2 = 0.25 25 x = 2.5 x = −2.5, reject Q3: 1 x 2 = 0.75 25 x = 4.33 x = −4.33, reject 2 a 1 3 4 5 8 f(x) 0.3 0.1 0 1 9 3 1 x Therefore the median value lies between 3 and 9. Area of trapezium + area of rectangle = 0.5 c Area of rectangle = 0.1 (x – 3) × 0.1 = 0.1 Therefore the median is 4. Alternative method: 3 4 a y y <0 0y <1 1 y < 4 y > 4. 3 F(m) = 1 + 5(m − 4) = 0.5 216 m = 1.22 (3 s.f.) 31 3 1 ∫0 7 dw = 7 < 2 Therefore, the median value lies between 3 and 5. Therefore, the median value lies between 3 and 9. 0.4 + ∫ 0.1 dx = 0.5 3 + w 2 dw = 1 7 ∫3 7 2 [0.1x ]3x = 0.1 2w = 1 − 3  7  3 2 7 x 3 w 0.1x – 0.3 = 0.1 w=3 x=4 F(x) = 0.2 x ∫1 0.4 − 0.1x dx = 0.2 x 0.4x − 0.1 x 2  = 0.2  1 2 1 4 Therefore, the median is 3 Therefore the median is 4. c 4 d The 20th percentile is when y is between 0 and 1, 3 y 2 = 0.2 8 y = 0.730, y = −0.730, reject x=4 ∫1 0.4 − 0.1x dx = 0.4 < 0.5 3  0 3 2 8 y b F(y) =  3  5( y − 4 ) 1 +  216  1  b Using the graph, area of trapezium 1 = (0.1 + 0.3) × 2 = 0.4 < 0.5 2 Using calculus, 2 b 1 . 4 3 + w 2 dw = 7 7 ∫3 7 10 w 2w = 7 − 3  7  3 10 7 w=3 19 20 11 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 CONTINUOUS RANDOM VARIABLES 5 a 3 F(Q3) = 0.75 5 ∫1 k ( x − 1)dx + ∫3 k (5 − x )dx = 1 3 1 x − 1 = 0.75 2 2 5  kx   kx   2 − kx  + 5kx − 2  = 1  1  3 2 ( )( ) 25k 9 +  ( 25k − − 15k − k ) = 1 2 ) ( 2    9k − 3k − k − k   2  2 b 7 a f(x) 1− e c 6 a 1 3 8 − 1m 3 = 1 2 x 0. m = 3ln 2 or 2.08 (3 s.f.) −1x 3 = 0.8 x = 3ln 5 or 4.83 (3 s.f.) a y 1 2 a 4 ∫1 k dx + ∫3 2k dx = 1 [kx ] + [2kx ] 3 1 4 3 –1 k= 1 x =1 a=1 2 (3k – k) + (8k – 6k) = 1 1 4 b F(t) = 0 + x1  0 1 1 F ( x ) =  4 x − 4 1 x − 1 2 1  1 ∫3 2 dx = 2 x − 1 For x ∈ (3, 4), F(x) = c x <1 x > 4. 1 1 Q1= −0.5 Q3 = 0.5 IQR = Q3 − Q1 = 1 1 x < 3 3x 4 t ∫−1 2 dt = 2 (t + 1) Alternative method 1 (m + 1) = 0.5 2 m=0 x1 1 1 b For x ∈ (1, 3), F(x) = ∫ 4 dx = 4 x − 4 1 c x <0 F(x) = 0.8 1− e x 5 The graph is an isosceles triangle. It has the line of symmetry x = 3. Therefore, the median is 3. 3 x −1x  −1x  dx =  −e 3  = 1 − e 3  0 1 b F(m) = 2 1 2 c x 1 −1x 3 ∫0 3 e  0 Therefore F ( x ) =  −1x  1 − e 3 1 4 0 7 2 Therefore IQR = 7 − 2 = 1 1 2 2 4k = 1 k= x= 9 a 4 k 1 ∫1 0.25 dr + ∫4 − 8 (r − k ) dr =1 k 2  1 4  1  4 r  +  − 16 (r − k )  = 1 1 4 F(x) (1 − 14 ) +  0 − − 161 ( 4 − k )  = 1 2 1 1 2 0 1 3 4 x d F(Q1) = 0.25. From the graph, LQ value must be between 1 and 3. 1 1 x − = 0.25 4 4 x=2 1 1 (4 − k )2 = 16 4 4 – k = ±2 k = 2 (reject because k > 4) or k = 6 b For r ∈(1, 4), F (r ) = F (1) + ∫ r 1 r 1 1 1 dr = 0 +  r  = (r − 1) 4  4 1 4 12 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 1 Exam-style questions For r ∈ (4, 6), r r ⌠ − 1 ( r − 6 )dr = 3 +  − 1 (r − 6)2  = 1 − 1 (r − 6)2 F( r ) = F( 4 ) +  1  4 16 4  16 ⌡4 8 1 a F(7) = a(7)3 = 343a = 1, so a = 343 r r 1 3 1 (r − 6)2  = 1 − 1 (r − 6)2 t = 0.2 ) = F( 4 ) + ⌠⌡4 − 18 (r − 6 )dr = 34 +  − 16 343  4 16 t = 4.0936…  t = 4 days r <1 0 1  1r < 4  3 2  4 (r − 1) 0 t 7 b f(t) =  343 t F(r) =  2 0  (r − 6 ) otherwise.  4 r 6 1 − 16 1 c F(t) = 0.25 r > 6.  c 1− (r − 6)2 = 0.8 16 r = 4.21 10 −x  a f(x) = x 0  Lower quartile F(t) = 0.25 r = 7.79 reject 1 3 t = 0.25 343 −1 x < 0 0 x 1 otherwise. 0 1 1 2  2 − 2 x F(x) =  1 + 1 x2 2 2 1 For t ∈(0, 7), t 3 1 3 F (t ) = ∫ t 2 dt = t 343 0 343 For the 80th percentile, R is between 4 and 6. t3 = 85.75 t = 4.41 x < −1 2 −1 x 0 a a x > 1. − Change limits: X Y −1 4 0 0 1 4 FY (y) = P(4 X 2 y) = P − 1 y X 1 y 2 2 1 1 1 = FX 2 y − FX − 2 y = 4 y Therefore,  y <0 0  1 F(y) =  y 0y 4 4 1 y > 4. 1  Hence f(y) =  4 0  1 b y = 0.5 4 M: y = 2 c ) 0 y 4 otherwise. For Q1, 1 y = 0.25, 0.5 so y = 1 4 For Q3, 1 y = 0.75, 0.5 so y = 3 4 IQR = 2 1  − 1 (4 − x)2  + 1 (5 − a) = 1  2  1 6 0 x 1 ( ( ) ( a1 ∫1 6 (4 − x)dx + 6 (5 − a) = 1 { } 1 1 (4 − a)2 − 9 + (5 − a) = 1 12 6 (4 − a )2 − 9 − 2(5 − a ) = −12 a 2 − 6a + 9 = 0 2 (a − 3) = 0 a=3 ) 3 4. 5 1 1 dx = 11 16 b ∫1.5 6 (4 − x)dx + ∫3 c The 95th percentile lies in the region (3, 5). 1 1 F(x) = 6 + 6 x 6 3x 5 1 + 1 x = 0.95 6 6 x = 4.7 3 a 1  1 x 4  + ( k − 1) x k +  − ( x − 2 )2 2 = 1 1  k  4 0  8k 2 − 24k + 17 = 0 6+ 2 6− 2 or 4 4 13 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables b     f(x) =      x3 0x <1 2+ 2 4 1 x < − 2( x − 2) 0     or f ( x ) =      Or change limits: 6+ 2 x < 2 4 otherwise x3 0x <1 2− 2 4 6− 2 1 x < 4 6− 2 x <2 4 otherwise. −2( x − 2) 0 x <0 0 4 x 4 2+ 2 2 +1 x− 4 4 1 − ( x − 2) 2 1  0 x4 4  or F(x) =  2 − 2 x − 1 − 2 4  4  2 1 − ( x − 2 )  1 c Y 0 1 6− 2 4 126 − 55 2 32 2 8  1 1 y 3 3  −2 2 − 2 3 f(y) =  12 y   1  −2 − 2  y 3 − 2  y 3 3    0  Therefore,      F(x) =       X 0 1 6+ 2 4 0x <1 4 6+ 2 1 x < 4 6+ 2 x 2 4 x >2 Q 3 : 1 − e −0.1t = 0.75 c For median: 1 − e −0.1t = 0.5 t = 10 ln 2 1t  1 − 10  e f (t ) = 10  0      f(y) =       126 − 55 2 32 2 8 1 13 y 3 0 y <1 2 + 2 − 23 y 12 1 y < 2 otherwise. ∞ ∞ ∞ ∞ 6+ 2 4 t 0 ⌠   − 1t ∞ − 1t − 1t Mean = E(t ) =  t  1 e 10  dt =  −te 10  − ∫ e 10 dt 0  ⌡0  10  0 Change limits: 1 otherwisse. Q 3 − Q1 = 10 ln 3 1 x < 6 − 2 4 6− 2 x 2 4 x > 2. 1 126 − 55 2 y < 8 32 t = 10 ln 4 3 t = 10 ln 4 b Q1 : 1 − e −0.1t = 0.25 0x <1 Y 0 1 y < 126 − 55 2 32 a F(15) − F(10) = 0.145 x <0 X 0 0y <1 ⌠   − 1t ∞ − 1t − 1t E(t ) =  t  1 e 10  dt =  −te 10  − ∫ e 10 dt 0  ⌡0  10  0 126 + 55 2 32  −3 2 1 −  y 3 − 2 y 3  126 + 55 2 y <8 32 0 otherwise. ∞  − 1t = 0 − 10e 10  = 0 − ( 0 − 10 ) = 10  0 P(mean  T  median) = F(10 ln 2) – F(10) 5 a = 0.5 − e −1 iTwo points (−2, 0) and (0, 0.2) on the first piece: 0.2 − 0 = 1 a= 0 + 2 10 Two points (1, 0.4) and (4, 0) on the third piece: 0 − 0.4 2 b = 4 − 1 = − 15 14 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions 1 1 10 x + 5  1 f ( x ) = 5 − 2 x + 8  15 15 0   9 x − 1 x2  = 3 14 1 8  14 0x <1 4x2 – 36x + 53 = 0 1 x < 4 x = 1.85, x = 7.15 reject otherwise. 0   1 x 2 + 4x + 4 20  ii F(x) = 1 (1 + x ) 5  − 1 x 2 − 8x + 1  15 1  b i Change limits: ( −2 x < 0 ) 1 x < 4 7 2 ( ) = 0.803 919 239 − 1 192 240 2 e−2 1 4 e4 k = 0.25 b F(4) = 0.5, therefore the value of Q1 must lie between 2 and 4; the value of Q3 must lie between 4 and 5. e 0.25(x – 2) = 0.25 x = 3 0.5(x – 3) = 0.75 x = 4.5 c For y ∈ (e, e4), F ( y ) = 1 (ln y 2 − 8ln y + 1) 15 Therefore, e −2 y < 1 E( X ) = =1− F 8 a 3 5 ( 4 otherwise. 4 3 9 ∫2 0.25x dx + ∫4 0.5xdx = 2 + 4 = 3 4 P( X > µ) = P X > ii 1 y <e ey <e IQR = 4.5 – 3 = 1.5  2x 4  0.25 f ( x ) =  0.5 4x 5  otherwise. 0 d i ) ( ) 15 15 =1− P X . 4 4 (154 ) = 1 – 0.4375 = 0.5625 f(x) e ⌠  1  3  ii ⌠  y  10y  d y +  y  5y  d y ⌡e−2 ⌡1 1 or 1 15 k e4  −2  +⌠  y  5y  d y = 21.4 ⌡e a P ( 0.5 < x < 3 ) = 1 1 31 ∫0.5 2 x dx + ∫1 7 ( 4.5 − x )dx 3 1 1 ∫0 2 x dx = 8 , therefore the median value 3 0 3 1 9 1 15 5 745 =  x 4  +  x − x2  = + = 14 1 128 7 896  8 0.5  14 11 ) so k(4 – 2) = 0.5 1 (ln y 2 + 4 ln y + 4) For y ∈ (e–2, 1), F ( y ) = 20 1 For y ∈ (1, e), F ( y ) = (1 + ln y) 5 b 4. 5 1 a F(4) = 0.5(4 – 3) = 0.5 F(y) = P(Y  y) = P(eX  y) = P(X  ln y) 6 5 2k = 0.5 −2 0 1 1 ( x 2 9 − 1 x dx 14 7 1 301 919 = = + 12 64 192 1 0 ( ) x > 4. Y   3 10y 1  f(y) =  5y  −2  5y  0 ( ) ∫ 12 x dx + ∫ E X2 = Var ( X ) = E X 2 − ( E ( X )) = 0x <1 X c x < −2 ) ( x −2 x < 0 b x ∫0 15 0 1 dx = 1 x F x =  1 ( ) 15 x 15 15  1 x x <0 0 x 15 x > 15. must lie between 1 and 4.5. 1 + x 1 4 . 5 − x dx = 1 ) 2 8 ∫1 7 ( 15 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables c Change limits: X 0 15 (16k − 8k ) − (8k − 2k ) = 53 Y 0 225 k= b 1  1  1  1  1 21   FY ( y ) = P X 2 y = P  X y 2  = FX  y 2  =  y 2  = y   15   15   (   = FX  1  y2 1  y2 1  = 15  ) 1  y2 1  = 15 0 0 y 225 y > 225. 0 y 225 E (T ) = otherwise. 2 ∫0 0.15t  3X 4  E(Z) = E  2  = E(3X 2)  X  = 3 × E(X2) = 3 × 75 = 225 a For X ∈(–∞, 0), x ∫−∞ 2 Mean = E (T ) = c t 4 2 ∫0 0.15t 4 dt + ∫ 3 t ( 4 − t )dt = 3 + 8 = 2.2 hours 5 5 2 10 3 4 d t + ∫ 3 t ( 4 − t )dt = 3 + 8 = 2.2 hours 5 5 2 10 3 d For t ∈ ( 0, 2 ) , F (t ) = d E(Y) = E(X 2) = 75 F(x) = 0.6 y <0  −1 Therefore, f ( y ) =  1 y 2 30  0  9 f(t) 1 y2  0 F ( y ) =  1 21 y  15 1  3 10 t ∫0 0.15t 2 dt = 0.05t 3 t For t ∈ ( 2, 4 ) , F (t ) = F ( 2 ) + ∫ 3 4 − t dt = 0 . 4 +  6 t − 3 t 2  = ( ) 20 2 2 10  5 t t t 3 t 2 + 6t − 7 t ∈ ( 2, 4 ) , F (t ) = F ( 2 ) + ∫ 3 ( 4 − t )dt = 0.4 +  65 t − 230 t 2  = − 20 5 5 2 10  2 Therefore, 0 dx = 0  t <0 0 0.05t 3 For X ∈(0, ∞ ), 0 t 2 F (t ) =  x x 1x  1x 1x 1x 3 6 7  2 − − − − 1 2 t 4 F(x) = F( 0 ) + ⌠  4 e 4 dx = 0 +  − e 4  = − e 4 − ( −1) = 1 − e 4 − 20 t + 5 t − 5 ⌡0  0 t > 4. 1 x 1 1 1 1   1 e − 4 xdx = 0 + − e − 4 x = − e − 4 x − −1 = 1 − e − 4 x 17 3 ( ) e P(T > 3) = 1 – F(3) = 1 − 20 = 20   4  0 Therefore,  0 F( x ) =  1 1 − e − 4 x  b x < 0, 2 ∫0 0.15t dt + ∫ 2 4 2 6 2  1 kx 2  + [ 2kx ]6 = 1 2  2 0 x 0. (2k – 0) + (12k – 4k) = 1 −1  e 4 k=  = 0.779 1 = e − 14 x 4 1 1 − x = ln 4 4 x = 5.55 a 2 ∫0 kx dx + ∫2 2k dx = 1 a −1   P ( X > 1) = 1 − P ( X 1) = 1 − F (1) = 1 −  1 − e 4  = 0.779    1) = 1 − P ( X 1) = 1 − F (1) = 1 −  1 −  1 − x 3 c 1−e 4 = 4 10 11 1 10 2 31 1 3 1 7 x dx + ∫ dx = + = 10 20 5 20 2 5 b P (1 < X < 3 ) = ∫ c For x ∈ ( 0, 2 ) , F ( x ) = 1 x 1 1 ∫0 10 x dx = 20 x 2 x x x1 1 1 1 1 For x ∈( 2, 6 ) , F ( x ) = F ( 2 ) + ∫ 5 dx = 5 +  5 x  = 5 x − 5 2  2 x x ∈( 2, 6 ) , F ( x ) = F ( 2 ) + ∫ 1 dx = 1 +  1 x  = 1 x − 1 5  5 2 5 5 2 5 k ( 4 − t ) dt = 1 4 16 2  1 + 4kt − kt 2  = 1 5  2 2 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Worked solutions Therefore,  0 1 2 F ( x ) =  20 x 1 x − 1 5 5 1  d x <0 b 0x 2 2x 6 x > 4. For a ∈ ( 4, 9 ) , F (a ) = 1 F ( 2 ) = , therefore the median value must lie 5 between 2 and 6. a 9 a ( ) b i =− ( ) 0x π otherwise. 5 1 2 c 1 2 0 –5 π 5 a 1 1 4 ∫4 10 da = 10 a − 10 ( ) 1 2 19 161 a + a− 200 100 200 Therefore,  0 1 4 F (a ) = 10 a − 10 − 1 a 2 + 19 a − 161 100 200  200 1  f(x) ii otherwise. ( 1 1 1 P − π < x < π = 0 + F π = 0.383 4 4 4  1 1 f ( x ) =  2 cos 2 x 0  9 a < 19 1 19 − a da = 1 + 19 a − 1 a 2 − 261 ) 100 ( 2 100 200 200 x = 3.5 12 4a < 9 a 1 1 19 For a ∈ ( 9, 19 ) , F (a ) = F ( 9 ) + ∫ 100 (19 − a ) da = 2 + 10 9 a ∈ ( 9, 19 ) , F (a ) = F ( 9 ) + ∫ 1 1 1 x− = 5 5 2  1 10 f (a ) =  1 100 (19 − a ) 0  a<4 4 a 9 9 a 19 a > 19. 1 S = A2 Limits: 1 S = A2 2 3 A 4 9 x 19 –5 19 ( 12 x ) dx   1 s For s ∈ ( 2,3) , F ( s ) = P  A s  = P ( A s ) = 10   1 1 1 = 2x sin ( x )  + ∫ 2sin ( x ) dx = 2π −  −4cos ( x )  = 2.28 2  2 s ∈ ( 2,3) , F (s ) = P 2A   s = P ( A s ) = 1 s − 4   10 10 1 1 1 sin ( x )  + ∫ 2sin ( x ) dx = 2π −  −4cos ( x )  = 2.28 P (S < 2.5) = F ( 2.5) = 1 (2.5) − 4 = 9 2  2 2   10 10 40 c E ( 2X ) = i π 1 ∫0 2x  2 cos π π 0 0 1 2 2 π 1 2 π π π 0 0 0 2 0 2 2 ii 13 a E( X ) = 9 19 ∫4 k da + ∫9 or 0.225 1 E 2 X = 1.14 2 ( ) 14 1 19 − a da = 1 ) 100 ( 19 19 1 2 a− a [ka ]94 +  100 200  9 =1  192 192  19 × 9 81   5k +  − − − =1 200    100 200  100 5k + k= a x ∫0 cx 2 dx = 1 2 c x3 = 1  3 0 8c −0=1 3 c= 3 8 1 =1 2 1 10 17 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 − 1 1 Continuous Random Variables b 0  F ( x ) =  1 x 3 8 1 0  F ( y ) =  1 y 4 − 1 80  327680 1 x <0 0x 2 x > 2. Q1 : 1 3 1 x = 8 4 x = 1.260 Q3 : 1 3 3 x = 8 4 x = 1.817 E (10Y ) = ∫ c Limits of Y: X 0 2 8 y 24 y > 24.  1 y 3 8 y 24 Therefore, f ( y ) =  81920 0 otherwise.  b P(Y > 6) = 1 – F(6) = 1 – 0 = 1 IQR = 1.817 – 1.260 = 0.557 c y <8 24 8 1 1 y 3  d y =  y5 (10y )  81920 40960    24 8 = 193.6 Y 0 16 16 ( ) ( )( ) a 4k ∫−k i 1 3 2 x + k dx = 1 5k  1 21  4k 2 1  1 1 3 FY ( y ) = F 4 X 2 y = FX  y  =  4 y  = 64 y 2  x + 1 x  = 1 4 8      10k 5    −k 1 1 3 1 2 1  1 2 1 23 y = y  = y  64 4 8  4    8 4 1 1  k + k − k − k  =1 5 10 5   5 0 y <0  3 F ( y ) =  1 y 2 12 1 0 y 16 k + k = 1 64 5 10  y > 16. 1  5 k = 1 2  1 3 2 0 y 16 Therefore, f ( y ) = 128 y 2 k = 5 0 otherwise.   3 16 x < −2 0 5 d E ( y ) = ∫ 3 y 2 dy = 9.60  128 0 ( 5x + 2 )2 ii F(x) =  −2 x < 8 5 5  100 x x 1 3 8 1 1 1 1 4 4  x . 15 a ∫ x dx = x = x −  5 80  80 1 80 1 20 ( ) ( )( ) ( ) ( 0  F ( x ) =  1 x 4 − 1 80 80 1 x > 3. x = 1.44 (3 s.f) −2.24, reject b i Limits of Y: 1 1 = y4 − 327680 80 (5x + 2)2 = 0.85 100 1 x 3 FY ( y ) = F ( 8X y ) = FX ) iii P(X  p) = 0.85 x <1 X 1 3 )( Change limits: X Y 8 24 − ( 18 y ) = ( 801 )( 18 y ) − 801 4 Y 8 125 2 5 − 8 5 512 125 1 P (Y y) = P (X 3 y) = P (X y 3 ) =  13   5y + 2  = 100 2 18 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 T) = 1 Worked solutions  0  2   13  5 2 y + F(y) =     100  1  0  2 t  24 F(t) =  2  1 − (12 − t ) 192  1  y <− 8 125 − 8 y < 512 125 125 y 512 . 125  1 2  5y 3 + 2    = 0.5 ii 100 y = 1.01 −1.81, reject 17 t −0.1t ) 1 a E(T) = ⌠  t 10 e −0.1t dt = 10 minutes ⌡0 b i ∫0 0.1e ( ∞ t F (T ) = ∫0 0.1e −0.1t ( t t ) ) 1− F(m) = P(2T  m) = P(T  0.5m) = 1 – e–0.1(0.5m) = 1 – e– 0.05m  −0.05m  m 0 Therefore, f (m ) =  0.05 e otherwise. 0 – 0.05m = 0.5 To find median, 1 – e ii – 0.05m = ln 0.5 m = 13.9 minutes = P − n T n n 0.1 n n  0 FN ( n ) =  0.1  e n 18 n ) x x 2 2 t n<0 −e 2 t t  1 ∫0 12t dt =  24  −0.1 n t 4 = 0 = x  x2  1 ∫0 11 x dx =  22  x = 0 x2 22 x ∫2 11 dx = 11 + 11 x  2 = 11 + x 2 2 2 2 (121 x − 5 n 0. =1− 2 t 24 ( 8 − x )2 33 Change limits: 2 t  (12 − t )  1 (12 − t ) dt = 32 +  − 192  96  4  2 2  (12 − t ) 1 + − +  3  192 3 =1− ottherwise. 2 8  (8 − x ) 3 = + − +  11  33 11  For t ∈ [4, 12], F (t ) = F ( 4 ) + ∫ 5x < 8 For x ∈ (5, 8) x x 2 8  (8 − x)2  F ( x ) = F ( 5) + ∫ 8 − x ) dx = + − (  11  33  5 33 a For t ∈ (0 , 4) F (t ) = 2x <5 2 x = 2 + 2 x − 4 = 2 x −1 ( ) ) − (1 − e F(x ))=) F(2) + ∫ 112 dx = 112 +  11  11 ( 11 11 ) 11 − e −0.1 0.1 = e F( x ) = For x ∈ (2, 5) F ( x ) = F ( 2) +  n) 0x < 2 For x ∈ (0 , 2) (( t = 5.07, 18.9, reject  1 x 11 2 f(x) =  11 2  33 ( 8 − x )  0 b i = 1 − e −0.1 (12 − 5)2 − 32 = 71 192 24 192 1 × 2 ×k + 5 − 2 × k + 1 × 8 − 5 × k = 1 ( ) ) 2 2 ( 2 k = 11 e– 0.05m = 0.5 ( ii 1 − (12 − t )2 = 0.75 192 M = 2T ii FN(n) = t 12. dt =  − e −0.1t  = − e −0.1t − ( −1)19= 1 −ae −0i.1t 0 dt =  − e −0.1t  = − e −0.1t − ( −1) = 1 − e −0.1t 0 P(T 2 4 t < 12 Q 3 when 4 t < 12, ( 0t < 4 2.52 = 25 24 96 b i c t <0 X 0 2 5 8 Y = X2 0 4 25 64 (12 − t )2 192 19 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 1 Continuous Random Variables 2 For y ∈ (0, 4) 1 1 1.5  1 c y 2P( 0.5 < x < 1.5 ) = ∫ 0.2x 2 dx + ∫ 26 xdx = 0.08619... + 0.36111 1 1 45 . 0 5 1     1 F ( y ) = P (Y y ) = P X 2 y = P  − y 2 X y 2 = y −0= 1 22 22 1 1.5  P ( 0.5 < x < 1.5 ) = ∫ 0.2x 2 dx + ∫ 26 xdx = 0.08619... + 0.36111... = 0.447 0.5 1 45 1 2  2  y  1 1   1 , Mathematics in life and work Y y ) = P X 2 y = P  −y 2 X y 2  = y −0= 22 22   ( ( ) ) because X cannot take negative values. 1 a f(a) For y ∈ (4, 25) 1 1 1 F ( y ) = 2  y 2 − 1 − 2  − y 2 − 1 = 4 y 2 11   11   11 For y ∈ (25, 64)   F( y ) = 1 −    f (y) = 1 ∫0       1 22 1 11 y 1 0.2x 2 25 y 64 0 otherwise. 2 26 c 21 26 a Mean is calculated from the integral of x f(x). Mean is 22.1 years, so in the age group 21  a  26. d Players between 21 and 26. 2 a The probability will be zero since t is the continuous random variable measuring the number of hours in one day. c dx + ∫ 1 45 x dx = 1 1 Median > mean, negative skew. Reduce the difficulty level in order to reduce the median of the play time.  1 2 23   13 2  b  5 3 x  +  45 x  = 1 1  0 Put in extra help functions to support players to complete each level, so that the mean of play time will increase. 2 13 2 13 + b − =1 15 45 45 d Let daily playing time on the weekend be Y, Y = 2X E(Y) = 2 E(X) b = 2, b = –2 reject 3 16 b Any increasing function, i.e. x 2, 0.2x 3.e x b2 = 4 1 10 b A ge is a non-negative continuous random variable. However, we are not expecting a baby to play computer games. The model can be modified to take this into account. 4 y 25 8− y 33 y b 26 ( )( ) b 0 0 y 4 383 22 ii a 2 d F y , therefore: dy ( )      f(y) =       20 1   2  8 − y  33 E(x ) = 3 1 0.2x 2 0 ∫ 2 M(Y) = 2 M(X) 2 dx + ∫ 26 x 2dx = 2 + 182 = 1.443 25 135 1 45 182 x dx = + = 1.443 ∫0 0.2x 2 dx + ∫ 25 135 1 45 2 20 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 2 2 Inference using normal and t -distributions Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Where values from the Cambridge International Education statistical tables are used, the same level of accuracy has been used in workings unless stated otherwise. Prerequisite knowledge 1 P(W < 53) = P(Z < 1.5) = 0.9332 2 a P(W  3.0) = P(Z  −0.3529) = 1 − P(Z < 0.35) = 0.3621. Expected number = 100 × 0.3621 = 36 b P(3.2  W  3.5) = P(−0.1176  Z  0.2353) = 0.1398 c 3 4 Expected number = 100 × 0.1398 = 14 0.85 gives the interval [ 3.13,3.47 ] 3.3 ± 1.96 100 The test statistic T = 7.2 − 9 = − 4.495 1.2829 8 One-tailed test to the left, with p = 0.95 and v = 7; the critical value of t = −1.895. As −4.495 < −1.895, the test statistic T does not lie in the acceptance region, so you should reject H0. There is significant evidence to suggest that the mean value of the random variable has decreased from 9. 2 15.2 gives the interval [ 42.8,49.8 ] 50 b You are 90% confident that on average, applicants can score between 42.8 and 49.8. Since the lower limit of the confidence interval is greater than 42, that means you are 90% confident that applicants could achieve a mean score of 42 or higher. a 46.3 ± 1.645 H0: m = 1.26 x= a H0 : m = 10 cm. The average length of leaves is 10 cm. b H1 : m > 10 cm. The average length of leaves is greater than 10 cm. c d Assume that the length of leaves is normally distributed. H1: m ≠ 1.26 x= 13.35 = 1.335 10 The test statistic z = 1.335 − 1.26 = 0.988 0.24 10 Two-tailed test, with p = 0.975, the critical values are ± 1.96. 1 H0 : m = 9 57.6 x = 8 = 7.2 1 57.6 2  s 2 = 7  423.7 − 8  = 1.2829   1 96 2  s 2 =  1080 − = 17.6 9 10  9.6 − 10 = −0.302 17.6 10 A one-tailed test to the right, with p = 0.95 and m = 9. The critical value of t is 1.833. The test statistic –0.302 < 1.833, lies within the acceptance region. So you should accept H0. There is no evidence to suggest that the average length of leaves in Gemma’s garden is greater than 10 cm. 3 H1: m < 9 96 = 9.6 10 The test statistic T = As 0.988 < 1.96, the test statistic Z lies inside the acceptance region, so you should accept H0. There is no evidence to suggest a change in mean growth of tomato plants. Exercise 2.1A Gemma should use the test statistic T, because the population standard deviation is unknown (and the sample size is small). a H 0 : m = 165 cm. The average height of students is 165 cm. H1 : m > 165 cm. The average height of students is greater than 165 cm. b One-tailed test to the right. c Assume the height of students is normally distributed. 21 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions x= 972 = 162 6 1 972 2  s 2 =  157754.6 − = 58.12 5 6  The test statistic T = 162 − 165 = −0.964 . 58.12 6 One-tailed test to the right, with p = 0.99 and v = 5. The critical value of t is 3.365. The test statistic –0.964 < 3.365, lies within the acceptance region, so you should accept H0. There is no evidence to suggest that the average height of the students is greater than 165 cm. 4 Assuming the sample is normally distributed. H0 : m = 10.5 H1: m > 10.5 2   127.3 = 12.73 s 2 = 1  2219.6 − 127.3  = 66.563 10 9 10   12.73 − 10.5 The test statistic T = = 0.864 66.563 10 One-tailed test to the right, with p = 0.90 and v = 9; the critical value of t = 1.383. x= As 0.864 < 1.383, the test statistic T lies inside the acceptance region, so you should accept H0. There is no evidence to suggest that the new technology has increased the television lifetime. 207.7 1 207.7 2  = 20.77 s 2 =  4361.33 − = 5.267 10 9 10  2 207.7 1 207.7  x= = 20.77 s 2 =  4361.33 − = 5.267 10 9 10  b H0 : m = 20 5 a x= H1 : m > 20 The test statistic T = 20.77 − 20 = 1.061 5.267 10 One-tailed test to the right, with p = 0.975 and v = 9. The critical value of t is 2.262. The test statistic 1.061 < 2.262, lies within the acceptance region. So you should accept H0. There is no evidence to suggest that the mean is greater than 20. 6 H0: m = 138 ml H1: m ≠ 138 ml ∑ x = 682 → x = 136.4 ∑ x 2 = 93 246 → s 2 = 55.3 The test statistic T = 136.4 − 138 = − 0.481 55.3 5 Two-tailed test at the 5% significance level, with p = 0.975 and v = 4; the critical values of t = ± 2.776. As −2.776 < −0.481 < 2.776, the test statistic T lies inside the acceptance region, so you should accept H0. There is no evidence to suggest that the mean volume is not as expected. 7 H0: m = 20 g H1: m < 20 g ∑ x = 115 → x = 19.17 2 2 ∑ x = 2231.26 → s = 5.419 The test statistic T = 19.17 − 20 = − 0.8734 5.419 6 One-tailed test to the left, with p = 0.05 and v = 5; the critical value of t = −2.015 . As −0.8734 > −2.015, the test statistic T lies inside the acceptance region, so you should accept H0. There is no evidence to suggest that the mean weight of a pack of Brand A’s raisins is less than 20 g. 8 H0: m = 12 minutes H1: m < 12 minutes ∑ x = 26 → x = 3.25 2 2 ∑ x = 312 → s = 32.5 3.25 − 12 = − 4.341 The test statistic T = 32.5 8 One-tailed test at the 1% significance level, with p = 0.01 and v = 7; the critical value of t is −2.998. As −4.341 < −2.998, the test statistic T does not lie in the acceptance region, so you should reject H0. There is evidence to suggest that the new mean is less than 12 minutes. That means the new schedule introduced by the control room works better than the previous one during the peak time. 9 1 H1 :Hm1 :≠m1≠ 1 kg H0 H : m0 :=m1=kg ∑ x = 10.54 → x = 1.054 ∑ x 2 = 11.2202 → s 2 = 0.012 34 1.054 − 1 = 1.537 The test statistic T = 0.012 34 10 Two-tailed test at the 10% significance level, with p = 0.95 and v = 9; the critical values of t = ±1.833. As 1.537 < 1.833, the test statistic T lies within the acceptance region, so you accept H0. There is no evidence to suggest that the mean weight of bags of vegetables has changed from 1 kg. 10 H0 : m = 50 grams 597 x= = 49.75 12 H1 : m = 50 grams 1 597 2  s 2 =  29 767 − = 6.023 11  12  The test statistic T = 49.75 − 50 = −0.353 6.023 12 22 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions A two-tailed test, with p = 0.975 or p = 0.025; v = 11. The critical values of t are ± 2.201. lifespan of candles made at the two factories are different. As the sample mean of factory X is smaller than the sample mean from factory Y, it also suggests that the mean lifespan of candles produced by factory X is shorter. The test statistic –0.353 is between –2.201 and 2.201, and lies within the acceptance region. So you should accept H0. There is no evidence to suggest that the mean weight differs from 50 grams. b It is not necessary as both samples are larger than 30. The Central Limit Theorem can be applied and the unbiased estimators can be used as the population variances. Exercise 2.2A 1 a Normal distribution test is suitable as the population variances are known. b H0 : m1 − m 2 = 0 H1: m1 − m 2 ≠ 0 The test statistic Z is = 4 (4.06 − 3.91) 0.062 + 0.047 10 10 z = 1.437 Two-tailed test with p = 0.95; the critical values of z = ±1.645. Two-tailed test with p = 0.975; the critical values of z = ±1.960. 2 As 1.437 < 1.645, the test statistic Z does lie in the acceptance region, so accept H0. There is no significant evidence to suggest that the mean amounts of milk dispensed by the two machines are different. 5 H0 : m M − m F = 0 H1: m M − m F > 0 xM = 256 = 3.2 80 2 sM = H1 : m1 – m2 > 0 1  208  208 s F2 =  731 − = 2.408 = 2.6 79  80  80 The test statistic Z is (3.2 − 2.6) = z = 2.564 1.972 + 2.408 80 80 One-tailed test to the right with p = 0.95; the critical value of z = 1.645. 2  x1 = 2859 = 81.69 s12 = 1  235 425 − 2859  = 55.46 35  34  35 2 2  x 2 = 3052 = 87.20 s 2 2 = 1  268 450 − 3052  = 68.11 35  34  35 The test statistic Z = As 2.564 > 1.645, the test statistic Z does not lie in the acceptance region. So, you reject H0. There is significant evidence to suggest that the mean number of sales of the new cereal to men is higher than to woman on that day. 3 The test statistic –2.93 < 1.65, lies within the acceptance region. So accept H0. There is no evidence to show that the warning signs reduce road accidents. H1: m X − mY ≠ 0 (305 − 309) Z = −2.170 82.35 + 123.61 55 65 Two-tailed test with p = 0.975; the critical values of z = ±1.960. As −2.170 < −1.960, the test statistic Z does not lie in the acceptance region. So, you reject H0. There is evidence to suggest that the mean 81.69 − 87.20 = −2.93 55.46 + 68.11 35 35 A one-tailed test to the right with p = 0.95. The critical value from the normal distribution table is 1.645. a H0 : m X − mY = 0 = The test statistic Z is Assume each accident that happened was independent. H0 : m1 – m2 = 0 1  256 2  975 − = 1.972  79  80  xF = H0 : m1 − m 2 = 0 H1: m1 − m 2 ≠ 0 The test statistic Z = (23.4 − 19.8) z = 0.4238 12 2 + 17 2 6 6 As 0.4238 < 1.960, the test statistic Z lies within the acceptance region, so you accept H0. There is no evidence to suggest that the two random variables have different means. 2 6 Let the time spent on the internet by families with children be X1 and the time spent by families without children be X2. H0 : m1 − m 2 60 H1: m1 − m 2 < 60 x1 = 728 s12 = 11 970.59 x 2 = 635 s12 = 49 929.66 23 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions (728 − 635) − 60 11970.59 + 49 929.66 35 30 Z = 0.7367 One-tailed test to the left with p = 0.05; the critical value of z = −1.645. The test statistic Z = 9 H1 : mB – mG < 2 2  x B = 2474 = 82.47 s B 2 = 1  206 044 − 2474  = 69.71 30  29  30 2  xG = 2380 = 79.33 sG 2 = 1  191094 − 2380  = 78.64 30  29  30 0.7367 > −1.645, the test statistic Z lies within the acceptance region. So, you accept H0. The evidence supports the claim that families with children spend at least sixty extra hours on the internet than those without children in a year. 7 The test statistic Z = Let the time taken by the old system be O. H0 : mO − m N = 2 H1: mO − m N < 2 2 n = 4.3 s N = 2.56 The test statistic Z = (7 − 4.3) − 2 2.52 + 2.56 45 45 z = 1 . 582 10 One-tailed test to the left with p = 0.05; the critical value of z = –1.645. As 1.582 > –1.645, the test statistic Z does lie in the acceptance region, so accept H0. The hotel manager’s claim is supported; there is at least 2 minutes improvement from the new computer system. 8 (82.47 − 79.33) − 2 = 0.510 69.71 + 78.64 30 30 A one-tailed test to the left with p = 0.05. The critical value from the normal distribution table is –1.645. The test statistic 0.510 > –1.645, lies within the acceptance region, so accept H0. The evidence supports the claim that boys scored at least 2 marks more than girls in the mock science exam. The time taken by the new system is denoted by N. H0 : mB – mG = 2 Let X1 be the size of flowers grown in high nitrogen compost, and X2 be the size of flowers grown in normal compost. H0 : m1 – m2 = 1 a H0 : m X − mY = 0 H1 : m X − mY ≠ 0 y = 19.99 (20.32 − 19.99) 0.36 2 + 0.36 2 35 40 z = 3.960 Two-tailed test with p = 0.95, the critical values of z = ±1.645. The test statistic Z = 3.960 > 1.645. The test statistic does not lie in the acceptance region, so reject H0. There is significant evidence to suggest that the mean length has changed. After the machine has been serviced, the mean length of flat-pack components has decreased. b H0 : m X − mY = 0 H1 : m X − mY ≠ 0 H1 : m1 – m2 < 1 1  285.92  = = 0.6575 2357.75 −  34  35  285.9 x1 = = 8.169 35 s12 244.9 x2 = = 6.997 35 s2 2 The test statistic Z = (8.169 − 6.997 ) − 1 = 0.764 1  244.92  = = 1.118 1751.61 − 34  35  0.6575 + 1.118 35 35 A one-tailed test to the left with p = 0.10. The critical value from the normal distribution table is –1.282. The test statistic, 0.764 > –1.282, lies within the acceptance region, so accept H0. The evidence supports the garden centre’s claim that flowers grown in the high nitrogen compost are at least 1 cm bigger. y = 19.99 s y2 = 0.0784 The test statistic Z = (20.32 − 19.99) 0.36 2 + 0.0784 35 40 z = 4.39 Two-tailed test with p = 0.95, the critical values of z = ±1.645. 4.39 > 1.645. The test statistic Z lies outside the acceptance region, so reject H0. There is significant evidence to suggest that the mean length has changed after the service. The second test is more reliable as the sample variance is used. 24 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Worked solutions Exercise 2.2B 1 (8 − 1) × 3.7 + (15 − 1) × 2.9 = 3.167 8 + 15 − 2 b H0 : m X − mY = 0 a sp 2 = H1 : m X − mY ≠ 0 The test statistic T = As 0.8560 < 1.761, the test statistic T does lie in the acceptance region. So, you accept H0. There is no evidence to suggest that trains from manufacturer A have greater fuel efficiency on average than those from manufacturer B. 4 26.1 − 24.8 1 3.167 18 + 15 ( ) = 1.669 a Two-tailed test with p = 0.975, v = 8 + 15 – 2 = 21; the critical values of t = ± 2.080. x = 6.967 s x2 = 1.1 y = 6.556 s 2y = 0.8778 s p2 = s p2 = The test statistic T = H1 : m X − mY > 0 ( ) One-tailed test with p = 0.95, v = 9 + 9 – 2 =16; the critical value of t = 1.746. As 0.877 < 1.746, the test statistic T lies within the acceptance region. So, you accept H0. There is no evidence to suggest that the sunflowers planted in soil X are larger than those planted in soil Y. H0 : m A − m B = 0 sp 2 = ) The five records are independent between Adam and Bob. The population variances are the same. H0 : m x A − m xB = 0 H1: m x A − m xB ≠ 0 x A = 49.3 s x A 2 = 0.0375 x B = 48.7 s xB 2 = 0.0625 s p2 = (5 − 1) 0.0375 + (5 − 1) 0.0625 = 0.05 5+5−2 The test statistic T = 49.3 − 48.7 = 4.243 0.05 1 + 1 5 5 ( ) Two-tailed test with p = 0.975, v = 8; the critical values of t = ± 2.306. H1: m A − m B > 0 ( = 0.499 As 0.0499 < 0.700, the test statistic T does lie in the acceptance region, so accept H0. There is no evidence to suggest that the new fertiliser gives an increase in growth. 5 The test statistic T = 6.967 − 6.556 = 0.877 0.989 1 + 1 9 9 7.55 − 7.533 0.3488 1 + 1 6 6 One-tailed test to the right, with p = 0.75, v = 10; the critical value of t = 0.700. 9+9−2 H0 : m X − mY = 0 x B = 3.8 (6 − 1) 0.619 + (6 − 1) 0.078 67 = 0.3488 6+6−2 H1 : m X − mY > 0 (9 − 1)1.1 + (9 − 1)0.8778 = 0.989 x A = 4.2 s y 2 = 0.07867 b H0 : m X − mY = 0 b Assume that both samples are independent and randomly selected. They have the same population variance. 3 s x 2 = 0.619 y = 7.533 a ∑ x = 62.7 ∑ x 2 = 445.61 ∑ y = 59 ∑ x 2 = 393.8 ∑ x = 45.3 ∑ x 2 = 345.11 ∑ y = 45.2 ∑ y 2 = 340.9 x = 7.55 As −2.080 < 1.669 < 2.080, the test statistic t lies within the acceptance region, so accept H0. There is not enough evidence to suggest that the two random variables X and Y have different means. 2 Let X denote the height of young plants receiving the new fertiliser. Y denotes the height of young plants receiving the usual fertiliser. s 2A = 0.49 s B2 = 1.257 (8 − 1) 0.43 + (8 − 1)1.257 = 0.8735 8+8−2 4.2 − 3 .8 The test statistic T = = 0.8560 0.8735 1 + 1 8 8 One-tailed test with p = 0.95, v = 8 + 8 – 2 = 14; the critical value of t = 1.761. ( ) As 4.243 > 2.306, the test statistic T does not lie in the acceptance region, so you reject H0. There is evidence to suggest that the average swimming times of Adam and Bob are different. 6 H0 : md = 0 , there is no difference between the population mean leaf widths. H1 : md ≠ 0 , there is a difference between the population mean leaf widths. 25 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions 42 X d = 10 = 4.2 2 1  381.96 − ( 42 )  = 22.84 s d2 =  10 − 1  10  4.2 T = = 2.779 22.84 10 Two-tailed test with p = 0.975, v = 9; the critical values of t = ± 2.262. One-tailed test to the right with p = 0.9, v = 9; the critical value of t = 1.383. As 2.78 > 2.262, the test statistic does not lie in the acceptance region, so reject H0. There is evidence to suggest that nitrogen affects leaf growth, as the two samples indicate there is a difference in the population mean leaf widths. H1 : md ≠ 0 7 H0 : md 1 H1: md < 1 pH difference: 0.8 1.2 x d = 1.05 0.6 1.2 0.7 1.8 s d2 = 0.199 The test statistic T is T = 1.05 − 1 = 0.2745 0.199 6 As 1.67 > 1.383, the test statistic does not lie in the acceptance region, so you reject H0. There is evidence to suggest that vitamins increase attention span. 10 H0 : md = 0 Difference 7 H1: md ≠ 0 xd = −0.4571 1 a A t-distribution with 15 degrees of freedom, critical value = 2.131 c A t-distribution with 22 degrees of freedom, critical value = 2.819 d A normal distribution, critical value = 2.326 2 a Assuming sample is normally distributed. xt = 5.3 −4.218 < −2.447, so the test statistic does not lie in the acceptance region. So, you reject H0. There is strong evidence to suggest that the wear on the front and rear tyres is different. H0 : md = 0 The test statistic T is T = 3.5 = 1.67 44.1 10 st2 = 2.161 A t-distribution with p = 0.975, v = 14, critical value = 2.145 s d2 = 0.08219 H1: md > 0 3 b A t-distribution with 9 degrees of freedom, critical value = 1.833 The test statistic T is T = −0.4571 = −4.218 0.08219 7 Two-tailed test with p = 0.975, v = 6; the critical values of t = ± 2.447. 9 −20 As −2.776 < −0.7567 < 2.776, the test statistic lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that the percentages of bacteria from the two lakes are different. The sample is randomly selected; the differences are normally distributed. H0 : md = 0 −6 x = −2.4 s d2 = 50.3 d −2.4 = −0.7567 The test statistic T is T = 50.3 5 Two-tailed test with p = 0.975, v = 4; the critical values are t = ± 2.776. 0.2745 > −2.015, so the test statistic T lies within the acceptance region. So, you accept H0. The evidence suggests that the chemical reduces pH value by at least 1. −6 Exercise 2.3A One-tailed test to the left with p = 0.05, v = 5; the critical value of t = −2.015. 8 The differences are normally distributed. 5.3 ± 2.145 2.161 15 4.49 minutes  mean time  6.11 minutes b You are 95% confident that the mean time of completing the puzzle is between 4.49 and 6.11 minutes. That means the students can complete the puzzle more quickly than the time that the manufacturer suggested. 3 3.24 1.87 a (176 − 166) ± 1.645 40 + 32 9.39 cm  height difference  10.6 cm 2.5 + 2.9 b (80.3 − 67.6) ± 1.96 40 32 11.9 kg  mass difference  13.5 kg 26 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions c (28.8 − 27.6) ± 2.326 4.6 + 5.9 40 32 −0.0727  BMI difference  2.47 9 a Assume population variances of the times spent watching TV by boys and girls are equal. b 2.37 5.78 d (63 − 68) ± 2.576 40 + 32 −6.26 bpm heart rate difference −3.73bpm 4 Assume sugar content is normally distributed. a π = 3.13 b The confidence interval calculated is valid because neither the distribution nor the parameters used are approximated. 6 a x = 501.7 s = 7.421 A t-distribution with p = 0.975, v = 5, critical value = 2.571 7.421 501.7 ± 2.571 6 493.9 g  mean weight  509.5 g 6 × 2.2192 + 8 × 1.0142 7+9−2 ( 17 + 19 ) c 10 You are 95% confident that, on average, the difference between the times spent by boys and girls watching TV each week lies in this interval. As the interval contains the value zero, there is no significant difference between boys and girls. Assume population variances of the study times of first-year students and final-year students are the same. Xd = mean first-year study time – m ean final-year study time 4 × 1.3 + 4 × 0.98 The pooled estimate s p2 = = 1.14 5+5−2 1 1 (3 − 2) ± 1.86 × 1.14 × + 5 5 ( ) 11 −0.256 xd 2.26 aAssume the pH values are normally distributed. x = 6.99 s = 0.4408 w = 5.5 sw2 = 1.692 A t-distribution with p = 0.975, v = 9, critical value = 2.262 0.4408 6.99 ± 2.262 10 Let d = μM – μW 6.67  mean pH value  7.31 a m = 8.6 2 sm = 7.575 (8.6 − 5.5) ± 1.96 b 0.95 × 60 = 57 7.575 1.692 + 600 800 12 2.86  d  3.34 b You are 95% confident that men spend more money on breakfast at a café than women do as both lower and upper limits are above the value of zero. 8 sG = 1.014 −1.67 xd 1.88 b There is a 5% chance that the confidence interval will not contain the population mean. 7 xG = 7.556 (7.657 − 7.556) ± 2.145 × 2.698 × s = 0.02915 A t-distribution with p = 0.975, v = 4, critical value = 2.776 0.02915 3.13 ± 2.776 5 3.094  calculated value of π  3.166 s B = 2.219 The pooled estimate s p2 = = 2.698 32.9 g  mean sugar content  39.1 g 5 x B = 7.657 Xd = μB − μG s = 2.280 x = 36 A t-distribution with p = 0.99, v = 5, critical value = 3.365 2.28 36 ± 3.365 6 2 2 2 a The pooled estimate s p2 = 11 × 5.9 + 14 × 4.1 12 + 15 − 2 = 24.73 ( ) b (63 − 57) ± 1.708 × 24.73 × 1 + 1 12 15 2.71  difference in mean scores  9.29 Let Year 7 level of progress be the random variable X, and Year 8 level of progress be the random variable Y. a x = 2.28 s x2 = 0.907 y = 2.9 s y2 = 0.185 Xd = μX − μY The pooled estimate s p2 = 4 × 0.907 + 4 × 0.185 = 0.546 5+5−2 (2.28 − 2.9) ± 2.306 × 0.546 × ( 15 + 15 ) −1.70 xd 0.458 27 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions b You are 95% confident that the difference in mean levels of progress between Year 7 and Year 8 is between −1.70 and 0.458. As the interval contains the value 0, there is no significant difference. c A normal distribution as populations are used. Xd = μ7 – μ8 x 7 = 2.6 s 72 = 0.850 x 8 = 3.1 s 82 = 0.128 (2.6 − 3.1) ± 1.96 0.850 + 0.128 120 87 −0.681 xd −0.319 d 95% confidence interval found from the population suggests that Year 7 students are making less progress than Year 8, as the upper and lower limits of the interval are both negative values. The interval calculated from part a does suggest that there is no significant difference between Year 7 and Year 8 level of progress. However, since the upper limit of the interval is only just above the value 0, this also confirms the conclusion from part c. Exam-style questions 1 H0 : m = 4.27 H1: m < 4.27 503.5 − 498 = 2.349 22.58 1 + 1 10 7 ( 2.349 > 2.131. The test statistic T does not lie in the acceptance region, so reject H0. There is evidence to suggest that the population mean weights of Bakery A’s white bread and brown bread are different. To test whether the mean weight of Bakery A’s white bread is higher than the mean weight, 505 g, of Bakery B’s 50 bread: 50 H0 : µX = 505 H1 : µX > 505 x = 503.5 2 s x = 17.1 Test statistic T = 503.5 − 505 = −1.147 17.1 10 One-tailed test to the right with p = 0.95, v = 9 Critical value is t = 1.833 Since –1.147 <1.833, the test statistic T lies in the acceptance region, so accept H0; there is no evidence to suggest that Bakery A’s claim is justified. a A ssuming the mobile phone signal strength is normally distributed in city X and city Y. 4.023 − 4.27 = − 5.347 0.0256 12 One-tailed test to the left, with p = 0.01 and v = 11; the critical value of t = −2.718. x = −113.34 –5.347< −2.718. The test statistic T lies outside the acceptance region, so reject H0. There is significant evidence to suggest that the mean is less than 4.27. −0.835 m x − m y −0.625 The test statistic T = 2 Use a two-sample t-test, assuming both white and brown bread weights are normally distributed with the same variance. Let X be the weight of white bread and Y the weight of brown bread sold by Bakery A. H0 : m X − mY = 0 H1: m X − mY ≠ 0 x = 503.5 s p2 = s x2 = 17.1 y = 498 9 × 17.1 + 6 × 30.8 = 22.58 10 + 7 − 2 s 2y = 30.83 ) Two-tailed test with p = 0.975, v = 15; the critical values of t = ±2.131 3 2 sm = 0.02560 xm = 4.023 The test statistic T = s x2 = 0.18 y = −112.61 s y2 = 0.052 (−113.34 − (−112.61)) ± 1.645 × ( 060.18 + 0.50052 ) b H0 : m X − mY = 0 H1 : m X − mY ≠ 0 −113.34 − ( −112.61) 0.18 + 0.052 60 50 z = −11.49 The test statistic Z = Two-tailed test with p = 0.99; the critical values of z = ± 2.326. As −11.49 < −2.326, the test statistic Z does not lie in the acceptance region, so reject H0. There is significant evidence to suggest that the mean signal strengths in the two cities are different. 28 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 4 A paired sample t-test H0 : md = 0 H1 : md ≠ 0 y = 6.025 Difference X d = −0.66 1 2 3 4 5 −3 0.4 −1 −1.5 1.8 b No ladybird with nine spots was found in forest B. However, a few of them were found in forest A. Therefore, forest A is more likely to be located in the north-eastern United States. s d = 1.835 7 sm = 0.2130 xw = 5.795 sw = 0.439 ∑(x − x )2 = 0.67 ) (2.67 − 3.2) ± 1.812 × ∑(y − y )2 = 1.84 ( ) 94 1 1 × + 375 3 9 −1.13 m x − m y 0.0748 8 H0 : mm − mw = 2.20 a H0 : mr = 36.5 H1 : mr ≠ 36.5 H1: mm − mw ≠ 2.20 The test statistic T = (7.934 − 5.795) − 2.2 = −0.395 0.119 1 + 1 10 10 Two-tailed test with p = 0.995, v = 18; the critical values of t = ± 2.878. ( ) −2.878 < –0.395 < 2.878. The test statistic lies within the acceptance region, so accept H0. There is evidence to suggest that the mean distance of long jump of men is 2.20 m greater than that of women. a ∑ x = 0 × 8 + 2 × 5 + 7 × 24 + 9 × 7 + 13 × 6 = 319 ∑ x 2 = 0 × 8 2 + 2 × 52 + 7 × 24 2 + 9 × 7 2 + 13 × 6 2 = 4991 2 5 + 7 × 24 + 9 × 7 + 13 × 6 2 = 4991 2 x = 6.38 y = 3.2 t-distribution with p = 0.95, v = 10, critical value = 1.812 9 × 0.21302 + 9 × 0.4392 = = 0.119 10 + 10 − 2 1.81 mm − mw 2.46 2 28.82 3n b x = 2.67 ( 6 ∑(y − y )2 = 94 − n = 3 or n = 113.2 (reject) 1 1 + (7.934 − 5.795) ± 2.101 × 0.119 × 10 10 c 82 n 1128n 2 − 131064n + 383040 = 0 The pooled estimate s p2 ∑ ( x − x )2 = 22 − 94 ( 4n − 2 ) 1021.44 = 116 − 375 3n a Population variances are equal. xm = 7.934 a  82   28.82   22 − n  +  94 − 3n  94 = 375 n + 3n − 2 −2.132 < −0.804 < 2.132. The test statistic T lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that the mean running time over the two courses is different. b  7.7672 9.0992   50 + 40  −3.20 x − y 3.90 The test statistic T = −0.66 = − 0.804 1.835 5 Two-tailed test with p = 0.95, v = 4; the critical values of t = ± 2.132. 5 s x = 9.099 (6.38 − 6.025) ± 1.96 × The differences, course 1 – course 2, are Runner 2 s x = 7.767 xr = 34.8 sr 2 = 15.46 34.8 − 36.5 = −1.675 15.46 15 Two-tailed test with p = 0.975, v = 14, critical values = ± 2.145. The test statistic T = −2.415 < −1.675 < 2.415. The test statistic T lies within the acceptance region, so accept H0. There is no evidence to suggest that the mean is not 36.5. b A t-distribution with p = 0.975, v = 14, critical value = 2.145 15.46 15 32.6 mr 37.0 34.8 ± 2.145 ∑ y = 0 × 4 + 2 × 7 + 7 × 25 + 9 × 0 + 13 × 4 = 241 2 2 2 2 2 2 ∑ y = 0 × 4 + 2 × 7 + 7 × 25 + 9 × 0 + 13 × 4 = 4681 7 2 + 7 × 252 + 9 × 0 2 + 13 × 4 2 = 4681 29 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions 9 One-tailed test to the left with p = 0.05; the critical values of z = −1.645. −1.104 > −1.645. The test statistic Z lies within the acceptance region, so accept H0. There is no evidence to suggest that college A students took less time than college B students. a A paired sample t-test H0 : md = 0 H1 : md > 0 The differences before – after are: Staff Difference A 4 X d = 3.5 s d = 7.672 B 1 C D E 4 −2 0 F G H 10 18 −7 3.5 The test statistic T = 7.672 = 1.29 8 One-tailed test with p = 0.90, v = 7, critical value t = 1.415. b Calculate value of Z = 12 H1: m < 28 s = 1.502 27.44 − 28 The test statistic T = 1.502 = –0.9133 6 One-tailed test to the left with p = 0.05, v = 5, critical value t = −2.015. −0.9133 > −2.015. The test statistic lies in the acceptance region, so accept H0. There is no evidence to suggest that the mean completion time for the swimmers is less than 28 seconds. b p = 0.975, v = 9; the critical value of t = 2.262 m − 2.262 s = 26.07 10 m + 2.262 s = 28.17 10 m = 27.12 s 2 = 1.468 H1: m > 15 x = 15.95 b 15.95 ± 2.571 1.629 6 14.2 mm m 17.7 mm 11 a H0 : m A − m B = 0 H1 : m A − m B < 0 t A = 5.7 s 2A = 0.7838 t B = 5.9 s B2 = 0.2669 The test statistic Z is = (5.7 − 5.9) 0.7838 + 0.2669 30 40 z = −1.104 a H0 : m = 28 x = 27.44 a H0 : m = 15 1.428 < 2.015. The test statistic lies in the acceptance region, so accept H0. There is no evidence to suggest that the mean tail length of new-born mice is greater than 15 mm. = 0.5522. 1 – 0.7095 = 0.2905, therefore β > 29.05% b The mean number of absent hours is reduced by 3.5 hours from eight staff. However, at the 10% significance level, the test statistic is not significant enough to suggest that the number of hours of absence is reduced. As it costs the agency to run a kids’ club, it is recommended not to do it in the coming year. s = 1.629 The test statistic T = 15.95 − 15 = 1.428 1.629 6 One-tailed test to the right with p = 0.95, v = 5, critical value t = 2.015. 0.7838 + 0.2669 30 40 Gives probability 0.7095. 1.29 < 1.41. The test statistic T lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that the holiday kids, club reduced the absence rate. 10 (5.9 −5.7 ) − 0.1 13 Assume the numbers of people coming to the gym on different days are independent. Population variances before and after extended opening hours are the same. Let X1 be the number of people using the gym each day when the gym is open for 12 hours, and X2 be the number of people when the gym is open for 18 hours. H0 : m1 – m2 = 0 H1 : m1 – m2 < 0 x1 = 407 = 40.7 10 1 407 2  s12 =  18125 − = 173.34 9 10  x2 = 511 = 51.1 10 1 5112  s2 2 =  28109 − = 221.88 9 10  sp2 = (9)(173.34 ) × (9)(221.88) = 197.61 18 The test statistic T = 40.7 − 51.1 = −1.65 1 + 1 197.61 10 10 ( ) 30 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Worked solutions A one-tailed test to the left with p = 0.05, v = 18. The critical value is –1.734. The test statistic, –1.65 > –1.734, lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that more people use the gym when the gym is open for 18 hours. 14 A one-tailed test to the right with p = 0.90. The critical value is 1.282. The test statistic, 2.070 > 1.282, lies outside the acceptance region, so reject H0. There is significant evidence to show that the new menu increases sales. 16 a Let X1 be the time needed to solve the puzzle without any training and X2 be the time needed with training. H0 : m1 – m2 = 0 b Let X1 be the height of the boys and X2 be the height of the girls. H1 : m1 – m2 > 0 x1 = 19 = 3.8 5 s12 = 17.4 x2 = = 3.48 5 sp2 = a A suitable test would be a two-sample t-test. Assume that the height of each group of children is independent. Population variances are the same. s2 2 H0 : m1 – m2 = 2 1 192  75.34 − = 0.785 4  5  H1 : m1 – m2 < 2 ( 4 )(0.785) × ( 4 )(0.112 ) = 0.4485 sp 2 = 8 The test statistic T = 3.8 − 3.48 = 0.7555 0.4485 1 + 1 5 5 ( ) sp2 = sp2 = ( 4 )(0.785) × ( 4 )(0.112 ) = 0.4485 c (112.8 − 111) − 2 = −0.1255 ( 6.35 15 + 15 ) For a 90% confidence interval, p = 0.95 and v = 8. The critical values are ± 1.860. ( ) 17 Assume the sales made on different days are independent. Let X be the random variable of new menu sales and Y be the random variable of old menu sales. H0 : mx – my = 0 a A ssume the weight of each pack of potatoes is independent and normally distributed. Population variances are the same. Let X1 be the weight of King Edward potatoes, and X2 be the weight of salad potatoes. H0 : m1 – m2 = 0 H1 : m1 – m2 ≠ 0 H1 : mx – my > 0 311 X2 = = 10.37 30 8 –1.16 cm  μ1 – μ2  4.76 cm – 0.468 minutes  μ1 – μ2  1.11 minutes 343.8 = 11.46 30 ( 4 )(5.2 ) × ( 4 )( 7.5) = 6.35 (112.8 − 111) ± 1.860 × 6.35 15 + 15 ( 3.8 − 3.48) ± 1.860 × 0.4485 15 + 15 X= 2  s 2 2 = 1  61 635 − 555  = 7.5 5  4 The test statistic, –0.1255 < –1.860, lies within the acceptance region, so accept H0. There is sufficient evidence to suggest that the boys are taller than the girls by at least 2 cm. 8 The confidence interval for the difference in means: 15 = 0.002139 A one-tailed test to the left with p = 0.05, v = 8. The critical value is –1.860. 19 17.4 = 3.8 x2 = = 3.48 5 5 ( ) 14 The test statistic T = b Assume the times spent completing the puzzle are independent. Population variances with and without training are the same. From part a, x1 = (7 ) (0.003 343) × (7 ) (0.000 9357 ) X 2 = 555 = 111 5 A one-tailed test to the right with p = 0.95, v = 8. The critical value is 1.860. The test statistic, 0.7555 < 1.860, lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that the training improves the time taken to solve the puzzle. c 2  s12 = 1  63 640 − 564  = 5.2 4 5  X 1 = 564 = 112.8 5 1 17.42  =  61 − = 0.112 4 5  sx2 = sy2 The test statistic Z = 2 1 343.8 = 4.451 4069.04 − 29  30  1 3112  =  3336.29 − = 3.871 29  30  11.46 − 10.37 = 2.070 4.451 + 3.871 30 30 x1 = 8.4 = 1.05 8 2  s12 = 1  8.843 4 − 8.4  = 0.003 343 8  7 2  x 2 = 8.18 = 1.0225 s 2 2 = 1  8.3706 − 8.18  = 0.0009357 8  7 8 The test statistic 1.05 − 1.0225 = 1.189 0.002139 1 + 1 8 8 T = ( ) 31 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 Inference using normal and t-distributions b Let X be the length of a piece before and Y the length after the machine is serviced. A two-tailed test with p = 0.05 and p = 0.95. v = 14. The critical values are ± 1.761. H0 : µ X − µY = 0 The test statistic –1.761 < 1.189 < 1.761 lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that the weights of the two types of potatoes are different. H1 : µ X − µY ≠ 0 ∑ x = 6.04 ∑ x 2 = 6.0864 b For a 95% confidence interval, p = 0.975 and 6.04 1 6.04 2  x= = 1.006667 s x2 =  6.0864 − = 0.0012267 6 5 6  v = 14. The critical values are ± 2.145. 2 6.04 1 6.04  2 = 0.0012267 x= = 1.006667 1 + 1s x = 5  6.0864 − 6 6  . − . . . 1 05 1 0225 ± 2 145 × 0 002139 ( ) 8 8 ∑ y = 6.01 ∑ y 2 = 6.0223 ( ) –0.0221 kg  μ1 – μ2  0.0771 kg 18 x= 6.01 1 6.012  = 1.001667 s x2 =  6.0223 − = 0.0004567 6 5 6  a A paired sample t-test. Assume that the 6.01 1 6.012  difference betweenx the of pain 1.001667 = 0.0004567 = two=types s x2 relief =  6.0223 − 6 5 6  tablets is normally distributed. 5 × 0.0012267 + 5 × 0.0004567 s p2 = = 0.00084167 b Let d = Tablet A time – Tablet B time H0 : md = 0 6+6−2 H1 : md < 0 6.04 − 6.01 = 1.791 0.00084167 16 + 16 Test statistic T = ( Differences: 4 1.5 0 –1.5 1.5 –2.5 xd = 3 = 0.5 6 1 32  sd 2 =  29 −  = 5.5 5 6 Two-tailed test with p = 0.025, 0.975 and v = 10 Critical values are t = ±2.228 0.5 − 0 = 0.5222 The test statistic T = 5.5 6 Since −2.228 < 1.791 < 2.228, the test statistic lies within the acceptance region, so accept H0. There is no significant difference between the mean lengths before and after the machine’s service. A one-tailed test to the left with p = 0.95. v = 5. The critical value is –2.015. The test statistic, 0.5225 > –0.2015, lies within the acceptance region, so accept H0. There is not sufficient evidence to suggest that Tablet A is more efficient than Tablet B. c 19 20 a L et X1 be the 11-year-olds’ progress and X2 be the 12-year-olds’ progress. x1 = 0.84 s12 = 0.148 Tablet A, x A = 52.5 = 8.75 6 s A = 2.44 x2 = 1.16 s22 = 1.528 Tablet B, x B = 49.5 = 8.25 6 s B = 0.524 sp2 = The test statistic T in part b can be used to show that Tablet B is more efficient than Tablet A at the 5% significance level. Also, tablet B has a smaller sample mean and standard deviation. Tablet B would be recommended. aA two-sample t-test. Assume that the samples of length measurements before and after the machine’s service are independent, that each is taken from a normally distributed population, and that the two populations have the same variance. ) ( 4 )(0.148) × ( 4 )(1.528) = 0.838 8 The confidence interval: ( ) (0.84 − 1.16 ) ± 2.306 × 0.838 15 + 15 –1.655  μ1 – μ2  1.015 b The confidence interval contains the value 0, so there is no significant difference between the two groups of children. 32 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 2 Mathematics in life and work 1 A 2-sample t-test Let H denote the goals scored in home matches and A denote the goals scored in away matches H0 : m H − m A = 0 H1 : m H − m A > 0 x H = 2.33 s H 2 = 3.87 x A = 1.83 s A 2 = 1.37 s p 2 = 5 × 3.87 + 5 × 1.37 = 2.62 6+6−2 2.33 − 1.83 The test statistic T = = 0.535 2.62 1 + 1 6 6 One-tailed test to the right, p = 0.95, v = 10; the critical value of t = 1.812. ( ) 0.535 < 1.812. The test statistic lies in the acceptance region, so accept H0. There is not enough evidence to show that Liverpool plays better at a home match than an away match. 2 For home matches, ∑ x = 14 + 19 = 33 x H = 2.06 Pooled variance estimate of home matches 2   5 × 1.97 +  49 − 19  10  = = 1.625 6 + 10 − 2 For away matches, ∑ x = 11 + 11 = 22 x A = 1.375 Pooled variance estimate of away matches 2   5 × 1.17 +  21 − 11 10   = 1.054 = 6 + 10 − 2 Pooled estimate of home and away matches 15 × 1.625 + 15 × 1.054 s p2 = = 1.3395 16 + 16 − 2 95% confidence interval: p = 0.975, v = 30, t = 2.042 (2.06 − 1.375) ± 2.042 × 1.3395 × 1 + 1 16 16 −0.15 m H − m A 1.52 From part b, the confidence interval contains 0, so you are 95% confident that there is no difference between how Liverpool play at home or away; both parts suggest that there is not enough statistical evidence to support the claim that Liverpool plays better at home. ( ) 33 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 -Tests 3 χ2-tests Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Prerequisite knowledge 1 a The total frequency is 25 + 31 + ⋅⋅⋅ + 1 = 100 1 1 × 25 + 2 × 31 + ⋅ ⋅ ⋅ + 8 × 1) = 2.5 100 ( x= {( } ) 1 189 12 × 25 + 2 2 × 31 + ⋅ ⋅ ⋅ + 82 × 1 − 100 × 2.52 = = 1.91 99 99 s2 = b The total frequency is 13 + 25 + 32 + 5 = 75 1 566 10 × 13 + 30 × 25 + 50 × 32 + 70 × 5 ) = = 37.7 (3 s.f.) 75 ( 15 x= s2 = 2 a ( ) 2 1  2 566  10 × 13 + 302 × 25 + 502 × 32 + 702 × 5 − 75 × = 291  74  15  ( )  10 10 − 2 P ( X = 2 ) =   0.32 (1 − 0.3 ) = 0.233  2 i ii E(X) = 10 × 0.3 = 3 P (150 < Y 200 ) = Φ b i ( 20050− 260 ) − Φ(15050− 260 ) = Φ(–1.2) – Φ(–2.2) = (1 – Φ(1.2)) – (1 – Φ(2.2)) = (1 – 0.8849) – (1 – 0.9861) = 0.101 ii 3 Var(Y) = 502 = 2500 Performing a paired t-test, assume the differences between sample A and sample B are normally distributed. Take differences to be sample A minus sample B. Differences 2, – 7, 6, 5, 3, 7, 4, 0 d= 1 2 − 7 + ⋅ ⋅ ⋅ + 0 ) = 2.5 8( (( ) ) sd2 = 71 2 2 + (−7)2 + ⋅ ⋅ ⋅ + 02 − 8 × 2.52 = 138 7 H0: μd = 0 H1: μd ≠ 0 2.5 − 0 = 1.593. Critical value: t7 (2.5%) = 2.365. 19.71 8 As 1.593 < 2.365 there is no reason to doubt H0. There is insufficient evidence at the 5% significance level to 5% significance level, two-tailed test. The test-statistic T = suggest the means of sample A and sample B are different. 34 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 3 Worked solutions Exercise 3.2A 1 H0: Distribution of flights is as Yusuf claims. H1: Distribution of flights is not as Yusuf claims. 5% significance level, degrees of freedom: 4 − 1 = 3, critical value 7.815 On time Under 30 mins Over 30 mins Cancelled Probability 0.5 0.2 0.2 0.1 Observed 35 10 3 2 Expected 25 10 10 5 4 0 4.9 1.8 (Ok − E k ) Ek 2 X 2 = 10.7 > 7.815. Therefore reject H0; the distribution of flight departure times is not as Yusuf claims. 2 H0: Number rolled on the dice can be modelled by a uniform distribution. H1: Number rolled on the dice cannot be modelled by a uniform distribution. 10% significance level, degrees of freedom: 6 − 1 = 5, critical value 9.236 1 2 3 4 5 6 Probability 1 6 1 6 1 6 1 6 1 6 1 6 Observed 29 44 38 34 48 47 Expected 40 40 40 40 40 40 3.025 0.4 0.1 0.9 1.6 1.225 (Ok − E k ) Ek 2 X2 = 7.25 < 9.236. Therefore, no reason to doubt H0, the dice are fair. 3 a H 0: The distribution of ‘shiny’ stickers in packs can be modelled by B(8, 0.2). H1: The distribution of ‘shiny’ stickers in packs cannot be modelled by B(8, 0.2). 5% significance level. 0 1 2 3 4 5 6 0.1678 0.3355 0.2936 0.1468 0.045 88 0.009 175 0.001 147 8.192 × 10–5 2.560 × 10–6 Observed 32 43 40 21 10 3 1 0 0 Expected 25.17 50.33 44.04 22.02 6.881 1.376 0.1720 0.012 29 0.000 384 0 Probability 7 8 Combining columns 4 to 8: 0 1 2 3 4 or more Observed 32 43 40 21 14 Expected 25.17 50.33 44.04 22.02 8.442 1.856 1.068 0.3706 0.047 26 3.659 (Ok − E k ) Ek 2 Degrees of freedom 5 − 1 = 4, critical value 9.488 X 2 = 7.001 < 9.488. Therefore no reason to doubt H0, the distribution of shiny stickers in packs is B(8, 0.2). b For the test statistic to be approximately described by the χ2-distribution, expected values must be greater than 5. 35 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 4 -Tests a A geometric distribution would be suitable if the probability of success is fixed and if successes occur independently. b Expected frequencies, under Geo(0.4) Probability Expected c 1 2 3 4 5 6 7 or more 0.4 0.24 0.144 0.0864 0.051 84 0.031 10 0.046 66 8.64 5.184 3.110 1.866 2.799 24 14.4 H0: First sale of the day can be modelled by Geo(0.4). H1: First sale of the day cannot be modelled by Geo(0.4). 2.5% significance level. Combine the classes 5, 6 and 7 or more. 1 2 3 4 5 or more Observed 10 20 16 7 7 Expected 24 14.4 8.64 5.184 7.776 (Ok − E k )2 Ek 8.167 2.178 6.270 0.6362 0.077 44 Degrees of freedom 5 − 1 = 4, critical value 11.14 X 2 = 17.33 > 11.14. Therefore reject H0; the distribution for the first sale of the day cannot be modelled by Geo(0.4). 5 a H0: Defective parts can be modelled by Po(2.5). H1: Defective parts cannot be modelled by Po(2.5). 1% significance level 0 1 2 3 4 5 6 or more Prob 0.082 08 0.2052 0.2565 0.2138 0.1336 0.066 80 0.042 02 Obs 28 49 50 44 16 8 5 16.42 41.04 51.30 42.75 26.72 13.36 8.404 8.172 1.543 0.033 10 0.036 40 4.301 2.151 1.379 Exp (Ok − E k ) Ek 2 Degrees of freedom 7 − 1 = 6, critical value 16.81 X 2 = 17.62 > 16.81. Therefore reject H0; defective parts cannot be modelled by Po(2.5). b Defects should occur singularly and randomly. 6 H0: Sarah’s cat’s food preference can be modelled by a uniform distribution. H1: Sarah’s cat’s food preference cannot be modelled by a uniform distribution. 5% significance level, degrees of freedom: 5 − 1 = 4, critical value 9.488 Turkey Fish Chicken Lamb Beef Prob 0.2 0.2 0.2 0.2 0.2 Obs 4 7 3 6 10 6 6 6 6 6 2 3 1 6 3 2 0 8 3 Exp (Ok − E k ) Ek 2 X 2 = 5 < 9.488. Therefore, no reason to doubt H0, the cat does not have a preference. 36 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 7 3 H0: The number of goals scored in a penalty shootout can be modelled by B(5, 0.7). H1: The number of goals scored in a penalty shootout cannot be modelled by B(5, 0.7). 5% significance level 0 1 2 3 4 5 Prob 0.002 43 0.028 35 0.1323 0.3087 0.3602 0.1681 Exp 0.243 2.835 13.23 30.87 36.02 16.81 Combine 0, 1 and 2: 2 or fewer Obs Exp (Ok − E k ) Ek 2 3 4 5 22 40 27 11 16.31 30.87 36.02 16.81 1.987 2.700 2.257 2.006 Degrees of freedom 4 − 1 = 3 , critical value 7.815 X 2 = 8.950 > 7.815. Therefore reject H0; goals scored in a penalty shootout cannot be modelled by B(5, 0.7). 8 ( ) 12 25 1 1 1 + + = 1, therefore 12 k = 1 so k = 25 . 2 3 4 b Expected frequencies for sample of size 50. a The sum of the probabilities is 1. k 1 + r P(X = r) 1 2 3 4 0.48 0.24 0.16 0.12 24 12 8 6 Er c H0: The data can be modelled by the random variable X. H1: The data cannot be modelled by the random variable X. 10% significance level, Degrees of freedom 4 − 1 = 3, critical value 6.251 3 1 1 3 43 + + + = 2 12 2 2 12 X 2 = 3.583 < 6.251. Therefore, no reason to doubt H0, the random variable X is a good model for the data. X2 = 9 a b c (15050− 260 ) = Φ(−2.2) = 1 − 0.9861 = 0.0139 200 − 260 0.1012 using sum of probabilities. P(150 X < 200) = Φ ( ) − 0.0139 = Φ(−1.2) − 0.0139 = 0.1012 50 P(X < 150) = Φ H0: the finishing times can be modelled by N(260, 50) H1: the finishing times cannot be modelled by N(260, 50) Time under 150 150–200 200–250 250–300 over 300 Prob 0.0139 0.1012 0.3057 0.3674 0.2119 Obs 20 83 373 476 298 Exp 17.38 126.5 382.1 459.3 264.82 (Ok − E k )2 Ek 0.3952 14.93 0.2162 0.6105 4.158 5% significance level, degrees of freedom 5 − 1 = 4, critical value 9.488 X 2 = 20.31 > 9.488. Therefore reject H0; there is evidence that the finishing times cannot be modelled by N(260, 50). 37 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 -Tests Exercise 3.2B 1 a H0: The number of buses arriving before Nury’s can be modelled by a Poisson distribution. H1: The number of buses arriving before Nury’s cannot be modelled by a Poisson distribution. 0 × 4 + 1 × 13 + 2 × 10 + 3 × 3 = 1.4, hence λ = 1.4 30 Expected frequencies under Po(1.4) b Sample mean: x = c 0 1 2 3 Probability 0.2466 0.3452 0.2417 0.1128 0.053 73 Expected 7.398 7.250 3.383 1.612 10.36 4 or more d Groups 2, 3 and 4 must be combined in order to get expected frequencies greater than 5. e Three (combined) groups, two constraints, degrees of freedom 3 − 2 = 1. f 5% significance level, critical value 3.841 0 Obs Exp (Ok − E k ) Ek 2 1 2 or more 4 13 13 7.398 10.36 12.25 1.561 0.6744 0.046 55 X 2 = 2.282 < 3.841. Therefore, no reason to doubt H0, the number of buses arriving before Nury’s bus can be modelled by a Poisson distribution. 2 a Sample mean: x = b p=1÷7= 3 3 7 c Probabilities 1 × 51 + 2 × 46 + 3 × 29 + … + 8 × 1 7 = 150 3 1 2 3 4 5 6 7 8 9 or more 0.4286 0.2449 0.1399 0.079 97 0.045 69 0.026 11 0.014 92 0.008 526 0.011 37 d Expected frequencies 1 2 3 4 5 6 7 8 9 or more 64.29 36.73 20.99 12.00 6.854 3.917 2.238 1.279 1.705 e Combined groups 6, 7, 8 and 9 or more. Degrees of freedom: 6 combined groups minus 2 constraints, so 4. f H0: The number of darts taken to hit a double can be modelled by a geometric distribution. H1: The number of darts taken to hit a double cannot be modelled by a geometric distribution. 5% significance level, critical value 9.488 X 2 = 2.746 + 2.337 + 3.056 + 0.082 54 + 0.1065 + 0.501 = 8.83 X 2 = 8.83 < 9.488. Therefore accept H0, the number of darts required to hit a double can be modelled by a geometric distribution. g 3 The geometric distribution requires the probability of success to remain constant. As the dart player throws more darts, they are probably more likely to hit a double through practice. Therefore, it is unlikely this condition would be met. 0 × 49 + 1 × 64 + 2 × 34 + 3 × 3 = 0.94, hence λ = 0.94 150 b H0: The visits by patients can be modelled by a Poisson distribution. H1: The visits by patients cannot be modelled by a Poisson distribution. a Sample mean: x = 38 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions c 3 Expected frequencies Prob Exp 0 1 2 3 4 or more 0.3906 0.3672 0.1726 0.054 07 0.015 53 8.111 2.329 58.59 55.08 25.89 d Combine groups 3 and 4 or more. 1% significance level, degrees of freedom 4 − 2 = 2, critical value 9.210 X 2 = 1.571 + 1.445 + 2.543 + 5.302 = 10.86 X 2 = 10.86 > 9.210. Therefore reject H0; the visits by patients cannot be modelled by a Poisson distribution. As a generic Poisson distribution is not a good fit, it is not surprising that in the example the null hypothesis is rejected as well. 4 H0: The number of goals scored in a penalty shootout can be modelled by a binomial distribution. H1: The number of goals scored in a penalty shootout cannot be modelled by a binomial distribution. Sample mean: x = 0 × 3 + 1 × 6 + … + 5 × 11 = 3.15, so an estimate for p is 3.15 ÷ 5 = 0.63 100 0 1 Prob 0.006 934 0.059 04 Exp 0.6934 5.904 2 3 0.2010 0.3423 20.10 34.23 2 3 4 5 0.2914 0.099 24 29.14 9.924 Combine 0 and 1: 1 or fewer Obs Exp (Ok − E k ) Ek 2 4 5 9 13 40 27 11 6.597 20.10 34.23 29.14 9.924 0.8753 2.510 0.9721 0.1576 0.1166 5% significance level, degrees of freedom 5 − 2 = 3, critical value 7.815 X 2 = 4.632 < 7.815. Therefore, no reason to doubt H0, goals scored in a penalty shootout can be modelled by a binomial distribution. This reverses the result from the previous question; a 70% chance of scoring was too high, with this sample suggesting a 63% chance would be more appropriate. 5 a c + (c + d ) + (c + 2d ) + (c + 3d ) + (c + 4d ) = 1 5c + 10d = 1 c + 2d = 0.2 b E ( X ) = 0 × c + 1 × ( c + d ) + … + 4 × ( c + 4d ) = 10c + 30d 0 × 12 + 1 × 20 + … + 4 × 3 = 1.5 60 c x= d c + 2d = 0.2   ⇒ c = 0.3, d = −0.05 10c + 30d = 1.5  e H0: The data can be modelled by the random variable X. H1: The data cannot be modelled by the random variable X. 5% significance level, degrees of freedom 5 − 2 = 3, critical value 7.815 0 1 2 3 4 P(X = r) 0.3 0.25 0.2 0.15 0.1 Or 12 20 17 8 3 18 15 12 9 6 2 5 3 25 12 1 9 3 2 r Er (Or − E r ) Er 2 X 2 = 7.361 < 7.815. Therefore, no reason to doubt H0, the random variable X is a good fit for the data. 39 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 -Tests 6 a x = 38.68 (by symmetry), y = 160 − 2 × 0.9935 + 9.696 + 38.68) = 61.26 Alternative method: ( 5010− 45 ) − Φ ( 4010− 45 ) = 160 × (2 Φ (0.5) − 1) = 61.28 . Actual value using spreadsheet, y = 61.27. y = 160 ×  Φ  (You get slightly different answers, due to the fact that normal tables round probability values to 4 d.p.) b H0: Plant growth can be modelled by N(45, 102). H1: Plant growth cannot be modelled by N(45, 102). 5% significance level, combine ‘20 or less’ and ‘20–30’, and combine ‘more than 70’ and ‘60–70’; degrees of freedom 5 − 1 = 4, critical value 9.488 X 2 = 0.009 040 + 2.421 + 0.083 96 + 0.002701 + 11.97 = 14.49 X 2 = 14.49 > 9.488. Therefore reject H0; plant growth cannot be modelled by N(45, 102). c x= 25 × 11 + 35 × 29 + 45 × 59 + 55 × 39 + 65 × 22 = 47mm 160 d z = 160 − ( 0.5547 + 6.576 + … + 1.716 ) = 13.77 Alternative method: z = 160 ×  Φ  e ( 7010− 47 ) − Φ ( 6010− 47 ) = 160 × (Φ(2.3) − Φ(1.3)) = 13.78 H0: Plant growth can be modelled by N( μ, 102). H1: Plant growth cannot be modelled by N( μ, 102). 5% significance level, combine ‘20 or less’ and ‘20–30’, and combine ‘more than 70’ and ‘60–70’; degrees of freedom 5 − 2 = 3, critical value 7.815 X 2 = 2.102 + 0.2108 + 0.021 99 + 0.9677 + 2.738 = 6.0404 X 2 = 6.0404 < 7.815. Therefore, no reason to doubt H0, plant growth can be modelled by N(μ, 102) with μ estimated to be 47. 7 a At x = 3, F(3) = 3a + 9b = 1 b f(x) = F ′(x) = a + bx2 for 0  x  3, 0 otherwise 3 E ( X ) = ⌠ x f ( x ) dx = ⌡0 3  ax 2 bx 4  9a 81b 3 ∫0 ax + bx dx =  2 + 4 0 = 2 + 4 3 (0.5 × 3 + 1.5 × 16 + 2.5 × 21) = 1.95 40 c Average time = d 3a + 9b = 1   2 1 9a 81b 39  ⇒ a = 15 , b = 15 + = 2 4 20  e H0: The data can be modelled by the random variable X. H1: The data cannot be modelled by the random variable X. 1% significance level, degrees of freedom 3 − 2 = 1, critical value 6.635 0t1 1<t2 2<t3 7 45 13 45 5 9 Ot 3 16 21 Et 56 9 104 9 200 9 1.669 1.709 0.067 22 Prob (Ot − Et )2 Et X 2 = 3.445 < 6.635. Therefore, no reason to doubt H0, the random variable X is a good fit for the data. 40 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 8 a x= 3 5 × 0 + 15 × 12 + 25 × 48 + … + 55 × 7 = 35 200 s=  200  52 × 0 + 152 × 12 + 252 × 48 + … + 552 × 7 − 352  = 9.563 (4 s.f.) 199  200  b H0: Income distribution can be modelled by N( μ, σ 2) H1: Income distribution cannot be modelled by N( μ, σ 2) Less than 10 Prob 0.004 472 Exp 0.8943 10–20 20–30 30–40 40–50 50–60 60 or more 0.053 91 0.2422 0.3989 0.2422 0.053 91 0.004 472 10.78 48.43 79.78 48.43 10.78 0.8943 Combine ‘less than 10’ and ‘10–20’ and combine ‘60 or more’ and ‘50–60’: Less than 20 Obs Exp (Om − E m ) Em 2 20–30 30–40 40–50 50 or more 12 48 75 58 7 11.68 48.43 79.78 48.43 11.68 0.009 026 0.003 864 0.2869 1.890 1.872 5% significance level, degrees of freedom 5 − 3 = 2, critical value 5.991 X 2 = 4.062 < 5.991. Therefore no reason to doubt H0, income distribution can be modelled by N(μ, σ2). c P(M 22.5) = Φ ( 22.59.563− 35 ) = Φ(−1.307) = 0.095 59 200 × 0.095 56 = 19.12, therefore approximately 19 families. Exercise 3.3A 1 60 × 80 a = 32, row total multiplied by column total divided by grand total is expected frequency. 150 70 × 60 x= = 28 150 b c 2 (34 − 32)2 1 (Ok − E k )2 (26 − 28)2 1 = = 0.125 (4 s.f.). y = 0.1429 = = = 32 8 Ek 28 7 H0: Age group and clinic time attendance are independent. H1: Age group and clinic time attendance are not independent. 5% significance level, degrees of freedom (4 − 1) × (2 − 1) = 3, critical value 7.815 X 2 = 12.84 > 7.815. Therefore reject H0; age group and clinic time attendance are not independent. a 365 × 460 = 115, row total multiplied by column total divided by grand total is expected frequency. The 1460 column totals (365) are identical for each region, so expected frequency will be equal in each row. b Expected frequencies: 115, 187.5, 62.5. 2 X England = (120 − 115)2 (199 − 187.5)2 (46 − 62.5)2 + + = 5.279 115 187.5 62.5 2 X Scotland = (103 − 115)2 (215 − 187.5)2 (47 − 62.5)2 + + = 9.130 115 187.5 62.5 2 X Wales = (103 − 115)2 (184 − 187.5)2 (78 − 62.5)2 + + = 5.162 115 187.5 62.5 X N2 .Ireland = (134 − 115)2 (152 − 187.5)2 (79 − 62.5)2 + + = 14.22 115 187.5 62.5 Hence X2 = 33.79. 41 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 -Tests c Degrees of freedom (4 − 1) × (3 − 1) = 6 d H0: Region and rainfall are independent. H1: Region and rainfall are not independent. 0.1% significance level, 6 degrees of freedom, critical value 22.46 X 2 = 33.79 > 22.46. Therefore reject H0; region and rainfall are not independent. There is strong evidence to justify this conclusion. 3 H0: Gender and political preference are independent. H1: Gender and political preference are not independent. 5% significance level, degrees of freedom (5 − 1) × (2 − 1) = 4, critical value 9.488 Key Observed  (Ok − E k )2    Ek  A Male B 27 Female 30.5 (0.4298) 28.5 34 30.5 (0.4298) 28.5 61 D 24 12 (0.5976) 20.5 (0.2857) 17 16 (0.5976) 20.5 (0.2857) 41 28 25 (0.4016) Total C 32 (0.4016) Expected 57 E Total 5 14 (0.3462) 100 6.5 8 14 (0.3462) 13 100 6.5 200 X 2 = 4.122 < 9.448. Therefore, no reason to doubt H0; there is no relationship between gender and political preference. 4 H0: Consumption type and taste change are independent. H1: Consumption type and taste change are not independent. 1% significance level, degrees of freedom (3 − 1) × (3 − 1) = 4, critical value 13.28 Key Observed  (Ok − E k )2    Ek  Better High Medium 20.625 (2.133) 16.5 (1.5) 55 30 (0.8776) 24 (2.885) (0.1538) 80 Total 75 24.375 12 28 17.875 Worse 29 30 13 (1.330) Total 22 18 (0.1364) Low No Change 24 (0.5523) Expected 60 19.5 24 26 (0.3913) 65 65 21.125 200 X 2 = 9.959 < 13.28. Therefore, no reason to doubt H0, volume of consumption is not related to taste change. 5 a 34 × 25 = 6.8, row total multiplied by column total divided by grand total is expected frequency. 125 42 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 3 b The expected frequencies are less than 5, and in order for the test statistic to be approximated by the χ2-distribution, this cannot be the case. c H0: Advertising and sales are independent. H1: Advertising and sales are not independent. 10% significance level, degrees of freedom (4 − 1) × (3 − 1) = 6 , critical value 10.64 Key Observed  (Ok − E k )2    Ek  0a<5 None Low & High 6.8 (0.4765) 8 6.8 (0.5718) 8 (0.2042) 12 (0.3176) 10 8.16 (0.4099) 11 (0) 10.2 15  a  20 6 8 7 (1.004) 10  a < 15 5 5 (1.125) Medium 5  a < 10 13 (5.653) Expected 16 9.6 (0.1778) 13 10.2 (0.04719) 12.24 14.4 19 12.24 (0.02231) 18.36 X 2 = 10.01 < 10.64. Therefore, no reason to doubt H0; there is insufficient evidence at the 10% significance level to state that advertising and sales are linked. d H0: Advertising and sales are independent. H1: Advertising and sales are not independent. 10% significance level, degrees of freedom (4 − 1) × (3 − 1) = 6, critical value 10.64 Key Observed  (Ok − E k )2    Ek  None 0  a < 10 10  a < 15 18 6 (1.424) Low 16 10 8.16 (0.2042) 11.6 (2.345) 9.6 (2.038) 12.24 (0.1778) 14.4 7 6.96 2 8.8 (0.4099) 16 11 8 (0.072 73) (0.5718) 15  a  20 11 11 (0.031 03) High 13.6 13 (0.5625) Medium Expected (1.133) 10.44 12 5.28 (2.102) 7.92 X 2 = 11.07 > 10.64. Therefore reject H0; there is sufficient evidence at the 10% significance level to state that advertising and sales are linked. e By combining different rows or columns, opposite conclusions are reached. 43 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 6 -Tests a H0: Gender and subject preference are independent. H1: Gender and subject preference are not independent. 5% significance level, degrees of freedom (3 − 1) × (2 − 1) = 2, critical value 5.991 Key Observed  (Ok − E k )2    Ek  Maths & History Male Female Science 14 (0.050 78) Geography 5 13.18 6 (0.4848) 15 (0.042 32) Expected 6.818 (0.2) 10 15.82 5 5 (0.4040) 8.182 (0.1667) 6 X 2 = 1.349 < 5.991. No reason to doubt H0, there is no relationship between gender and subject preference. b H0: Gender and subject preference are independent. H1: Gender and subject preference are not independent. 5% significance level, degrees of freedom (3 − 1) × (2 − 1) = 2, critical value 5.991 Key Observed  (Ok − E k )2    Ek  History Male Female Science 12 (0.9309) Maths & Geography 5 9.091 8 (0.7758) Expected (0.4848) 8 6.818 (0.1309) 10 10.91 (0.4040) 9.091 12 8.182 (0.1091) 10.91 X 2 = 2.836 < 5.991. No reason to doubt H0, there is no relationship between gender and subject preference. Both tests return the same result, despite different column groupings. However, the second test has a p-value of approximately 24%, whereas the first has a p-value of approximately 51%, meaning the grouping of columns (which here are arbitrary) could affect conclusions at weaker significance levels (for example 25%). Exam-style questions 1 a Geometric distribution b H0: The first time a head is tossed can be modelled by Geo(0.4). H1: The first time a head is tossed cannot be modelled by Geo(0.4). 5% significance level 44 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 3 Worked solutions 1 2 3 4 5 6 Prob 0.4 0.24 0.144 0.0864 0.051 84 Obs 70 42 33 21 20 Exp 80 48 28.8 17.28 10.37 1 2 3 4 7 8 9+ 0.018 66 0.011 20 0.016 80 5 2 0 6.221 3.732 2.239 3.359 6 7+ 0.031 10 7 5 Obs 70 42 33 21 20 Exp 80 48 28.8 17.28 10.37 6.221 0.75 0.6125 0.8008 8.948 0.097 60  (Ok − E k )2    1.25 Ek  7 7 9.331 0.5824 Combine 7, 8 and 9+, degrees of freedom 7 − 1 = 6, critical value 12.59 X 2 = 13.04 > 12.59, therefore reject H0; there is evidence that the first time a head is tossed cannot be modelled by Geo(0.4). 2 H0: The number of broken teacups in a pack of four can be modelled by B(4, 0.15). H1: The number of broken teacups in a pack of four cannot be modelled by B(4, 0.15). 5% significance level Prob 0 1 2 0.5220 0.3685 0.0975 3 4 0.011 48 0.000 506 3 Obs 42 16 5 2 0 Exp 33.93 23.95 6.340 0.7459 0.032 91 0 1 2 or more Obs 42 16 7 Exp 33.93 23.95 7.119  (Ok − E k )2    Ek  1.919 2.639 0.001 980 Combine 2, 3 and 4, degrees of freedom 3 − 1 = 2, critical value 5.991 X 2 = 4.561 < 5.991, therefore no reason to doubt H0, the number of broken teacups in a pack of four can be modelled by B(4, 0.15). 3 H0: Gender and vegetable preference are independent. H1: Gender and vegetable preference are not independent at the 5% significance level. Key Observed  (Ok − E k )2    Ek  Male Tomatoes Carrots 24 22 10 (2.701) Mushrooms Female 26 (0.7273) (0.5714) 28 28 16.72 19 (0.4544) Expected (2.122) 21.28 18 16.28 (0.3571) 20.72 45 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 -Tests Degrees of freedom (3 − 1) × (2 − 1) = 2, critical value 5.991 X 2 = 6.933 > 5.991, therefore reject H0; there is a relationship between gender and vegetable choice. 4 a H0: House type and supermarket shopped at are independent. H1: House type and supermarket shopped at are not independent. b Total columns minus one multiplied by total combined rows minus one: degrees of freedom (3 − 1) × (3 − 1) = 4 5 c Critical values: 2.5% significance level 11.14, 1% significance level 13.28, 0.5% significance level 14.86. 13.28 < 13.95 < 14.86, so 0.5% is the largest significance level for which there would be no reason to doubt H0. (Note: from calculator 0.75% is the solution.) a P(150 < l 170) = Φ (17020− 160 ) − Φ(15020− 160 ) = Φ(0.5) − Φ(−0.5) = 2 × 0.6915 − 1 = 0.3830 Expected value: 100 × 0.3830 = 38.30 100 × P(l 150) = 100 × Φ(−0.5) = 100 × 0.3085 = 30.85 100 × P(170 < l 190) = 100 × [ Φ(1.5) − Φ(0.5) ] = 24.17 100 × P (l > 190 ) = 100 × [1 − Φ(1.5 )] = 6.681 b H0: The length of Aesculapian snakes can be modelled by N(160, 202). H1: The length of Aesculapian snakes cannot be modelled by N(160, 202). 2.5% significance level, degrees of freedom 4 − 1 = 3, critical value 9.348 X2 = 1.997 + 0.075 46 + 0.1386 + 2.794 = 5.005 X2 < 9.348, therefore no reason to doubt H0, the length of Aesculapian snakes can be modelled by N(160, 202). c Confidence interval  20 20  165 − 1.96 × 100 ,165 + 1.96 × 100  [161, 169] The proposed population mean of 160 lies outside the 95% confidence interval, and so it would seem unlikely that it would make a suitable model to measure the lengths of the snakes. 6 a 3 ∫0 ky 2 4 d y + ∫ k(12 − y) d y = 1 3 3 4 y3   y2  35 k   + k 12y −  = k (9 − 0) + k (40 − 31.5) = k = 1 therefore k = 2 3 2  2 35  0  3 3 2 ∫0 35 y 2 3 2 3 18 dy =  y =  105 0 35 b P(Y 3) = c From data given in the table P(Y 2) = 6 + 10 16 = 105 105 18 16 38 − = 35 105 105 18 17 and P(3 < Y 4) = 1 − P(Y 3) = 1 − = 35 35 so the remaining two expected values are 38 and 51 respectively. H0: Amount of chemical produced can be modelled by the random variable Y. H1: Amount of chemical produced cannot be modelled by the random variable Y. 5% significance level, degrees of freedom 4 − 1 = 3, critical value 7.815 Therefore P(2 < Y 3) = 46 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 3 X 2 = 1.5 + 1.6 + 3.789 + 7.078 = 13.97 X 2 > 7.815, therefore reject H0, so the amount of chemical produced cannot be modelled by the random variable Y. 7 a x = 0 × 29 + 1 × 38 + 2 × 33 + 3 × 13 + 4 × 8 + 5 × 4 = 1.56 125  125  02 × 29 + 12 × 38 + 22 × 33 + 32 × 13 + 42 × 8 + 52 × 4 2 s = − 1.562  = 1.7 124  125  b Poisson distribution has equal expectation and variance, and the sample mean and variance are close. c P(X = 2) = e −1.56 × 1.56 2 = 0.2557, and expected value is 125 × 0.2557 = 31.96 2! x = 125 − 26.27 − 40.98 − 31.96 − 16.62 − 6.482 − 2.022 = 0.6705 d Groups 4, 5 and 6 or more must be combined (to form 4 or more) so that expected frequencies are greater than five. e H0: The number of A* gained can be modelled by a Poisson distribution. H1: The number of A* gained cannot be modelled by a Poisson distribution. 10% significance level, degrees of freedom 5 − 2 = 3, critical value 6.251 X 2 = 0.2844 + 0.2162 + 0.033 73 + 0.7885 + 0.8701 = 2.193 X 2 < 6.251, therefore no reason to doubt H0, the number of A* gained can be modelled by a Poisson distribution. f 8 Expected frequencies would change. One more degree of freedom. a H0: Opening day and level of demand are independent. H1: Opening day and level of demand are not independent. 5% significance level Key Observed  (Ok − E k )2    Expected Ek  Low Tuesday Wednesday Thursday 15 18 12 (0) Normal 30 (0.6154) High 15 15 25 26 5 (1.778) (0.6) (0.038 46) (0.4444) 15 23 26 7 9 (0.6) (0.3462) 26 15 9 (4) 9 Degrees of freedom (3 − 1) × (3 − 1) = 4, critical value 9.488 X 2 = 8.422 < 9.448, therefore no reason to doubt H0, there is not a relationship between level of demand and the weekday of opening. b The conclusion states there is no relationship between level of demand and opening day, meaning no specific day gains a higher or lower demand. Therefore, it would be difficult to choose which one of the days to open on. Also, the test does not give any information on whether the demand is sufficient for the club to make a profit. 47 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 2 3 9 -Tests a H0: The number of cases assigned can be modelled by a uniform distribution. H1: The number of cases assigned cannot be modelled by a uniform distribution. 5% significance level A B C Observed 43 32 45 Expected 40 40 40 0.225 1.6 0.625  (O − E )2  k  k  Ek   Degrees of freedom 3 − 1 = 2, critical value 5.991 X 2 = 2.45 < 5.991, therefore no reason to doubt H0, the number of cases assigned can be modelled by a uniform distribution. b H0: Cases solved and officer are independent. H1: Cases solved and officer are not independent. 1% significance level Key Observed  (Ok − E k )2    Ek  A Solved B 31 21.5 16 (5.878) 14 (4.198) 21.5 60 22.5 34 (0.25) 43 Total 11 (0.25) 12 Total C 18 (4.198) Unsolved Expected 16 (5.878) 32 45 60 22.5 120 Degrees of freedom (3 − 1) × (2 − 1) = 2, critical value 9.210 X 2 = 20.65 > 9.210, therefore reject H0, so there is a relationship between the proportion of cases solved and assigned officer. 10 a ΣP(X = r) = a + 2a + 3a + 4b + 5b + 6b = 1 ⇒ 6a + 15b = 1 b E(X) = ΣrP(X = r) = a + 4a + 9a + 16b + 25b + 36b = 14a + 77b c 1 × 3 + 2 × 13 + 3 × 11 + 4 × 10 + 5 × 16 + 6 × 7 56 = 60 15 6a + 15b = 1   1 1 56  ⇒ a = 12 , b = 30 14a + 77b =  15  x= d Use these values to calculate expected frequencies: 1 Obs Exp (Or − E r ) Er 2 4 5 3 13 2 3 11 10 16 6 7 5 10 15 8 10 12 0.8 0.9 1.067 0.5 3.6 2.083 48 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 3 X2 = 8.95, degrees of freedom 6 − 2 = 4 critical values: 10% significance level 7.779, 5% significance level 9.488 7.779 < X2 < 9.488, therefore 5% significance level is the greatest at which there would be no reason to doubt H0, the data is modelled by the random variable X. 11 a R ow total multiplied by column total divided by grand total gives expected frequency. Here 48 × 70 32 = = 10.67 to 2 d.p. 315 3 b Expected frequencies for the row x > 150 are all less than five, and so must be combined to get expected frequencies of over five in order for the test statistic to be compared to the χ2-distribution. c X F2 = (11 − 9.90)2 + (19 − 27.24 )2 + ( 35 − 27.86 )2 = 4.44 to 2 d.p. 9.90 27.24 27.86 d 1% significance level, degrees of freedom (5 − 1) × (3 − 1) = 8, critical value 20.09. X2 > 20.09, so she would have rejected the null hypothesis under these conditions. 12 a x = 25 × 10 + 35 × 42 + 45 × 31 + 55 × 17 = 40.5 100  100  252 × 10 + 352 × 42 + 452 × 31 + 552 × 17 − 40.52  = 8.92 to 3 s.f. 99  100  60 − 40.5 50 − 40.5   −Φ b 100 × P (50 length < 60) = 100 × Φ 8.92 8.92   s= ( ) ( ) = 100 × [Φ(2.186) − Φ(1.065)] = 100 × (0.9856 − 0.8566) = 12.90 c H0: Algae length can be modelled by a normal distribution. H1: Algae length cannot be modelled by a normal distribution. 1% significance level, combine first two and last two cells, three constraints, degrees of freedom 4 − 3 = 1, critical value 6.635 X2 = 0.3202 + 1.070 + 1.253 + 0.4938 = 3.137 X2 < 6.635, therefore no reason to doubt H0, algae length can be modelled by a normal distribution. d Expected values will change, and there will be two fewer constraints, so degrees of freedom will increase by two. 13 a H0: Mobile phone signal strength and service provider are independent. H1: Mobile phone signal strength and service provider are not independent. 1% significance level Key Observed  (Ok − E k )2    Ek  A Good Medium 55 56 (0.018 52) B 54 (0.074 07) 36 (0.1111) 35 (0.027 78) Expected Bad 39 54 (0.2143) 36 (0.3214) 34 42 31 28 Degrees of freedom (3 − 1) × (2 − 1) = 2, critical value 9.210 X 2 = 0.7302 < 9.210, therefore no reason to doubt H0, there is no relationship between signal and provider. b Increasing sample size by a factor of n and keeping observed frequencies in the same proportion increases X 2 by a factor of n. 0.7302n > 9.210 ⇒ n > 12.61 therefore n = 13. 49 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 3 2 14 -Tests a H0: Treatment and reaction are independent. H1: Treatment and reaction are not independent. b Column total multiplied by row total divided by grand total gives expected frequency, here 41 × 41 = 11.21 to 2 d.p. 150 c X S2 = (25 − 21.05)2 + (7 − 11.21)2 + (9 − 8.75)2 = 2.33 to 2 d.p. 21.05 11.21 8.75 d Degrees of freedom (3 − 1) × (3 − 1) = 4 Critical value at 5% is 9.488 and at 2.5% is 11.14. Since 9.488 < X 2 < 11.14, the null hypothesis would be rejected at the 5% significance level, but not at the 2.5% significance level. Therefore 5% is the smallest significance level from the tables for which the null hypothesis would be rejected. e H0: Proportions of people given the treatments A, B and C are in the ratio 3:2:1. H1: Proportions of people given the treatments A, B and C are not in the ratio 3:2:1. 5% significance level A B C 77 41 32 75 50 25 0.053 33 1.62 1.96 Observed Expected  (Ok − E k )    Ek  2 Degrees of freedom 3 − 1 = 2, critical value 5.991 X 2 = 3.633 < 5.991, therefore no reason to doubt H0, the proportion of people given each treatment is in the ratio 3:2:1. Mathematics in life and work H0: Age and demand are independent H1: Age and demand are not independent 5% significance level, combine 40–49 and 50+ rows. Key Observed  (Ok − E k )2    Ek  Low <20 12 (2) 20–29 40+ 8 7.667 7 (1.093) 10.67 (0.7621) 6.708 (0.3487) 9 (3.350) 8.625 13 9.333 14 13.67 (0.4444) 14 12 16 (0.3983) (0.5714) 7 4 7 (1.262) High 5 5 (0.9277) 30–39 Medium Expected (0.083 33) 12 11 11.96 (1.247) 15.38 Degrees of freedom (4 − 1) × (3 − 1) = 6, critical value 12.59 X 2 = 12.49 < 12.59, therefore just about no reason to doubt H0, there is no relationship between age and demand. 50 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 3 H0: The ages of customers attending the coffee shop can be modelled by a uniform distribution. H1: The ages of customers attending the coffee shop cannot be modelled by a uniform distribution. 5% significance level <20 Observed Expected  (Ok − E k )    Ek  20–29 30–39 40–49 50+ 24 23 32 27 14 24 24V 24 24 24 0 0.041 67 2.667 0.375 4.167 2 Degrees of freedom 5 − 1 = 4, critical value 9.488 X 2 = 7.25 < 9.488, therefore no reason to doubt H0, the ages of customers attending the coffee shop can be modelled by a uniform distribution. 51 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests 4 Non-parametric tests Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Where values from the Cambridge International Education statistical tables are used, the same level of accuracy has been used in workings unless stated otherwise. Prerequisite knowledge 1 P(X  12) = P(X = 12) + P(X = 13) + P(X = 14) + P(X = 15) 15 1 576 = × { 455 + 105 + 15 + 1} = = 0.0176 2 32768 2 1 156 + 46 + … + 175 ) = 121 10 ( 1 12102  21350 = 2372.2… s 2 =  167760 − = 9 10  9 () x= H0 : μ = 150 H1 : μ < 150 121 − 150 = −1.883. Critical value: 2372 10 t9 (5%) = –1.833. As –1.883 < –1.833, reject H0. There is (just) sufficient evidence at the 5% significance level to reject the null hypothesis that the mean is 150 in favour of the alternative hypothesis that the mean is less than 150. 5% significance level, one-tailed test. The test statistic T = 3 Perform a paired t-test, assuming the differences in the calorie intakes in January and June are normally distributed. Take differences to be January minus June. Differences: 126, 63, 189, – 92, – 49, 6, 93, 141 d= 1 126 + 63 + … + 141) = 59.625 8( sd2 = 1 126 2 + 632 + … + 1412 − 8 × 59.6252 = 9508 7 (( ) ) H0 : md = 0 H 1 : md > 0 10% significance level, one-tailed test. Test statistic T = 59.625 − 0 = 1.730. 9508 8 Critical value: t7 (10%) = 1.415. As 1.730 > 1.415 reject H0. There is sufficient evidence at the 10% significance level to support Juliet’s hypothesis that people consume fewer calories in summer than winter. 52 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 4 Exercise 4.1A 1 a H0: median revision time is 30 hours a week H1: median revision time is less than 30 hours a week X ∼ B(9, 0.5), 5% significance level, one-tailed test Signed differences: −14, −11, −16, −18, −1, 9, −9, −20, −21 One positive sign. P(X  1) = 0.01953 < 0.05 Reject H0. Sufficient evidence that students are doing less than 30 hours revision a week. b H 0: median revision time is 20 hours a week H1: median revision time is not 20 hours a week X ∼ B(9, 0.5), 5% significance level, two-tailed test Signed differences: −4, −1, −6, −8, 9, 19, 1, −10, −11 Three positive signs. P ( X 3 ) = 0.2539 > 0.025 No reason to doubt H0. Insufficient evidence to refute the claim that median revision time is 20 hours per week. 2 H0: Standard and premium brand yield same total distance. H1: Premium brand yields higher distance than standard. X ∼ B(8, 0.5), 5% significance level, one-tailed test Signed differences: 15, −7, 24, 23, −16, 21, 10, 33 Two negative signs. P ( X 2 ) = 0.1445 > 0.05. No reason to doubt H0. Insufficient evidence to show premium is better than standard. The cars with the lowest distance travelled with standard are the ones that travel less far with premium. Perhaps premium works better with more efficient cars. 3 H0: Wheatees and Crunchos are equally popular. H1: There is a difference in popularity between Wheatees and Crunchos. X ∼ B(12, 0.5), 5% significance level, two-tailed test Two signs for Crunchos. P(X  2) = 0.01929 < 0.025 Reject H0. Sufficient evidence to say there is a difference in preferences for the two breakfast cereals. 4 H0: Chesford and Amerston have the same median crime rate. H1: Chesford and Amerston do not have the same median crime rate. X ∼ B(12, 0.5), 5% sig level, two-tailed test Signed differences: 0.78, 0.27, 0.48, −1.29, −0.15, 0.50, −0.04, 0.07, 2.09, 1.20, 1.04, 0.02 Three negative signs. P ( X 3 ) = 0.07300 > 0.025 No reason to doubt H0. Insufficient evidence to show that Chesford and Amerston have different median crime rates. 5 H0: Insulation does not affect median heat loss. H1: Homes with insulation have a lower median heat loss. X ∼ B(10, 0.5), 2% significance level, one-tailed test Signed differences: 0.9, 3.2, 1.4, 0.1, −1.3, 1.2, 1.7, 5.1, 0.4, −1.9 Two negative signs. P(X  2) = 0.05469 > 0.02 No reason to doubt H0. Insufficient evidence to show that insulation reduces median heat loss. 6 a 2.17, 3.50, 2.06, 0.55, 2.05 b 0.17, 1.50, 0.06, −1.45, 0.05 c H0: Derivative A performs two percentage points better than derivative B. H1: Derivative A performs more than two percentage points better than derivative B. X ∼ B(5, 0.5), 10% significance level, one-tailed test One negative sign. P ( X 1) = 0.1875 > 0.1 No reason to doubt H0. Insufficient evidence to show that derivative A has median performance more than two percentage points better than derivative B. 53 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests 7 H0: WBC count is normal, with a median of 7 million per 1 ml. H1: WBC count is abnormal; median is not 7 million per 1 ml. X ∼ B(25, 0.5), 10% significance level, two-tailed test Signed differences: −0.27, –0.08, 0.21, −0.06, 0.45, 0.32, −0.49, −0.29, −0.09, −0.13, −0.05, 0.93, 0.35, −0.37, −0.18, −0.39, −0.03, 0.23, 0.02, −0.24, −0.28, −0.10, −0.59, −0.34, −0.16 Seven positive signs Normal approximation: X ∼ N(12.5, 6.25)  7.5 − 12.5  P ( X 7) = Φ    6.25  = Φ ( −2 ) = 1 − Φ ( 2 ) = 1 − 0.9772 = 0.0228 < 0.05 Reject H0. The patient has an abnormal white blood cell count. 8 X ∼ B(n, 0.5): P(X = 0) = (0.5)n Note: 0 negative signs, implies one-tailed test 0.5n < 0.001 n ln 0.5 < ln 0.001 ln 0.001 n> = 9.966 ln 0.5 Therefore n = 10. X ∼ B(30, 0.5); therefore, by the normal approximation X ∼ N(15, 7.5)  8.5 − 15  P ( X 8) = Φ  = Φ ( −2.373 )  7.5  = 1 − 0.9912 = 0.0088 Two-tailed test, therefore significance level is 2 × 0.0088 = 0.0176 , or 1.76%. Exercise 4.2A 1 Signed diff 4 −7 −5 −13 1 3 Unsigned diff 4 7 5 13 1 3 Unsigned rank 3 6 4 10 1 2 Signed rank 3 −6 −4 −10 1 2 −20 −31 24 10 20 31 24 10 11 15 14 8 −11 −15 14 8 −12 −23 6 −21 −8 21 8 12 23 6 9 13 5 12 7 −9 −13 5 −12 −7 H0: Employees stay with the company for a median period of a year (52 weeks). H1: Employees stay with the company for a median period of less than a year. 1% significance level, one-tailed test, n = 15, critical value from table: 19 P = 33, Q = 87, therefore T = 33 T > 19, so no reason to doubt H0. Employees stay with the company for a year on average. Assumption: weeks worked at the company are distributed symmetrically. 2 H0: Median is the same. H1: Median is different. 10% significance level, two-tailed test,n = 18, critical value from table: 47 P = 133, Q = 38, therefore T = 38 T < 47, so reject H0. The median is different. 3 H0: Chesford and Amerston have the same median crime rate. H1: Chesford and Amerston do not have the same median crime rate. 5% significance level, two-tailed test, n = 12, critical value from table: 13 54 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Worked solutions Signed diff 0.78 0.27 0.48 Unsigned diff 0.78 0.27 0.48 −1.29 −0.15 0.50 1.29 −0.04 0.07 2.09 1.20 1.04 0.02 0.04 0.07 2.09 1.20 1.04 0.02 0.15 0.50 Unsigned rank 8 5 6 11 4 7 2 3 12 10 9 1 Signed rank 8 5 6 −11 −4 7 −2 3 12 10 9 1 P = 61, Q = 17, therefore T = 17 T > 13, no reason to doubt H0. Median crime rates in Chesford and Amerston are equal. 4 H0: Literacy rates between genders (people aged 35–39) in South America are equal. H1: Literacy rates between genders (people aged 35–39) in South America are not equal. 5% significance level, two-tailed test, n = 9, critical value from table: 5 Signed diff 1.59 −1.28 2.70 1.67 3.62 1.60 −0.19 0.46 1.79 Unsigned diff 1.59 1.28 0.19 2.70 1.67 3.62 1.60 0.46 1.79 Unsigned rank 4 3 8 6 9 5 1 2 7 Signed rank 4 −3 8 6 9 5 −1 2 7 P = 41, Q = 4, therefore T = 4 T < 5, reject H0. There is sufficient evidence at the 5% significance level that literacy rates between genders (people aged 35–39) in South America are not equal. 5 a H0: Median consumption of refined petroleum products is 35 barrels a day. H1: Median consumption of refined petroleum products is less than 35 barrels a day. X ∼ B(12, 0.5), 5% significance level, one-tailed test Signed differences: −9.99, 23.33, −6.24, −7.59, −5.66, −3.54, −11.06, 25.50, −10.40, −5.03, −2.37, −9.38 Two positive signs. P ( X 2 ) = 0.0193 < 0.05 Reject H0. Sufficient evidence to demonstrate median consumption of refined petroleum products is less than 35 barrels a day. b Same hypotheses. 5% significance level, one-tailed test, n = 12, critical value: 17 Sign diff Unsign diff Unsign rank Sign rank −9.99 23.33 −6.24 9.99 23.33 −7.59 −5.66 −3.54 −11.06 25.5 −10.4 6.24 7.59 5.66 3.54 11.06 25.5 −5.03 −2.37 −9.38 10.4 5.03 2.37 9.38 8 11 5 6 4 2 10 12 9 3 1 7 −8 11 −5 −6 −4 −2 −10 12 −9 −3 −1 −7 P = 23, Q = 55, therefore T = 23 T > 17, no reason to doubt H0. Median consumption of refined petroleum products is 35 barrels a day. c T he two biggest consumers are Netherlands and Belgium. They only count for two positive signs in the sign test, but are the two largest deviations from the median in the Wilcoxon signed-rank test, so count more significantly towards this test. Netherlands and Belgium could be considered outliers, so the sign test would be more persuasive. Or, thinking of ‘average’ consumption, the fact that Netherlands and Belgium consume so much means the Wilcoxon signed-rank test might be more appropriate. 6 H0: Median is as given. H1: Median is less than the value given (also acceptable: greater than). 4% significance level, one-tailed test. Normal approximation required µ = 1 × 50 × ( 50 + 1) = 637.5 and 4 σ 2 = 1 × 50 × ( 50 + 1) × ( 2 × 50 + 1) = 10 731.25 24 T = 423 55 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests  423.5 − 637.5  P (T 423 ) = Φ    10 731.25  = Φ ( −2.066 ) = 1 − 0.9806 = 0.0194 < 0.04 Therefore reject H0. Median is not as the given value. 7 a H 0: WBC count is normal, with a median of 7 million per 1 ml. H1: WBC count is abnormal; median is not 7 million per 1 ml. 10% significance level, two-tailed test, n = 25. Normal approximation required µ = 1 × 25 × 26 = 162.5 and 4 1 × 25 × 26 × 51 = 1381.25 σ 2 = 24 Signed ranks: −14, −5, 11, −4, 22, 17, −23, −16, −6, −8, −3, 25, 19, −20, −10, −21, −2, 12, 1, −13, −15, −7, −24, −18, −9 P = 107, Q = 218, therefore T = 107  107.5 − 162.5  P (T 107 ) = Φ    1381.25  = Φ( −1.480 ) = 1 − 0.9306 = 0.0694 > 0.05 Therefore no reason to doubt H0. WBC count is normal. b T he Wilcoxon signed-rank test has a lower probability of a Type II error (incorrectly rejecting a true null hypothesis). Given the data does not look asymmetric, the Wilcoxon signed-rank test would be more appropriate here. 8 a H 0: Population median is as given H1: Population is not as given 1% sig level, two-tailed test, n = 30. Normal approximation required µ = 1 × 30 × 31 = 232.5 and 4 1 × 30 × 31 × 61 = 2363.75 σ 2 = 24 T = 105  105.5 − 232.5  P (T 105 ) = Φ    2363.75  = Φ( −2.612 ) = 1 − 0.9955 = 0.0045 < 0.005 Therefore reject H0. The population median is not as given. b SN = 1 N ( N + 1) 2 The maximum number of positive ranks would occur if these were all the lowest ranks (because T is the smaller of P and Q). If the lowest N ranks were all positive, then 1 N N + 1) = 105 2 ( N 2 + N = 210 N 2 + N − 210 = 0 ( N − 14 )( N + 15) = 0 As N is positive, therefore N = 14, so at most 14 positive ranks. 56 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 4 c X ∼ B(30, 0.5), two-tailed test. Normal approximation required.  14.5 − 15  P ( X 14 ) = Φ    7.5  = Φ ( −0.1826 ) = 1 − 0.5726 = 0.4274 Therefore the probability of a Type I error (rejecting a true null hypothesis) is 2 × 0.4274 = 0.855. 9 a b 1 2 3 P Q T − − − 0 6 0 + − − 1 5 1 − + − 2 4 2 + + − 3 3 3 − − + 3 3 3 + − + 4 2 2 − + + 5 1 1 + + + 6 0 0 1 2 3 4 P Therefore, for a two-tailed test with rejection region T  2, P (T 2 ) = 6 = 63 8 2 T Q + + + + 10 0 0 − + + + 9 1 1 + − + + 8 2 2 + + − + 7 3 3 − − + + 7 3 + + + − 6 4 − + − + 6 4 3 + c − − + 5 Therefore, for a two-tailed test with rejection region T  2, 6 6 P (T 2 ) = = 16 2 4 4 4 5 5 − + + − 5 5 5 + − + − 4 6 4 − − − + 4 6 4 1 2 3 4 P Q T + + − − 3 7 3 − − + − 3 7 3 − + − − 2 8 2 + − − − 1 9 1 − − − − 0 10 0 1 2 3 4 5 − − − − − 0 15 0 + − − − − 1 14 1 − + − − − 2 13 2 + − + + + 13 2 2 − + + + + 14 1 1 + + + + + 15 0 0  P Q  T Each increase in n raises the total number of possible outcomes by an additional power of two; however, the number of different ways of getting signed-ranks of two or less remains constant at six. Therefore, for a two-tailed test with rejection region T  2, 6 6 P (T 2 ) = = 32 2 5 57 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests d For a sample of size n, P (T 2 ) = 6n 2 e P (T 2 ) = 6n < 0.001 2 6 < 0.001 2n 6 < 0.001 × 2 n ln6 < ln0.001 + n ln2 ln6 − ln0.001 < n ln2 ln6 − ln0.001 <n ln2 n > 12.55 Therefore, n = 13. Exercise 4.3A 1 a β = 81 b γ = 55 c m (n + m + 1) = 7 × ( 9 + 7 + 1) = 119. 119 − 81 = 38 and 119 − 55 = 64 d There are fewer boys, so should use 119 − 81 = 38 or 81. But as 38 is lower, the test statistic is W = 38. e Critical value for 5% significance level one-tailed test: 43. f H0: Girls and boys raise the same amount of sponsorship. H1: Girls raise more sponsorship than boys. W < 43, so reject H0. There is sufficient. evidence at the 5% significance level that girls raise more sponsorship than boys. 2 a Rank 1 2 3 4 5 6 7 8 9 10 98A 95B 93A 90B 87B 84B 81A 78A 77A 67A  10 b   = 210  6 c 17 d H0: Quarry A and quarry B have the same purities of iron ore. H1: Quarry A and quarry B have different purities of iron ore. 10% significance level, two-tailed test, m = 4, n = 6, critical value: 13, m (n + m + 1) − 17 = 27, therefore W = 17 As W > 13, no reason to doubt H0; the quarries have the same purity of iron ore. e No assumptions on the underlying probability distribution are necessary to perform this test. 3 a H 0: Group 1 and Group 2 are drawn from identical populations. H1: Group 2 has higher values than Group 1. 5% significance level, one-tailed test, m = 6, n = 8, critical value: 31, m is the second group with summed rank Rm = 62, m (n + m + 1) − 62 = 28 , therefore W = 28 As W < 31, reject H0, so Group 2 has higher values than Group 1. b T wo-tailed test, 2% significance level (from tables) critical value is 27, as W > 27, no reason to doubt H0. At 5%, critical value is 29, so this would yield a rejection of H0. Answer: 2%. 58 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 4 a 2 3 4 5 6 7 8 Ball C C C G G C G G C 9 10 G Sum of ranks C = 17 Rank 1 4 Sum of ranks G = 38 b The two sample sizes are equal. c H 0: Claxxon and Galway golf balls travel the same distance. H1: Claxxon and Galway golf balls do not travel the same distance. 10% significance level, two-tailed test, m = 5, n = 5, critical value: 19 For Claxxon balls, Rm = 17, m ( n + m + 1) − 17 = 38, therefore W = 17 As W  19, reject H0. There is sufficient evidence at the 5% significance level that the two types of balls do not travel the same distance. d H 0: The balls travel a median distance of 275 metres. H1: The balls travel a median distance of greater than 275 metres. X ∼ B(10, 0.5), 8% significance level, one-tailed test Three negative signs. P ( X 3 ) = 0.172 > 0.08 No reason to doubt H0. Insufficient evidence that balls travel further than 275 metres. 5 H0: Younger and older drivers take the same length of time to pass their driving test. H1: Younger drivers take less time than older drivers. 1% significance level, one-tailed test, m = 6, n = 10, critical value: 29 Rm = 37, m (n + m + 1) − 37 = 65 therefore W = 37 As W > 29 no reason to doubt H0; there is insufficient evidence at the 1% significance level that younger drivers take less time than older drivers to pass their driving test. 6 H0: The two doctors have the same waiting time. H1: The two doctors do not have the same waiting time. 5% significance level, two-tailed test, m = 3, n = 7, critical value: 7 Rm = 26, m (n + m + 1) − 26 = 7 therefore, W = 7 As W  7, there is just about reason to doubt H0, so just sufficient evidence at the 5% significance level that waiting times are different. 7 H0: The two samples are drawn from identical distributions. H1: The two samples are not drawn from identical distributions. 2.5% significance level, two-tailed test, m = 13, n = 14, therefore normal approximation required. Rm = 231, m (n + m + 1) − 231 = 133 therefore W = 133 µ = 1 m (n + m + 1) = 1 × 13 × (13 + 14 + 1) = 182 2 2 1 2 σ = nm ( n + m + 1) = 1 × 14 × 13 × (13 + 14 + 1) = 1274 12 12 3   133.5 − 182  = Φ ( −2.354 ) = 1 − 0.9907 = 0.0093 < 0.0125 P (W 133 ) = Φ   1274    3 Therefore reject H0; there is sufficient evidence at the 2.5% significance level that the two samples are not from identical distributions. 8 a T he samples are no longer matched pairs, but just 12 observations from each population (though one might consider they are not randomly drawn). b T he three pairs of tied values occur within each sample, e.g. 4.94 appears in Chesford’s data twice, but not in Amerston’s. So, this means they can be ranked k and k + 1 in either ordering, and it will not affect the test. 59 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests c A 1 A 13 A 2 A 14 A 3 C 15 C 4 A 16 A 5 A 17 C 6 A 18 C 7 C 19 A 8 C 20 C 9 C 21 C 10 A 22 A 11 C 23 C 12 C 24 H0: Chesford and Amerston have the same crime rate. H1: Chesford and Amerston do not have the same crime rate. 5% significance level, two-tailed test, m = 12, n = 12, therefore normal approximation required. Using Amerston as m, and ranking from low to high crime rate: Rm = 130, m (n + m + 1) − 130 = 170 therefore W = 130 µ = 1 × 12 × 25 = 150 2 2 σ = 1 × 12 × 12 × 25 = 300 12  130.5 − 150  P (W 130 ) = Φ   = Φ ( −1.126 ) = 1 − 0.8698 = 0.1302 > 0.025  300  Therefore no reason to doubt H0; the crime rates in Chesford and Amerston are the same. 9 Entries N have been omitted for clarity 1 2 M M M 3 4 5 6 11 3 4 10 4 5 9 5 6 8 6 7 7 7 5 9 5 6 8 6 7 7 7 8 6 6 7 7 7 8 6 6 M 9 5 5 9 5 5 M 10 4 4 M 11 3 3 M M M M M M M M M M M M M M M M M M M M M M W 3 M M Rm 60 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Worked solutions Sampling distribution: w 3 4 5 6 7 P(W = w) 2 15 2 15 4 15 4 15 3 15 Therefore, lowest possible significance level for two-tailed test would be 2 . 15 Exam-style questions 1 a H0: Median ratio of pupils to teachers is the same in 2000 and 2010. H1: Median ratio of pupils to teachers is lower in 2010 than in 2000. X ∼ B(9, 0.5), 10% significance level, one-tailed test, 2 negative signs P ( X 2 ) = 0.08984 < 0.1, therefore reject H0; there is evidence to show that the median ratio of pupils to teachers is lower in 2010, so quality is increasing. b The Wilcoxon matched-pairs signed-rank test has a lower probability of Type II error. 2 H0: Median number of pages set for reading is 40. H1: Median number of pages set for reading is not 40. 2% significance level, two-tailed test, n = 14, critical value 15 Difference 9 20 −8 25 3 −5 −2 −1 38 13 17 7 15 22 |Difference| 9 20 8 25 3 5 2 1 38 13 17 7 15 22 Rank 7 11 6 13 3 4 2 1 14 8 10 5 9 12 Signed rank 7 11 −6 13 3 −4 −2 −1 14 8 10 5 9 12 P = 92, Q = 13, so T = 13 < 15, therefore reject H0; the median number of pages set is not 40. 3 H0: Drinking Blue Stallion does not improve concentration. H1: Drinking Blue Stallion does improve concentration. 5% significance level, one-tailed test, m = 6, n = 7, critical value 29 1 2 3 4 5 6 7 8 9 10 11 12 13 32 (BS) 44 (BS) 51 (Co) 58 (BS) 59 (Co) 60 (BS) 62 (Co) 67 (Co) 68 (BS) 72 (BS) 73 (Co) 74 (Co) 81 (Co) Rm = 32, m ( n + m + 1) − Rm = 52 therefore W = 32. W > 29, so no reason to doubt H0; Blue Stallion drinkers’ reaction times are drawn from an identical distribution to the control groups’ reaction times. 4 a H0: The median coefficient of friction is the same with both oils. H1: The median coefficient of friction is not the same with both oils. X ~ B (15, 0.5 ), 5% significance level, two-tailed test, 3 negative signs P ( X 3 ) = 0.01758 < 0.025, therefore reject H0; there is a difference in median coefficient of friction between the two oils. b C ritical value for n = 15, 5% significance level, two-tailed test is 25. As T = 33 > 25, there is no reason to doubt the null hypothesis. This changes the conclusion from above. 5 a The data does not appear to be symmetric. bH0: The median amount of time for pain to be relieved is 30 minutes. H1: The median amount of time for pain to be relieved is more than 30 minutes. X ~ B (12, 0.5 ), 5% significance level, one-tailed test, 3 negative signs P ( X 3 ) = 0.0730 > 0.05, therefore no reason to doubt H0; on average pain is relieved within 30 minutes. 6 H0: There is no preference for one sports kit manufacturer over the other. H1: There is preference for one sports kit manufacturer over the other. X ~ B (100, 0.5 ) , therefore use a normal approximation, X ~ N ( 50, 25 ) 1% significance level, two-tailed test, 36 ‘negative’ signs ( ) P(X 36) = Φ 36.5 − 50 = Φ(−2.7) = 1 − 0.9965 = 0.0035 < 0.005, 5 therefore reject H0; there is evidence of a difference in preference for the two manufacturers of the sports kit. 61 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests 7 a H0: North African and Central American television ownership rates are drawn from identical distributions. H1: North African and Central American television ownership rates are not drawn from identical distributions. 5% significance level, two-tailed test, m = 5, n = 10, critical value 23 Rank Value Region 1 84.8 NA 2 93.5 NA 3 93.9 NA 4 99.5 CA 5 99.8 CA 6 104.8 NA 7 109.7 CA 8 110.9 NA 9 125.9 CA 10 129.4 CA 11 134.6 CA 12 152.0 CA 13 158.7 CA 14 158.9 CA 15 166.1 CA Rm = 20, m (n + m + 1) − Rm = 60, therefore W = 20. W < 23, reject H0; there is sufficient evidence of a difference between television ownership rates in North Africa and Central America. b In order to perform a two-sample t-test the samples must be drawn from distributions with identical variance, but clearly the two standard deviations are not close to being the same. 8 a H 0: Median score on first paper is the same as on the second paper. H1: Median score on first paper is the lower than on the second paper. 5% significance level, one-tailed test, n = 10, critical value 10 Difference 10 −19 11 16 20 |Difference| 28 −4 21 8 2 10 19 11 16 20 28 4 21 8 2 Rank 4 7 5 6 8 10 2 9 3 1 Signed rank 4 −7 5 6 8 10 −2 9 3 1 P = 46, Q = 9, so T = 9 < 10 therefore reject H0; there is evidence to support the claim that the median mark on the second paper is higher than on the first paper. b This converts the test into a two-tailed test, and the critical value is now 8. As T > 8 there is no reason to doubt H0, which is that the median scores on the two papers are the same. For a given significance level, in stating that one median is lower than the other, the critical region becomes larger than just looking for a generic difference (either higher or lower). Hence, it is not contradictory that the first test should reject, whilst the second test finds no reason to doubt the null hypothesis. 9 H0: The two samples are drawn from identical distributions. H1: The two samples are not drawn from identical distributions. 5% significance level, two-tailed test, m = 15, n = 20, normal approximation required Rm = 340, m (n + m + 1) − Rm = 200, therefore W = 200. ( ) 1 1 2 2 W ~ N µ, σ : µ = 2 m (n + m + 1) = 270, σ = 12 nm (n + m + 1) = 900 200.5 − 270 P (W 200 ) = Φ = Φ ( −2.317 ) = 1 − 0.9898 = 0.0102 < 0.025 30 ( ) Therefore, reject H0; the two samples are not drawn from identical distributions. 62 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Worked solutions 10 4 aH0: Median birth rates have not changed from 2000 to 2005. H1: Median birth rates have decreased from 2000 to 2005. 5% significance level, one-tailed test, n = 7, critical value 3 Difference −2.55 −0.45 0.73 −0.28 −0.17 −0.29 0.06 |Difference| 2.55 0.45 0.73 0.28 0.17 0.29 0.06 Rank Signed rank 7 5 6 3 2 4 1 −7 −5 6 −3 −2 −4 1 P = 7, Q = 21, so T = 7 > 3, therefore no reason to doubt H0; there has been no change in median birth rate from 2000 to 2005. b Wilcoxon rank-sum test c 5% significance level, one-tailed test, m = n = 7, critical value 39. As W = 50 > 39, this does not change the conclusion that there is no reason to doubt the null hypothesis. 11 H0: The median weight of the first-born twin is the same as that of the second-born. H1: The median weight of the first-born twin is greater than that of the second-born. 8% significance level, one-tailed test, n = 45, normal approximation required 1 T ~ N µ, σ 2 : µ = 4 n ( n + 1) = 517.5 s 2 = 1 n ( n + 1)( 2n + 1) = 7848.75 24 ( ) T = 437  437.5 − 517.5   = Φ ( −0.9030 ) = 1 − 0.8167 = 0.1833 > 0.08 P (T 437 ) = Φ  7848.75  Therefore, no reason to doubt H0; the median weight of the two twins is the same. 12 a H0: Winston and Jamal have the same median time. H1: Winston has a lower median time than Jamal. X ~ B ( 5, 0.5 ) , 5% significance level, one-tailed test, no negative signs P(X = 0) = 0.03125 < 0.05, therefore reject H0; there is sufficient evidence that Winston has a lower median time than Jamal. b The paired-sample sign test can only be used if the samples are drawn under the same conditions for each point. In this case the races being different will have different underlying conditions, so the test is not valid. c Use a Wilcoxon rank-sum test. H0: Winston and Jamal’s times are drawn from identical distributions. H1: Winston’s times are lower than Jamal’s. 5% significance level, one-tailed test, m = n = 5, critical value 19. Using Jamal as m Rm = 34, m ( n + m + 1) − Rm = 21, therefore W = 21. W > 19, therefore, no reason to doubt H0; Winston and Jamal’s times are drawn from identical distributions. As only the second non-parametric test is valid, this suggests that there is insufficient evidence to pick Winston ahead of Jamal. For example, Winston’s times may have come in races where the wind was behind, whereas Jamal’s may have all been into a headwind. If the times had come from the same races (so under the same conditions) there might have been sufficient evidence to pick Winston ahead of Jamal, as he would have beaten him five times out of five. 13 a H0: Median black-fly damage is identical on crops treated by organic or chemical pesticides. H1: Median black-fly damage is not identical on crops treated by organic or chemical pesticides. 5% significance level, two-tailed test, n = 9, critical value 5 63 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Non-parametric tests Difference −4.6 1.2 −7.6 −4.0 3.4 −1.1 −3.2 −5.8 −3.3 |Difference| 4.6 1.2 7.6 4.0 3.4 1.1 3.2 5.8 3.3 7 2 9 6 5 1 3 8 4 −7 2 −9 −6 5 −1 −3 −8 −4 Rank Signed rank P = 7, Q = 38, so T = 7 > 5, therefore no reason to doubt H0; type of pesticide makes no difference to prevalence of black-fly damage. b The paired-sample t-test could be used to test whether the mean damage is the same. c H0: Mean difference of black-fly damage between crops treated by organic or chemical pesticides is zero. H1: Mean difference of black-fly damage between crops treated by organic or chemical pesticides is not zero. 5% significance level, two-tailed test Estimated standard deviation of differences s= 9  (−4.6)2 + 1.2 2 + … + (−3.3)2 2 − ( 6.4 − 3.6 )  8  9  = 3.416 Test statistic T = 6.4 − 3.6 = 2.459 3.416 9 Critical value from t-distribution with 8 degrees of freedom is 2.306 As T > 2.306, reject H0; there is sufficient evidence of a difference in the mean black-fly damage between the two pesticides. d Paired sample t-test as Wilcoxon signed-rank test doesn’t take into account the magnitude of the differences. Different regions may respond better to different pesticides. 14 a H0: Ship-building times in Guangnan and Jiangzhou are drawn from identical distributions. H1: Ship-building times in Guangnan and Jiangzhou are not drawn from identical distributions. 10% significance level, two-tailed test, m = 3, n = 4, critical value 6 Ranking: 23J, 27J, 32G, 34J, 40J, 43G, 54G Rm = 16, m (n + m + 1) − Rm = 8 Therefore W = 8. W > 6, so no reason to doubt H0; the ship-building times are the same. b H0: The mean ship-building times in Guangnan and Jiangzhou are equal. H1: The mean ship-building times in Guangnan and Jiangzhou are not equal. 10% significance level, two-tailed test, degrees of freedom 3 + 4 − 2 = 5 Critical value from t-distribution is 2.015 Sample means xG = 43 and x J = 31 Estimate of shared variance 2 × 121 + 3 × 170 3 = 82.4 s2 = 5 43 − 31 Test statistic T = = 1.731 82.4 13 + 14 As T < 2.015, there is no reason to doubt H0; the mean ship-building times are the same for both companies. ( c 15 ) Advantage is that it does not rely on the underlying populations being drawn from a normal distribution. Disadvantage is that it does not use all the information (it only uses the relative sizes and not the exact values) of the data points to test the hypothesis. 1 × 10 × (10 + 1) = 55, 55 assuming the first ten ranks were all from the sample of size m. 2 b Let ω be the maximum value at which the null hypothesis would be just rejected. Using a normal approximation a R m= ( ) 1 W ∼ N µ, σ 2 : µ = m (n + m + 1) = 125, 2 64 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 4 Worked solutions σ 2 = 1 nm (n + m + 1) = 875 12 3     + 0.5 − 125  ω + 0.5 − 125  ω   P (W ω ) = Φ ⇒ 0.01 = Φ     875 875     3 3 = −2.326 this yields ω = 84.78 Therefore W < 84.8. Mathematics in life and work Use a matched-pairs Wilcoxon signed-rank test, as the nesting locations are the same throughout. H0: Median number of eggs from earlier year to later year is unchanged. H1: Median number of eggs from earlier year to later year has decreased. 5% significance level, one-tailed test, n = 12, critical value 17 For 2000 to 2005, T = 19 and for 2005 to 2010 T = 18. In both cases T > 17, so there is no reason to doubt H0; the median number of eggs is unchanged. However, for 2000 to 2010, T = 9 and T < 17, so the null hypothesis is rejected in favour of H1; the median number of eggs has decreased. This demonstrates that even if over shorter periods there is no evidence to show egg numbers are decreasing, in the longer run the evidence supports this hypothesis. Location A B C D E F G H I J K L 2000 154 239 107 167 130 245 280 68 179 294 273 249 2005 99 201 121 162 129 258 254 90 162 278 242 252 Difference 55 38 −14 5 1 −13 26 −22 17 16 31 −3 |Difference| 55 38 14 5 1 13 26 22 17 16 31 3 Rank 12 11 5 3 1 4 9 8 7 6 10 2 Negative ranks 5 4 8 2 Location A B C D E F G H I J K L 2005 99 201 121 162 129 258 254 90 162 278 242 252 2010 109 193 76 140 118 246 230 108 153 203 269 212 Difference −10 8 45 22 11 12 24 −18 9 75 −27 40 |Difference| 10 8 45 22 11 12 24 18 9 75 27 40 Rank 3 1 11 7 4 5 8 6 2 12 9 10 Negative ranks 3 Location 6 9 A B C D E F G H I J K L 2000 154 239 107 167 130 245 280 68 179 294 273 249 2010 109 193 76 140 118 246 230 108 153 203 269 212 Difference 45 46 31 27 12 −1 50 −40 26 91 4 37 |Difference| 45 46 31 27 12 1 50 40 26 91 4 37 9 10 6 5 3 1 11 8 4 12 2 7 Rank Negative ranks 1 8 65 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 5 Probability Generating Functions 5 Probability generating functions Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Prerequisite knowledge 1 Mean, E(X ) = 2 × 0.2 + 5 × 0.5 + 7 × 0.3 = 5 Variance, Var(X ) = 22 × 0.2 + 52 × 0.5 + 72 × 0.3 – 52 = 3 iiFrom the expansion, the coefficient of t r is P ( X = r ) = kα r b GX (1) = kα 1 =1⇒α = 1−α k +1 a G eometric distribution requires a set of trials that have two outcomes (success or failure, equivalently, pass or fail his test) with these Variances, Var(X ) = 2, Var(Y ) = 5 × 0.4 × (1 – 0.4) = outcomes being independent from trial to trial 1.2 and with a fixed probability of success. These −2  2 3 1   − 2 × − 3 ( ) ( ) (−are 2) ×stated (−3) ×in (−the 4) question. 1 9 9 1 1 1 assumptions Once 3 × − x + + − x + …  = 16 1 + ( −2 ) × − 4 x + 2 = 9 × 16 1 − 4 x 2 4 6 4 Sudhir passes his test, he does not retake it, (4 − x )     which is as in the geometric distribution: once 3 −2  1  ( −2 ) × ( −3) × − 1 x 2 + (−2) × (−3) × (−4) 1 9  1 1 there has+ been = 9 ×  1 − x  = 1 + ( −2 ) × − x + − xa success + … the trials end. 16 4 16 4 2 4 6 4     2 3 1 2 1 2 2 1 2 1 b G t = t + × t + × t3+ × t4 +… ( ) 2 3 X  3 3 3 3 3 3 3 −2 ) × ( −3 ) ( (−2) × (−3) × (−4) 1 1 + × − x + + − x + … 2 4 6 4 2 3   1  2 2 2 G (t ) = t  1 + t + t + t + … 3  3 3 3 9 1 3 2 1 3 9 9 272 9 X3  = + x+ x + x … 1+ x + x + x … = 16 2 16 16 16 32 256 256 This is an infinite geometric series with a 1 3 2 1 3 9 9 27 2 9 3 x+ x + x … = + x+ x + x … common ratio of 2 t and initial term 2 16 16 16 32 256 256 3 1 t . Therefore 3 1t Exercise 5.1A t GX (t ) = 3 = 1 − 23 t 3 − 2t 1 1 3 1 1 1 GX (t ) = + t + t 2 + t 3 + t 4 1 = 1 as required. 2 5 20 10 20 GX (1) = 3 − 2 2 ( ( ) ) ) ( ) ( ) ( ) ( ) } 2 4 Expectations, E(X ) = 2, E(Y ) = 5 × 0.4 = 2 {   1010   1010 GHG(Ht ()t=) =   (0.7 )10)10+ +   (0.7 )9 )(90.3 )t )t (0.7 (0.7 (0.3  0 0   1 1   10   8 8 2 22 2 …  10 ++ ) )(0.3 ) )t t+ + … (0.7 (0.3  2  (0.7  2 a i ( ) ( ) ( ) } c GX (t ) = 5 () () ( ) ( ) pt 1 − (1 − p )t GX (t ) = e −λ + e −λ λt + e −λ λ 2 2 e −λ λ 3 3 … e −λ λ r r t + t + + t r! 2 6  ( λt )2 + ( λt )3 + … = e −λ  1 + λt +  2! 3!    10   10 1010 10  10 ++ (0.3 ) )t t (0.3  10  10   = (0.7 + 0.3t ) Using the Maclaurin series result this equals = e −λe λt = e λt − λ = e λ(t −1) as required. 10 3 ( ) (1 − αt )−1 = 1 + αt + (αt )2 + (αt )3 + … −1 kα t (1 − α t ) = kα t + k(α t )2 + k(α t )3 + k(α t ) 4 + … ⇒ P ( X = 1) = kα 66 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 5 Worked solutions Exercise 5.2A 1 E( X ) = G′X (1) = 3 t 3 + 1 t 10 GX(t ) = 15 + 25 t + 10 10 2 9 2 9 G′X (t ) = 5 + 10 t + t 9 + 1 = 2.3 = E X G′X (1) = 25 + 10 ( ) G′′X (t ) = 95 t + 9t 8 G′′X (1) = 95 + 9 = 10.8 Var( X ) = G′′X (1) + G′X (1) − G′X (1)   GX′′ (t ) = 2 24 2 6 b G ′X (t ) = 125 ( 3 + 2t ) and G′′X (t ) = 125 ( 3 + 2t ) 6 × 25 = 1.2 Therefore E ( X ) = G′X (1) = 125 3 2 (1 − p ) p2 = a GY (1) = a ( 5 − b ) + b (1 + a ) =1 (5 − b )2 Substituting a = 4 − b gives ( 4 − b )(5 − b ) + b (1 + 4 − b ) = 4 = 1 5−b (5 − b )2 ⇒ b = 1 and a = 3 c Expanding (2 + m )2 = 1 ⇒ (m + 2 )2 = 25 , as m > 0, 25 m + 2 = 5, m = 3 ( ) 1 t 2 +… (1 + 3t )(5 − t )−1 = 15 (1 + 3t ) 1 + 15 t + 25 4 4m m 2 + t a Expanding brackets: GZ (t ) = t −1 + 25 25 25 Therefore Z can take values {−1, 0,1} b G Z (1) = ) 3 2 (1 − p ) p 1 1− p + 2 − 2 = 2 as required p2 p p p so GY′ (1) = 2 Var( X ) = G′′X (1) + G′X (1) − G′X (1)   24 2 = 125 × 5 + 1.2 − 1.2 = 0.72 ( 2 (1 − p ) p 1 − (1 − p ) 1+ a =1⇒ a = 4−b 5−b a ( 5 − bt ) + b (1 + at ) b GY′ (t ) = (5 − bt )2 5 3 1 a GX (1) = k ( 3 + 2 × 1) = 125k = 1 ⇒ k = 125 p = 1 as required p2 p (1 − (1 − p )t )3 Var ( X ) = = 10.8 + 2.3 − 2.32 = 7.81 = 2 (1 − p ) p so GX′′ (1) = 2 p (1 − (1 − p ))2 Mean is expectation, so equal to 1. Therefore P (Y = 1) is the coefficient for t from the 16 expansion: 1 1 t + 3t so P (Y = 1) = 25 5 5 ( ) Exercise 5.3A 6( 2 + 3t ) × 25t − 25 × ( 2 + 3t ) c GZ′ (t ) = 625t 2 (2 + 3t )( 3t − 2 ) = 25t 2 Therefore the mean (expectation) is (2 + 3)( 3 − 2 ) = 0.2 GZ′ (1) = 25 2 G′′ (t ) = Z ( 18t × 25t 2 − 50t 9t 2 − 4 625t 4 8 = 0.32 G′′Z (1) = 25 )= 8 25t 3 1 GX +Y (t ) = GX (t ) × GY (t ) ( ) 0.05 (8 + 12t ) t = 0.005 ( 6 + 3t + t )( 8 + 12t ) (t ) = 0.005 ( 48 + 24t + 8t + 72t + 36t GX +Y (t ) = 0.1t 2 6 + 3t + t 2 × 2 GX +Y 4 2 4 2 4 5 + 12t 6 ) P ( X + Y is odd ) is the sum of coefficients of tr where r is odd. P ( X + Y is odd ) = 0.005 ( 24 + 36 ) = 0.3 2 Let Yi be the score on an individual dice, then 2 2 3 1 1 1 1 1 1 G′′Z (1) + G′X (1) − G′X (1)  = 0.32 + 0.2 − 0.04 = GYt (t ) = t + t 2 + t 3 + t 4 + t 5 + t 6 5 6 6 6 6 6 6   1 2 = t (1 + t + t 2 + ... + t 5) 2 3 6 G′′Z (1) + G′X (1) − G′X (1)  = 0.32 + 0.2 − 0.04 = 5   using formula for geometric sum pt 1 1−t6 t 1−t6 1 4 GX (t ) = therefore = t × = 1 − (1 − p )t 6 1−t 6(1 − t ) p (1 (1 p ) t ) pt (1 p ) × − − − × − − p G′X (t ) = = (1 − (1 − p)t )2 (1 − (1 − p)t )2 Standard deviation is ( ) ( ) 67 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 5 Probability Generating Functions So for X = Y1 + Y 2 + Y 3 ( )  t 1 − t 6 GX (t ) =   6(1 − t ) as required. 0.5(0.5 + 0.5t )5 × (1 − 0.5t )−1 3 t3 1−t6  = 216  1 − t   = 0.015625 + 0.0859375t + 0.19921875t 2 + … 3 Therefore P ( A + B < 3 ) = P ( A + B = 0 ) + P ( A + B = 1) P ( X 16 ) is the sum of the coefficients of t16, t17 and t18 Begin by noting 1−t 6 = 1+t +t2 +t 3 +t 4 +t 5 1−t ( 3 ( therefore P ( X 16 ) = 3 GX1 (t ) = 0.75t −2 GX (t ) = 0.75t = 0.015625 + 0.0859375 + 0.19921875 = 0.301 6 a T his is a case of the sum of two geometric distributions, each with a PGF of 0.25t G X i (t ) = 1 − 0.75t So ) 6 + 3 +1 5 = 216 108 5 + 0.25t and X = X 1 + X 2 + … + X 25 −2 + P( A + B = 2 ) (3 s.f.) ) 1−t  so  = … + 6t 13 + 3t 14 + t 15  1 − t  6 5 25 + 0.25t  0.25t  G X1 + X 2 (t ) = G X1 (t ) × G X 2 (t ) =  1 −  0.75t  b GX′ (t ) = To find expectation 4 E(X ) = GX′ (1) = 25 × (−1.5 + 1.25) × (0.75 + 0.25)24 = −6.25 c 0.125(1 − 0.75)2 + 1.5(1 − 0.75) × 0.0625 =8 (1 − 0.75) 4 GX (t ) = (0.25t )2(1 − 0.75t )−2 = (0.25t )2(1 + 1.5t + 1.6875t 2 + …) P ( X < 5 ) = 0.252 (1 + 1.5 + 1.6875 ) = 0.262 to a G′X (t ) = 0.3 + 0.74t + 0.6t 2 + 0.16t 3 3 s.f. E(X ) = G′X (1) = 0.3 + 0.74 + 0.6 + 0.16 = 1.8 G′′X (t ) = 0.74 + 1.2t + 0.48t 2 Var(X ) = G′′X (1) + G′X (1) − G′X (1)    0.125t (1 − 0.75t )2 + 1.5(1 − 0.75t ) × 0.0625t 2 (1 − 0.75t ) 4 E(X ) = GX′ (1) = GX′ (t ) = 25 × (−1.5t −3 + 1.25t 4) × (0.75t −2 + 0.25t 5 )24 Exam-style questions 2 2 = (0.74 + 1.2 + 0.48) + 1.8 − 1.8 = 0.98 1  n n  n n −1 a G X (t ) =   (1 − p ) +   (1 − p ) pt 0 1   n n −2 +   (1 − p ) p 2t 2 2 GY (t ) = 0.3 + 0.5t + 0.2t 2 as [ GY (t )] 2 = GX (t ) b i ii P (Y = 1) = 0.5 iii 2E (Y ) = E ( X ) ⇒ E (Y ) = 0.9 and 2Var (Y ) = Var ( X ) ⇒ Var (Y ) = 0.49 5 The number of heads Alberta gets, A, is a geometric sequence, but if she succeeds (gets a tail) on her first throw there are no heads. This can be thought of as a geometric distribution with the 0.5 values starting at zero. Hence GA (t ) = 1 − 0.5t The number of heads Bruno gets, B, is simple binomial, so GB (t ) = ( 0.5 + 0.5t )5 GA + B (t ) = GA (t ) × GB (t ) = 0.5 ( 0.5 + 0.5t ) 1 − 0.5t 5 = 0.5(0.03125 + 0.15625t + 0.3125t + …) = 0.015625 + 0.078125t + 0.15625t + … 2 ( )n GX′′(t ) = n (n − 1)p 2((1 − p) + pt )n − 2 Therefore E(X ) = GX′ (1) = np ((1 − p) + p)n −1 = np and Var(X ) = GX′′(1) + GX′ (1) − GX′ (1)    2 2  n  n n−3 +   (1 − p ) p 3t 3 + … +   p nt n = (1 − p ) + pt 3  n b GX′ (t ) = np ((1 − p) + pt )n −1 and 0.5(0.5 + 0.5t )5 (1 − 0.5t )−1 = 1 + 0.5t + 0.25t 2 + … 2 2 = n(n − 1)p 2 + np − n 2p 2 = np(1 − p) ( ) 2 1 a GX (1) = k 1( 3 + 4 ) + (1 + 1) = 11k = 1 ⇒ k = 11 b The coefficient of the t2 term is P(X = 2). 4 Therefore P(X = 2) = 11 68 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 5 Worked solutions ( ) () −1 2 1 1  t t t a  a (2 + 8t + 12t 2) and GX′′(t ) = (8 + 24t ) = t 1 + + + … c GX (t ) = t 1 − 11 11 b b b b  b  Therefore E(X ) = GX′ (1) = 2 and a a a 2 3 2 t + t +… = t+ 32 10 a +1 Var(X ) = GX′′(1) + GX′ (1) − GX′ (1)  = +2−4= (a + 1)2 (a + 1)3 11 11   d p = a and 1 − p = 1 3 GY (t ) = pt + (1 − p)pt 2 + (1 − p)2pt 3 + (1 − p)3pt 4 + … a +1 a +1 2 3 2 −2 2 = pt (1 + (1 − p )t + (1 − p ) t 2 + (1 − p ) t 3 + …) 2 t 6 a G′X (t ) = 4te and G′′X (t ) = 4(1 + 4t 2)e2t − 2 This is an infinite geometric series with common so E(X ) = G′X (1) = 4 ratio (1 − p )t and first term pt. 2 pt Therefore GY (t ) = and Var(X ) = G′′X (1) + G′X (1) − G′X (1)    1 − (1 − p )t c G X′ (t ) = = 20 + 4 − 4 2 = 8 To find expectation and variance GY′ (t ) = ( ) p × 1 − (1 − p )t − pt × − (1 − p ) (1 − (1 − p )t ) E(Y ) = G′Y (1) = G′′Y (t ) = GY′′ (1) = 4 p (1 − (1 − p ))2 2 (1 − p ) p (1 − (1 − p )t ) ( 2 (1 − p ) p 1 − (1 − p ) Var (Y ) = ) p = (1 − (1 − p )t ) p 1 = 2=p p = 4 2 ) 1 = 66c = 1 ⇒ c = 66 ( ) ( ( 1 2 c GQ′′ (t ) = 66 12 (1 + t ) + 36 ) ) so, Var (Q ) = GQ′′(1) + GQ′ (1) − GQ′ (1)   ( 2 ( ) ) 2 5 a GX (1) = b G X′ (t ) = × et −1 = e(t −1)(2t + 3) ( 1 (1 + t )4 + 2 ( 2 + 3t )2 4356 2 a =1⇒ b = a +1 b −1 a ( b − t ) + at (b − t )2 a ( b − 1) + a a 2 + a a + 1 = = a E( X ) = G′X (1) = a2 (b − 1)2 )( ( ) ) −1  1 t t 2 … 3 + t 2 1 + + +  5 5 25   ( ) Coefficient is: 1 1 + 3 × 1 = 28 25 125 5 2t (5 − t ) + (3 + t 2) (5 − t )2 so E(X ) = G X′ (1) = H ′Z (t ) = 3 4 ( ) 2 4t 3 + t 2 ( 5 − t ) + 2( 5 − t )( 3 + t ) so E( Z ) = H Z′ (1) = 2 2 (5 − t ) 4 256 + 128 = 3 = 2 × E X ( ) 256 2 as required 8 ) ( 1 t 3 + t2 1− 5 5  3 + t2 2 d H Z (t ) = [ G X (t )] ⇒ H Z (t ) =   5 − t  = 0.724 d GP +Q (t ) = GP (t ) × GQ (t ) = c GX′ (t ) = 2 1 46 46 = 66 12(1 + 1) + 36 + 33 − 33 = 2 +t − 3 2 −2 a GX (1) = 3 + a = 1 ⇒ a = 1 5 −1 GX (t ) = b GQ′ (t ) = 1 4 (1 + t )3 + 12 ( 2 + 3t ) 66 Therefore, 1 46 3 E (Q ) = GQ′ (1) = 4 (1 + 1) + 12 ( 2 + 3 ) = 66 33 = e2t 2 (1 − p ) p2 a GQ (1) = c (1 + 1) + 2 ( 2 + 3 ) b GX +Y (t ) = G X (t ) × GY (t ) = e2t b The coefficient of t 2 represents P ( X = 2 ). 2 (1 − p ) p 1 1− p + 2− 2 = 2 p2 p p p ( 2 7 so 3 3 2 ( ) a G D1 (t ) = k t + 2t 2 + 3t 3 + 4t 4 + 5t 5 + 6t 6 , When t = 1, 21k = 1 therefore k = 1 21 1 1 + 4t + 9t 2 + 16t 3 + 25t 4 + 36t 5 , b GD′ 1 (t ) = 21 ′ (1) = 13 GD 1 3 1 4 + 18t + 48t 2 + 100t 3 + 180t 4 , and GD′′1 (t ) = 21 ′′ (1) = 50 GD 1 3 ( ) ( ) 69 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 5 Probability Generating Functions ′ (1) = 13 and E( X ) = GD 1 3 11 a E ( X ) = 0 × (1 − p ) + 1 × p = p and Var ( X ) = [02 × (1 − p ) + 12 × p] − p 2 = p (1 − p ) 2 ′ (1)  = 20 Var ( X ) = GD′′1 (1) + GD′ 1 (1) − GD 9  1  c Let D be the sum of the scores on the three dice, then 3 1 GD (t ) = t + 2t 2 + 3t 3 + 4t 4 + 5t 5 + 6t 6 9261 ( ) P ( D 16 ) is given by the coefficients of the t16, t17 and t18 terms. 1 26 P ( D 16 ) = 432 + 450 ) + 540 + 216 ) = 9261 (( 147 a GV (t ) = a t 2 + 2 t −2 b b a+2 GV (1) = =1⇒b = a +2 b GV′ (t ) = 2a t − 4 t −3 b b E(V ) = GV′ (1) = 2a − 4 = − 23 ⇒ b = 6 − 3a b Solving simultaneously yields a = 1 and b = 3 b GX (t ) = P ( X = 0 ) × t 0 + P ( X = 1) × t 1 = (1 − p ) + pt −λ 2 −λ 3 c GX (t ) = e −λ + e −λ λt + e λ t 2 + e λ t 3 2 6 +…+  ( λt )2 + ( λt )3 + … = e −λ  1 + λt +  2! 3!   Using the Maclaurin series result 9 P(V = 0) is the coefficient of the constant term. Using the binomial expansion, this 3 6  3 coefficient is   1 2 = 160 729  3 3 3 1 10 a GY (1) = k (1 + a ) = 1 ⇒ k = 1+ a ( )( ) Substituting k = 1 gives 1+a 2 = λ2 + λ − λ2 = λ as required. ( ( −λ 1− (1− p )+ pt )) = e λp(t −1) f KZ′ (t ) = λ pe λp(t −1) and KZ′′ (t ) = ( λ p ) e λp(t −1) KZ′ (1) = λ p and KZ′′ (1) = ( λ p ) so 2 2 E ( Z ) = KZ′ (1) = λ p and Var ( Z ) = ( λ p ) + λ p − (λ p)2 = λ p 2 2 ( H′Y (1) = λ and H′′Y (1) = λ 2 so e KZ (t ) = HY ( G X (t )) = e as required 2 = 6 ak + 3 ak − ( 3ak ) = 9 ak (1 − ak ) = 2 −λ 1 − t = e λt − λ = e ( ) as required. Var ( X ) = H′′Y (1) + H′Y (1) − H′Y (1)    G′Y (t ) = 3 akt 2 and G′′Y (t ) = 6 akt Var ( X ) = GY′′ (1) + GY′ (1) − GY′ (1)    =e e E ( X ) = H′Y (1) = λ and 6 −λ λt −λ 1− t −λ 1 − t d H′Y (t ) = λ e ( ) and H′′Y (t ) = λ 2e ( ) b L et V be the sum of six independent observations, then 6 1 2 HV (t ) = [ GV (t )] =  t 2 + t −2  3   3 e −λ λ r r t r! ) g Po ( λ p ) Mathematics in life and work 9a 9a a = =2 1− 1+ a 1+ a (1 + a )2 L et X be the score of the voter: GX (t ) = 0.4t + 0.25 + 0.35t −1 1 2 1 2a 2 − 5a + 2 = 0 ⇒ a = or 2 and k = or 2 3 3 I f sample is random, then scores for each voter will be independent, so let Y be the total score of three voters, hence ( b H Z (t ) = k10 1 + at 3 ) 10 P ( Z 3 ) is given by the sum of coefficients of the constant term, t, t 2 and t 3. Using a binomial expansion ( ) H Z (t ) = k10 1 + 10at 3 + … 1 2 For a = , k = : P ( Z 3 ) = 0.104 2 3 For a = 2, k = 1 : P ( Z 3 ) = 0.000356 3 ( GY (t ) = 0.4t + 0.25 + 0.35t −1 ) 3 I n the sample of three, these provisos imply that Y 2 (2 yes, 1 non-vote or 3 yes votes). Therefore, P (Y 2 ) is the sum of the coefficients of the t2 and t3 terms. P (Y 2 ) = 0.4 3 + 3 × 0.4 2 × 0.25 = 0.184 70 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS Summary Review Please note: Full worked solutions are provided as an aid to learning, and represent one approach to answering the question. In some cases, alternative methods are shown for contrast. All sample answers have been written by the authors. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers, which are contained in this publication. Non-exact numerical answers should be given correct to 3 significant figures, or 1 decimal place for angles in degrees, unless a different level of accuracy is specified in the question. Warm-up Questions 1 2σ z < 0.2 n σ = 0.17 and for a 99% confidence interval z = 2.576 Width of the confidence interval is 2 × 0.17 × 2.576 < 0.2 n n > 4.3792… n > 19.17… nMIN = 20 2 H0: μ = 17 H1: μ ≠ 17 This is a two-tailed test with 2.5% in each tail ⇒ z = ±1.96 x = 17.8 + 22.4 + 16.3 + 23.1 + 11.4 = 18.2 5 x − µ 18.2 − 17 = = 1.12 σ 2.4 n 5 −1.96 < 1.12 < 1.96 ⇒ not in the critical region. Accept H0: μ = 17. Accept the lecturer’s claim. 3 C ~ N(91, 3.2 2) and S ~ N(72, 2.6 2) X = C1 + ... + C6 + S1 + ... + S6 + 550 E(X ) = 6 × 91 + 6 × 72 + 550 = 1528 Var(X ) = 6 × 3.22 + 6 × 2.62 + 02 = 102 X ~ N (1528, 102 )   P ( X > 1550 ) = P  Z > 1550 − 1528    102 = P ( Z > 2.178 ) = 1 − 0.9853 = 0.0147 = 1 − P ( Z 2.178 ) A Level Questions 1 ∑ x = 5 and x = 5 N ∑ x 2 = 11 s x2 = for N observations 1 11 − 52  N − 1  N  71 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW ∑ y = 10 ∑ y 2 = 160 for 10 observations and 2  s y2 = 1 160 − 10  = 150 9 10  9 y = 10 = 1 10 So, the pooled estimate is: + 9 150 ( N − 1) × N 1− 1 11 − 25 N 9 2 sp = N + 10 − 2 ( ) ( ) 11 − 25 + 150 N N +8 s p2 = Given that s p2 = 12 12 ( N + 8) = 11 − 25 + 150 N 12N + 96 = 161 − 25 N 2 12N − 65N + 25 = 0 (12N − 5)(N − 5) = 0 N= 5 12 We know that N must be an integer, so N = 5. 2 H0: median = 400 ml H1: median < 400 ml The deviations from the median are: −10, −3, −15, 10, −8, −30, 30, −3, −12, −9, −25, 42, −4, −28, −19, 4. There are 4 positives and 12 negatives. Under H0, X ~ B (16, 0.5 ) 0.0384 < 0.05 ⇒ this result is in the critical region. Reject H0 and accept H1. The customers’ complaints are justified. 3   F( x ) =    Y= New limits are: 1 → 1, 4 → 16 For 1 y 16 , G ( y ) = 1 63 N =5 or P ( X 12 ) = 0.0384 ⇒ X2    G( y) =     1  63  1 x 4 x>4 X= Y (( y ) 3 )  3  − 1 = 1  y 2 − 1 63   y <1 0 3 y2 1 Differentiating:  1 1 2  g ( y ) =  42 y  0  x <1 0 1 3 ( x − 1) 63 1  − 1  1 y 16 y > 16 1 y 16 otherwise 72 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS At the median, G(y) = 0.5 and y = m 3 1 m 2 − 1 = 0.5  63   i 3 m 2 = 32.5 m = 10.18 16 ii E (Y ) = ∫ y g ( y ) d y 1 = 4 1 16 23 y dy 42 ∫1 16  5 = 1 y 2  105   1 = 1 [1024 − 1] = 9.74 105 i G X (1) = 1 ⇒ 2 + a = 1 ⇒ a = 3 5 ii For P(X = 2), we need the coefficient of t2. )( ) 2 + 3t 3 = 2 + 3t 3 7 − 2t −1 = 1 2 + 3t 3 1 − 2 t −1 ( ) 7 − 2t 7 7 3  2 + 3t = 1 2 + 3t 3 1 + −1 − 2 t + (−1)(−2 ) − 2 t  ( ) 7 7 − 2t 7 2! 7  The term in t2 is: ( ) ( ( ( ) ) ( ) = 27 × 494 t 1 × 2 × (−1)(−2 ) − 2 t 7 2! 7 iii G X′ (t ) = 2 2 ( ) + …  2 = 8 t2 343 So P ( X = 2 ) = 8 343 (7 − 2t )9t 2 − (2 + 3t 3 )( −2 ) (7 − 2t )2 5 × 9 − 5 × ( −2 ) 11 E ( X ) = G X′ (1) = = 25 5 5 Let d = score before eating fruit – score after eating fruit H0: µd = 0 there is no difference between the two sets of results H1: µd < 0 there is an increase in the results This is a one-tailed test with p = 0.05, v = 13, so the critical value is −1.771 Calculating the differences and squares: Student 1 2 Before 15 10 After 16 Differences −1 1 Squared ∑ d = −12 xd = and 3 4 5 6 7 8 9 10 11 12 13 14 7 12 18 16 15 13 10 5 19 20 14 15 7 11 14 19 15 12 15 3 −4 −2 −1 1 3 −2 11 7 18 19 19 18 −1 −2 1 1 −5 −3 9 16 4 1 1 9 4 1 4 1 1 25 9 ∑ d 2 = 86 −12 6 =− 14 7  (−12)2  = 1 × 530 = 530 s d2 = 1 86 − 13  14  13 7 91 73 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW The test statistic is: − 67 = −1.329 530 91 14 −1.329 > −1.771 ⇒ not in the critical region. Accept H0 There is insufficient evidence to claim that eating fruit improves mathematical ability. The claim is not justified. 6 H0: gender and preferred brand are independent H1: gender and preferred brand are not independent Table of expected frequencies: A B C Male 28.57 37.71 13.71 Female 21.43 28.29 10.29 X2 = ∑ = (O − E )2 E ( 32 − 28.57 )2 + ( 36 − 37.71)2 + (12 − 13.71)2 + (18 − 21.43)2 + ( 30 − 28.29)2 + (12 − 10.29)2 28.57 37.71 13.71 21.43 28.29 10.29 = 0.4118 + 0.07754 + 0.2133 + 0.5490 + 0.1034 + 0.2842 = 1.639 For a 5% test with v = 1 × 2 = 2, the critical value is 5.991. 1.639 < 5.991 ⇒ not in the critical region. Accept H0. Gender and brand are independent. There is no difference in the preferences between males and females. If the sample is n times larger, then χ2 will also be n times larger. For χ2 to be in the critical region it must be greater than 5.991. 1.639n > 5.991 n > 3.655 Since n is an integer, nMIN = 4. x x ⌠ x  x2  1 2 x2 4 7  6 dx =  12  = 12 − 12 = 12 x − 4  2 ⌡2  0 x <2  1 2 F( x ) =  2x4 x −4 12  x>4  1 ( ( ) Y = X3 ⇒ X = 3 Y New limits are: 2 → 8, 4 → 64 For 8 y 64 G(y ) = 1 12 (( 3 ) y ) 2 )  2  − 4 = 1  y 3 − 4 12   So, the CDF is:  0    1  23 y − 4 G( y) =  12     1  y <8 8 y 64 y > 64 74 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS Differentiating:  1 −1 3  g ( y ) =  18 y  0  8 y 64 otherwise 64 E (Y ) = ∫ y g ( y ) d y = 8 = 64 1 64 23 1  53  y dy = y ∫ 18 8 30   8 1  5 4 − 2 5  = 33.1 30  8 i H0: median difference = 0 There is no change in the amount of litter in the street H1: median difference < 0 There has been a reduction in the amount of litter in the street Let the difference be after minus before. Calculating the differences and ranks gives: Site A B C D E F G H I J Before poster campaign 85 146 137 120 79 95 153 144 108 127 After poster campaign 78 120 110 128 61 65 121 131 88 104 Difference −7 −26 −27 8 −18 −30 −32 −13 −20 −23 Rank −1 −7 −8 2 −4 −9 −10 −3 −5 −6 Sum of the positive ranks: P =2 Absolute sum of the negative ranks: Q = 53 Therefore T = 2. For a one-tail test at the 1% level, T  5 to reject H0. Since T = 2  5 we can reject H0 and accept H1. There has been a reduction in the amount of litter in the street. iiThe test tells us if there has been a significant change, but it does not establish cause and effect. In this case, the reduced amount of litter may not be as a result of the poster campaign. 9 λ= (0 × 7 ) + (1 × 20 ) + (2 × 39) + ( 3 × 16 ) + ( 4 × 14 ) + (5 × 2 ) + (6 × 1) + (7 × 1) 100 λ = 225 = 2.25 100 H0: data can be modelled by Po(2.25) H1: data cannot be modelled by Po(2.25) e −2.25 × 2.25r , which gives: r! 10.540, 23.715, 26.679, 20.009, 11.255, 5.065, 1.899, 0.6105, 0.2275 The expected values are calculated using 100 × The last three expected values are too small as they must be greater than 5, so the final four categories are combined to get an observed value of 4 and an expected value of 7.802. (O − E )2 = 1.189 + 0.5820 + 5.690 + 0.803 + 0.6695 + 1.853 = 10.8 E At the 2.5% level with v = 4, the critical value is 11.14 10.8 < 11.14 X2 = ∑ ⇒ Accept H0. The data can be modelled by Po(2.25). 10 i GY (t ) = k (5t − at 4) GY (1) = 1 E (Y ) = 2 ⇒ GY′ (t ) = k (5 − 4at 3) ⇒ 1 = k (5 − a ) 1 ⇒ 2 = k (5 − 4a ) 2 75 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW 2 ÷ 1 2 = 5 − 4a 5−a 10 − 2a = 5 − 4a ⇒ a = −5 2 Substituting in 1 ⇒ k= 2 1=k 5+ 5 2 15 ( ) ii ( ) GY (t ) = 2 5t + 5 t 4 = 1 (2t + t 4) 2 3 15 1 3 ′ GY (t ) = (2 + 4t ) 3 GY″ (t ) = 1 (12t 2) = 4t 2 3 Var (Y ) = GY″ (1) + GY′ (1) − [ G Y′ (1)] 2 G ′ (1) = E(Y ) = 2 Y GY″ (1) = 4 × 12 = 4 Var (Y ) = 4 + 2 − 2 2 = 2 iii () 3 HZ (t ) = 1 (2t + t 4)3 3 3 = 1 ( 2t ) + 3(2t )2(t 4) + 3 ( 2t ) (t 4)2 + (t 4)3  27 = 1 8t 3 + 12t 6 + 6t 9 + t 12  27 P(Z  6) is the sum of the coefficients of t with powers  6. 8 12 20 P ( Z 6) = + = 27 27 27 11 i x = 2478 = 45.05 s x2 = 343.75 = 6.25 55 55 y = 3981 = 56.87 s y2 = 857.5 = 12.25 70 70 For a 90% confidence interval, we need p = 0.95 + 12.25 ( 45.05 − 56.87 ) ± 1.645 × 6.25 55 70 ⇒ z = 1.645 −12.7 µ x − µ y −10.9 ii H 0: µ x − µ y = 0 H 1: µ x − µ y ≠ 0 The test statistic is 45.05 − 56.87 = −22.0 6.25 + 12.25 55 70 For a two-tail test at the 10% significance level, z = ±1.645 −22.0 < −1.645 ⇒ it is in the critical region. Reject H0. μx is not the same as μy. 12 For 1  x  3, F ( x ) = ∫ 1 dx = x + c 2 2 When x = 1, F ( x ) = 0 ⇒ 1 +c = 0 2 ⇒ c = −1 2 76 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS F ( x ) = x − 1 = 1 (x − 1) 2 2 2  0  1 F ( x ) =  ( x − 1) 2  1  x <1 1 x 3 x>3 G(y) = P(Y  y) Y = X3 ( G( y ) = P X 3 y ⇒ ) 1    1 = P X y 3  = F y 3       0    1  G( y) =  1  y 3 − 1   2  1  y <1 1 y 27 y > 27  1  For 1  y  27, G ( y ) = 1  y 3 − 1 2  1 −2 1 ⇒ g(y ) = 6 y 3 = 2 6y 3  1  2 g (y ) =  6 y 3   0 1 y 27 otherwise g( y) 0.2 1 6 27, 0 10 27 27 1 1 1 E (Y ) = ∫ y g (y) d y = ∫ 27 6 20 25 y 1 y 3 dy 27 P(median Y  mean) = |P(Y < 10) – 0.5| = |G(10) – 0.5| 1  13  10 − 1 − 0.5 = 0.0772 (3 s.f.) 2   13 ∑x = 2623, ∑x 2 = 1 376 081 x = 2623 = 524.6 5 15 4    4 =  3 × 1 y 3  = 3  y 3  = 3 ( 81 − 1) = 10 24   24  4 6 1 1 = 5 1 54 s2 = 1 26232  1 376 081 − = 13.8 4  5  77 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW ⇒ 2.5% in each tail ⇒ 95% confidence interval There are 4 degrees of freedom Therefore, the confidence interval is: 524.6 ± 2.776 13.8 = 524.6 ± 4.6118 5 [520, 529] to 3 s.f. Let the first sample be sample A and the second sample be sample B. H0: μA = μB H1: μA ≠ μB For sample B: ∑ x = 5216, ⇒ t4, 0.975 = 2.776 p = 0.975 (from tables) = 521.6 ∑x 2 = 2 720 780, x = 5216 10 2  s 2 B = 1  2720780 − 5216  = 12.71 10  9 For the combined sample: s 2p = 4 × 13.8 + 9 × 12.71 = 13.05 13 T = 524.6 − 521.6 = 1.516 13.05 1 + 1 5 10 ( ) 10% significance level and 2-tail test ⇒ p = 0.95 There are 13 degrees of freedom ⇒ ⇒ t13, 0.95 = 1.771 (from tables) not in the critical region ⇒ accept H0. 1.5164 < 1.771 There is no significant evidence of a difference in the population means before and after the adjustments. 14 i ∞ ∫0 Ae −λt dt = 1 ∞  − A e −λt  = 1  λ 0 [0] −  − Aλ  = 1 A=λ ii A =1 λ 1 ∫0 λe −λt dt ≈ 16 100 1  −e −λt  ≈ 16  0 100  −e −λ  − [ −1] ≈ 16   100 16 −λ e ≈1− 100 ( ) λ ≈ −ln (1 − 16 ) = 0.174 100 −λ ≈ ln 1 − 16 100 For median: T ∫0.174e −0.174t dt = 0.5 0 78 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS T  −e −0.174t  = 0.5 0 [– e– 0.174T ] – [–1] = 0.5 e– 0.174T = 0.5 – 0.174T = ln 0.5 T = 3.98 years (3 s.f.) 15 G(y) = P(Y  y) Y = X3 ( G( y ) = P X 3 y ⇒ ) 1    1 = P X y 3  = F y 3      For 1  x  4, F ( x ) = 2 1 ∫ 15x dx = 15 x When x = 1, F ( x ) = 0 ⇒ 2 F ( x ) = x − 1 = 1 (x 2 − 1) 15 15 15 i +c 1 + c=0 15  0   2   G ( y ) =  1  y 3 − 1 15     1  2 ⇒ c=− 1 15 y <1 1 y 64 y > 64 Let m be the median value of Y. G(m) = 0.5 1  m 23 − 1 = 0.5  15  2 m 3 − 1 = 7.5 m 3 = 8.5 2 ii m = 24.8 (3 s.f.) −1  −1  For 1 y  64, g ( y ) = 1  2 y 3  = 2 y 3 15  3 45  64 E (Y ) = ∫ y g ( y ) d y = 1 64 2 64 23 y dy 45 ∫1 64 5    5 = 2  3 y 3  = 2  y 3  = 2 (1024 − 1) 45  5  75   75 1 1 = 27.28 = 27.3 (3 s.f.) 16 H0: μO – μI = 0 H1: μO – μI ≠ 0 Outdoor times – Indoor times: 0 .1, 2.1, –0.1, 0.2, 2.4, 0.5, 2.8, –2.6 ∑x = 5.4, ∑x 2 = 25.08, s= x = 5.4 = 0.675 8 1  25.08 − 5.4 2  = 1.750 8  7  79 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW For the combined sample: t = 0.675 = 1.091 1.750 8 5% significance level and 2-tail test ⇒ p = 0.975 There are 7 degrees of freedom ⇒ 1.091 < 2.365 There is no significant evidence that there is a difference between the indoor and outdoor swimming times. t7, 0.975 = 2.365 ⇒ not in the critical region ⇒ (from tables) accept H0. 222.8 17 x = 10 = 22.28 4.12 = 0.6766 9 s= ⇒ 2.5% in each tail ⇒ 95% confidence interval There are 9 degrees of freedom Therefore, the confidence interval is: ⇒ t9, 0.975 = 2.262 p = 0.975 (from tables) 2 22.28 ± 2.262 0.6766 = 22.28 ± 0.4840 10 [21.8, 22.8] to 3 s.f. 3 3 18 E 2 x < 3 = 80⌠  32 dx = 80  − 3  = 80  [ −1] −  − 3  = 80 × 0.5 = 40  ⌡2 x  x 2  2  4 4  32 dx = 80  − 3  = 80   − 3  − [ −1] = 80 × 0.25 = 20 E 3x < 4 = 80⌠   4   ⌡3 x  x  3 5 5 ⌠ 3 dx = 80  − 3  = 80   − 3  −  − 3  = 80 × 0.15 = 12 E 4x <5 = 80   5   4  ⌡4 x 2  x  4 6 6  32 dx = 80  − 3  = 80   − 3  −  − 3  = 80 × 0.1 = 8 E 5x <6 = 80⌠  x 5   6   5  ⌡5 x H0 : f ( x ) = 32 x H1 : f ( x ) = 32 does not fit the data. x 10% significance level ⇒ p = 0.9 There are 3 degrees of freedom X2 = fits the data. ⇒ χχ3232,, 00.9.9 == 66..251 251 (from tables) ( 36 − 40)2 + (29 − 20)2 + (9 − 12 )2 + (6 − 8)2 40 20 12 8 = 0.4 + 4.05 + 0.75 + 0.5 = 5.7 ⇒ 5.7 < 6.251 f ( x ) = 32 x 19 x = 42.5 = 5.3125 8 s = 15.519 = 1.4890 7 T = 5.3125 − 4.5 = 1.5434 1.4890 8 accept H0 fits the data 80 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS H0: μ = 4.5 H1: μ > 4.5 5% significance level and 1-tail test There are 7 degrees of freedom 1.5434 < 1.895 There is not significant evidence that μ is greater than 4.5 95% confidence interval There are 7 degrees of freedom Therefore, the confidence interval is: ⇒ ⇒ p = 0.95 t7, 0.95 = 1.895 (from tables) ⇒ not in the critical region ⇒ ⇒ 2.5% in each tail ⇒ accept H0. ⇒ p = 0.975 t7, 0.975 = 2.365 (from tables) 2 5.3125 ± 2.365 1.489 = 5.3125 ± 1.2450 8 [4.07, 6.56] to 3 s.f. 20 For 1  x  3, F ( x ) = ∫ 1 dx = x + c 2 2 When x = 1, 1 1 +c =0 ⇒ c =− 2 2 F(x ) = 0 ⇒ F ( x ) = x − 1 = 1 (x − 1) 2 2 2  0  1 F ( x ) =  ( x − 1) 2  1  i 1 x 3 x>3 G(y) = P(Y  y) Y = X3 1    1 G( y ) = P X 3 y = P  X y 3  = F y 3      ( ⇒  0    1  G ( y ) =  1  y 3 − 1 2     1   1  2 g (y ) =  6 y 3   0 y <1 1 y 27 y > 27 ii ⇒ −2 g (y ) = 1 y 3 = 1 2 6 6y 3 1 y 27 otherwise 27 27 1 1 1 E (Y ) = ∫ y g ( y ) d y =∫ 6 27 ) 1 1  For 1  y  27, G(y ) = 2  y 3 − 1   x <1 1 y 3 dy 27 4    4 =  3 × 1 y 3  = 3  y 3  = 3 ( 81 − 1) = 10 24   24  4 6 1 1 ( ) 27 27 1 1 1 E Y 2 = ∫ y 2 g ( y ) d y =∫ 6 4 y 3 dy 81 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW 27 27 7    7 = 3 × 1 y 3 = 1  y 3 7 6 14  1  1 1 = ( 2187 − 1) = 156.14 14 Var(Y) = E(Y 2) – E 2(Y) = 156.14 – 102 = 56.1 (3 s.f.) 23.2 + 27.8 = 0.96298 50 60 21 s = x − y = 25.4 − 23.6 = 1.8 Z= 1.8 = 1.8692 0.96298 Using the normal tables in reverse: z = 1.8692 ⇒ P(Z  z) = 0.9692 Two-tail test at α % significance level ⇒ α % in each tail. 2 α = (1 − 0.9692) × 100 2 α = 3.08 2 α  6.16% () 22 Total number of goals scored = (0 × 12) + (1 × 16) + (2 × 31) + (3 × 25) + (4 × 13) + (5 × 3) = 220 Therefore, the average number of goals scored/match is 220 = 2.2 ⇒ λ = 2.2 100 H0: Total number of goals scored can be modelled by Po(2.2) H1: Total number of goals scored cannot be modelled by Po(2.2) The expected numbers of goals are:  2.2 0 × e −2.2  E 0 = 100 ×   = 11.080 0!   2.21 × e −2.2  E1 = 100 ×   = 24.377 1!   2.2 2 × e −2.2  E 2 = 100 ×   = 26.814 2!   2.2 3 × e −2.2  E 3 = 100 ×   = 19.664 3!   2.2 4 × e −2.2  E 4 = 100 ×   = 10.815 4!   2.2 5 × e −2.2  E 5 = 100 ×   = 4.7587 5!  E6+ = 100 – (E0 + E1 + E2 + E3 + E4 + E5) = 100 – 97.509 = 2.491 E5 < 5 and E6+ < 5 O5+ = 3 (from the table in the question) ⇒ combine E5+ = 4.7587 + 2.491 = 7.2497 X2 = (12 − 11.080 )2 + (16 − 24.377 )2 + ( 31 − 26.814 )2 + (25 − 19.664 )2 + (13 − 10.815)2 + ( 3 − 7.2497 )2 X2= 7.99 5% significance level There are 4 degrees of freedom 7.99 < 9.488 Total number of goals scored can be modelled by Po(2.2). 11.080 ⇒ 24.377 ⇒ 26.814 19.644 10.815 7.2497 p = 0.95 ⇒ χ 42, 0.95 not in the critical region = 9.488 ⇒ (from tables) accept H0 82 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS 23 H0: μ = 5.2 H1: μ > 5.2 ∑x = 61, ∑x 2 = 384, x = 61 = 6.1 10 1  384 − 612  = 1.1499 10  9  sx = 6.1 − 5.2 = 2.4751 1.1499 10 t= 5% significance level and 1-tail test There are 9 degrees of freedom 2.4751 > 1.833 ⇒ in the critical region ⇒ reject H0. There is significant evidence that the new type of tree produces a greater mass of fruit on average. H0: μy = μx H1: μy > μx ∑y = 70, ∑y 2 = 500.6, sy = ⇒ ⇒ p = 0.95 t9, 0.95 = 1.833 (from tables) y = 70 = 7 10 1  500.6 − 702  = 1.0853 10  9  Estimate of the common variance: 2 2 s = 1.1499 + 1.0853 = 0.25 10 T = 7.1 − 6 = 1.8 0.25 5% significance level and 1-tail test There are 18 degrees of freedom 1.8 > 1.734 ⇒ in the critical region ⇒ reject H0. There is significant evidence that the mean mass of fruit produced by gardener Q's trees is greater than the mean mass of fruit produced by gardener P's trees. ⇒ p = 0.95 ⇒ t18, 0.95 = 1.734 (from tables) 24 H0: coffee preferences are independent of company H1: coffee preferences are not independent of company Observed Latte Ground Total 60 52 32 144 Company B 35 40 31 106 Total 95 92 63 250 Expected Cappuccino Company A Cappuccino Latte Ground Total Company A 54.72 52.992 36.288 144 Company B 40.28 39.008 26.712 106 Total 95 92 63 250 X2 = (60 − 54.72 )2 + (52 − 52.992 )2 + ( 32 − 36.288)2 + ( 35 − 40.28)2 + ( 40 − 39.008)2 + ( 31 − 26.712 )2 54.72 52.992 36.288 40.28 39.008 26.712 = 0.5095 + 0.0186 + 0.5067 + 0.6921 + 0.0252 + 0.6883 = 2.4404 5% significance level v=2×1=2 ⇒ ⇒ p = 0.95 χ 22, 0.95 = 5.991 (from tables) 83 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW ⇒ ⇒ accept H0 2.4404 < 5.991 Preferences are independent of company. For the larger sample, the value of v is the same ⇒ not in the critical region ⇒ p = 0.99 ⇒ 1% significance level To be in the critical region, we require: 2.44N > 9.21 N > 3.774 N must be an integer ⇒ 6 ∫0 kx 25 i 2 χ 22, 0.99 v=2 = 9.21 (from tables) Nmin = 4 dx = 1 k  x 3 6 = 1 3  0 72k = 1 k= 1 72  1 2  x 0x 6 f ( x ) =  72  0 otherwise 3 3 3 4 5 5 E 2 x < 3 = 3∫ x 2 dx =  x 3  = ( 27 − 8 ) = 19 2 2 E 3x < 4 = 3∫ x 2 dx =  x 3  = ( 64 − 27 ) = 37 3 4 E 4 x < 5 = 3∫ x 2 dx =  x 3  = (125 − 64 ) = 61 4 4 ii ⇒ ⇒ ⇒ a = 19 b = 37 c = 61 H0: f(x) fits the data H1: f(x) does not fit the data X2 = ( 4 − 8)2 + (15 − 19)2 + ( 31 − 37 )2 + (59 − 61)2 + (107 − 91)2 8 19 37 61 91 = 2 + 0.842 10 + 0.97297 + 0.065573 + 2.8132 = 6.6938 10% significance level v=4 ⇒ χ 42, 0.9 = 7.779 6.6938 < 7.779 ⇒ ⇒ p = 0.9 (from tables) accept H0 f(x) fits the data 26 H0: area and preference are independent H1: area and preference are not independent Observed Area 1 Area 2 Area 3 Local bus service Total 73 36 30 139 Road surfaces 47 44 20 111 Total 120 80 50 250 Expected Area 1 Area 2 Area 3 Total Local bus service 66.72 44.48 27.8 139 Road surfaces 53.28 35.52 22.2 111 Total 120 80 50 250 84 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS X2 = (73 − 66.72 )2 + ( 36 − 44.48)2 + ( 30 − 27.8)2 + ( 47 − 53.28)2 + ( 44 − 35.52 )2 + (20 − 22.2 )2 66.72 44.48 27.8 53.28 35.52 = 5.3646 5% significance level ⇒ p = 0.95 v=2×1=2 5.3646 < 5.991 Area and preference are independent. There is no association between them. χ 22, ⇒ ⇒ 0.95 22.2 = 5.991 (from tables) not in the critical region ⇒ accept H0 27 H0: μ = 1.2 H1: μ > 1.2 Assume the masses are normally distributed. = 1.211 ∑x = 12.11, ∑x 2 = 14.6745, x = 12.11 10 s= T = 1 12.112  14.6745 − = 0.032128 9  10  1.211 − 1.2 = 1.0827 0.032128 10 10% significance level and 1-tail test ⇒ p = 0.9 There are 9 degrees of freedom ⇒ 1.0827 < 1.383 There is no significant evidence that the mean mass of the greengrocer’s cabbages is greater than 1.2 kg. t9, 0.9 = 1.383 ⇒ not in the critical region ⇒ (from tables) accept H0. 28 H0: μ = 7.5 H1: μ < 7.5 x = 70.4 = 7.04 10 s= T = 8.48 = 0.970 68 9 7.04 − 7.5 = −1.4986 0.970 68 10 The tables are based on the upper tail, so we need to use the positive value of t. 10% significance level and 1-tailed test ⇒ p = 0.9 There are 9 degrees of freedom 1.4986 > 1.383 ⇒ in the critical region ⇒ mean is less than 7.5. ⇒ t9, 0.9 = 1.383 (from tables) reject H0. There is significant evidence that the population 29 For A: s= x = 57.4 = 8.2 7 ∑x = 57.4, ∑x 2 = 481.1, 1  481.1 − 57.42  = 1.3178 7  6  95% confidence interval ⇒ 2.5% in each tail ⇒ ⇒ t6, 0.975 = 2.447 There are 6 degrees of freedom Therefore, the confidence interval is: p = 0.975 (from tables) 2 8.2 ± 2.447 1.3178 = 8.2 ± 1.2188 7 [6.98, 9.42] to 3 s.f. 85 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW Assume that for B, the population is also normally distributed and has the same variance as for A. H0: μA = μB H1: μA > μB For B: ∑x = 37, ∑x 2 = 278.74, s= x = 37 = 7.4 5 1  278.74 − 37 2  = 1.1113 5  4  For the combined sample: s= T = 6 × 1.31782 + 4 × 1.11132 = 1.536 = 1.2394 10 8.2 − 7.4 0.8 = = 1.1024 0.725 69 1 1 1.2394 × 7 + 5 5% significance level and one-tailed test There are 10 degrees of freedom 1.1024 < 1.812 μA is not greater than μB. ⇒ p = 0.95 ⇒ t10, 0.95 = 1.812 ⇒ not in the critical region ⇒ (from tables) accept H0. Extension Questions 1 i G X′ (t ) = λ e λ(t −1) G ″X (t ) = λ 2e λ(t −1) ⇒ G X′ (1) = λ ⇒ G X″ (1) = λ 2 E ( X ) = λ 2 2 Var ( X ) = λ + λ − λ = λ So, E ( X ) = Var ( X ) ii Poisson distribution 2 i For 0 x π , 2 ( ) x I = ∫ x cos x 2 dx Using the substitution u = x2 0 I = ∫ 1 cosu du 2  x I =  1 sin x 2  = 1 sin x 2 2 0 2 ( ) ( ) π , 1 sin x 2 = 1 2 2 2 When x = So the CDF is:  0  1  sin x 2  2 F(x ) =  1 1  4 + x 8π   1  ( ) ( ) x <0 π 2 0x π π < x3 2 2 x>3 π 2 86 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS ii ( )  π   P < X < π  = P X < π − P X <  6   π 6  () 1 1  1 π  = + π − sin 8π   2 6  4 () 1 1 1 = + − = 0.354 8  4 4 3 Let the difference be after minus before. H0: Median difference = 0 There is no change in the number of flowers. H1: Median difference > 0 There is an increase in the number of flowers. Calculating the ranks and signed ranks, we get: Plant A B C D E F G H I J K L Number of flowers before spraying 3 7 1 5 2 8 4 4 5 9 1 6 Number of flowers 1 week after spraying 5 8 5 5 2 2 7 4 0 20 9 15 Difference 2 1 4 0 0 −6 3 0 −5 11 8 9 Rank 2 1 4 −6 3 −5 9 7 8 Notice that three plants have a difference of zero, so we ignore them and reduce n by 3. P = 34 and Q = 11 ⇒ T = 11 This is a one-tail test at the 5% level with n = 9 T>8 There has been no significant change in the number of flowers. 4 i ⇒ P(X = 1) = (k – 5) × 1! = k – 5 P(X = 2) = (k – 5) × 2! = 2(k – 5) ∴Gx(t) = (k – 5) + (k – 5)t + 2(k – 5)t2 Gx(t) = (k – 5)(1 + t + 2t2) Gx(1) = 1 ⇒ 1 = (k – 5)(1 + 1 + 2) ⇒ ( ) 1 =k −5 4 ⇒ k = 21 4 G X (t ) = 14 1 + t + 2t 2 G X (t ) = 14 + 14 t + 12 t 2 T  8 to reject H0 P(X = 0) = (k – 5) × 0! = k – 5 ⇒ Accept H0 ii G'X (t ) = 14 + t µ = G'X (1) = 54 G″X(t) = 1 Var ( X ) = G '' X (1) + G ' X (1) − ( G ' X (1)) 2 ( ) = 1611 = 1+ 5 − 5 4 4 ( ) + 161 11 2 5 16 = 5 4 2 2 ⇒ 1 Var( X ) = 25 µ 2 + 16 87 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW 5 i π I = ∫ ekxsin x dx 0 By parts: I = −ekxcosx + k ∫ekxcosx dx ( By parts again: I = −ekxcos x + k ekxsin x − k ∫ekxsin x dx Notice the integral is equal to I. I = – ekx cos x + kekx sin x – k2 I (1 + k2)I = kekx sin x – ekx cos x  kekxsin x − ekxcos x  I=  1 + k2  0 ) π  ekπ   −1  I= 2 − 2  k + 1   k + 1  I= ekπ + 1 k2 + 1 The integral must sum to 1 (total probability). Therefore: ekπ + 1 = k2 + 1 ekπ = k2 ii ekπ + 1 =1 k2 + 1 y y = ekπ 4 y = k2 2 0 –1 1 2 k The only solution is when k < 0. 6 H0: hair colour and eye colour are independent H1: hair colour and eye colour are not independent The table of expected values is: Hair colour Eye colour Blue Green Brown Total Blonde 4 7.25 13.75 25 Brown 3.84 6.96 13.2 24 Black 5.92 10.73 20.35 37 Red 2.24 4.06 7.7 14 Total 16 29 55 100 We need all expected values to be greater than 5 to apply the c 2 test, so merge blue eye and green eye columns to get the following table of observed and (expected) values. 88 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS Hair colour Eye colour ∑ Blue or green Brown Total Blonde 21 (11.25) 4 (13.75) 25 Brown 10 (10.8) 14 (13.2) 24 Black 7 (16.65) 30 (20.35) 37 Red 7 (6.3) 7 (7.7) 14 Total 25 55 100 (Ok − E k )2 = (21 − 11.25)2 + (10 − 10.8)2 + (7 − 16.65)2 + (7 − 6.3)2 + ( 4 − 13.75)2 Ek + 11.25 ∑ 10.8 16.65 6.3 13.75 (14 − 13.2 ) + ( 30 − 20.35 ) + (7 − 7.7 ) 13.2 20.35 7.7 2 2 2 (Ok − E k )2 = 25.781… Ek 2 There are 3 degrees of freedom. So the critical value of χ 3 at the 0.1% level is 16.27. Since 25.782 > 16.27, there is sufficient evidence to reject H0. Therefore, you can conclude that hair colour and eye colour are not independent. 7 i The sample space for the difference between the two dice is: Difference Dice 1 1 Dice 2 2 3 4 5 1 0 1 2 3 4 5 2 1 0 1 2 3 4 3 2 1 0 1 2 3 4 3 2 1 0 1 2 5 4 3 2 1 0 1 6 5 4 3 2 1 0 Therefore: x 0 1 2 3 4 5 P(X = x) 1 6 5 18 2 9 1 6 1 9 1 18 Therefore, the PGF is: 5 t + 2t 2 + 1t 3 + 1t 4 + 1 t 5 G X (t ) = 16 + 18 9 6 9 18 5 4 3 4 5 G'X (t ) = 18 + 9 t + 6 t 2 + 9 t 3 + 18 t 4 35 E( X ) = G X (1) = 18 20 4 2 G''X (t ) = 94 + t + 12 9 t + 18 t 20 35 G''X (1) = 94 + 1 + 12 9 + 18 = 9 35 35 Var( X ) = 35 9 + 18 − 18 ii 6 ( ) = 665 324 aE(X) = Var(X) 35 = 665 ⇒ a 18 324 2 a = 19 18 89 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 Summary REVIEW 8 Let the difference be after minus before. H0: Difference = 0 There is no change in the test scores. H1: Difference > 0 Test scores have increased. Calculating the ranks and signed ranks, we get: A B C D E F G H I J Test scores before tuition 20 15 17 18 8 15 19 24 6 23 Test scores after tuition 22 16 14 18 18 17 25 20 18 18 Difference 2 1 −3 0 10 2 6 −4 12 −5 2.5 1 −4 8 2.5 7 −5 9 −6 Rank Notice that 1 person has a difference of zero, so we ignore this and reduce n by 1. Notice also that two of the differences are equal, so the ranks (2 and 3) are averaged. P = 30 and Q = 15 This is a 1-tail test at the 2.5% level with n = 9 ⇒ T>5 There has been no significant increase in test scores. 9 i ⇒ ii T  5 to reject H0 Accept H0 π When x = 2 ∴k = ⇒ T = 15 ⇒ k −π π2 π2 e =1 4 4e 2 π2 Let y = (kx2ex) sin x Using the product rule for the expression within the brackets and for the overall expression: dy = kx 2e x cos x + (kx 2e x + 2kx e x )sin x = kx 2e xcos x + kx 2e xsin x + 2kxe xsin x dx ( ) (kx 2e x + 2kx e x )sin x = kx 2e xcos x + kx 2e xsin x + 2kxe xsin x = kx e x(xcos x + xsin x + 2sin x) −π dy 4e 2 x = 2 xe (xcos x + xsin x + 2sin x) dx π  −π  4e 2 x e x(xcos x + xsin x + 2sin x) Therefore the pdf is f ( x ) =  π2  0  10 Since a, b, c forms a geometric progression: GX(t) = a + art + ar2t2 GX(1) = 1 ⇒ 1 = a + ar + ar2 GX′(t) = ar + 2ar2t ⇒ π 2 otherwise ⇒ 1 = a(1 + r + r2) G′X (1) = E ( X ) = ar + 2ar 2 = a(r + 2r 2) Simultaneous equations: 1 = a(1 + r + r2) 1 24 2 19 = a(r + 2r ) 2 2 ÷ 1 a(r + 2r 2) 24 19 = a(1 + r + r 2) 24(1 + r + r2) = 19(r + 2r2) 24 + 24r + 24r2 = 19r + 38r2 0 = 14r2 – 5r – 24 0 = (7r + 8)(2r – 3) r = − 78 or r = 23 0x 90 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886 WORKED SOLUTIONS 3 Since the geometric progression is increasing ⇒ r = 2 1 = 4 Therefore a = 1 + 23 + 94 19 4 + 6 t + 9 t2 So G X (t ) = 19 19 19 G X ′ (t ) = 6 + 18 t 19 19 G′′X(t ) = 18 19 24 24 Var( X ) = 18 19 + 19 − 19 ⇒ G′′X(1) = 18 19 ( ) = 222 361 2 11 Let the difference be new minus original. H0: Difference = 0 There is no change in the median number of customers per hour. H1: Difference ≠ 0 There is a change in the median number of customers per hour. Calculating the ranks and signed ranks, we get: A B C D E I J K L M N O 251 700 632 348 372 366 571 336 515 324 198 337 380 837 632 485 395 237 258 465 714 523 69 337 −129 −313 129 199 199 −129 Original location (median number 224 108 of customers per hour) 613 New location (median number of customers per hour) 361 202 484 Difference 137 94 −129 129 137 9 2 Rank −5 5 F 0 G H 137 23 9 1 9 −5 −13 Notice that zero ranks have been ignored and tied ranks have been averaged. P = 63 and Q = 28 ⇒ 5 11.5 11.5 −5 T = 28 This is a 2-tail test at the 2% level with n = 13 ⇒ T  12 to reject H0 T > 12 There has been no significant change in the median number of customers per hour. The market research was correct. 12 i ⇒ accept H0 GX(t) = q + pt iiGY(t) = [GX(t)]n = (q + pt)n This represents the binomial distribution. iii GY′(t) = np(q + pt)n – 1 ⇒ GY′(1) = np(q + p)n – 1 We know that q + p = 1 ⇒ GY′ (1) = E ( X ) = np 0 iv GY″(t) = (n – 1) np2(q + pt)n – 2 ⇒ GY″(1) = (n – 1) np2(q + p)n – 2 We know that q + p = 1 ⇒ GY″(1) = (n – 1) np2 Var(X) = (n – 1)np2 + np – (np)2 = n2p2 – np2 + np – n2p2 = np – np2 = np(1 – p) We know that q + p = 1 Var(X) = npq ⇒ q=1–p 91 ©HarperCollinsPublishers 2018 Cambridge International AS & A Level Mathematics: Further Probability & Statistics 9780008271886

Further Probability & Statistics Student's Book

Related documents

Products

Support

Further Probability & Statistics Student's Book

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib