www.capankajgoel.com CORRELATION AND REGRESSION ANALYSIS Ex. 1 The total of the multiplications of the deviations of X and Y = 3044. No. of pairs of the observations is 10. The total of deviations of X = – 170. Total of deviations of Y = –20. Total of the squares of deviations of X = 8288. Total of the squares of the deviations of Y = 2264. Find the coefficient of correlation when the arbitrary means of X and Y are 82 and 68 respectively. (Ans.) +0.78 Ex. 2. The value of Spearman’s rank correlation coefficient of a certain number of observations was found to be 2/3. The sum of the squares of differences between the corresponding ranks was 55. Find the number of pairs. (Ans.) 10. Ex. 3. The coefficient of rank correlation of the marks obtained by 10 students in Mathematics and Statistics was found to be 0.5. It was then detected that the difference in ranks in the two subjects for one particular student was wrongly taken to be 3 in place of 7. What should be the correct rank correlation coefficient? (Ans.) 0.2576 Ex. 4 The lines of regression of Y on X and X on Y are respectively Y = X +5 and 16 X – 9Y = 94. Find the variance of X if the variance of Y is 16. Also find the covariance of X and Y. (b) If r=0.4,Covariance=10 Var X=9 Find Second Moment for Y Ex. 5. Consider the two regression lines: 3x + 2y = 26 and 6x + y = 31. (1) Find the mean values and correlation coefficient between x and y.* Find the ratio of the variances of the variables. Ex. 6. Point out the inconsistency in the statement: (i) The regression of 7 on x is 2y + 3x = 4 and the correlation coefficient between x and y is 0.8. (ii) The correlation coefficient of 3x and –2y is the same as the correlation coefficient of x and y. Ex. 7. For a set of 10 pairs of values of x and y, the regression line of x on y is x–2y+12=0, mean and standard deviation of y being 8 and 2 respectively. Later it is known that a pair (x=3, y=8) was wrongly recorded and the correct pair detected is (x=8, y=3). Find the correct regression line of x on y. and what will be r if wrong figure is weeded out (Ans.) x=y–3. Ex. 8. From the given information, find coefficient of correlation with the help of suitable method. (a) (b) (c) (d) (X–20)2 = 100 Given: N = 5, mean x = 20,mean y = 10, (Y–10)2 = 60, (X–20) (Y–10) = 40 Given: N = 5, mean x = 20, mean y = 10, (X–15) = 3 (Y–8) = 2, (X–15)2 = 62, (Y–3)2 = 50, (X–15) (Y–8) = 35 Given: N = 5, (X–8 ) = 4, (Y–7) = 5 (X–8)2 = 100, (X–7)2 = 80, (X–8) (Y–7) = 41 Given: N = 5, mean x = 3, mean y = 2, (X–4)2 = 40 (Y–3) = 60, (X–4) (Y–3) = 32, (Ans.) (a) 0.516 (b) 0.62 (c) .434 (d) 0.615 Ex. 9. You are given the following information relating to a frequency distribution of 10 observations: 1 www.capankajgoel.com mean x = 5.5, X2 = 385, mean y = 4, Y2 = 192, (X+Y)2 = 947, find rxy. (Ans.) –0.68 Ex. 10 The ranks’ coefficient of correlation of 5 observations is given as 0.5. By mistake the ranks of two values are taken as: X Y 2 1 4 3 instead of X Y 1 3 3 5 Find correct ranks coefficient of correlation. (Ans.) 0.2 Ex. 11 In a partially destroyed record the following data are available : Variance of X = 25 Regression Equation of X on Y : 5X - Y = 22 Regression Equation of Y on X : 64X - 45Y = 24 Find the : (a) Mean values of X and Y. (b) Standard deviation of Y. (c) Coefficient of correlation between X and Y. Ex. 12. (a) Given that Y = a + 1/m (X). Find the value of ‘m’, when r = –.5, s2X = 1/4 s2Y. (b) Find regression coefficient and coefficient of correlation, given that: N=5, mean x = 10, mean y= 20, ∑(X–4)2 = 100, ∑ (Y–8)2 = 120, ∑ (X–4) (Y–8) = 160 (c ) If X= 4y+5 and Y=KX+4 are X on Y and Y on X lines. If k is positive, prove that it cant exceed ¼.If K=1/16.Find r and mean of both X and Y. (Ans.) m=1, .86 Ex. 13. With the help of the given information, find regression coefficients and coefficient of correlation: (a) Given: N = 5, mean x = 20, mean y= 10, ∑ (X–20)2 = 100, ∑ (Y – 10)2 = 60, ∑ (X – 20) (Y – 10) = 40 (b)Given: N = 5, mean x = 20, mean y = 10, ∑ (X–15) = 3, ∑ (Y – 8) = 2, ∑ (X – 15)2 = 62 ∑ (Y – 3)2 = 50, ∑ (X – 15) (Y – 8) = 35 (c)Given: N = 5, = 3, = 2, ∑ (X–4)2 = 40, ∑ (Y – 3)2 = 60, ∑ (X – 4) (Y – 3) = 32 (d)Given: N = 5, ∑X = 10, ∑Y = 15, ∑X2 = 100, ∑Y2 = 120, ∑XY = 80 (Ans.) (a) .529 (b) 0.56, 0.62 (c) 0.615 (d) 0.229 Ex.14 Find the regression coefficient of y on x from the following regression equations. 5x = 22 + y,64x = 24 + 45y Is it possible to calculate the standard deviation of y from the given information? Answer with reason. (Ans.) 64/225. Ex.15. The lines of regression of y on x and x on y are respectively y = x + 5 and 16x – 9y = 94. Find the variance of x if the variance of y is 16. Also find the covariance of x and y. 2 www.capankajgoel.com (Ans.) ¾, Cov(xy) = 9. Ex.16. The two regresion lines are given by 3x + 2y = 6 and 7x + 5y =12 (i) Estimate y when x =10 (ii) Calculate the correlation coefficient value between x and y (iii) What percentage of total variation remains unexplained by the regression Equation of y on x. Ex.17. Calculate the values of y = (x-6)5 Corresponding to x = 1,2,3,4, and 5 and obtain the Correlation differs from unity.(Ans.) +0.8679 Ex.18. A student calculates the value of r as 0.7 when the number of items (n) is 25. Find the limits within which r lies for another sample from the same universe. & Calculate PE. Does r implies that 70%of data is explained(Ans.) 0.767 and 0.633. Ex19. A firm not sure of the response to its product in ten different colour shades decides to produce them in those colour shades, if the ranking of these colour shades by two typical consumer judges is highly correlated. The two judges ranks the 10 colour in the following order: Colour No. Ranking by 1 2 3 4 5 6 7 8 9 10 Judge I 6 4 3 1 2 7 9 8 10 5 Judge II 4 1 6 7 5 8 10 9 3 2 Is there any agreement between the two judges, to allow the introduction of the product by the firm in the market? (Ans.) Since the rank correlation is low, there is not much agreement between the two judges, and the firm may not produce the product to introduce it in the market. Ex20 Find the correlation coefficient between age and playing habits of the following students: Age : 15 16 17 18 No. of students: 250 200 150 120 Regular players: 200 150 90 48 19 20 100 80 30 12 [Hint: Take age as X and the percentage of regular players as Y.) Ex21. Ten students obtained the following marks In Statistics (X) and Accountancy (Y) : Student : A B C X : 92 89 Y : 86 83 D E F G H I J 89 87 83 71 77 63 53 50 77 91 68 52 86 86 57 57 Find the rank correlation coefficient& correlation by concurrent Deviation.& also Covariance Ex22(i) ∑X=30, ∑Y=5, ∑X2=670, ∑Y2==285, ∑XY=334,N=12 It was found that X=11&Y=4 was wrongly copied, correct values are X=10,Y=14.find r if (a) Wrong observation is excluded (b) Wrong observation is excluded & correct is also included 3 www.capankajgoel.com (c) Find Correct combined Mean& SD (d) Find Correct Regression equation Ex 23 Find Regression equation& find r MARKS ECO 0-10 10-20 20-30 30-40 40-50 IN MARKS IN MATHS 0-10 10-20 6 3 33 16 10 20-30 10 15 7 30-40 40-50 7 10 4 4 5 (ii)If R = 0.8. The sum of the squares of the differences in ranks is 33. Find N. Ex24 Monthly Salary (Rs.) 600-700 700-800 800-900 Age (Years) 20-30 16 6 30-40 4 10 4 40-50 4 18 50-60 10 900-1000 4 12 12 Predict the value of salary if age is 10 years and also calculate coefficient of correlation between age and salary. Ex 25: Comment on nature of correlation and state true/false Production of PIG iron and soot contents at Durgapur plant.(positive) Unemployment index and the purchasing power of common man(negative) The line of regression does not exist is r=0(F) The line of regression is based on principles of least squares(t) The regression lines are perpendicular if x and y are independent(f) The regression lines are perpendicular if r=+-1(f) Ex 26 If n is 9 and explained and unexplained variations are 24 and 36 respectively. Find coefficient of determination and Syx 4