GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005 Homework discussion. People are having problems with the null hypothesis and what the solution to a hypothesis test means. Rejecting the null hypothesis is the positive result of our tests. We need to understand what this means, and the easiest way is by example. Consider the Mann-Whitney Test and example 2-7 (page 45). Comparing grain sizes from two different locations on the moon, we want to see if the mean grain sizes differ in the two samples. Of course they won’t be the same, but is the difference statistically significant? Our hypothesis is that the the mean grain size is different, implying the two samples come from different populations. Our null hypothesis is thus that the mean grain sizes are the same with some statistical confidence (95%). In this test we combine the two samples and rank eache element of the sample. If the two samples were identical, then they would have the same W1 and W2, which would be (eqn. 2.33) and U1 and U2 would equal zero. Since U=0 is certainly less than the critical value (U) obtained from the table. So as U increases, the two means are getting farther apart. We cannot disprove our null hypothesis if U is smaller than U. In the notes for this example, U=24 and U=20. So we can conclude that the two means cannot be the same with 95% certainty. IN THE NOTES, Paul says “This (U=24) is larger than the critical value of 20, suggesting we cannot reject the null hypothesis.” He’s wrong. Don’t believe everything you read. Another example: Homework 5, problem 1. It states: “… do these data support the claim that on average higher concentrations were obtained before cleaning versus after cleaning?” What is the hypothesis? What is the null hypothesis? These data are in PAIRS. We are trying to figure out if the “before” number is statistically larger than the “after” number. Using the “sign” test, what do you do first? -subtract the 2nd number of each pair from the first. If the first is larger, the result will be “+”, if the first is smaller, the result will be “-”. How many “+” are there? If there were as many + as -, what would that say about the null hypothesis? If the number of + is much larger than the number of -, what would that say? You can calculate the probability of having n + using the binomial coefficients, but that’s a lot of work. Since, for this problem, np>5 and n(1-p)>5, you can use z-statistics (eqn. 2.32). If your z-value from the data is >2, what does that mean? These tests are not difficult, but you do need to think the logic through. You cannot expect to blindly use the notes and formulas and come up with the correct answer. Since there’s a 50% probability that you’ll get the answer right, the method is everything. Linear algebra provides us with an easy method for solving systems of simultaneous equations. Consider the following set of four equations with four unknowns x1,x2,x3,and x4: a11x1 a12 x2 a13 x3 a14 x4 b1 a21x1 a22 x2 a23 x3 a24 x 4 b2 a31x1 a32 x2 a33 x3 a34 x4 b3 a41x1 a42 x2 a43 x3 a44 x4 b4 In matrix form, the above is just: Ax b (3.77) Where: a11 a12 a21 a22 A a31 a32 a41 a42 a13 a23 a33 a43 a14 a24 , a34 a44 x1 x2 x , x3 x4 b1 b2 and b b3 b4 (3.79-3.81) The solution to the equations is obtained by multiplying both sides of Eqn. 3.78 by A-1, to obtain: 1 1 A Ax A b 1 Ix x A b (3.82) (3.83) We have thus solved for x and obtained the solution. EXAMPLE: Consider a simple example. We have three planes that are defined below. Any three planes cross at a point. At what point do they cross each other? x-y-2z=2 x+y+2z=10 -2x-2y-z=3 Setting up the matrix: 1 1 2 x 2 A 1 1 2 , x = y, b = 10 2 2 1 z 3 In Matlab: >> Ainv=inv(A) Ainv = 0.5000 0.5000 0 -0.5000 -0.8333 -0.6667 0 0.6667 0.3333 Multiplying by B, >> X=Ainv*B X= 6.0000 =x -11.3333 =y 7.6667 =z Thus the planes cross at the point above. I think this is an easy way to solve such sets of equations, particularly as the number of equations and unknowns increase. Other methods may be computationally more efficient, but the above method is easy to set up and solve. _______________ Try an example where you know the answer: Try the x-z plane at y=2 (0x+1y+0z=2), the x-y plane at z=0 (0x+0y+1z=0), and the y-z plane at x=-2 (1x+0y+0z=2). Where do these 3 planes cross? Solve graphically and using the matrix solution. Solutions of this sort are relatively simple. We have no options - there is only one correct answer and no freedom to choose between possible answers. A more interesting case is where we have more data than we need to fit a model. For example, as we’ve said before, two points define a line. But what if we want to define a line a line with three points? A and B define a line, but added data (point C) sheds doubt on our original interpretation. We would like to define a new line that is somehow the “best” line given the data we have. In addition, we would like to know just how likely it is that the new line, which is an estimate based on a sample of all points in the population, reflects the population. We must be careful! We are assuming that our MODEL (a straight line) reflects the shape of reality. A “best fit” does not validate the model. A good bet for finding the “best” fit to a model curve is to minimize the square of the errors of each point with respect to the model curve. Thus, we want to find the curve that minimizes these errors: This figure works well, but what if the line is nearly vertical? The figure above shows the errors in the y-value (regression of y on x). Similarly, we could find the errors in x: Do these two methods yield the same answers? Consider the case below: In this case the errors in y are far smaller than the errors in x, and utilizing errors in y will likely yield a better result. If the curve had a steep slope, the opposite would be the case. In any case, we can use a method that does not vary with slope of the curve by measuring perpendiculars to the curve at each point: This method, called orthogonal regression, is most useful when the slope of the line is unknown and it can be in any direction. We wish to find a line of the form: y(x)=a1+a2(x-x0). (3.89) Why does Paul use x0? Note that y(x)=a1+a2x-a2x0, so y(x)=(a1-a2x0)+a2x=a3+a2x. So why bother with x0??? Let’s ignore it… We have two unknowns, a1and a2, and we wnt to find values of these unknowns that minimize the square of the error: n minimize : (yobserved ytheoretical)2 i1 (3.90) We have one equation for each data point: a1 a2 x1 y1 a1 a2 x2 y2 (3.91) a1 a2 xn yn We only need two equations to solve for a1 and a2, but we have n equations. This situation is called over-determined. The only way these equations can have a unique solution is for n to equal 2 or if all the points lie exactly on a straight line. We can write the equations in matrix notation, A x=b : 1 1 1 1 y1 x1 x2 a1 y2 x3 a2 y3 x4 y4 (3.92) Why can’t we just invert as we did before, and solve for x=A-1b ? Unfortunately, this isn’t possible since A isn’t and it thus has no inverse. square, But all is not lost… Consider the equations: a1 a2 x1 y1 e1 a1 a2 x2 y2 e2 (3.93) a1 a2 xn yn en ei is the error of the observed y minus the theoretical value, and we want to obtain the values of a1 and a2 thatminimize the sum of the squares of the ei: n E(a1 , a2 ) ei2 eT e (3.94) i1 Recall the definition of variance. Minimization of E will minimize the variance of the errors. Recall that if E is minimum then the slope of E must be equal to zero. E is a function of two variables, and the slope must be zero for each, which says: E(a1a2 ) E(a1a2 ) 0 a1 a2 (3.95) Evaluating: n n E 2 2 a1 a2 xi yi ei a1 a1 i1 a1 i1 n (3.96) 2 a1 a2 xi yi 0, and i1 E n 2 n 2 a1 a2 xi yi ei a2 a2 i1 a2 i1 n 2 a1 a2 xi yi xi 0 i1 (3.97) With the results from the two equations (3.96 and 3.97) we have two equations and two unknowns (a1 and a2), and we can solve. Re-arranging: n n na1 a2 xi yi i1 n (3.98) i1 n n a1 xi a2 x 2i yi x i i1 i1 (3.99) i1 Note that the unknowns are a1 and a2, and that the x and y values are known, so the sumations in the equations above are all of known constants. We form the sums as follows (notation is poor here; S stands for sum, not covariance): n n n n i1 i1 i1 i1 Sx xi , Sy yi , Sxy yi x i , Sxx x 2i (3.100) Substituting: na1 a2 Sx Sy 3.101 a1Sx a2 Sxx Sxy 3.102 Solving for the y-intercept, a1 in the first equation, 1 a a1 Sy 2 Sx 3.103 n n Substituting into the first equation for a1 yields: 1 1 2 a2 Sxy Sx Sy Sxx Sx n n 3.107 In matrix notation: n Sx Sx a1 Sy Sxx a2 Sxy (3.109) This matrix is of the form N x = B, which can be solved for x by x = N-1 B. In - class problem: Use the following data and calculate the least-squares fit to a line using eqn. 3.109. Data points: x y 33 -33 10 -5 4 -2 50 -44