LeastSquaresNotes.doc

Bear with me for a bit while I review some algebra: To solve any system of equations, we need to solve the general system of the form a1 x + b1 y = c1 (1) a2 x + b2 y = c2 (2) To solve for x, we could eliminate y by multiply each equation by the coefficient of y in the other equation and then take the difference. b2 a1 x + b2 b1 y = b2 c1 - b1 a2 x - b1 b2 y = - b1 c2 ____________________________ b2 a1 x - b1 a2 x = b2 c1 - b1 c2 Then x can be found as: (As long as a 1 b 2 - a 2 b 1 != 0) x = ( c1 b2 - c2 b1 ) / ( a1 b2 - a2 b1 ) (As long as a 1 b 2 - a 2 b 1 != 0) Similiarly, y can be found as: y = ( a1 c2 - a2 c1 ) / ( a1 b2 - a2 b1 ) We might note a few thing here, first the denominator is the same in both cases. If we write the linear equations as a matrix: a1 b1 c1 a2 b2 c2 We note that the denominator is just the determinant of the matrix (recall that the determinant is just the difference of the cross multiplication of the terms in the matrix) a b D= 1 1 a2 b2 With additional examination we see that the numerator x is just c b Dx = 1 1 Where we note that this matrix is created by replacing the x coefficients with the c2 b2 constants and taking the resulting determinant. and y is: b c Dy = 1 1 b2 c2 Therefore the solutions could be written as x = Dx / D and y = Dy / D. This seems like a ridiculous amount of work, but the result generalizes to larger systems of equations and is extremely easy to implement computationally! Let's use the following system of equations 2x + y + z = 3 x– y–z=0 x + 2y + z = 0 We have the left-hand side of the system with the variables (the "coefficient matrix") and the right-hand side with the answer values. Let D be the determinant of the coefficient matrix of the above system, and let Dx be the determinant formed by replacing the x-column values with the answer-column values: (Adopted from analyzemath.com) system of equations coefficient matrix's determinant answer column Dx: coefficient determinant with answer-column values in x-column 2x + 1y + 1z = 3 1x – 1y – 1z = 0 1x + 2y + 1z = 0 Similarly, Dy and Dz would then be: Copyright © Elizabeth Stapel 1999-2009 All Rights Reserved Evaluating each determinant, we get: Cramer's Rule says that x = Dx / D, y = Dy / D, and z = Dz / D. That is: x = 3/3 = 1, y = –6/3 = –2, and z = 9/3 = 3 That's all there is to Cramer's Rule. To find whichever variable you want, just evaluate the determinant quotient Di ÷ D. LSPR(Least Squares Plate Reduction) - basic idea given  RA DEC for 6 reference stars  x y for 6 stars measured from plate  x y for variable star measured from plate want RA and DEC for variable star with MINIMAL error. x Define a plate origin as the average of the RAs and DECS of the six stars. Since the six stars should be grouped around the variable star this should be reasonable. Call this A, D (for RA Average and DEC average) Convert the RAs and DECs for each star to standard coordinates (Eta and Xi) using spherical Trig ala Dr. Ran’s lecture.   sin(  A)  sin D tan   cos D cos(  A) tan   tan D cos(  A) tan D tan   cos(  A) Point C refers to the center of the plate which has RA and DEC (A, D) The Star Q has coordinates () on the celestial sphere) and when projected onto the flat plate has coordinates () Now you should have six pairs of standard coordinates, Etas and Xis corresponding to the original RAs and DECS. The measured coordinates(x,y) are displaced from the standard coordinates () by an unknown translation and an unknown rotation. We will assume that the relation between the measured coordinates and the standard coordinate is linear and thus of the form:   x  ax  by  c   y  dx  ey  f These constants are the plate constants. Since we have the () for six stars we can solve for those constants. (Actually we only need 3 stars for this, but if one or more them is poorly measured or misidentified, the final result will be nonsense. Instead we will use a Least Squares technique to find values for a,b,c,d,e and f. Think of finding a best fit plane(defined by a,b,c) to a set of three space coordinates defined by ,x,y and then doing the same for c,d and f to Eta, x,y. The values (a,b,c,d,e and f) are called the plate constants are describe the relationship between standard coordinates and plate coordinates. . Finding Plate Constants (a,b,c,d,e and f) for LSPR: Consider the following equations: a11x1 + a12x2 + a13x3 + b1 = 0 a21x1 + a22x2 + a23x3 + b2 = 0 a31x1 + a32x2 + a33x3 + b3 = 0 a41x1 + a42x2 + a43x3 + b4 = 0 a51x1 + a52x2 + a53x3 + b5 = 0 Here we have five equations with only three unknowns, and there is no solution that will satisfy all five equations exactly. We refer to these equations as the equations of condition. The problem is to find the set of values of x1, x2, and x3 that, while not satisfying any one of the equations exactly, will come closest to satisfying all of them with as small an error as possible. In 1801 Gauss was faced with the problem of calculating the orbit of the newly discovered minor planet Ceres. The problem was to calculate the six elements of the planetary orbit, and he was faced with solving more then six equations for six unknowns. In the course of this, he invented the method of least squares. It is hardly possible to describe the nature of the problem more clearly then Gauss did: "...as all our observations, on account of the imperfection of the instruments and the senses, are only approximations to the truth, an orbit based only on the six absolutely necessary data may still be liable to considerable errors. In order to diminish these as much as possible, and thus to reach the greatest precision attainable, no other method will be given except to accumulate the greatest number of the most perfect observations, and to adjust the elements, not so as to satisfy this or that set of observations with absolute exactness, but so as to agree with all in the best possible manner." If we can find some set of values of x1, x2, and x3 that satisfy our five equations fairly closely, but without necessarily satisfying any one of them exactly, we shall find that, when these values are substituted into the left hand sides of the equations, the right hand sides will not be exactly zero, but will be a small number known as the residual R. Thus: a11x1 + a12x2 + a13x3 + b1 = R1 a21x1 + a22x2 + a23x3 + b2 = R2 a31x1 + a32x2 + a33x3 + b3 = R3 a41x1 + a42x2 + a43x3 + b4 = R4 a51x1 + a52x2 + a53x3 + b5 = R5 Gauss proposed a “best” set of values such that when substituted in the equations, give rise to a set of residuals such that the sum of the squares of the residuals is least. Let S be the sum of the squares of the residuals for a given set of values x1, x2 and x3. S = R12 + R22 + R32 + R42 + R52 R12 = (a11x1 + a12x2 + a13x3 + b1)2 = (a11x1 + a12x2 + a13x3 + b1) (a11x1 + a12x2 + a13x3 + b1) = a112x12 + a122x22 + a132x32 + b12 + 2a11a12x1x2 + 2a11a13x1x3 + 2a12a13x2x3 + 2a11x1b1 + 2a12x2b1 + 2a13x3b1 (I know you will zone out right here, but bear with me for the sake of completeness!) R22 = (a21x1 + a22x2 + a23x3 + b2)2 = (a21x1 + a22x2 + a23x3 + b2) (a21x1 + a22x2 + a23x3 + b2) = a212x12 + a222x22 + a232x32 + b22 + 2a21a22x1x2 + 2a21a23x1x3 + 2a22a23x2x3 + 2a21x1b2 + 2a22x2b2 + 2a23x3b2 R32 = (a31x1 + a32x2 + a33x3 + b3)2 = (a31x1 + a32x2 + a33x3 + b3) (a31x1 + a32x2 + a33x3 + b3) = a312x12 + a322x22 + a332x32 + b32 + 2a31a32x1x2 + 2a31a33x1x3 + 2a32a33x2x3 + 2a31x1b3 + 2a32x2b3 + 2a33x3b3 R42 = (a41x1 + a42x2 + a43x3 + b4)2 = (a41x1 + a42x2 + a43x3 + b4) (a41x1 + a42x2 + a43x3 + b4) = a412x12 + a422x22 + a432x32 + b42 + 2a41a42x1x2 + 2a41a43x1x3 + 2a42a43x2x3 + 2a41x1b4 + 2a42x2b4 + 2a43x3b4 R52 = (a51x1 + a52x2 + a53x3 + b5)2 = (a51x1 + a52x2 + a53x3 + b5) (a51x1 + a52x2 + a53x3 + b5) = a512x12 + a522x22 + a532x32 + b52 + 2a51a52x1x2 + 2a51a53x1x3 + 2a52a53x2x3 + 2a51x1b5 + 2a52x2b5 + 2a53x3b5 If any one of the x-values is changed, S will change – unless S is a minimum, in which case the derivative of S with respect to each variable is zero. The three equations: dS dS dS  0,  0,  0, dx1 dx2 dx3 express the conditions that the sum of the squares of the residuals is least with respect to each of the variables. If the reader will write out the value of S in full in terms of the variables x1, x2 and x3, they will find, So S = ∑ai12x12 + ∑ai22x22 + ∑ai32x32 + ∑bi2 + 2∑ai1ai2x1x2 + 2∑ai1ai3x1x3 + 2∑ai2ai3x2x3 + 2∑ai1x1bi +2∑ai2x2bi + 2∑ai3x3bi Finding dS give: dx1 dS = 2∑ai12x1 + 2∑ai1ai2x2 + 2∑ai1ai3x3 + 2∑ai1bi = 0 dx1 Cancelling the factor of 2 leaves: ∑ai12x1 + ∑ai1ai2x2 + ∑ai1ai3x3 + ∑ai1bi = 0 dS dS  0,  0, dx 2 dx3 We make the following abbreviations: There will be similar results for A11 = ∑ai12 A12 = ∑ai1ai2 A22 = ∑ai22 A13 = ∑ai1ai3 B1 = ∑ai1bi A23 = ∑ai2ai3 B2 = ∑ai2bi A33 = ∑ai32 B3 = ∑ai3bi We can write the equations as follows: ∑ai1x1 + ∑ai1ai2x2 + ∑ai1ai3x3 + ∑ai1bi = 0 becomes A11x1 + A12x2 + A13x3 + B1 = 0 and the rest of the derivatives can similarly be written as: A21x1 + A22x2 + A23x3 + B2 = 0 A31x1 + A32x2 + A33x3 + B3 = 0 or: A11x1 + A12x2 + A13x3 = -B1 A21x1 + A22x2 + A23x3 = -B2 A31x1 + A32x2 + A33x3 = -B3 We now have three equations with three unknowns that can be solved directly for the unknowns using Cramer’s1 Rule! Recall for Cramer’s Rule we rewrite things in matrix form:  A11 A  21  A31 A12 A22 A32 Then xi  A13   x1    B1  A23   x2    B2  A33   x3    B3  det( Ai ) det( A) Now in the particular case of LSPR, the equations are of the form:  – x = ax + by + c  – y = dx + ey + f Where a, b and c represent x1, x2 and x3 in our original formulation. Your programming task for the remainder of the period is to write a program that can find the solution via least squares to the following equations. 7x1 3x1 2x1 4x1 9x1 – + – + - 6x2 5x2 2x2 2x2 8x2 + + – + 8x3 2x3 7x3 5x3 7x3 – – – – – 15 27 20 2 5 = = = = = 0 0 0 0 0 First solve it by hand: 7x1 3x1 2x1 4x1 9x1 -108 – + – + - 6x2 5x2 2x2 2x2 8x2 -69 + + – + 8x3 2x3 7x3 5x3 7x3 -71 – – – – – ∑ai1si, ∑ai2si, ∑ai3si, Setup the normal equations: 1 Cramer is Swiss so pronounced CRAWmer 15 27 20 2 5 = = = = = 0 0 0 0 0 -6 -21 -13 -1 3 Add the rows s1, s2, s3, s4,s5 A11 = ∑ai12 A12 = ∑ai1ai2 A22 = ∑ai22 A13 = ∑ai1ai3 B1 = ∑ai1bi A23 = ∑ai2ai3 B2 = ∑ai2bi A33 = ∑ai32 B3 = ∑ai3bi 159x1 – 95x2 + 107x3 – 279 = 0 -95x1 + 133x2 – 138x3 + 31 = 0 107x1 – 138x2 + 191x3 – 231 = 0 Use Cramer’s Rule to solve: x1 = -2.474 x2 = -5.397 x3 = -3.723 Your program should be able to take input and return the correct output. If you have time, have your program calculate the residuals for each of the original equations and the standard deviation of the residuals. residuals [-0.28110831 -0.04074867 0.21272028 0.07598631 0.15117982] Standard Deviation 0.0527843650619 Once we have the plate constants, we can find the Eta and Xi of the unknown object using the two equations you have just derived.   x  ax  by  c   y  dx  ey  f Where a,b and c are now known. x and y are the coordinates of your unknown object. Once the standard coordinates Eta and Xi of the unknown object are known, we can use the inverse of the relationship derived earlier to obtains RA and DEC!!   sin(  A)  sin D tan   cos D cos(  A) tan   tan D cos(  A) tan D tan   cos(  A) Next we need to measure the quality of the Linear Regression. Calculate the residuals for each of the stars by using their measured values of (x,y) and your calculated values of a,b,c,d,e and f to determine the  and  for each star. Then calculate the RA and DEC for each star. The residual is the difference between the calculated RA and DEC and the RA and DEC you input at the beginning of the program. Last STEP!!!! Calculate the standard deviation of the residuals of the stars RA and DEC values. This is a measure of the mean error in your variable star position.

LeastSquaresNotes.doc

Products

Support

LeastSquaresNotes.doc

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib