Bear with me for a bit while I review some algebra:
To solve any system of equations, we need to solve the general system of the form
a1 x + b1 y = c1
(1)
a2 x + b2 y = c2
(2)
To solve for x, we could eliminate y by multiply each equation by the coefficient of y in the other
equation and then take the difference.
b2 a1 x + b2 b1 y = b2 c1
- b1 a2 x - b1 b2 y = - b1 c2
____________________________
b2 a1 x - b1 a2 x = b2 c1 - b1 c2
Then x can be found as: (As long as a 1 b 2 - a 2 b 1 != 0)
x = ( c1 b2 - c2 b1 ) / ( a1 b2 - a2 b1 )
(As long as a 1 b 2 - a 2 b 1 != 0)
Similiarly, y can be found as:
y = ( a1 c2 - a2 c1 ) / ( a1 b2 - a2 b1 )
We might note a few thing here, first the denominator is the same in both cases. If we write the linear
equations as a matrix:
a1 b1 c1
a2 b2 c2
We note that the denominator is just the determinant of the matrix (recall that the determinant is just the
difference of the cross multiplication of the terms in the matrix)
a b
D= 1 1
a2 b2
With additional examination we see that the numerator x is just
c b
Dx = 1 1 Where we note that this matrix is created by replacing the x coefficients with the
c2 b2
constants and taking the resulting determinant.
and y is:
b c
Dy = 1 1
b2 c2
Therefore the solutions could be written as
x = Dx / D and y = Dy / D.
This seems like a ridiculous amount of work, but the result generalizes to larger systems of equations
and is extremely easy to implement computationally!
Let's use the following system of equations
2x + y + z = 3
x– y–z=0
x + 2y + z = 0
We have the left-hand side of the system with the variables (the "coefficient matrix") and the right-hand side
with the answer values. Let D be the determinant of the coefficient matrix of the above system, and let Dx be the
determinant formed by replacing the x-column values with the answer-column values: (Adopted from analyzemath.com)
system of
equations
coefficient
matrix's
determinant
answer
column
Dx: coefficient determinant
with answer-column
values in x-column
2x + 1y + 1z = 3
1x – 1y – 1z = 0
1x + 2y + 1z = 0
Similarly, Dy and Dz would then be: Copyright © Elizabeth Stapel 1999-2009 All Rights Reserved
Evaluating each determinant, we get:
Cramer's Rule says that x = Dx / D, y = Dy / D, and z = Dz / D. That is:
x = 3/3 = 1, y = –6/3 = –2, and z = 9/3 = 3
That's all there is to Cramer's Rule. To find whichever variable you want, just evaluate the determinant
quotient Di ÷ D.
LSPR(Least Squares Plate Reduction) - basic idea
given
 RA DEC for 6 reference stars
 x y for 6 stars measured from plate
 x y for variable star measured from plate
want RA and DEC for variable star with MINIMAL error.
x
Define a plate origin as the average of the RAs and DECS of the six stars. Since the six stars should be
grouped around the variable star this should be reasonable. Call this A, D (for RA Average and DEC
average)
Convert the RAs and DECs for each star to
standard coordinates (Eta and Xi) using spherical
Trig ala Dr. Ran’s lecture.


sin(  A)

sin D tan   cos D cos(  A)
tan   tan D cos(  A)
tan D tan   cos(  A)
Point C refers to the center of the plate which has RA and DEC (A, D)
The Star Q has coordinates () on the celestial sphere) and when
projected onto the flat plate has coordinates ()
Now you should have six pairs of standard coordinates, Etas and Xis corresponding to the original RAs
and DECS.
The measured coordinates(x,y) are displaced from the standard coordinates () by an unknown
translation and an unknown rotation.
We will assume that the relation between the measured coordinates and the standard coordinate is linear
and thus of the form:
  x  ax  by  c
  y  dx  ey  f
These constants are the plate constants. Since we have the () for six stars we can solve for those
constants. (Actually we only need 3 stars for this, but if one or more them is poorly measured or
misidentified, the final result will be nonsense. Instead we will use a Least Squares technique to find
values for a,b,c,d,e and f. Think of finding a best fit plane(defined by a,b,c) to a set of three space
coordinates defined by ,x,y and then doing the same for c,d and f to Eta, x,y. The values (a,b,c,d,e
and f) are called the plate constants are describe the relationship between standard coordinates and plate
coordinates.
.
Finding Plate Constants (a,b,c,d,e and f) for LSPR:
Consider the following equations:
a11x1 + a12x2 + a13x3 + b1 = 0
a21x1 + a22x2 + a23x3 + b2 = 0
a31x1 + a32x2 + a33x3 + b3 = 0
a41x1 + a42x2 + a43x3 + b4 = 0
a51x1 + a52x2 + a53x3 + b5 = 0
Here we have five equations with only three unknowns, and there is no solution that will satisfy all five
equations exactly. We refer to these equations as the equations of condition. The problem is to find the
set of values of x1, x2, and x3 that, while not satisfying any one of the equations exactly, will come
closest to satisfying all of them with as small an error as possible.
In 1801 Gauss was faced with the problem of calculating the orbit of the newly discovered minor planet
Ceres. The problem was to calculate the six elements of the planetary orbit, and he was faced with
solving more then six equations for six unknowns. In the course of this, he invented the method of least
squares. It is hardly possible to describe the nature of the problem more clearly then Gauss did:
"...as all our observations, on account of the imperfection of the instruments and the
senses, are only approximations to the truth, an orbit based only on the six absolutely
necessary data may still be liable to considerable errors. In order to diminish these as
much as possible, and thus to reach the greatest precision attainable, no other method will
be given except to accumulate the greatest number of the most perfect observations, and to
adjust the elements, not so as to satisfy this or that set of observations with absolute
exactness, but so as to agree with all in the best possible manner."
If we can find some set of values of x1, x2, and x3 that satisfy our five equations fairly closely, but
without necessarily satisfying any one of them exactly, we shall find that, when these values are
substituted into the left hand sides of the equations, the right hand sides will not be exactly zero, but
will be a small number known as the residual R.
Thus:
a11x1 + a12x2 + a13x3 + b1 = R1
a21x1 + a22x2 + a23x3 + b2 = R2
a31x1 + a32x2 + a33x3 + b3 = R3
a41x1 + a42x2 + a43x3 + b4 = R4
a51x1 + a52x2 + a53x3 + b5 = R5
Gauss proposed a “best” set of values such that when substituted in the equations, give rise to a set of
residuals such that the sum of the squares of the residuals is least. Let S be the sum of the squares of
the residuals for a given set of values x1, x2 and x3.
S = R12 + R22 + R32 + R42 + R52
R12 = (a11x1 + a12x2 + a13x3 + b1)2
= (a11x1 + a12x2 + a13x3 + b1) (a11x1 + a12x2 + a13x3 + b1)
= a112x12 + a122x22 + a132x32 + b12 + 2a11a12x1x2 + 2a11a13x1x3 + 2a12a13x2x3 + 2a11x1b1
+ 2a12x2b1 + 2a13x3b1
(I know you will zone out right here, but bear with me for the sake of completeness!)
R22 = (a21x1 + a22x2 + a23x3 + b2)2
= (a21x1 + a22x2 + a23x3 + b2) (a21x1 + a22x2 + a23x3 + b2)
= a212x12 + a222x22 + a232x32 + b22 + 2a21a22x1x2 + 2a21a23x1x3 + 2a22a23x2x3 + 2a21x1b2
+ 2a22x2b2 + 2a23x3b2
R32 = (a31x1 + a32x2 + a33x3 + b3)2
= (a31x1 + a32x2 + a33x3 + b3) (a31x1 + a32x2 + a33x3 + b3)
= a312x12 + a322x22 + a332x32 + b32 + 2a31a32x1x2 + 2a31a33x1x3 + 2a32a33x2x3 + 2a31x1b3
+ 2a32x2b3 + 2a33x3b3
R42 = (a41x1 + a42x2 + a43x3 + b4)2
= (a41x1 + a42x2 + a43x3 + b4) (a41x1 + a42x2 + a43x3 + b4)
= a412x12 + a422x22 + a432x32 + b42 + 2a41a42x1x2 + 2a41a43x1x3 + 2a42a43x2x3 + 2a41x1b4
+ 2a42x2b4 + 2a43x3b4
R52 = (a51x1 + a52x2 + a53x3 + b5)2
= (a51x1 + a52x2 + a53x3 + b5) (a51x1 + a52x2 + a53x3 + b5)
= a512x12 + a522x22 + a532x32 + b52 + 2a51a52x1x2 + 2a51a53x1x3 + 2a52a53x2x3 + 2a51x1b5
+ 2a52x2b5 + 2a53x3b5
If any one of the x-values is changed, S will change – unless S is a minimum, in
which case the derivative of S with respect to each variable is zero.
The three equations:
dS
dS
dS
 0,
 0,
 0,
dx1
dx2
dx3
express the conditions that the sum of the squares of the residuals is least with
respect to each of the variables. If the reader will write out the value of S in full in
terms of the variables x1, x2 and x3, they will find,
So S = ∑ai12x12 + ∑ai22x22 + ∑ai32x32 + ∑bi2 + 2∑ai1ai2x1x2 + 2∑ai1ai3x1x3 + 2∑ai2ai3x2x3
+ 2∑ai1x1bi +2∑ai2x2bi + 2∑ai3x3bi
Finding
dS
give:
dx1
dS
= 2∑ai12x1 + 2∑ai1ai2x2 + 2∑ai1ai3x3 + 2∑ai1bi = 0
dx1
Cancelling the factor of 2 leaves:
∑ai12x1 + ∑ai1ai2x2 + ∑ai1ai3x3 + ∑ai1bi = 0
dS
dS
 0,
 0,
dx 2
dx3
We make the following abbreviations:
There will be similar results for
A11 = ∑ai12 A12 = ∑ai1ai2
A22 = ∑ai22
A13 = ∑ai1ai3
B1 = ∑ai1bi
A23 = ∑ai2ai3
B2 = ∑ai2bi
A33 = ∑ai32
B3 = ∑ai3bi
We can write the equations as follows:
∑ai1x1 + ∑ai1ai2x2 + ∑ai1ai3x3 + ∑ai1bi = 0
becomes
A11x1 + A12x2 + A13x3 + B1 = 0
and the rest of the derivatives can similarly be written as:
A21x1 + A22x2 + A23x3 + B2 = 0
A31x1 + A32x2 + A33x3 + B3 = 0
or:
A11x1 + A12x2 + A13x3 = -B1
A21x1 + A22x2 + A23x3 = -B2
A31x1 + A32x2 + A33x3 = -B3
We now have three equations with three unknowns that can be solved directly for
the unknowns using Cramer’s1 Rule!
Recall for Cramer’s Rule we rewrite things in matrix form:
 A11
A
 21
 A31
A12
A22
A32
Then xi 
A13   x1    B1 
A23   x2    B2 
A33   x3    B3 
det( Ai )
det( A)
Now in the particular case of LSPR, the equations are of the form:
 – x = ax + by + c
 – y = dx + ey + f
Where a, b and c represent x1, x2 and x3 in our original formulation.
Your programming task for the remainder of the period is to write a program that
can find the solution via least squares to the following equations.
7x1
3x1
2x1
4x1
9x1
–
+
–
+
-
6x2
5x2
2x2
2x2
8x2
+
+
–
+
8x3
2x3
7x3
5x3
7x3
–
–
–
–
–
15
27
20
2
5
=
=
=
=
=
0
0
0
0
0
First solve it by hand:
7x1
3x1
2x1
4x1
9x1
-108
–
+
–
+
-
6x2
5x2
2x2
2x2
8x2
-69
+
+
–
+
8x3
2x3
7x3
5x3
7x3
-71
–
–
–
–
–
∑ai1si, ∑ai2si, ∑ai3si,
Setup the normal equations:
1
Cramer is Swiss so pronounced CRAWmer
15
27
20
2
5
=
=
=
=
=
0
0
0
0
0
-6
-21
-13
-1
3
Add the
rows s1, s2,
s3, s4,s5
A11 = ∑ai12 A12 = ∑ai1ai2
A22 = ∑ai22
A13 = ∑ai1ai3
B1 = ∑ai1bi
A23 = ∑ai2ai3
B2 = ∑ai2bi
A33 = ∑ai32
B3 = ∑ai3bi
159x1 – 95x2 + 107x3 – 279 = 0
-95x1 + 133x2 – 138x3 + 31 = 0
107x1 – 138x2 + 191x3 – 231 = 0
Use Cramer’s Rule to solve:
x1 = -2.474 x2 = -5.397 x3 = -3.723
Your program should be able to take input and return the correct output.
If you have time, have your program calculate the residuals for each of the original
equations and the standard deviation of the residuals.
residuals [-0.28110831 -0.04074867 0.21272028 0.07598631 0.15117982]
Standard Deviation 0.0527843650619
Once we have the plate constants, we can find the Eta and Xi of the unknown object using the two
equations you have just derived.
  x  ax  by  c
  y  dx  ey  f
Where a,b and c are now known. x and y are the coordinates of your unknown object.
Once the standard coordinates Eta and Xi of the unknown object are known, we can use the inverse of
the relationship derived earlier to obtains RA and DEC!!


sin(  A)

sin D tan   cos D cos(  A)
tan   tan D cos(  A)
tan D tan   cos(  A)
Next we need to measure the quality of the Linear Regression. Calculate the residuals for each of the
stars by using their measured values of (x,y) and your calculated values of a,b,c,d,e and f to determine
the  and  for each star. Then calculate the RA and DEC for each star. The residual is the difference
between the calculated RA and DEC and the RA and DEC you input at the beginning of the program.
Last STEP!!!! Calculate the standard deviation of the residuals of the stars RA and DEC values. This
is a measure of the mean error in your variable star position.