W.R. Wilcox, Clarkson University, 4 May 2005 5. Using a semi-empirical equation to correlate data We saw in 4 that the theory-based Clapeyron equation, while giving a far better fit than polynomials, was still deficient. (This indicates that Hv/Zv varies some with temperature, rather than being constant as assumed in the integration of the Clausius-Clapeyron equation.) Consequently a variety of semi-empirical modifications have been proposed for the Clapeyron equation.1 The most widely used of these is the Antoine equation: ln(P) = A - B/(T+C) (5.1) Unfortunately, there is no way to use MATLAB's Basic Fitting tool to find the values of the constants A, B and C that best fit the data. Instead, we must use the backslash (\) operator. Read the material at MATLAB Help, Contents, MATLAB, Mathematics, Data Analysis and Statistics, Regression and Curve Fitting. Follow the links at the right bottom of each page. In order to use the backslash operator, we must first put the Antoine equation into linear form. This is done by multiplying both sides of the equation by (T+C)/T and then moving C ln(P)/T to the right-hand side. This gives a linearized form of the Antoine equation: ln(P) = A + (AC - B)/T - C ln(P)/T (5.2) This can be rewritten in the following more familiar form: Y = ao + a1X1 + a2X2 (5.3) where Y ln(P), ao A, a1 AC - B, X1 1/T, a2 - C and X2 ln(P)/T. Thus, if we are able to determine a0, a1 and a2, then we find from the above definitions that: A = a0 C = - a2 B = - (a0a2 + a1). Without going into details, the backslash operator finds the values of the constants giving the least squares fit to Y. If you have closed and re-opened MATLAB since 4, you will need to reload COvapP.dat into MATLAB and extract the values of P and T. Then execute the following: 1 See, for example, chapter 6 in "The Properties of Gases and Liquids," 3 rd edition, by R.C. Reid, J.M. Prausnitz and T.K. Sherwood, Mc-Graw Hill, NY (1977). 1 >> Y = log(P); X1 = 1./T; X2 = log(P)./T; >> D(:,1) = ones(length(T),1); D(:,2) = X1; D(:,3) = X2; >> F = D\Y, a0 = F(1), a1 = F(2), a2 = F(3) >> A = a0, C = -a2, B = - (a0*a2 + a1) A= 19.082 C= -16.221 B= 496.63 Now use these values in the Antoine equation (5.1) to calculate predicted values of ln(P), a plot of the residuals of ln(P) versus T, and r2 for ln(P). Also prepare a semilogarithmic plot of P versus T showing data as points and the curve for the Antoine equation. >> lnPp = A - B./(T+C); >> d = lnP - lnPp; bar(T,d) >> xlabel('Temperature (T)'), ylabel('Residual for ln(P)') >> title('Residuals for Antoine fit to CO vapor pressure') >> r2 = 1 - sum(d.^2) / sum((lnP - mean(lnP)).^2) r2 = 0.99905 Figure 5.1 shows the resulting residuals plot and Figure 5.2 the semilog plot. Figure 5.1. Residuals of ln(P) for a fit of the Antoine equation to CO vapor pressure data. 2 Figure 5.2. Fit of the Antoine equation to CO vapor pressure data. If you compare to the result using the Clapeyron equation in 4, you will see that the Antoine equation gives a superior fit. Tabulations of Antoine equation constants (A, B, C) for a variety of chemicals are common. This completes our treatment of correlation of x-y data. You may be wondering at this point, how one goes about selecting an equation with which to fit data. The sources are: 1. Theory (as we saw here). 2. The general behavior of the data combined with a knowledge of how plots of different functions appear. The second question is how does one linearize an equation in order to use the backslash operator? While there are some common operations, such as converting the power-law equation y = axb to ln(y) = ln(a) + b ln(x), this is an art that is mastered only by practice. This entire method is known as "linear regression analysis." Least-squares methods can also be applied to non-linear equations, but this involves more complex minimization methods that are beyond freshman level. 3