Analysis of congested spectra by total spectrum fitting: A statistical study *

Analysis of congested spectra by total spectrum fitting:

A statistical study

*

Joel Tellinghuisen

Department of Chemistry, Vanderbilt University

Nashville, TN 37235

* J. Mol. Spectrosc . 226 , 137 (2004).

Personal Retrospective:

(1) TSF yields surprisingly precise information about population distributions from “poor”

ESD spectra of OH [Mendenhall, et al ., Nucl. Instrum. and Meth .

B33 , 834 (1988)].

(2) Impressive performance found for TSF applied to ESDproduced CN spectra [Xu, et al .,

J. Chem. Phys. 93 , 5281 (1990)].

In (a), fits yield a 2T Boltzmann rotational distribution, with T s of

91(3) (19%) and 665(7) K (800 spectral points).

384

OH A



X (0–0)

CN B



X (0–0)

(nm) 389

(3) Later work on CN spectra of higher quality yields even more impressive results. [For similar, see Xu, et al ., Phys. Rev.

B . 48 , 8222 (1993).]

10

8

6

4

CN B  X from ESD off NaCl

Experimental

CN B



X Fit Model

Quadratic Background: b

0 b

1 b

2

= 0.056(5)

= 2.0(5)



10

–3

= -1.2(5)



10

–4

Population Distribution: 3T Boltzmann

T

1

T

2

T

3

= 161(8) K; c

1

= 519(13) K; c

2

= 19 K (fixed); c

3

= 0.68(4)

= 1.16(4)

= 0.056(5)

Line Shape: Gauss-Lorentz (~ Voigt)

D 

1/2

= 0.335(3) Å; f

G

= 0.82(6)

2

0

3840 3850

Calculated

3860



(Å)

3870 3880 3890

1-1 Band: I (1–1)/ I (0–0) = 0.061(6)

Calibration:

D  ( Å) =

A sin [ k (

 – 

0

)] + B

A = 0.133(8);



(2nd Harmonic!):

0

= –2.6(2) Å

D 

2

= A

2

A

2

= –0.030(7);

B = –1.010(5) sin [2 k (

 – 

2

)]



2

= 0.23(38)

Reduced c

2 = 1.13

(4) OH A



X spectra recorded at high resolution on a CCD array require inclusion of OD (natural abundance ~ 0.014%) in fit model [ J. Chem. Phys . 114 , 3465 (2001)].

300 0

200 0

100 0

0

75 80 85 90



(Å) - 3000 Å

95 100

10



0

-1 0

OH A



X emission , from

Tesla discharge spectrum of water vapor in Ar, as recorded at ~0.03Å resolution on a CCD array

(800 channels). Normalized

LS residuals below.

4

2

0

8

6

70

71 72 73 74



(Å) - 300 0 Å

75 76 77

5

0



-5

Blowup of 6-Å region of spectrum, showing residuals with and without OD in fit model. OD abundance fit parameter = 2.1(2)



10

–4

.

All examples so far have involved prior knowledge of the spectroscopic constants. Similar methods have worked on congested spectra requiring estimation of the constants.

Calibration

I

2

D'



A'

22-0 band

Experimental

32521

Calculated

32523 32525

Wavenumber (cm

–1

)

32527 32529

Experimental and fitted spectra for D'



A' transition in I

2

, as excited in a free jet expansion

[Zheng, et al ., J. Chem. Phys.

96 , 4877 (1992)]. Adjustable fit parameters include B' , B'' , and



0

, yielding apparent errors of 0.0008 cm

–1

, 1.3



10

–5 cm

–1

, and 1.4



10

–5 cm

–1

. The latter two represent relative errors of

0.05%.

Questions:

• Just how good are these derived parameters?

• In the case of the fitted spectroscopic parameters, how do these results compare with those obtained by the traditional Measure, Assign, Fit (MAF) approach?

• First question is relevant, because these are nonlinear LS fits, so there is no guarantee on the

V -based statistical errors. However, the relative errors are much less than 10%, so these parameters would appear to satisfy the “10% rule of thumb” [ J. Phys. Chem. A 104 , 2834 (2000)].

Approach: Monte Carlo calculations.

Test Case: Last example above.

Complication: Peak-finder for MAF method.

18000

14000

B  ' = 0.0177

10000

0.01725

6000

0.016816

Spectra are synthesized for various B' and fixed

B'' (= 0.02804 cm

–1

) at

T rot

= 5 K. P and R lines are fully resolved at top, perfectly overlapped at bottom.

2000

-7 -6 -5 -4 -3

 – 

0

(cm

–1

)

-2 -1 0

Monte Carlo Review *

•

Add random Gaussian error of known

 to independent variable in model. [Here proportional error is assumed, e.g.

, 5% of signal.]

•

Repeat N times to generate statistically equivalent spectra. Analyze each using weights w i

=

 i

–2

.

• Do sampling statistics in usual way, obtaining MC estimated values of each parameter b and its standard error

 bias) is b



. The MC precision of error (2 N ) b

/ N

–1/2

1/2

(= 2.2% for N b

(needed to evaluate

. The MC estimate of

= 1000).

 b has relative

* Recommended pedagogy — J. Chem. Educ. 82 , 157 (2005).

•

V -based standard errors for B in TSF approach are valid except where lines are doubled up; there “exact” predictions are conservative, while just off peak they are optimistic.

•

For B

 s, MAF outperforms TSF in doubled-up regions.

(First peak expanded)

0.001



B''

(cm

–1

) exact

TSF

MAF

0.0002

0.0001

0.0001

cm

–1

0.0000

TSF bias

0.0160

0.0170

0.0180

0.0190

0.0200

B ' (cm

–1

)

0.0165

0.0167

0.0169

0.0171

B ' (cm

–1

)

•

Similar results for



0

(points and curve at top), except that now TSF outperforms MAF everywhere in MC sampling.

0.0070

0.0050

0.0030



(cm

–1

)

0.0010

0.0008

(Linewidth parameter)

0.0160

0.0165

0.0170

0.0175

0.0180

0.0185

B ' (cm

–1

)

c 2

So why are the “exact” ( V -based) parameter

 s wrong when lines double up?

Exact too high

Examine c

2 surfaces for a clue.

Too low

10

8

6

0.016816

0.01650

0.01669

0.01700

1) Fix B'' = 0.02804 cm

–1

.

2) Compute spectra for indicated values of B' .

4

2

0

0.0270

0.0275

0.0280

0.0285

0.0290

B '' (cm

–1

)

3) Fit spectra with B'' frozen at values along the abscissa.

4) Results show c

2 surfaces with respect to B'' , for indicated B' .

Correct!

Histogrammed results for B'' from MC-generated spectra having various B' confirm this behavior.

160 120

0.0165

120

80

0.016816

0.01669

80

40

0

0.0276

0.0280

0.0284

0.0288

B '' (cm

–1

)

40

0

0.0279

0.0280

0.0281

0.0282

B '' (cm

–1

)

For MAF analysis of spectra, bias dominates statistical error except for regions where the R and P lines are either well resolved or perfectly coincident .

0.0004

0.0002

bias



0.0000

-0.0002

-0.0004

0.0160

0.0170

0.0180

B ' (cm

–1

)

0.0190

0.0200

Next step: Global analysis with B'' fixed. Now TSF method outperforms MAF everywhere, and exact and

MC error estimates agree.

( B' ) 1.0 10

-5



8.0 10

-6

6.0 10

-6

1 10

-6

5 10

-7

0

-5 10

-7

0.0160

0.0170

0.0180

0.0190

0.0200

B ' (cm

–1

)

Kicker: MAF results depend on peak-finding algorithm.

Histogram of MAF B'' values for B' = 0.01725 cm

–1

350

300

250

200

150

100

50

0

0.0272

0.5% noise

"

5% noise

Peak finder

(b)

0.0274

0.0276

B '' (cm

–1

)

0.0278

Peak finder

(a)

0.0280

Kicker 2: Good performance of TSF method is partly due to knowledge of the correct model.

Summary

•

Total spectrum fitting outperforms the traditional measureassign-fit approach to the analysis of congested diatomic spectra, except for the special case of perfectly coincident P and R lines, where MAF tacitly incorporates a special relation between the two B values.

•

A major advantage of TSF is lack of significant bias.

•

However, a major limitation of TSF is the requirement for a correct fit model. Model error can give bias comparable to the nominal statistical error.

•

A second significant limitation of TSF is unreliable parameter errors, from violation of the 10% rule of thumb. In the present case such violations are restricted to regions where the two B values are highly correlated, leading to very nonparabolic c

2 surfaces.

Analysis of congested spectra by total spectrum fitting: A statistical study *

Analysis of congested spectra by total spectrum fitting:

A statistical study

Joel Tellinghuisen

Questions:

Approach: Monte Carlo calculations.

Test Case: Last example above.

Complication: Peak-finder for MAF method.

Monte Carlo Review *

Summary

•

•

•

•

Related documents

Products

Support

Analysis of congested spectra by total spectrum fitting: A statistical study *