Applying Branching Processes Theory for Building a Statistical

Applying Branching Processes Theory for Building a Statistical

Model for Scanning Electron Microscope Signal

Ira Cohen a,b , Rotem Golan a,b and Stanley Rotman b a Opal Technologies, Nes-Ziona, Israel. b Department of Electrical & Computer Engineering, Ben-Gurion University, Beer-

Sheva, Israel.

Abstract: Branching stochastic processes are used to describe random systems such as nuclear chain reactions, population development and gene propagation. In this work we show that the creation of the SEM (Scanning Electron Microscope) signal can be developed as a branching stochastic process. A statistical model is described step by step, as a function of the physical parameters of the process. Using the model, we propose a method for determining the unknown probability distribution of the secondary electron emission. Using this method, a Lognormal distribution is shown to approximate well the secondary electron (SE) emission, and a Poisson distribution is shown to have poor approximation results.

1 Introduction

SEM image enhancement and noise reduction algorithms use some basic assumptions on the SEM image and the noise statistical characteristics. For example Erasmus (1982) [1] assumes that the noise is an additive independent noise with zero mean. Podsiadlo et al

(1995) [2] further assumes that this noise is Gaussian. However a full statistical model for the formation of the SEM signal is not available.

The notion of modeling the statistics of electron emission in SEM by the branching process theory was partially introduced earlier by Reimer [3]. Reimer describes the statistics of the SE and the BSE (backscattered electrons) emission in terms of their mean and variance only, using the results of branching processes theory.

In this work, we use the theory of branching processes to analyze of the formation of the

SEM signal. As a first step, we present an overview of the theory of branching stochastic processes. In section 3, we apply the theory to the SEM signal formation process, and, subsequently, introduce a full statistical model, including the generating function and the moments of the process. In section 4, we describe examples of a typical simulated process. In section 5, we introduce a method to find an approximate distribution to the SE emission. We use the model to synthesize a SEM signal and statistically compare it to a real SEM signal. The synthesized signal is based on the detection of secondary electrons.

Since the probability distribution of the SE emission is unknown, we can check several assumptions of the SE emission distribution by using statistical comparisons between the synthesized signal to the real signal. We compare two possible distributions, Poisson and

Lognormal, and find that the Lognormal distribution is the better fit of the two to the SE emission.

2 Branching Processes- Theoretical Review

A branching process, in general, is a process where an initial random number of objects

‘create’ more objects of the same or different type, and these objects continue to ‘create’ other objects, with the system developing in accordance with some probability law. An example of such a process is a nuclear chain reaction, where an initial number of neutrons hit nuclei which splits and some neutrons are emitted with a certain probability of creating other neutrons; the process continues statistically.

Another example is the imaging process of the SEM. An initial number of primary electrons hit the specimen, causing an emission of other electrons, which are partially detected by the system’s detectors, and cause an emission of other particles, electrons or photons, depending on the type of detector used in the particular system.

Figure 1 is a graphic illustration of a general multilevel branching process.

Primary

Particle

1

Primary

Particle

2

N

Particle (1)

Particle (1)

Particle (1)

V

1

Partic le (2) Partic le (2)

V

2

Partic le

(3)

Partic le

(3)

Partic le

(3)

Partic le (3)

V

3

Figure 1: Graphic illustration of a multilevel branching process

First, we describe a two level branching process. Expanding this to a multilevel branching process will be straightforward.

Let N be a random variable, with a probability distribution function g

N

(n)=P(N=n) , with mean



N

and variance

 2

N

.

Let { X i

} be a series of independent identically distributed (i.i.d) random variables, with a common distribution f

X

(X) and with



X as the mean and



2

X

as the variance of each element in the series.

The sum of N elements of the series {X i

} is denoted by the following sum

V



X

1



X

2



...

X

N

(1)

The mean of V is denoted by

  

 

  

(2)

And the variance of V is denoted by

     

  i

   

E

2

  i

Proof for equations 2 and 3 can be found in [4].

(3)

The distribution density function f

V

(v) of V can be derived from the basic formula for conditional probabilities f

V

( v

 k )



(

 k )

 n







0

P N

 n P X

1

...

X n

 k ) (4)

Let us denote f

X

(x) and

 x

(



) as the distribution density function and generation function of X i

respectively, and g

N

(n)=P(N=n) and



N

(



) as the distribution density function and generation function of N respectively. For a fixed n , the distribution of the sum X

1

+...+X n is expressed by the n-fold convolution of { f

X

(x) } with itself, due to the independence of the series {X i

} . Therefore equation (4) can be written in a more compact form f

V

( v

 k )

 n







0 g

N n f

X x n * where { } n* symbolizes the n-fold convolution.

(5)

This formula can be simplified by using the generating functions. Since the n-fold convolution becomes multiplication in this form, we derive from (5) that the generation function



V

(



) of V is

 k







0 f

V

( v



)

 jwk  n







0 g

N n

  n

(6)

The right side of (6) is the Taylor expansion of



N

(



) with



replaced by



X

(



) [5] . This proves that the generating function of the sum V is the following compound function

     

(7)

For multilevel branching processes, the expansion is simply a recursive use of (2), (3), (5) and (7) with a change of parameters in accordance with the probability law of the previous and new objects in the process.

It should be noted that all of the moments of the process, and not only the first and second moments, can be derived from the generating function described in (7).

3 The Statistical Model of the SEM signal

The creation of the SEM signal can be divided into three stages. The first stage is the electron beam itself. The electrons in the beam are called primary electrons . The interaction of a primary electron beam with a specimen creates a primary excitation within the specimen in which electrons are scattered. This scattering may be divided into two types: nearly elastic and inelastic.



In nearly elastic interactions, the electrons involved retain virtually all of their energy.

The resulting high energy electrons are termed backscattered electrons if they are emitted back from the specimen.



In inelastic interactions, the electrons involved lose much of their energy and hence are of low energy. Those electrons of less than 50 electron volts may be termed secondary electrons . Secondary electrons are created throughout the primary excitation. Due to their low energy, most of them are absorbed by adjacent atoms in the specimen. As a result, only those secondary electrons that were created near the surface of the specimen are able to escape carrying surface topography information. In contrast to secondary electrons however, backscattered electrons can escape from greater depths within the specimen because of their higher energy .

The emitted electrons are detected by the detectors, with a certain detection efficiency.

3.1 The Beam Distribution.

It can be assumed that the number of electrons in the beam follows a Poisson distribution: the time



for one pixel can be divided into a large number n of time intervals, so that the probability x of observing one electron in one of these time intervals is much less than unity and the probability of observing more then one electron per time interval is negligible. We then expect that the mean value y of the number of electrons in the time interval will be the mean of a Poisson distributed random variable, i.e. y=nx .

Using physical parameters, the mean number of electrons per pixel in the primary beam

 can be expressed as : n p



I p e

, where I p

is the beam current,



is the pixel time and e is the electron charge.

3.2 The Secondary Electron Emission

The emission distribution of the secondary electrons is unknown. We will describe their statistics by the first and second statistical moments and the distribution function of the resulting process.

For each pixel in the SEM image (or for each ‘pixel area’ in the specimen), an unknown random number of primary electrons (PE) hit the specimen.

Let us define ~ ( n p

) as the initial number of PEs in the beam;

  

[ ]

 n p

.Each one of these PEs, causes emission of a random number X i

of secondary electrons emitted from the specimen; X i

~ ?



 s

, b

 s s

2



, where

 s

is the mean

of the SE emission, and b s is the relative variance ( b s



Var X ) / E

2

( X ) ), and X i

is distributed according to some unknown distribution.

If the total number of SE emitted from the specimen is denoted as V

1

, then, using equations (2) and (3), the mean of V

1

is given by:

  s n p

(8) and the variance of V

1 is given by

[ ]

1

 n



2 b p s s

 n p

  s

2

(9)

The general form of the generating function of V

1

is given by equation (7).

In our case, N follows a Poisson distribution with mean n p

and a generating function

 

N

( )

 exp(

 n p

 n p



) (10)

Therefore the generation function of V is given by

 

V

1

( )

 exp(

 n p

 n

  p X

( )) (11)

The distribution with this generating function is called the Compound Poisson

Distribution .

If the distribution density function of the SE were known, a completestatistical model of the SE emission could be described using (11).

3.3 Back Scattered Electron emission

In the case of backscattered electrons, each PE from the beam causes the excitation of one or zero backscattered electrons, with a probability of success p b

and probability of failure

(1-p b

) . This means that each backscattered electron is a Bernoulli variable with a mean value of p b

, variance p b

(1-p b

) and a generating function



BSE



( 1 p b

)

 p b



Inserting the previous into equation (11) gives:

(12)

 

V

2

( )

 exp(

 n p p b

 n p p b



) (13)

Which is the generation function of the signal resulting from the BSE emission. The form of

 

2 is that of a Poisson distributed random variable which is described entirely by its mean

[

2

]

 n p p b

(14)

This result is general for any cascade of a Poisson process with Bernoulli trials.

3.4 Detection Efficiency.

In this model we assume that each electron which is emitted from the specimen has a probability p d

of being detected. Therefore the detection efficiency can be described by a Bernoulli model.

As in the case of the backscattered electrons, the generation function of this stage is linear and is given by

  d

( ) ( 1 with mean p d

and variance p d

(1-p d

). p d

)

 p d



(15)

It is reasonable to assume that the detection efficiency p d is different for BSE and SE electrons. If we denote p d1 as the detection efficiency for the SE electrons and p d2

as the detection efficiency for the BSE. then, using equations (2), (3), (7) and the results in equations (11), (13) and (15), the generating functions and the means and variances corresponding to the signal resulting from SE emission and BSE emission can be expressed as

 

1

( )

 exp[

  p n p



X

( 1

 p

 p d 1 d 1



)] (16) with Z

1

being the number of SE which enter the detector, with mean and variance

Var Z

1

]



E V p

1 d 1

( 1



[

1

]



[ ]

1 d 1

 n

 p p s d 1

(17) p d 1

)



[ ]

1

2 d 1

 n

 p p s d 1

( 1

 p d 1

  b p s s d 1

)

For the BSE emission:

(18)

 

2

( )

 exp(

 n p p d 2

 n p p d 2



) (19) with Z

2

being the number of BSE that enter the detector. Z

2

follows a Poisson distribution with a mean value:

[

2

]

 n p p p b d 2

(20)

The total number of electrons that enter the detectors can be written as:

Z=Z

1

+Z

2

(21)

Assuming that BSE emission and SE emission are statistically independent, then the probability distribution function of V , its generating function , its mean, and variance are given by: f

Z

( )

 f

Z

1

( ) * f

Z

2

( )

      

1

( )

Z

2

( )

E Z

Var Z





E Z

1

]

[



1

E Z

]



2

]

[

2

]









(22)

3.5 The detection model.

There are two possible approaches which can be used in order to describe the detection model.

The first is through the detector’s gain probability distribution (known also as the pulse height distribution, e.g. PHD). The PHD is the distribution of the output of the detector

following excitation of a single electron. The probability distribution function of the detector’s gain can be measured and is usually given by the manufacturer. An example of the gain distribution function of an Microchannel plates (MCP) detector is shown in figure 2 .

Pulse Height (Gain)

Figure 2: Pulse height distribution of the Opal 7830Si MCP detectors

For a known detector’s gain probability distribution function, the signal at the output of the detector is also a result of a branching process. The input process being the electrons that enter the detector and the output is the signal at the output of the detectors (normally current or voltage).

The second approach is to describe the detection model as a function of the physical processes which occur inside of it. The emitted electrons enter the detector and cause excitation of some type of particles in the detector. These particles follow some known distribution, depending on the type of detectors that are being used. The signal at the output of the detector is the sum of these particles. For example, in the case of MCP detectors the entering electrons cause electron emission in the detectors, and in the

Everhart-Thornley detectors photons are excited when the electrons hit the detectors. The disadvantage of this approach is that a full and exact knowledge of the processes in the detector is needed, where as in the first approach this knowledge is not necessary.

We will use the first approach to describe the detection model.

Let us denote f

D

(d) and



D

(



) as the distribution density function and generating function of the detectors, and its mean and variance as:

 

D

,

2

D

. Using all of the results of the model up to the detectors, and equations (2), (3) and (7) we derive the following expressions for the generating function, mean, and variance of the signal at the output of the detectors:

 

S

      

Z

1

( ( )) ( ( ))

D Z

2

D

E S



E Z



D



E Z

1

]



E Z

2

])



D





Var S



E Z



2

D

 

2

D

(23)

In systems where the SE emission and/or detection is much greater than that of the BSE or vice-versa these expressions are reduced to simpler forms since either Z

1 or Z

2

is negligible compared to the other.

4 Simulation Examples

In order to demonstrate the model, we performed simulations of a system using the following parameters: a. The mean number of electrons in the beam is 5. (i.e. n p



5 ) b. Only SE are participating in the process. c. The detection efficiency is 33% d. The detector gain distribution is normal, with SNR~3 (5dB) ( SNR







D

2

D

2

)

Each simulation included 10000 trials.

We performed the simulations for two possible distributions of the SE emission. The first was the Poisson distribution and the second the Lognormal distribution.

Figure 3 show the signal probability density of the simulation with the Poisson distribution assumption for the SE. The first bar shows the probability of no-detection, i.e. the probability that an electron will not contribute to the output signal (around 0.25)

Figure 4 show the signal probability density of the simulation with the Lognormal assumption for the SE. In the figure, the first bar shows the probability of no-detection

(about 0.35).

Figure 3: Signal Probability Density of results based on the SE Poisson assumption

Figure 4: Signal probability density of results based on the SE Log-Normal assumption

5

Approximation of the SE distribution

In the statistical model presented above, the probability distribution of the SE emission is unknown.

By using simulations based on the model and by measuring real signals, the distribution of the SE can be approximated by trying to statistically fit a simulated signal to a real signal. This test gives reliable results when a large number of observations are available.

In our simulations, we tried to determine the likelihood of two possible distributions for the SE emission: Poisson distribution and Lognormal distribution.

Using the model, we simulated SEM signals. The detectors simulated were MCP detectors with a known gain distribution function (see figure 2). The mean and variance

of the simulated signal were taken to be the same as the average and variance of the true signal. The real signal was obtained from a CD-SEM tool (Opal 7830Si). The specimen was a homogeneous flat metal surface (to avoid charging effects) which was scanned once at a fast TV rate. This procedure resulted in a large amount of samples (more then

10000) which can be assumed to be independent identically distributed (i.i.d) samples.

The statistical fit of the real SEM signal to the synthetic signal was done using the method known as the Quantile-Quantile plot test. By displaying the plot of the quantiles of two sequences of data, it can be determined whether they come from the same distribution if the plot is linear. The idea of the Q-Q plot is to look at pairs of quantiles from the two population (the real and synthetic) with the same associated cumulative probability. If the data of the synthetic signal arises from the same distribution as the real signal, then the quantiles will be linearly related, and therefore the entire plot will be linear.

Figures 5a , 5b and 5c show the histogram of the real signal, the synthesized signal using

Lognormal SE emission and the synthesized signal using Poisson SE emission, respectively.

Figures 6a and 6b show the results of the Quantile-Quantile plot test of the synthesized signal using Lognormal and Poisson SE emission, respectively. From figure 6a it is clear that the synthesized signal has a very good statistical fit to the real signal. The number of samples which deviate from the linear line is very small (10 out of 10000), i.e. 99.9% of the data show a good fit to the real signal. This implies that the SE emission does follow a

Lognormal distribution.

In contrast to the result in figure 6a , the fit of the real signal to the synthesized signal using a Poisson SE emission ( Fig. 6b ) is very poor. Approximately 18% of the quantiles of the synthesized signal deviate from the linear line. The two signals do not originatethe same distribution.

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0 50 100

Quantity

150

Figure 5a: Histogram of the true signal

200 250

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0 50 100

Quantity

150 200 250

Figure 5b: Histogram of the synthetic signal (Lognormal SE distribution)

0.6

0.5

0.4

0.3

0.2

0.1

1

0.9

0.8

0.7

0

0 50 100

Quantity

150 200

Figure 5c: Histogram of the synthetic signal (Poisson SE distribution)

250

Figure 6a: A Quantile-Quantile plot with Lognormal SE emission

Figure 6b:A Quantile Quantile plot with Poisson SE emission

6

Summary

In this work, we have introduced a new model for the SEM signal formation process.

We have separated the process into four successive stages, each stage was discussed in detail and a model was introduced for it. The beam specimen interaction process, was described as a cascade process of the beam - a Poisson process, BSE emission- a binomial process, and the SE emission - a process with an unknown distribution defined by its mean, variance and characteristic function. This leads to analytic expressions for the mean, variance, and the characteristic function of this stage. The detection process was added in cascade to the previous process, including a binomial process which describe the detection efficiency. The result is that of a multilevel branching process.

As a result we presented a complete physical-mathematical model of the process, as a function of the physical parameters of the process, e.g. the beam current, the scattering ability of the specimen electrons (which depends on the specimen’s material and topography), and the parameters of the detector.

This model is presented as an analytic expression of the signal’s generating function including expressions for the signal’s mean and variance.

We presented a method for approximating the SE emission distribution, the Lognormal distribution is apparently a good approximation to the SE emission distribution.

7

References

1. S. J. Erasmus, “Reduction of Noise In TV Rate Electron Microscope Images By

Digital Filtering”, J. Mic., Vol. 127, Pt 1, July 1982, pp. 29-37.

2. P. Podsiado and G. W. Stachowiak, “Median-Sigma Filter for SEM Wear Particle

Images”, J. Com. Ass. Mic., Vol. 7, No. 2, 1995, pp. 67-82.

3. L. Reimer, Scanning Electron Microscopy - Physics of Image Formation and

Microanalysis, Springer-Verlag, Berlin, 1985, pp. 155-158.

4. T.E. Harris, “Branching Processes”, Annals of Mathematical Statistics, Vol.19, 1948, pp. 474-494.

5. W. Feller, An Introduction to Probability Theory and Its Applications, John Wiley &

Sons, Inc., New York, 1957, Vol. 1, pp. 268.

8

Acknowledgments

This work was supported by Opal Technologies, a company from the group of Applied

Materials.

The authors would like to thank Prof. Eliahu Gertsbach of the Mathematics Department,

Ben-Gurion University, Israel, for his assistance and valuable advice.

Thanks also to Alex Goldstein, Alexander Libinson, Mannie Dorfan, Avner Karpol and

Steven Rogers of Opal Technologies, for their help.

Applying Branching Processes Theory for Building a Statistical

Applying Branching Processes Theory for Building a Statistical

Model for Scanning Electron Microscope Signal

1 Introduction

2 Branching Processes- Theoretical Review

3 The Statistical Model of the SEM signal

4 Simulation Examples

Approximation of the SE distribution

Summary

References

Acknowledgments

Related documents

Products

Support

Applying Branching Processes Theory for Building a Statistical