II. General Implementation

advertisement
Implementation of Naive Bayes
Classifiers with Op Amps
Horatiu Moga, Gheorghe Pana
Liliana Miron
Electronics and Computers Department
Transilvania University, UTBv
Brasov, Romania
horatiu.moga@gmail.com
gheorghe.pana@unitbv.ro
Department of Electronics and IT,
“Henri Coanda” Air Force Academy, AFAHC
Brasov, Romania
miron_liliana@yahoo.com
Abstract— The purpose of the study is the construction of
probabilistic classifiers for signal detection. Applications of this
classifier can be found in various areas like surveillance,
biomedical equipments, automation with machine vision, military
technology, etc. Naïve Bayes classifiers are used for learning
machines, and the study tries to determine the posterior
probabilities for the present approach.
number of red balls and blue balls. In that case the probability
of class red balls occurring will be defined simply as PC  1
and the probability of class blue balls occurring will
be PC  0 . These probabilities are set prior to making any
measurements and hence are called the prior probabilities of
class membership.
Keywords-component; probabilistic classifier, Bayes, signal
detection, Op Amp
B. Class Conditional Likelihood
There is a ball (plotted white), randomly selected from the
population, and we measure its size. There will be a natural
distribution of the size of red balls and blue balls, in other
words there will be a class conditional distribution of the
measured features, in this case size. We can name these class
conditional distributions ps | C  1 and ps | C  0 fom red
balls and blue balls classes respectively.
I.
INTRODUCTION
We try to build a classifier which will predict whether a
ball is red or blue based on their measured size alone. We have
two groups of balls red and blue and a border between them. A
white ball (could be red or blue) can be in one of groups (Fig.
1). Depends on the size to evaluate the affiliation.
C. Class Posterior
The Bayes rule helps us obtain the posterior probability of
class membership by noting that

Ps, C  1  ps | C  1  PC  1  PC  1 | s   p(s)  
and
PC  1 | s  

ps | C  1  PC  1
p(s)


and the marginal likelihood of our measurement, ps  , is
the probability of measuring a size s irrespective of the class
and therefore
Figure 1. Red balls and blue balls are separated by size
A. Class Priors
The class variable C will take on two values so we can
encode red balls by the value 1 and blue balls by the value 0.
[1] Within the general population there is an approximate equal

ps   ps | C  1  PC  1  ps | C  0  PC  0
 
which means that the class posteriors will also sum to one,
PC  0 | s   PC  1 | s   1 


D. Discriminant Functions [2]
From Fig.1 we can see the empirical distributions of size
for both reds and blues. The first thing to notice is that there is
a distinct difference in the location of the distributions and that
they can be separated by a large extent (supposing that reds are
typically bigger than blues). However there is a region where
the two distributions overlap and it is here that classification
errors can be made. The region of intersection where
PC  1 | s   PC  0 | s  is important as it defines our
decision boundary. If we make a measurement of
f s   log
PC  1 | s 

PC  0 | s 
ps | C  0 and then use these estimates to define our
posterior probabilities and hence our discriminant function. As
the class-conditioning densities define the statistical process
which generates the features, we then measure this approach
which is often referred to as the generative approach.
s0 size then
we can see that PC  1 | s   PC  0 | s  and whilst there is
some probability that we have measured a rather bigger blue, to
minimize the unavoidable errors that will be made then our
decision should be based on the largest posterior probability.
We can then define a discriminant function based on our
posterior probabilities. One such function could be the ratio of
posterior probabilities for both classes. If we consider the
logarithm of this ratio then the general discriminant function

often referred to as the discriminative approach to defining a
classifier as all effort is placed on defining the overall
discriminant function with no consideration to the classconditional densities which form the discriminant. The second
way is to focus on estimating the class-condition densities
(distributions if the features are discrete) ps | C  1 and

would define the rules that s would be assigned to
C  1 (male) if f s   0 and if f s   0 the assignment would
be to C  0 (female).
II.
GENERAL IMPLEMENTATION
The general schematic of Bayes classifier is presented in
Fig. 2. The coming signal is captured by parallel independent
blocks and it is calculated by class conditional likelihood.
Multiplicator blocks calculate partial probabilities and the
“Maximum Selector” block selects the maximum probability
and the involved class.
For digital approach, we can use a DSP or FPGA area, and
general schematic could be quickly executed. The main idea
remains in the analog area. For conditional likelihood
calculation, we need a general reconfigurable analog block,
that could have a non-liniar variable transfer function. The
decision over a specific transfer function is getting in learning
process. We have to catch up a couple of patterns for the
distribution functions, and to decide which of them use it in a
specific time.
E. Discriminative and Generative Classifiers
There are two ways in which we can define our
discriminant function. [3]
In the first case we can explicitly model our discriminant
function using for example a linear or nonlinear model. This is
P(s|C=1)
P(C=1)
s-signal
Maximum
Selector
P(s|C=2)
P(C=2)
P(s|C=k)
P(C=k)
General schematic of Bayes classifier
P(C)
For analog implementation, we could use FPAA circuits,
which are large analog area circuits, fully reconfigurable, with
wide area of implementation in a small frequency spectrum,
from telecommunication, automation, audio signal processing,
etc.
III.
We consider the generative case with two classes. The
detector support Gauss distribution is used for both of them.
The balls’ size is in the first class if the ray is smaller than 2.25
and bigger than 2.75 and in the second class it is the opposite.
For the first class we calculated the mean µ1=2.5 and the
standard deviation ϭ1=0.88. The second class mean is µ2=0.91
and the standard deviation is ϭ2=3.6. The schematic is
presented in Fig. 3.
Generally we have to focus on analog non-linear electronic
circuits using synthesis and analysis. For this we may use a
few types of research:

Smooth approximations

Polynomials and power series

Piecewise-linear function fitting [4]
We suppose that the uniform distribution for both classes is
0.7 for first and 0.5 for second one.
The results (Fig. 4) show us that it verifies the initial
hypotheses. The transition band between (2.25, 2.5) and (2.5,
2.75) decides if the white ball is red or blue, and if it belongs
to one class or the other.
For multiplication blocks we may use regular multiplication
circuits and the “Maximum Selector” block will use
comparative integrated circuits.
8
-
4
R9
V-2Meg
TL082
V-
Q1
V-
V-
10k
7
-
4
V+
U9B
TL082
V-
R5
5
U5B
VR6
OUT
6
10k
Vmiu2
3.6
+
10k
-
4
TL082
6
7
R10
V-2Meg
V-
X1
X2
U0
U1
U2
Y1
Y2
VP
DD
W
Z1
Z2
ER
VN
14
13
12
11
10
9
8
V+
+
Q2
8
V+
TL082/301/TI
7
OUT
-
4
V-
V-
R14
0
3.9k
0
Vsigma2
-1.82
-
V+
VP2
5V
0
6
VVLM311/301/TI
0
V+
AD633/AD
V1
15V
V-
0
V2
15V
V-
0
Figure 2. The schematic of the circuit
15V
10V
5V
0V
0V
0.5V
1.0V
1.5V
2.0V
2.5V
3.0V
3.5V
4.0V
V(out)
V_Vz
Figure 3. The result of the Spice simulation
4.5V
0
7
R12
1T
0
out
0
AD734/AD
0
+
U7
1
2 X1
3 X2
4 Y1 W
6 Y2
Z
Q2N2222
V-
V-
0
V+
0
U4
1
2
3
4
5
6
7
R15
4.7k
V+
U8
OUT
5
0
8
OUT
6
VP1
7V
0
R11
1T
0
0
3.9k
AD734/AD
0
R13
Q2N2222
0
V+
V-
V+
Vsigma1
-1.76
V+
Vz
2
VP
DD
W
Z1
Z2
ER
VN
V+
0
R8
10k
V+
U2B
5
+
8
R7
X1
X2
U0
U1
U2
Y1
Y2
V-
V+
2
0
1
4
5
OUT
-
8
+
V+
2
14
13
12
11
10
9
8
V+
V+
1
2
3
4
5
6
7
V-
R4
10k
3
U3
5
8
V+
U9A
U6
AD633/AD
1
2 X1
3 X2
7
4 Y1 W
6 Y2
Z
TL082/301/TI
1
OUT
0
1
V+
10k
+ 8
U1A
+
8
OUT
R3
V+
V+
4
3
0
3
V+
U10A
10k
V-
TL082
2
-
V-
10k
Vmiu1
2.5
R2
V+
R1
RESULTS
5.0V
5.5V
6.0V
[3]
REFERENCES
[1]
[2]
Christopher M. Bishop,“Pattern Recognition and Machine Learning”,
Springer Science+Business Media, LLC, 2006, pg. 181.
Andrew Webb, “Statistical Pattern Recognition”, Second Edition, John
Wiley & Sons Ltd., 2000, pp. 123-180.
[4]
Sebe N., Ira Cohen, Ashutosh Garg, Thomas S. Huang, “Machine
Learning in Computer Vision”, Springer Science+Business Media, LLC,
2005, pp. 71.
Daniel Sheingold, „Nonlinear Circuit Handbook“, Analog Device Inc.,
1976.
Download