Uploaded by Đức Anh Hoàng

sigmoid function implementation mdpi 1

advertisement
electronics
Article
FPGA Implementation for the Sigmoid with Piecewise Linear
Fitting Method Based on Curvature Analysis
Zerun Li, Yang Zhang *, Bingcai Sui *, Zuocheng Xing * and Qinglin Wang *
College of Computer, National University of Defense Technology, Changsha 410073, China;
lizerun16@nudt.edu.cn
* Correspondence: zhangyang@nudt.edu.cn (Y.Z.); bingcaisui@nudt.edu.cn (B.S.); zcxing@nudt.edu.cn (Z.X.);
wangqinglin@nudt.edu.cn (Q.W.)
Citation: Li, Z.; Zhang, Y.; Sui, B.;
Xing, Z.; Wang, Q. FPGA
Implementation for the Sigmoid with
Piecewise Linear Fitting Method
Abstract: The sigmoid activation function is popular in neural networks, but its complexity limits the
hardware implementation and speed. In this paper, we use curvature values to divide the sigmoid
function into different segments and employ the least squares method to solve the expressions of
the piecewise linear fitting function in each segment. We then adopt an optimization method with
maximum absolute errors and average absolute errors to select an appropriate function expression
with a specified number of segments. Finally, we implement the sigmoid function on the fieldprogrammable gate array (FPGA) development platform and apply parallel operations of arithmetic
(multiplying and adding) and range selection at the same time. The FPGA implementation results
show that the clock frequency of our design is up to 208.3 MHz, while the end-to-end latency is just
9.6 ns. Our piecewise linear fitting method based on curvature analysis (PWLC) achieves recognition
accuracy on the MNIST dataset of 97.51% with a deep neural network (DNN) and 98.65% with a
convolutional neural network (CNN). Experimental results demonstrate that our FPGA design of
sigmoid function can obtain the lowest latency, reduce absolute errors, and achieve high recognition
accuracies, while the hardware cost is acceptable in practical applications.
Keywords: sigmoid; neural networks; piecewise linear fitting; approximation methods; FPGA; high
speed; hardware acceleration
Based on Curvature Analysis.
Electronics 2022, 11, 1365. https://
doi.org/10.3390/electronics11091365
1. Introduction
Academic Editor: Alexander
Barkalov
Received: 23 March 2022
Accepted: 21 April 2022
Published: 25 April 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affiliations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
The term artificial neural network (ANN) refers to a series of mathematical models
inspired by biology and neuroscience. These models primarily simulate biological neural
networks by abstracting the neural network of the human brain, constructing artificial
neurons, and establishing connections among these artificial neurons according to a certain
topological structure. In the field of artificial intelligence, ANNs are usually referred as
neural networks or neural models. The basic constitutive unit of a neural network is the
artificial neuron, which mainly simulates the structure and characteristics of biological
neurons, receives a group of input signals, and produces output. The output of a neuron is
usually realized by different activation functions, including sigmoid, Relu, Softplus, etc. [1].
Among them, the sigmoid function is widely used in various ANN models due to its
simple expression and limited output range. However, the sigmoid function is a nonlinear
function with exponent and division operations that consumes a large amount of hardware
resources. It is therefore necessary to simplify and accelerate the sigmoid function when
deploying neural networks on hardware platforms [2,3]. These hardware platforms include
embedded devices, Internet of Things (IoT) applications, and field-programmable gate
array (FPGA) boards [4,5]. It is also necessary to strike a balance between performance and
functional flexibility when realizing a feasible neural network model design [6,7].
In order to solve the problem of simplifying and deploying sigmoid functions, various
fitting methods, such as look-up table, the coordinate rotation digital computer (CORDIC)
Electronics 2022, 11, 1365. https://doi.org/10.3390/electronics11091365
https://www.mdpi.com/journal/electronics
Electronics 2022, 11, 1365
2 of 16
algorithm, Taylor series expansion, polynomial, the piecewise method, and the hybrid
method, have been proposed to implement the sigmoid function on hardware. The look-up
table [8] is a direct method of fitting a sigmoid function according to preset values. It
requires all input and output values to be saved in memory and reads outputs based on
inputs. While the accuracy of the outputs can be extremely high, it consumes too much
storage space to save high-precision values. The CORDIC method [9] converts the sigmoid
function into simple operations, such as addition and shifting through multiple iterations.
Although this method does not involve multiplication, it requires the use of multiple lookup
tables and additions; as a result, its hardware resource consumption is also too high for
many applications. The Taylor series expansion method [10] and polynomial method [11]
fit sigmoid functions with high-order expressions, which also consume a large amount of
hardware resources. Moreover, the referenced hybrid method [12] applies different kinds
of fitting methods together, which requires substantial storage space and many complex
operations.
Compared with the methods above, the piecewise fitting method has clear function
expressions, consumes few hardware resources, and achieves high fitting accuracy [13]. It
has thus become a mainstream sigmoid function fitting method. The basic principle of the
piecewise fitting method is to divide the sigmoid function into several regions in a specific
piecewise manner, then use a different expression for each region to replace the original
function, thereby achieving the purpose of fitting the original function [14]. Piecewise
fitting methods can be divided into piecewise linear fitting and piecewise nonlinear fitting
methods. The latter has higher fitting accuracy but consumes more hardware resources,
while the former can achieve the same fitting accuracy through the use of more segment
numbers without needing to employ high-order operations. Thus, the piecewise linear
fitting method is more compatible to obtain high speed and few hardware resources
on FPGA.
Savich divides the sigmoid function into five segments in the range of [−8, 8] and uses
a linear fitting method with both adders and multipliers [15]. Armato uses the area conservation method to divide the sigmoid function into 16 non-uniform segments for linear
fitting [16]. Ngah proposes a fitting method that combines piecewise binomial function
and look-up table approaches. In more detail, this method uses a piecewise second-order
nonlinear function method and look-up table method to perform the fitting for the first
time, then adds and subtracts the output value to improve the fitting accuracy [12]. Campo
divides the sigmoid function into 12 segments and then uses the second-order Taylor expansion formula to fit every segment [10]. Gomar uses the approximate calculation method
for exponent to fit the sigmoid function [17]. Pandit applies Chebyshev’s polynomial
approximation for efficient hardware realization [18]. Mitra proposes a 16-segment linear
fitting algorithm based on adders, multipliers, and logic blocks [14]. Zamanlooy uses a
continuous valued number system to linearly fit the sigmoid and applies the continuous
modular compression operation to reduce the width of the numbers [19]. Nguyen divides
the sigmoid function into 12 segments and chooses the parameters of the linear function
based on the distribution probability of input values [20]. Pan combines the piecewise linear
(PWL) approximation, Taylor series approximation, and Newton–Raphson method-based
approximation methods together to implement the sigmoid function efficiently [21].
The piecewise fitting methods described above have advantages in terms of their
utilization of hardware resources; however, their recognition accuracies and processing
speed on hardware need some improvement [22,23]. This paper accordingly chooses
several abscissas of potential piecewise points based on the curvatures of the sigmoid
function. We solve the function expressions in every segment between piecewise points
using the least squares method. We then compare the absolute errors of different fitting
function with potential piecewise points and choose a single function expression as the
hardware implementation scheme to achieve higher fitting accuracy and reduce hardware
resource consumption. Finally, we realize the piecewise linear fitting function on the
specified FPGA platform to simulate the inference stage of neural networks. The circuits
Electronics 2022, 11, 1365
3 of 16
for different segments work in parallel to calculate the outputs of these segments, and the
multiplexer selects the result of one segment as the final output of the fitting function based
on the range of input value. The clock frequency of this circuit design and recognition
accuracies on the MINIST dataset show that our PieceWise linear fitting method based on
curvatures (PWLC) has lower time latency and achieves higher accuracies in the specified
neural networks.
The contributions of this paper lie in the following three aspects:
•
•
•
This paper proposes a new method to select potential piecewise points based on the
curvature values of the sigmoid function. Piecewise points are dynamically selected
in the specified range according to curvature values.
This paper develops an approximate comparison scheme for PWLC to determine the
proper expression of the piecewise linear function. The comparison is based on the
values of maximum absolute errors and average absolute errors.
This paper presents a high-speed hardware design for PWLC. The circuit implemented
on the FPGA development platform can achieve the lowest end-to-end latency at
higher clock frequency with the use of additional hardware resources.
The remainder of this paper is organized as follows. Section 2 presents the principle
of solving expressions of the piecewise linear fitting function based on curvature analysis. Section 3 outlines the comparison scheme for the expressions of piecewise function
solutions. Section 4 describes the module design of the piecewise linear function. Section 5
presents the experimental results and draws some comparisons with other papers. Finally,
Section 6 makes a conclusion for this paper at the end.
2. Piecewise Linear Fitting Method Based on Curvature Values
The sigmoid function has continuity and monotonicity in the domain of definition.
Unlike the linear Relu function, the derivative of the sigmoid function in the domain of definition is constantly changing and exhibits nonlinear characteristics. The sigmoid function
is close to the saturation values (0 or 1) at both ends and has almost no change of value
at all. The function graphs among saturation areas have clearly nonlinear characteristics,
which can be fitted by several linear functions with different derivatives. In the middle of
this range, the sigmoid function changes more drastically, which needs to be fitted with
more linear functions with different derivatives.
The derivative can describe how fast a function changes. As can be seen from Figure 1,
when x = 0, the derivative of the sigmoid function reaches its maximum value. Although
the value of the sigmoid function changes drastically near 0, the shape is relatively straight
and can be approximated as a linear function. When x = 3, the derivative value of sigmoid
function is small, and it can be observed that the sigmoid function has a greater degree of
curvature near x = 3. In particular, the derivative values of the two functions f 1 ( x ) = x2
and f 2 ( x ) = x4 are the same at x = 0, but the curvature values of the two functions at
zero are obviously different. Thus, the derivatives cannot intuitively describe the deviation
between a curve function and a straight line, especially the degree of curvature at a single
point. Different from the derivative, the curvature value is defined as the rate of the tangent
direction angle at one point on the function relative to the arc length of the curve. This
indicator can describe the degree to which a curve deviates from a straight line. The value
of curvature is positively related to the curve’s degree of curvature. The expression of
curvature is defined as follows.
κ = lim
∆`→0
∆θ
= ∆`
|y00 ( x0 )|
1 + (y0 ( x
0 ))
2
3
2
(1)
Electronics 2022, 11, 1365
4 of 16
Figure 1. The original, derivative, and curvature graphs of the sigmoid function at the range
of [−8, 8].
In the equation above, θ is the tangent direction angle, while ` is the arc length of
the given tangent direction angle. The original expression of the sigmoid function is
given below.
1
y( x ) =
(2)
1 + exp(− x )
The expression of the first derivative for the sigmoid function is given below.
y0 ( x ) =
exp(− x )
(3)
(1 + exp(− x ))2
The expression of the second derivative for the sigmoid function is given below.
y00 ( x ) =
exp(−2x ) − exp(− x )
(4)
(1 + exp(− x ))3
Thus, the expression of the curvature value for the sigmoid function is as follows.
κ=
| exp(−2x ) − exp(− x )|(1 + exp(− x ))3
3
(1 + 4 exp(− x ) + 7 exp(−2x ) + 4 exp(−3x ) + exp(−4x )) 2
(5)
2.1. Selection of Piecewise Points Based on Curvature Analysis
As shown in Figure 1, the curvature graph is symmetric about the x-axis. When x = 0,
the curvature has the minimum value of 0. At this point, the curvature of the sigmoid
function as its smallest and the shape is closest to a straight line. When x ∈ [−5, 0.5] ∪ [0.5, 5],
the curvature value tends to be large. When x is around -1 and 1, the curvature reaches its
maximum value (close to 0.1). When x ∈ (−∞, −5] ∪ [5, +∞), the curvature value is close
to 0. In this range, the sigmoid function can be approximately regarded as a linear function.
From Figure 1, the derivative of the sigmoid function exhibits obvious changes in
the range of [−8, 8] and peaks when x = 0, at which point the derivative of the sigmoid
function achieves a maximum value of 0.25. If the range of [−8, 8] is subdivided into
numerous small segments, the derivative change in each of these small segments will tend
to be smooth. In these small areas, the shape of the sigmoid function becomes similar to
the shape of some linear functions. As a result, the nonlinear sigmoid function can be fitted
by numerous linear functions, which will decrease the sigmoid function fitting error. It
Electronics 2022, 11, 1365
5 of 16
is practical that the sigmoid function can be fitted in the specified range according to the
abscissa of the given piecewise point, and that each segment range has an independent
piecewise function expression. In the saturation regions at both ends, the original function
can be approximately equal to 0 or 1 with few fitting errors.
Due to the complexity of the sigmoid curvature function, it is complicated to obtain
the maximum of this function. This paper applies systematic sampling for abscissas to
reflect the curvature values corresponding to different coordinates of segment intervals.
In the range of [−5, 5], the abscissa is equidistant and the scale value is 0.5. Systematic
sampling results in a set of samples according to proportional abscissas. This sampling
method, with its utilization of equal uniform spacing, can intuitively reflect the changes
in curvature value. In the range of [0, 5], there is only one peak value of curvature, and
there is also no periodic variability or monotonous change. The abscissas of the positive
piecewise points and the corresponding curvature values are listed in Table 1. A segment
interval with a larger curvature requires more piecewise points for linear fitting if we are to
reduce fitting errors and improve the numerical accuracy of the fitting function.
Table 1. Curvatures with different abscissas of the sigmoid function.
Abscissa
0
0.5
1
1.5
2
2.5
Curvature
0
0.053
0.086
0.092
0.079
0.059
Abscissa
3
3.5
4
4.5
5
5.5
Curvature
0.041
0.027
0.017
0.011
0.007
0.004
2.2. Solution for Function Fitting Based on Sample Points and Selected Piecewise Points
We select n points in a sigmoid graph whose horizontal ordinates are x1 , x2 , x3 , · · · , xn ,
according to a sequence from small values to large values. The corresponding longitudinal
coordinates are y1 , y2 , y3 , · · · , yn . Due to the monotonicity of the sigmoid function, the
corresponding longitudinal coordinates are also in the same sequence. To apply all data
pairs equally, we sample the data with equal uniform spacing. The data pairs are as
presented below.




x1 1+exp1(− x )
x1 y1
1


 x2 y2   x2 1+exp1(− x2 ) 
 


1
 x3 y3   x3

(6)

=
1+exp(− x3 ) 
  .
 .

.
.
.
.


 .
..
.   ..

1
xn yn
xn
1+exp(− xn )
Here, n is the number of sample points of the sigmoid function. The discretized sample
points are evenly distributed on the coordinate axis, and the subscripts represents the size
of the value. This detailed systematic sampling guarantees sample coverage and accurately
describes the data distribution of the original function. In the internal ranges enclosed by
adjacent piecewise points, the sigmoid function can be fitted with linear functions. All m
piecewise points can be selected from the values listed in Table 1, and the expressions of
the piecewise linear fitting functions for the sigmoid function are as follows.


0,
x ≤ b1




η1 + k1 ( x − b1 ),
b1 < x ≤ b2




η2 + k2 ( x − b2 ),
b2 < x ≤ b3
p( x ) = .
(7)
.
..
..







η m − 1 + k m − 1 ( x − bm − 1 ) , bm − 1 < x ≤ bm



1,
x>b
m
Electronics 2022, 11, 1365
6 of 16
where bm is the abscissa of the m-th piecewise point chosen in Table 1, ηm is the ordinate of
the m-th piecewise point, k m is the slope of the m-th segment of the linear fitting function,
and the total number of piecewise points is m. This formula also assumes that the order
of the piecewise points is b1 < b2 < · · · < bm , which is similar to the order of abscissas
among sample points.
The expression in Equation (7) presents the general form of the piecewise function.
Notably, due to the continuity of the sigmoid function, we require the piecewise linear
function to be continuous at each piecewise point. Under this premise, the slope and
intercept of each linear region depend on the value of the former segment. The function
value of the former interval plus the increment can generate the function of the latter
interval. The piecewise function can therefore be expressed in the form of the previous
function expression β 1 + β 2 ( x − b1 ) + · · · + β m−1 ( x − bm−2 ) and the numerical increment
of this interval β m ( x − bm−1 ). In fact, the subsequent interval is based on the formal interval.
Accordingly, the expression of the piecewise function is presented as below.


0,
x ≤ b1





β
+
β
x
−
b
,
b
(
)
2
1
1
1 < x ≤ b2





β + β 2 ( x − b1 ) + β 3 ( x − b2 ), b2 < x ≤ b3

 1
..
..
p( x ) = .
.




β 1 + β 2 ( x − b1 ) + β 3 ( x − b2 )





+ · · · + β m ( x − bm − 1 ) , bm − 1 < x ≤ bm




1,
x > bm
(8)
By taking advantage of the continuity of the linear fitting function, the number of
unknown parameters can be decreased from 3 × (m − 1) to 2 × m − 1. This paper applies
a custom step function δxi >bj to express the piecewise functions with a matrix form. The
expression of δxi >bj can be expressed as the following piecewise function with step values
including 0 and 1.
(
0, xi ≤ b j
δxi >bj =
(9)
1, xi > b j
Here, i ∈ 1, 2, . . . , n, j ∈ 1, 2, . . . , m and b j is the horizontal ordinate of the step point. By
using this step function, the relationship between data pairs can be described in matrix form
rather than the piecewise function expression form. We bring the values of n samples and
m piecewise points into the expression in Equation (8) and obtain an equation containing
an unknown β i (i = 1, 2, . . . , m). The equation of the unknown parameter based on the
specified data pairs is presented in Equation (10).





1
1
..
.
1
x1 − b1
x2 − b1
..
.
xn − b1
( x1 − b2 )δx1 >b2
( x2 − b2 )δx2 >b2
..
.
···
···
..
.
( x1 − bm−1 )δx1 >bm−1
( x2 − bm−1 )δx2 >bm−1
..
.
( xn − b2 )δxn >b2
· · · ( xn − bm−1 )δxn >bm−1





β1
β2
..
.
βm


 
 
=
 
y1
y2
..
.





(10)
yn
In common cases, the relationship between all sample points n and piecewise points
m is n > m. It is worthy of note that the solution for β i (i = 1, 2, . . . , m) will reduce the
fitting errors between the sample points and the corresponding sigmoid function values.
To abbreviate the expression of the matrix form, this paper simplifies the matrix operation
in the expression below.
(11)
Aβ = Y
where A is the regression matrix of size n × m, β is the vector consisting of m parameters,
and Y is the vector comprising of n function values. We use the least square method to
Electronics 2022, 11, 1365
7 of 16
reduce the sum of squares for residuals in order to solve the unknown vector β. The
solution of β can be expressed using β∗ as follows.
−1
β ∗ = AT A
AT y
(12)
After solving β, the expressions of the piecewise function become clear. To accelerate
the processing speed and retain recognition accuracy, the different fitting schemes designed
to replace the original sigmoid function need to be compared before hardware implementation occurs. This paper applies the error vector E to describe the differences between
fitted values in the piecewise linear function and actual values of the original function in
the given data pairs. The error vector E can be expressed as follows.
E = Aβ∗ − Y = [e1 , e2 , . . . , en ] T
(13)
E is the vector reflecting absolute errors with n elements, and ei = p( xi ) − yi , i =
1, 2, . . . , n. The maximum of absolute error emax = max (|e1 |, |e2 |, . . . , |en |) in E is the largest
value of all biases between the fitting function and the original function with the data pairs,
while the average absolute error of all elements eavg = n1 ∑in=1 |ei | in E is the average value
of all biases in the fitting model. |·| denotes the absolute value of each element in vector E.
The two kinds of indexes can measure the errors among fitting models.
3. Realization Scheme of PWLC for the Sigmoid Function
Having demonstrated the principle of solving expressions of the linear fitting function
with sample data points, we next present the detailed process used to solve the function
expression with the PWLC method. To determine the abscissas of the piecewise points with
the specified sample points, we need to analyze the characteristics of the sigmoid function.
The sigmoid function is centrally symmetric about the point with the coordinate of
(0, 0.5). There are obvious saturation areas for the sigmoid function at both ends of the
x-coordinate axis. When x = 8, the sigmoid function’s value is equal to 0.9997, which is
very close to its largest value of 1. When x > 8, the value of the fitting function can be set to
1. Correspondingly, when x ≤ −8, the value of the fitting function can be set to 0. From the
curvature graph of the sigmoid function, it can be seen that when x = 0, the curvature is
also 0. Around x = 0, the sigmoid function is almost straight, and x = 0 is in the range of
an approximately straight line. In addition, the ideal linear fitting function is also centrally
symmetric about the point with the coordinate of (0, 0.5). If x = 0 is set as a piecewise
point, the slope and intercept of the front and back two segments remain equal, and the
expressions of the two piecewise functions are the same; thus, it is unnecessary to set x = 0
as a piecewise point.
Due to the central symmetry of the linear fitting function, the coordinates of the
piecewise points on the positive half of the x-axis and the coordinates of the piecewise
points on the negative half of the x-axis generally appear in pairs. Therefore, the number
of piecewise points of the linear fitting function is set to an even number. The number of
piecewise points here is set to 4, 6, 8, and 10, and the total number of segment intervals
including the two ranges of x ≤ −8 and x > 8 is 5, 7, 9, and 11, respectively. The selection of
the abscissa of the piecewise point needs to be symmetric about x = 0 to satisfy the central
symmetry condition of the fitting function. Under the premise of the central symmetry, we
can define the abscissa independently. It should be noted that the piecewise points need to
be set relatively densely in the segment interval with larger curvature values so that more
linear functions can be used to fit the sigmoid function with a greater degree of curvature.
The numbers of abscissas are even, and the selection of abscissas is based on the analysis of
curvatures. This paper selects several representative abscissas in the range of [−8, 8]. The
abscissas of the piecewise points and the corresponding numbers of abscissas are shown in
Table 2, arranged from small numbers to large numbers.
Electronics 2022, 11, 1365
8 of 16
Table 2. Different abscissas with various numbers of piecewise points.
Numbers
Abscissas of Piecewise Points for Different Functions
4
6
8
10
12
14
−8, −2, 2, 8
−8, −3, −1, 1, 3, 8
−8, −4, −2, −1, 1, 2, 4, 8
−8, −4.5, −3, −2, −1, 1, 2, 3, 4.5, 8
−8, −4.5, −3, −2.5, −2, −1, 1, 2, 2.5, 3, 4.5, 8
−8, −4.5, −3, −2.5, −2, −1.5, −1, 1, 1.5, 2, 2.5, 3, 4.5, 8
According to the abscissas selected in Table 2 and the method proposed to solve
the piecewise linear fitting function, we can solve the expressions of the functions with
different numbers of piecewise points based on the analysis of curvature values discussed
above. The piecewise function expressions gives the slopes and intercepts in the format of
decimals, valid to five decimal places. In fact, the slopes and intercepts in each piecewise
linear function interval have redundant mantissas in the format of decimal numbers. Due
to the limited storage space available for fixed-point numbers on the FPGA platform, it is
sufficient to present the leading significant digits. These decimal numbers can be converted
to binary numbers with limited bits of mantissas in the storage space of the FPGA platform.
The expression of a piecewise linear fitting function with four piecewise points p4 ( x )
is as follows.

0,






0.01511 · x + 0.09783,
x ≤ −8
−8 < x ≤ −2
p4 ( x ) = 0.21619 · x + 0.5,
−2 < x ≤ 2




0.01511 · x + 0.90217, 2 < x ≤ 8



1,
x>8
(14)
The expression of a piecewise linear fitting function with six piecewise points p6 ( x ) is
as follows.

0,
x ≤ −8





0.00634
·
x
+
0.04392,
−
8 < x ≤ −3






0.11127 · x + 0.35872, −3 < x ≤ −1
p6 ( x ) = 0.25255 · x + 0.5,
−1 < x ≤ 1




0.11127 · x + 0.64128, 1 < x ≤ 3





0.00634 · x + 0.95608, 3 < x ≤ 8



1,
x>8
(15)
The expression of a piecewise linear fitting function with eight piecewise points p8 ( x )
is as follows.

0,
x ≤ −8





0.00261 · x + 0.01947, −8 < x ≤ −4





0.04767 · x + 0.19971, −4 < x ≤ −2






0.15881 · x + 0.42199, −2 < x ≤ −1
p8 ( x ) = 0.23682 · x + 0.5,
(16)
−1 < x ≤ 1




0.15881 · x + 0.57801, 1 < x ≤ 2




0.04767 · x + 0.80029, 2 < x ≤ 4





0.00261 · x + 0.98053, 4 < x ≤ 8



1,
x>8
Electronics 2022, 11, 1365
9 of 16
The expression of a piecewise linear fitting function with 10 piecewise points p10 ( x ) is
as follows.

0,
x ≤ −8





0.00252 · x + 0.01875, −8 < x ≤ −4.5




0.02367 · x + 0.11397, −4.5 < x ≤ −3





0.06975 · x + 0.25219, −3 < x ≤ −2






0.14841 · x + 0.40951, −2 < x ≤ −1
p10 ( x ) = 0.2389 · x + 0.5,
(17)
−1 < x ≤ 1




0.14841 · x + 0.59049, 1 < x ≤ 2





0.06975 · x + 0.74781, 2 < x ≤ 3





0.02367 · x + 0.88603, 3 < x ≤ 4.5





0.00252 · x + 0.98125, 4.5 < x ≤ 8



1,
x>8
The expression of a piecewise linear fitting function with 12 piecewise points p12 ( x ) is
as follows.

0,
x ≤ −8




0.00248 · x + 0.0185,
−8 < x ≤ −4.5





0.02405 · x + 0.11556, −4.5 < x ≤ −3





0.06608 · x + 0.24165, −3 < x ≤ −2.5





0.07375
· x + 0.26084, −2.5 < x ≤ −2






0.14761 · x + 0.40855, −2 < x ≤ −1
p12 ( x ) = 0.23906 · x + 0.5,
(18)
−1 < x ≤ 1



0.14761 · x + 0.59145, 1 < x ≤ 2





0.07375 · x + 0.73916, 2 < x ≤ 2.5





0.06608 · x + 0.75835, 2.5 < x ≤ 3





0.02405 · x + 0.88444, 3 < x ≤ 4.5





0.00248 · x + 0.9815, 4.5 < x ≤ 8



1,
x>8
The expression of a piecewise linear fitting function with 14 piecewise points p14 ( x ) is
as follows.

0,
x ≤ −8





0.00247
·
x
+
0.01843,
−
8 < x ≤ −4.5





0.02415 · x + 0.11599, −4.5 < x ≤ −3





0.0639 · x + 0.23525,
−3 < x ≤ −2.5





−2.5 < x ≤ −2
0.0831 · x + 0.28325,




0.12891
·
x
+
0.37487,
−
2 < x ≤ −1.5






0.16351 · x + 0.42677, −1.5 < x ≤ −1
p14 ( x ) = 0.23674 · x + 0.5,
(19)
−1 < x ≤ 1




0.16351 · x + 0.57323, 1 < x ≤ 1.5





0.12891 · x + 0.62513, 1.5 < x ≤ 2





0.0831 · x + 0.71675, 2 < x ≤ 2.5





0.0639 · x + 0.76475, 2.5 < x ≤ 3




0.02415 · x + 0.88401, 3 < x ≤ 4.5





0.00247 · x + 0.98157, 4.5 < x ≤ 8



1,
x>8
Electronics 2022, 11, 1365
10 of 16
Based on the fitting function expression above (the interval length of sample points
is set to 0.01), we obtain the maximum absolute errors emax and the average absolute
errors eavg in Figure 2. As can be seen from the figure, the absolute error is inversely
proportional to the number of piecewise points. In more detail, when the number of
piecewise points increases to 10, the maximum error converges to about 0.007, while the
average error converges to about 0.001. It can be determined that the point at which the
number of piecewise points is 10 is the elbow point in Figure 2. When the number of
piecewise points is less than 10, the absolute error of the piecewise function is relatively
large, and the fitting degree of the original sigmoid function is not ideal. When the number
of piecewise points is greater than 10, the absolute error converges to a small value without
significant change, while the absolute errors converge to the fixed values. Notably, a higher
number of piecewise points increases the complexity of the piecewise function, elevating
the power consumption and consuming more unnecessary hardware resources with little
improvement on absolute fitting errors. Accordingly, considering the absolute error and
the complexity of the function, this paper uses 10 as the number of piecewise points.
Maximum Absolute Error
Average Absolute Error
0.06
Absolute errors
0.05
0.04
0.03
0.02
0.01
0
4
6
8
10
12
14
Number of Piecewise Points
Figure 2. Absolute errors between the piecewise linear fitting function and original function with
different numbers of piecewise points based on analysis of curvature values.
Apart from the selection of piecewise points, it is possible for the interval length of
sample points to have effects on the maximum absolute errors and average absolute errors.
This paper sets the interval lengths of sample points to 0.5, 0.1, 0.05, 0.01, and 0.005. The
corresponding numbers of sample points n in the range of [−8, 8] are 32, 160, 320, 1600, and
3200, respectively. The maximum absolute errors and average absolute errors between the
fitting function and the original sigmoid function at different interval lengths are presented
in Table 3.
Table 3. Absolute errors with different sample point interval lengths.
Length
0.5
0.1
0.05
0.01
0.005
emax
0.00653
0.00773
0.00781
0.00784
0.00784
eavg
0.00251
0.00166
0.00164
0.00163
0.00163
It can be seen that differences in sample point interval lengths have a significant impact
on the absolute errors of the fitting function. As the sample point interval length is reduced,
the selected sample points become more dense, the maximum error emax tends to be larger,
and the average error eavg becomes smaller. When the sample point interval length is 0.01,
these two absolute errors converge to fixed values at the same time. It is worth noting
that the maximum error does not accurately reflect all of the fitting function’s deviation
from the original function, even when the sample point interval length is small. Due to the
Electronics 2022, 11, 1365
11 of 16
sparsity of the sample points, the maximum error cannot include other non-sample points.
The average error can thus describe the relative deviation between the fitting function and
the original function more generally.
When the sample point interval length is sufficiently small and the selection of sample
points is sufficiently dense, both the maximum error and the absolute error can reflect the
degree to which the linear fitting function deviates from the sigmoid function. When the
sampling interval length is less than 0.01 in Table 3, the values of the maximum error and
the absolute error remain unchanged. More sampling points cannot improve the fitting
effect; instead, it will cause the unknown parameter matrix β to be larger, increase the
amount of calculation required to solve unknown parameters, raise the complexity of
the fitting function expression, and cost a large amount of additional computational time.
Therefore, it is both appropriate and reliable to set the sampling interval to 0.01.
4. Hardware Design for the Circuit of PWLC Method
We implement our PWLC method on the Xilinx FPGA (XC7V2000) with the Vivado
design suite [24]. In the specified range of [−8, 8], all the numbers of the sigmoid function,
including input values, slopes, intercepts, and output values, are within the range of [−8, 8].
The circuit uses 16-bit fixed-point numbers to store all values. The fixed-point number
includes a 1-bit signal part, a 3-bit integer part, and a 12-bit mantissa part. If the input value
is outside the range of [−8, 8], it can be stored as the saturation values of a 16-bit fixed-point
number. The output function value of −8 is 0, while the output function value of 8 is 1.
All input values have corresponding output values expressed using this 16-bit fixed-point
method. The original decimal numbers are converted to 16-bit fixed-point numbers with
minimal accuracy loss.
It is time consuming to calculate different segments of the optimized function in series
and then output the result for the specified segment. This paper accordingly designs a
structure that allows 11 segments to be processed in parallel, after which one result is chosen
as an output according to the range of input values. The hardware computes the arithmetic
results of the input value in nine segments and selects one result based on comparisons
among all piecewise points and the input value. The arithmetic (multiplying and adding)
and range selection operations are performed in parallel to decrease the end-to-end latency.
The input value is compared with 10 piecewise points and the range of input values are
determined. As some slopes of the piecewise function are equal, we reuse some multipliers
and connect two different adders after each multiplier. This design can reduce the number
of multipliers required by four and therefore reduce the hardware resources. The hardware
realization structure of PWLC is in shown Figure 3.
The comparators compare the input value with all piecewise points without the trigger
of the clock. When the input value is larger than a piecewise point, the comparator output
is set to 1; otherwise, the comparator outputs 0. The expression for the comparator output
is as follows.
(
0, x ≤ b j
c j = δx>bj =
(20)
1, x > b j
The relationship among the 10 outputs of the different comparators and the data ports
of the multiplexer is summarized in Table 4. The selected function value of the multiplexer
is based on the range of input value. The multiplexer then outputs the calculation results
according to the comparison results.
Electronics 2022, 11, 1365
12 of 16
input[15:0]
Buffer for
Slopes and
Intercepts
t1 k1
t9
Buffer for
Piecewise
Points
t4 k4
t6
b1
t5 k5
Com
>
b10
Com
>
c[0]
0
D0
c[9:0]
1
D1
D2
D7
D8
MUX
c[9]
D10
D9
output[15:0]
Figure 3. The overall hardware design of a piecewise linear fitting function with 10 piecewise points.
Table 4. Selection method for the 11 input data of the multiplexer.
c [0]
c [1]
c [2]
c [3]
c [4]
c [5]
c [6]
c [7]
c [8]
c [9]
MUX
0
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
D0
D1
D3
D5
D7
D9
D8
D6
D4
D2
D10
According to the linear function expression with 10 piecewise points, the operations
in different ranges can be realized by some combination of multipliers and adders. These
multipliers truncate the sign bit and the high 15-bit data as output when multiplying two
pieces of 16-bit input data. For the multipliers and adders in different ranges, the slopes k i
and the intercepts ti are inputs of the multiplier and adder, respectively. Ranges that are
symmetrical on the y-axis are brought close to each other to reuse the multiplier with the
same slope value. Five multipliers and nine adders work in parallel and output nine results
corresponding to different ranges. Based on the encoding table, the multiplexer chooses
the specified value as the final output according to its segment selection.
The circuit for the PWLC method has input and output registers, so it outputs a value
with two clocks. The multipliers, adders, and comparators are combinational circuits and
operate with two clock cycles of latency. The parallel processing procedure can eliminate
the read-after-write correlation between the selection signal and the outputs of the nine
adders. Moreover, the parallel operation scheme across two data paths can also make full
use of hardware resources and increase the computational efficiency. The computation
modules thus work together at high speed and take little computation time.
Electronics 2022, 11, 1365
13 of 16
5. Results and Comparisons
We list the timing characteristics and hardware resources from the referenced papers
in Table 5. The FPGA series names in Table 5 are all Virtex. We present the detailed data
of the minimum input arrival time before clock and the maximum output required time
after clock [25,26]. The minimum input arrival time before clock of the proposed circuit is
8.559 ns and the maximum output required time after clock is 8.860 ns. Due to the high
processing speed requirement, the timing characteristics include clock frequency and circuit
latency. The clock frequency of our circuit design is 208.3 MHz, while the whole end-to-end
latency is 9.6 ns. The comparisons of hardware resources include flip-flop (FF), look-up
table (LUT), and digital signal processor (DSP). We find that our design can achieve high
frequency; the primary reason for this relates to our circuit design with two parallel data
paths. This design, with higher numbers of FFs and LUTs to realize parallel processing, can
achieve the lowest latency when implementing our circuit without the use of DSPs. Given
the advantages in terms of processing latency, the hardware resource of LUT overhead
is acceptable in practical scenarios, while the design of Campo [10] may exceed the DSP
resources and that of Gomar [17] has more FF usage.
Table 5. Timing characteristics and hardware resources of different methods.
Method
Campo [10]
Gomar [17]
Proposed
Timing Characteristics
Hardware Resources
Freq./MHz
Lat./ns
Platform
LUT
FF
DSP
373.5
383.8
208.3
18.7
13.0
9.6
XC6V2000
XC4VFX12
XC7V2000
232
123
493
16
71
32
6
0
0
This paper proposes maximum absolute error and average absolute error to describe
the deviations between the sigmoid function and the piecewise linear fitting function. The
maximum error presents the largest deviation, while the average error gives the overall
deviation of all samples in the domain of definition. The comparisons of maximum absolute
errors and average absolute errors among different methods [20] are presented in Table 6.
As the table shows, the proposed method has the smallest maximum absolute error among
all methods and the second smallest average absolute error. Moreover, compared with the
method that achieves the smallest average absolute error (proposed by Armato [16]), our
method has fewer segments in the specified range. This lower number of segments can
decrease the hardware complexity and reduce hardware resource consumption. In short,
our hardware implementation design achieves high fitting accuracies with few FFs and no
usage of DSPs.
Table 6. Absolute errors of different fitting methods for the sigmoid function.
Method
Range
Segments
Armato [16]
Ngah [12]
Gomar [17]
Mitra [14]
Zamanlooy [19]
Campo [10]
Savich [15]
Pan [21]
Nguyen [20]
Proposed
[−8, 8]
[−4, 4]
[−4, 4]
[−9.35, 9.35]
[−8, 8]
[−4.59, 4.59]
[−8, 8]
[−5, 5]
[−5, 5]
[−8, 8]
16
Null
Null
14
6
12
5
7
12
9
Absolute Errors
Maximum
Average
0.00788
0.022
0.0087
0.0127
0.0189
0.028
0.0679
0.0189
0.0125
0.00784
0.00107
0.0077
0.0058
0.0015
0.0059
0.0043
0.0263
0.00587
0.0042
0.0016
This paper further applies the piecewise linear fitting function to recognize different
handwritten numerals in the MINIST dataset with a specified deep neural network (DNN)
Electronics 2022, 11, 1365
14 of 16
and a convolutional neural network (CNN). The structure of DNN (comprising of five
fully connected layers) and CNN (consisting of two convolutional layers, two pooling
layers, and two fully connected layers) is in Table 7. Based on the specified DNN and CNN
structure, this paper compares the recognition accuracies of different fitting methods on
the MNIST dataset. The recognition accuracy can intuitively reflect the actual effect of the
proposed linear fitting method; a higher recognition accuracy indicates that the design is
more trustable in practical use.
Table 7. Layer names and sizes of DNN and CNN.
DNN
CNN
Layer Name
Layer Size
Layer Name
Layer Size
Input
Hidden1
Hidden2
Hidden3
Hidden4
Hidden5
Output
784
576
450
300
120
80
10
Input
Conv1
Pool1
Conv2
Pool2
FullyCon1
FullyCon2
Output
1 × 32 × 32
6 × 28 × 28
6 × 14 × 14
12 × 10 × 10
12 × 5 × 5
300
120
10
According to the contents of Table 8, the hardware implementation of the linear fitting
function proposed in this paper achieves a higher recognition rate than other methods.
Compared with the second-highest recognition accuracy obtained by Nguyen [20], our
PWLC method increases the accuracy with DNN by 0.06% and the accuracy with CNN by
0.23%. Moreover, the recognition rate of the linear fitting function circuit applied in the
deployment of DNN is even higher than that of the original sigmoid function. This may
be because all the middle layers of the DNN network are fully connected layers, and the
linear fitting function is expressed in a piecewise hierarchical form, which facilitates precise
feature extraction with discrete values and result in high recognition rates. For its part, the
original nonlinear sigmoid function may aggravate the error transmission in the inference
process. Thus, the recognition rates of DNN with the linear fitting function are superior to
those of the original nonlinear sigmoid function.
Table 8. Accuracies of DNN and CNN with different fitting methods.
Method
Range
Segments
Sigmoid
Armato [16]
Ngah [12]
Gomar [17]
Mitra [14]
Zamanlooy [19]
Campo [10]
Savich [15]
Nguyen [20]
Proposed
(−∞,+∞)
[−8, 8]
[−4, 4]
[−4, 4]
[−9.35, 9.35]
[−8, 8]
[−4.59, 4.59]
[−8, 8]
[−5, 5]
[−8, 8]
Null
16
Null
Null
14
6
12
5
12
9
Accuracy/%
DNN
CNN
97.37
97.38
97.37
97.3
97.35
97.36
97.34
96.61
97.45
97.51
98.96
98.26
98.35
98.24
98.29
98.21
98.27
97.85
98.42
98.65
6. Conclusions
This paper proposes PWLC to calculate the expression of the piecewise linear fitting
function for the sigmoid function, compare the absolute errors of different expressions for
the fitting functions, and realize a hardware acceleration scheme with high fitting accuracies.
According to the characteristics of the curvature graph and the systematic sampling method,
the abscissas of a given sigmoid function graph are dynamically selected as the candidate
piecewise points. After comparing the maximum absolute error and average absolute
Electronics 2022, 11, 1365
15 of 16
error of the linear fitting function with different numbers of piecewise points, we choose
the elbow point of the absolute error graph for different piecewise points for use in our
hardware implementation circuit design. Moreover, as the hardware resource consumption
is mainly related to the range numbers of the piecewise function, the circuit design in this
paper achieves low absolute errors while maintaining hardware resource consumption
at a moderate level. This design does not require a DSP composed of multiple LUTs and
FFs. Therefore, this implementation of the sigmoid function will not lead to excessive
usage of DSPs. We further apply parallel operation on arithmetic and data comparisons in
different ranges simultaneously to accelerate the processing speed evidently. These parallel
operation paths consist of multipliers, adders, and comparators used to achieve low latency
and high processing speed. Based on this parallelism at the operational level, the fitting
error and circuit latency are very low, albeit at the cost of a slight increase in hardware
resource consumption. In the future, it will be valuable to use the piecewise linear fitting
function algorithm and hardware module design proposed in this paper to implement
other activation functions in other neural networks.
Author Contributions: Conceptualization, Z.L.; funding acquisition, Z.X.; investigation, Z.L.;
methodology, Z.L. and Y.Z.; resources, Y.Z. and Q.W.; software, B.S. and Q.W.; supervision, Y.Z. and
B.S.; writing—original draft, Z.L.; writing—review and editing, Y.Z. All authors have read and agreed
to the published version of the manuscript.
Funding: This work was supported by the National Natural Science Foundation of China under
Grant 61874140 and 62002365.
Data Availability Statement:
(accessed on 22 March 2022).
The MINIST dataset is on http://yann.lecun.com/exdb/mnist/
Acknowledgments: We appreciate our reviewers and editors for their precious time.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Dubey, S.R.; Singh, S.K.; Chaudhuri, B.B. A Comprehensive Survey and Performance Analysis of Activation Functions in Deep
Learning. ArXiv 2021, arXiv:2109.14545.
Reuther, A.; Michaleas, P.; Jones, M.; Gadepally, V.N.; Samsi, S.; Kepner, J. AI Accelerator Survey and Trends. AI Accelerator
Survey and Trends. In Proceedings of the 2021 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA,
USA, 20–24 September 2021; pp. 1–9. [CrossRef]
Wang, E.; Davis, J.J.; Zhao, R.; Ng, H.C.; Niu, X.; Luk, W.; Cheung, P.Y.K.; Constantinides, G.A. Deep Neural Network
Approximation for Custom Hardware. ACM Comput. Surv. 2019, 52, 1–40. [CrossRef]
Ghimire, D.; Kil, D.; Kim, S.h. A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration. Electronics
2022, 11, 945. [CrossRef]
Papaphilippou, P.; Luk, W. Accelerating Database Systems Using FPGAs: A Survey. In Proceedings of the International
Conference on Field Programmable Logic and Applications (FPL), Dublin, Ireland, 27–31 August 2018; pp. 125–1255. [CrossRef]
Chiluveru, S.R.; Tripathy, M.; Mohapatra, B. Accuracy controlled iterative method for efficient sigmoid function approximation.
IET Electron. Lett. 2020, 56, 914–916. [CrossRef]
Lin, Z.; Sinha, S.; Liang, H.; Feng, L.; Zhang, W. Scalable Light-Weight Integration of FPGA Based Accelerators with Chip
Multi-Processors. IEEE Trans. Multi-Scale Comput. Syst. 2018, 4, 152–162. [CrossRef]
Namin, A.H.; Leboeuf, K.; Muscedere, R.; Wu, H.; Ahmadi, M. Efficient hardware implementation of the hyperbolic tangent
sigmoid function. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Taipei, China, 24–27
May 2009; pp. 2117–2120. [CrossRef]
Chen, H.; Jiang, L.; Yang, H.; Lu, Z.; Fu, Y.; Li, L.; Yu, Z. An Efficient Hardware Architecture with Adjustable Precision and
Extensible Range to Implement Sigmoid and Tanh Functions. Electronics 2020, 9, 1739. [CrossRef]
Campo, I.D.; Finker, R.; Echanobe, J.; Basterretxea, K. Controlled accuracy approximation of sigmoid function for efficient
FPGA-based implementation of artificial neurons. Electron. Lett. 2013, 49, 1598–1600. [CrossRef]
Nascimento, I.; Jardim, R.; Dias, F.M. A new solution to the hyperbolic tangent implementation in hardware: Polynomial
modeling of the fractional exponential part. Neural Comput. Appl. 2012, 23, 363–369. [CrossRef]
Ngah, S.; Bakar, R.B.A. Sigmoid Function Implementation Using the Unequal Segmentation of Differential Lookup Table and
Second Order Nonlinear Function. J. Telecommun. Electron. Comput. Eng. 2017, 9, 103–108.
Electronics 2022, 11, 1365
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
16 of 16
Jin, R.; Jiang, J.; Dou, Y. Accuracy Evaluation of Long Short Term Memory Network Based Language Model with Fixed-Point Arithmetic;
Springer International Publishing: Cham, Switzerland, 2017; pp. 281–288. [CrossRef]
Mitra, S.; Chattopadhyay, P. Challenges in implementation of ANN in embedded system. In Proceedings of the International
Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Chennai, India, 3–5 March 2016; pp. 1794–1798.
[CrossRef]
Savich, A.W.; Moussa, M.A.; Areibi, S. The Impact of Arithmetic Representation on Implementing MLP-BP on FPGAs: A Study.
IEEE Trans. Neural Netw. 2007, 18, 240–252. [CrossRef] [PubMed]
Armato, A.; Fanucci, L.; Scilingo, E.P.; Rossi, D.D. Low-error digital hardware implementation of artificial neuron activation
functions and their derivative. Microprocess. Microsyst. 2011, 35, 557–567. [CrossRef]
Gomar, S.; Mirhassani, M.; Ahmadi, M. Precise digital implementations of hyperbolic tanh and sigmoid function. In Proceedings
of the Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, USA, 6–9 November 2016;
pp. 1586–1589. [CrossRef]
Pandit, B.K.; Banerjee, A. VLSI Architecture of Sigmoid Activation Function for Rapid Prototyping of Machine Learning
Applications. In Proceedings of the 2021 IEEE International Symposium on Smart Electronic Systems (iSES), Jaipur, India, 18–22
December 2021; pp. 117–122. [CrossRef]
Zamanlooy, B.; Mirhassani, M. An Analog CVNS-Based Sigmoid Neuron for Precise Neurochips. IEEE Trans. Very Large Scale
Integr. (VLSI) Syst. 2017, 25, 894–906. [CrossRef]
Nguyen, V.T.; Jueping, C.; Linyu, W.; Jie, C. Low complexity probability-based piecewise linear approximation of the sigmoid
function. J. Xidian Univ. 2020, 47, 58–65. [CrossRef]
Pan, Z.; Gu, Z.; Jiang, X.; Zhu, G.; Ma, D. A Modular Approximation Methodology for Efficient Fixed-Point Hardware
Implementation of the Sigmoid Function. IEEE Trans. Ind. Electron. 2022. [CrossRef]
Liang, Y.; Lu, L.; Xiao, Q.; Yan, S. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. IEEE Trans.
Comput.-Aided Des. Integr. Circuits Syst. 2020, 39, 857–870. [CrossRef]
Wei, X.; Liang, Y.; Li, X.; Yu, C.H.; Zhang, P.; Cong, J. TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference.
In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Marrakech, Morocco, 19–21
March 2018; pp. 1–8. [CrossRef]
Xilinx. Vivado Design Suite Tutorial: Design Flows Overview (UG888). 2021. Available online: https://www.xilinx.com/content/
dam/xilinx/support/documents/sw_manuals/xilinx2021_1/ug888-vivado-design-flows-overview-tutorial.pdf (accessed on 22
March 2022).
Kumar, A.; Sharma, P.; Gupta, M.K.; Kumar, R. Machine Learning Based Resource Utilization and Pre-estimation for Network on
Chip (NoC) Communication. Wirel. Pers. Commun. 2018, 102 2211–2231. [CrossRef]
Kumar, A.; Verma, G.; Gupta, M.K.; Salauddin, M.; Rehman, B.K.; Kumar, D. 3D Multilayer Mesh NoC Communication and
FPGA Synthesis. Wirel. Pers. Commun. 2019, 106, 1855–1873. [CrossRef]
Download