Two-Dimensional orthogonal DCT Expansion In triangular and

advertisement
TWO-DIMENSIONAL ORTHOGONAL DCT EXPANSION
IN TRIANGULAR AND TRAPEZOID REGIONS
Soo-Chang Pei (貝蘇章), Jian-Jiun Ding (丁建均), Pao-Yen Lin (林保言),
Tzu-Heng Henry Lee (李自恆)
Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan (R.O.C.)
Email address: pei@cc.ee.ntu.edu.tw, djj@cc.ee.ntu.edu.tw, alexndy@gmail.com,
r96942133@ntu.edu.tw
ABSTRACT
It is known that the 2-D DCT basis is complete and orthogonal in a rectangular region. In this paper, we introduce the way to generate the complete and orthogonal 2D DCT basis in a trapezoid region or a triangular region
without using the complicated Gram-Schmidt method.
Moreover, since a polygon can be decomposed several
triangular regions, the proposed method is also suitable
for the polygonal region. Our algorithm can much generalize the JPEG algorithm. Instead of dividing an image into 8 by 8 blocks, we can divide an image into
trapezoid or triangular regions and then transform and
code each of them. In addition to the DCT basis, our
method can also be used for generating the 2-D complete and orthogonal DFT basis, KLT basis, Legendre
basis, Hadamard (Walsh) basis, and polynomial basis in
the trapezoid and triangular regions.
1. INTRODUCTION
The 2-D discrete cosine transform (DCT) [1] is
X  p, q  
M 1 N 1
  C m, n x m, n
m0 n 0
p,q
(1)
where m = 0, 1, …., M1, n = 0, 1, …., N1,
p = 0, 1, …., M1, q = 0, 1, …., N1,
  p(m  1/ 2) 
  q(n  1/ 2) 
C p, q  m, n  k p hq cos 
 cos 
 , (2)
M
N




k p  2 / M when p  0, k p  1/ M when p = 0,
hq  2 / N when q  0, hq  1/ N when p = 0.
The DCT basis is orthonormal in the MN rectangular region:
M 1 N 1
  C m, n C
m0 n 0
p1 , q1
p2 , q2
m, n    p2  p1  q2  q1  .
(3)
DCT expansion is often used in signal synthesis and
image compression [1][2]. During the JPEG process, an
image is first divided into 88 blocks and then the 2-D
DCT basis was applied to expand each block [3]. Since the
DCT is near to the optimal Karhunen-Loeve transform
(KLT) and has higher ability for decorrelation, after performing the DCT, most energy is concentrated on the lowfrequency region, which is very helpful for compression.
Although the DCT in (1) is popular in image compression, it has some problem. That is, it is only orthogonal in
an MN rectangular region. However, for the region with
other shape, it may not be orthogonal. Although for these
types of regions, we can use the Gram-Schmidt algorithm
to convert the DCT basis into an orthogonal basis, it is very
time-consuming and the round-off error may be caused
during the process of computation.
In this paper, we find that, with some modification, the
DCT basis can also be complete and orthogonal in a triangular region, a trapezoidal region, or their twisted
forms.
Furthermore, since a polygon can be viewed as a combination of triangles, the proposed method can also be applied for a polygonal region. We can first divide an n-side
polygon into n-2 triangular regions (instead of 88 blocks)
then perform DCT expansion for each triangular region.
Therefore, with the proposed method, we can perform
DCT expansion for an arbitrary polygonal region. It makes
the JPEG algorithm much more flexible.
Moreover, in addition to the DCT basis, the proposed
method can also be applied to other discrete orthogonal
bases with even and odd symmetries, such as the KLT basis, the DFT basis, the Hadamard (Walsh) basis, the discrete Legendre basis, and other discrete orthogonal polynomial bases. With the proposed method, we can convert
them into a complete and orthogonal vector set in the trapezoid and triangular regions
2. COMPLETE AND ORTHOGONAL DCT BASIS IN
THE TRAPEZOID REGION
Here, we define the trapezoid as a region that has M rows
(or columns) and if the number of pixels in the mth row (m
= 0, 1, …, M-1) is denoted by K(m), then
K(m) + K(M1m) is a constant.
(4)
A satisfies (5) and m = 0, 1, …, M1, n = 0, 1, …, N1,
when [m, n] is in Region A, then [M1m, N1n] must be
in Region B.
Similarly, if the pixel [m, n] is in Region B, then the
pixel [M1m, N1m] must be in Region A.
(M1)th row
(M2)th row
1st row
0th row
Fig. 1: A “trapezoid” region that satisfies (4) and the starting point of each row are aligned at the same column.
Black dots mean the pixels in the trapezoid region.
(a)
m = M1
m = M2
(Proof): From (5),
K(M1m) = N  K(m).
(6)
If [m, n] is in Region A, then, in the mth row, there should
be no less than n+1 pixels in Region A. Thus, K(m)  n+1
must be satisfied. Therefore, from (6),
K(M1m)  N  n 1.
(7)
Since [M1m, N1n] is the (Nn)th pixel in the
(M1m)th row, from (7), it is impossible in Region A. It
must be in Region B.
#
We have known that, for the M  N rectangular region,
the DCT bases Cp,q[m, n] in (2) forms a complete and orthogonal basis set. To derive the orthogonal basis for the
trapezoid region A, we can use the even / odd symmetric
property of the DCT basis
m=2
m=1
m=0
C p , q  m, n    1
n=0
(b)
1
2
N1
Region A
Region B
Region A
rotation by 180
Region B
Region B
Region A
M 1 N 1
  C m, n C
m0 n 0
Note that, if (5) is satisfied, Region A and Region B have
the same number of points (MN/2) (See Fig. 2). Moreover,
[Theorem 1] For the rectangular region in Fig. 2, if Region
(8)
p1 , q1
p2 , q2
m, n    p2  p1  q2  q1  .(10)
Since in Fig. 2(a), the rectangular region can be divided
into Region A and Region B, (11) can be re-expressed as:
 C p1 ,q1 m, n C p2 ,q2 m, n 
 m, nRegion A

Note that the definition in (4) is more general than the traditional definition of the “trapezoid”. In Fig. 1, we show an
example of a trapezoid region that satisfies (4) and the
starting points of each row are aligned at the same column:
To derive the DCT bases that are orthogonal for the
trapezoid region that satisfies (4), we can first use the
method in Fig. 2(b) to convert the trapezoid region into an
MN rectangular region where
N = K(m) + K(M1m)
(5)
The obtained rectangular region is shown in Fig. 2(a). It
consists of two parts: Region A (the original trapezoid region) and region B (the extended region).
C p , q  M  1  m, N  1  n  .
From (8), we can classify the DCT basis into two classes:
p+q is even (i.e., Cp,q[m, n] is even symmetric about
[(M1)/2, (N1)/2]) and p+q is odd (i.e., odd symmetric
about [(M1)/2, (N1)/2]). If
p1+q1 is even and
p2+q2 is even,
(9)
from (3), we have known that
Rectangular region
Fig. 2: Extending the trapezoid region in Fig. 1 into a rectangular region, where  means the pixels in the original trapezoid region (Region A) and  means the pixels in the rectangular region but not in the original
trapezoid region (Region B).
pq
 C p ,q m, n C p ,q m, n    p2  p1  q2  q1 
m , n Region B
1 1

2
. (11)
2
From Theorem 1, since if [m, n]  Region B, then [M1m,
N1n]  Region A, therefore

C p1 , q1  m, n  C p2 , q2  m, n 

C p1 , q1  M  1  m, N  1  n  C p2 , q2  M  1  m, N  1  n 
 m , nRegion B

 m , nRegion A


 m , nRegion A
C p1 , q1  m, n  C p2 , q2  m, n  .
(12)
The last line in (12) comes from (8) and (9). Thus, (11)
becomes
2  C p1 , q1  m, n  C p2 , q2  m, n     p2  p1   q2  q1  , (13)
 m, n Region A
i.e., the two bases are also orthogonal in Region A. Thus,
[Theorem 2] To derive the orthonormal DCT basis in the
trapezoid region that satisfies (5) and the starting points of
each row are aligned at the same column, as in Fig. 1, we
can follow the following process:
(A) First, use the method in Fig. 2(b) to construct the M 
N rectangular region and obtain the orthonormal DCT
basis Cp,q[m, n] by (2) for the rectangular region.
(B) Select the DCT basis Cp,q[m, n] that satisfies
p+q is even.
(14)
(C) Then we multiply Cp,q[m, n] by a constant:
(15)
Bp, q  m, n  2 C p, q  m, n .
Then, from (13),
 Bp1 ,q1  m, n Bp2 ,q2  m, n    p2  p1  q2  q1  . (16)
(a)
(b)
C0,0
2
2
4
4
2
4
[Theorem 3] The orthogonal DCT basis derived in Theorem 2 is complete in the trapezoid region.
2
(Proof): Since there are MN/2 points in the trapezoid region A, a complete set should have MN/2 bases. Note that,
from (14), the constraint that p+q is even should be satisfied.
Case 1: Both M and N are even: Since p  [0, M1], q  [0,
N1], the number of (p, q) that p + q is even is satisfied is:
M N
M N

 MN .
(17)
2 2
2 2
2
(both p and q are even)
(both p and q are odd)
Case 2: M is odd and N is even: Since there are (M+1)/2
even p and (M1)/2 odd p. Thus, the number of (p, q) that
satisfy (14) is:
M  1 N  M  1 N  MN .
(18)
2 2
2 2
2
(both p and q are even)
(both p and q are odd)
Case3: M is even and N is odd:
M N 1  M N 1
2 2
2 2
(both p and q are even)
(both p and q are odd)
 MN .
2
(19)
Note that, it is impossible that both M and N are odd. In
this case, from (5), if m = (M1)/2, 2K((M1)/2) = N and
K((M1)/2) = N/2, which is not an integer.
Therefore, in all the cases, we can obtain MN/2 DCT bases
from Theorem 2, which is equal to the number of points in
the trapezoid region A. Thus, the DCT bases obtained from
Theorem 2 form a complete and orthonormal set in the
trapezoid region A.
#
In Fig. 3, we give an example. Fig. 3(a) is a trapezoid
region. We use (14)-(16) to derive its complete and orthonormal DCT set (consists of 16 bases) and the results
are shown in Fig. 3(b).
4
C3,7
2
4
2 4 6 8 10
2 4 6 8 10
C1,7
2
4
2 4 6 8 10
4
2 4 6 8 10
C2,6
2
C3,5
2
4
2 4 6 8 10
C0,6
2 4 6 8 10
C1,5
2
4
2 4 6 8 10
4
2 4 6 8 10
C2,4
2
C3,3
2
4
2 4 6 8 10
C0,4
2 4 6 8 10
C1,3
2
4
2 4 6 8 10
4
2 4 6 8 10
C2,2
2
C3,1
2
4
2 4 6 8 10
C0,2
2
C1,1
2
4
2 4 6 8 10
 m, n Region A
Therefore, {Bp,q[m, n] | p  [0, M1], q  [0, N1], and
p+q is even} form an orthonormal basis set in the trapezoid region A.
C2,0
4
2 4 6 8 10
2 4 6 8 10
Fig. 3: The complete and orthonormal 2-D DCT basis in a
trapezoid region.
3. EXTENDING TO GENERALIZED TRAPEZOID,
TRIANGULAR, AND POLYGONAL REGIONS
We have derived the complete and orthonormal DCT basis
for the trapezoid region whose first pixels in each row are
aligned at the same column, as in Fig. 1. In fact, our results
can also be applied to other type of regions.
First, our results can be applied to any trapezoid region that satisfies (4), even if the first pixels in each row
are not aligned at the same column. For the region as in Fig.
4(a), we can first shear it into in Fig. 4(b), then use the
method in Section 2 to find the complete orthogonal DCT
bases, and then shear the bases back.
Furthermore, our method can also be applied for the
trapezoid regions that is the rotation form of Fig. 1 or Fig.
4(a).
Moreover, since the triangular region can be viewed
as a special case of trapezoid region whose number of pixel in the first (or the last) row is 1 (i.e., in (5), K(0) = 1 or
K(M 1) =1), as in Fig. 5, thus, the method in Theorem 2
can also be used for the triangular region.
Furthermore, since an n-side polygonal region can be
view as a combination of n2 triangular regions, we can
also use our method to perform DCT expansion for a polygonal region.
4. EXTENDING TO OTHER SYMMETRIC
ORTHOGONAL BASIS
(b)
(a)
shearing
Fig. 4: Shearing a region that satisfies (5) into the trapezoid
region whose first pixels in each row are aligned at the
same column.
(M1)th row
1st row
0th row
Fig. 5: A triangular region can be viewed as a special
case of the trapezoid region where K(0) or K(M1)
=1 in (4).
However, it is hard to find a trapezoid which can
match the arbitrary shape accurately for real case image
compression. That is, we find the approximate trapezoid
that is contained inside the arbitrary shape with the largest
area instead of finding the perfect matched trapezoid.
In order to have higher compression ratio we intend to
find a trapezoid that is contained inside the shape. Therefore, most of the pixels in the trapezoid region may have
similar characteristics (grey level values). In other words,
energy in this trapezoid region mostly concentrates in the
low frequency region. It is helpful for image compression.
Fig. 6 shows an example of finding an approximate trapezoid in an arbitrary region. Fig. 6(a) is an arbitrary shape
and Fig. 6(b) is one of the ways to find the approximate
trapezoid region. We can see that the trapezoid cannot exactly match the shape in Fig. 6(a). Therefore, we may find
more trapezoids with smaller size in the rest of the region
to have the entire shape. In chapter 5, we will show how to
deal with this problem. In fact, little amount of missing
points is tolerable. They can be easily recovered by pixel
interpolation in the posterior process.
(a)
(b)
approximate
trapezoid
In Sections 2 and 3, we discussed how to derive the complete and orthogonal DCT basis in a triangular or a trapezoid region. In fact, our method is also suitable for other
types of bases. Since Theorem 2 was derived based on (8),
thus, if a basis set is complete and orthogonal in a rectangular region and has the even / odd symmetric relation as
in (8), we can also use Theorem 2 to convert it into the
complete and orthogonal basis set in the triangular and the
trapezoid regions.
For example, in digital signal processing [5], the basis
sets of the 2-D discrete Fourier transform (DFT), the 2-D
discrete Hartley transform, the 2-D number theoretic
transform (NTT), the 2-D discrete Legendre transform,
the 2-D discrete orthogonal polynomial expansion, and
the 2-D Hadamard (Walsh) transform all have the even /
odd symmetric relation as in (8). Therefore, we can use
Theorem 2 to convert them into complete and orthonormal
basis sets in a triangular or a trapezoid region.
We give an example of deriving the complete orthogonal
Hadamard (Walsh) basis set for the triangular region as in
Fig. 7(a). Then, as the method in Fig. 2(b), we first convert
it into a 44 rectangular region. The 2-D orthogonal
Hadamard basis for the 44 rectangular region is [6]:
wp,q = wpTwq, where p = 0, 1, 2, 3, q = 0, 1, 2, 3, (20)
w0 = [1 1 1 1],
w1 = [1 1 1 1],
w3 = [1 1 1 1],
w4 = [1 1 1 1].
(21)
(a) triangular region
(b) w0,0
1
1

1

1
(d) w1,1
 1 1 1
1 1 0

 1 1 0

 1 0 0
(c) w2,0
1
1
0
1
0
0
0
0
0 
0

0
(e) w3,1
0
0 
0

0
(g) w2,2
 1 1 1
 1 1
0

 1 1
0

0
0
1
1
 1 1
 1 1

 1 1

 1 0
1
0
0
0
0
0 
0

0
(f) w0,2
 1 1 1
 1 1 0

1 1 0

 1 0 0
0
0 
0

0
(h) w1,3
0
0 
0

0
1
1
 1 1

 1 1

0
1
1
1

1

1
1
1
1
0
1 0 
0 0 
0 0

0 0
(i) w3,3
1 0
0 0 
0 0

0 0
 1 1
 1 1

 1 1

 1 0
1 0
0 0 
0 0

0 0
Fig. 7: (a) A triangular region, (b)-(i) The complete and
orthogonal Hadamard basis set for the region in (a)
Fig. 6: Finding (b) an approximate trapezoid region in (a)
an arbitrary shape.
(a) triangular region
(b) w0,0
1
1

1

1
(d) w1,1
(c) w2,0
1
1
1
0
1
0
0
0
1
1
 1 1

1
1


1
0

0
0 
0

0
(e) w3,1
 1  j 1
 j
1 0

 1 j 0

 j 0 0
0
0 
0

0
(g) w2,2
0
0 0 
0 0

0 0
1
0
0
0
0
0 
0

0
 1  j 1
  j 1 0

 1 j
0

j
0
0

0
0 
0

0
1
1

1

1
1
1
1
0
1
0
0
0
0
0 
0

0
(i) w3,3
j 1
1
 j 1 0

 1  j 0

0
 j 0
0
0 
0

0
j 1
1
 j 1 0

 1  j 0

0 0
 j
0
0 
0

0
Fig. 8: (a) A triangular region, (b)-(i) The complete and
orthogonal 2-D discrete Fourier transform basis set for
the region in (a)
Then, we use (14) and choose the Hadamard bases that
satisfy p+q is even. The bases that satisfy the constraint are
shown in Fig. 7(b)-(i). From Fig. 7(b)-(i), it is no hard to
show that
 wp1 ,q1 m, n wp2 ,q2 m, n  8  p2  p1  q2  q1  .(22)
 m, n  the region in Fig.6(a)
That is, the bases in Fig. 7(b)-(i) form a complete and orthogonal basis set in the triangular region as in Fig. 7(a).
Fig. 8 shows another example for the 2-D DFT basis.
5. APPLICATIONS IN IMAGE COMPRESSION AND
SIGNAL ANALYSIS
This chapter is divided into three parts. First, we will discuss the proposed method used in a trapezoid region.
Chapter 5.2 introduces the new segmentation and compression algorithms. Chapter 5.3 shows the compression procedure of the entire image.
5.1. Proposed method in a specific trapezoid region
The proposed method provides an efficient way to transform and code a trapezoid or triangular shape object. In
Figs. 9 and 10, we show a simulation.
trapezoid
50
50
100
100
150
150
200
100
150
proposed
0.99
0.98
Gram-Schmidt
MPEG-4
door
region
50
100
0.97
0.96
0
5
10
15
20
j
25
Fig. 10: Normalized partial sums P(j) (see (24), which can
measure the performance of energy concentration)
using (a) the proposed method, (b) the DCT obtained
by the Gram-Schmidt method, and (c) the two directional 1-D DCT in MPEG 4.
Although a door has the shape of rectangle, in a 2-D image,
it always becomes the trapezoid form, as in Fig. 9(b). Then
we use three methods to transform and code the door region in Fig. 9(b): (a) the proposed method, (b) using the
DCT basis orthogonalized by the Gram-Schmidt method,
and (c) applying the 1-D DCT along x-axis and y-axis, as
the method used in MPEG 4 [4]. Their running time are:
(a) proposed: 0.0364 sec
(b) Gram-Schmidt: 1032.87 sec
(c) the 1-D DCT method in MPEG 4: 0.0701 sec.
(23)
Then, in Fig. 10, we show the normalized partial sums of
the energies of the largest DCT coefficients of the three
methods:
j
P j 
s2  j 

i 1
total energy
where s  j  
sort
from large to small
,
X  p, q  ,
(24)
s[1]  s[2] s[3]  …., X[p, q] are the DCT expansion coefficients of the three methods.
From (23), the proposed method is much faster than the
Gram-Schmidt method and its energy concentration is as
good as the results of the Gram-Schmidt method (see Fig.
10). Moreover, compared with the shape adaptive DCT
method in MPEG 4, since our method perform the DCT
with fixed number of points for each row and column, our
method has both less computation time and better energy
concentration than the 1-D DCT method in MPEG 4.
5.2. New Segmentation and Compression Algorithms
200
50
1
P[ j]
(f) w0,2
(h) w1,3
 1 1
 1 1

 1 1

 1 0
1
150
Fig. 9: (a) A laboratory image. (b) In a 2-D image, the door
always has the shape of trapezoid.
With the proposed method, the algorithm for image
compression can become much more general. For the existing JPEG algorithm, an image is first divided into several
88 blocks, as Fig. 11(a). Now, with the proposed method,
we can divide an image into several trapezoid, rectangular, or triangular blocks instead of 88 rectangular
blocks, as Fig. 11 (b).
and may not cost too much processing time.
(a)
(b)
50
all 88 rectangular blocks
trapezoidal, rectangular, or
triangular blocks
Fig. 11: (a) The existing JPEG cuts an image into several
88 rectangular blocks. (b) With the proposed method,
we can divide an image into rectangular, trapezoid, or
triangular blocks.
Compared with the original JPEG algorithm, the method in Fig. 11 (b) is more flexible. Since the boundaries
between two blocks can have the direction not parallel to xand y-axes, we can make them match the edges of the objects. Then, the YCbCr values in a block will be more uniform, which is good for compression.
To make the block exactly match the shape of the object, which is the work in MPEG-4, we need extra data to
record the edges of the objects, which is not good for compression.
Using the method in Fig. 11 (b) can avoid the problem.
Since the boundary consists of straight lines, to record the
shape of a block, we only have to record its corners.
Moreover, from Section 4, since Theorem 2 can also be
used for deriving the 2-D complete and orthogonal DFT,
NTT, and Hadamard basis in a trapezoid or triangular region, therefore, the proposed method is also useful for signal analysis, filter design, CDMA, and other signal processing applications.
5.3. Image Compression with proposed method
Chapter 5.1 shows the compression in a specific trapezoid
region. However, for general images we can hardly find a
trapezoid which can exactly match the shape of the object.
Therefore, finding the appropriate trapezoids is very important in our proposed method.
Images are divided into four regions: lower frequency
regions, higher frequency regions, border regions and
the corner and boundaries part. The lower frequency
regions are trapezoids; they are depicted in Fig. 12(b). We
divide this image into eight low frequency parts. The lines
in Fig. 12(b) denote the boundaries of the trapezoid region.
Trapezoid DCT is used in the lower frequency regions and
the corner and boundaries part are coded by geometric coding techniques. Arbitrary shape DCT using GaussianSchmidt method is used in the higher frequency regions
and the border regions because their size are small enough
100
50
100
50
100
50
100
Fig. 12: (a) A fruit image. (b) The lower frequency regions
found in the fruit image.
We try to find the largest trapezoid that is contained inside
the lower frequency regions. Therefore, higher compression ratio can be obtained in the compression process. Dividing the objects into many trapezoid regions, the optimal
solution is difficult to find.
There are two problems in the dividing procedure:
overlapped trapezoids and missing points. Missing points
mean that we have gap between the trapezoids we found.
This can be dealt with pixel interpolation. The overlapped
trapezoids problem cause when we divide into larger trapezoids. This can be easily remove by simply choosing the
average value or just drop one of the points. Missing points
may cause larger error so we are willing to process more
data (overlapped trapezoids problem) rather than have
missing points between the regions.
Fig. 13 is the flowchart of our proposed compression
method. An image is divided into four regions as we mentioned before. The trapezoid DCT will be applied on the
low frequency region; in other words, the low frequency
regions must be divided into trapezoid. The arbitrary shape
DCT using GS is applied on the rest of the regions.
Lower frequency
region
DCT in trapezoid
regions
Input
image
Coding
ASDCT using
GSO process
Other region
50
Coding
Fig. 13: The flowchart of our proposed image compression
method using DCT in trapezoid regions
As mentioned, it is hard to find the optimal solution of dividing the lower frequency region into trapezoids. We proposed a method to resolve the problem. For each objects in
the image, we do the following processes. The dividing
procedure has mainly two steps: slice the objects into several stripes, find the inscribed trapezoid in each stripes.
The following is the dividing procedure:
Step 1. Find the corners of the object
Step 2. According to the corners, the object is sliced into
several stripes on the position of the corners. If the
corners are too close, we will merge the stripes.
Step 3. Find the inscribed trapezoid in each stripe. The
endpoints of the trapezoid are initialized to the
endpoint of the upper side and the lower side.
Step 4. By moving the legs inward we can obtain the inscribed trapezoid.
Step 5. Record the endpoints of the inscribed trapezoid.
Fig. 14 shows an example of finding inscribed trapezoid regions according to this process. Fig. 14(a) shows
how we slice the object into stripes. Note that if the corners
are too close then we will merge the two stripes. Fig. 14(b)
is the process that we find the inscribed trapezoid. We
move the legs of the trapezoid until the legs are all inside
the object.
(a)
Corner too close
(b)
Finding inscribed
trapezoid
Fig. 14: (a) Slicing the object into several stripes according
to the corners. (b) Finding the inscribed trapezoid by
moving inward the legs of the initial trapezoid.
100
50
100
50
100
50
100
Fig. 15: (a) The reconstruction fruit image using JPEG
compression standard (692 bytes). (b) The reconstruction fruit image using our proposed method (165
bytes).
Fig. 15 shows the reconstruction fruit image using the
JPEG compression standard and our proposed method. We
can see some black points inside the apple in Fig. 15(b).
This is caused by the missing point problem and we do not
fix it yet. The distortions are mainly in the high frequency
region and the border region but it is endurable for human
vision because human eyes are more sensitive to the lower
frequency distortion.
In Fig. 15, compared to the JPEG standard, the number
of bits using our proposed method is 165 bytes with
RMSE equals to 4.7286. The JPEG standard costs 692
bytes with RMSE equals to 2.1198. The data amount of
our proposed method is about one fourth of the JPEG
standard one while looking similar.
If we use smaller quantization step, we will have RMSE
smaller than using JPEG standard but it costs more bytes
whereas still costs only two third of data amount of the one
using the JPEG compression standard. Furthermore, if we
compress the image by using JPEG compression standard
with the same amount of data as our proposed method it
will cause severe block effect. So our proposed method can
also solve the block effect.
can be resolved even using less amount of data quantity to
compress the image. Moreover, the computation time of
our proposed method is much lower than the arbitrary
shape DCT using Gaussian-Schmidt method.
ACKNOWLEDGEMENT
50
100
The authors would like to thank the members of the NTU
thesis defense committee for their constructive and valuable comments and suggestions. They also thank the support
of the project 98-2918-I-002-012 by National Science
Council of Taiwan.
50
100
Fig. 16: The reconstruction fruit image using JPEG compression standard (233 bytes) RMSE= 4.2173.
Fig. 16 is the example of a fruit image using JPEG compression standard with data amount equals to 233 bytes
and RMSE equals to 4.2173. It is obvious that the block
effect becomes severe while using less byte to encode the
image. Compared to our proposed method in Fig. 15(b), it
costs only 165 bytes without block effect.
Moreover, the processing time is much less than the arbitrary shape DCT using Gaussian-Schmidt method. It
costs only 4.930688 seconds by using our proposed method while the Gaussian-Schmidt method needs much more
processing time.
In summary, compared to the conventional JPEG compression standard, our proposed method has the following
advantages:
(a) Less amount of data quantity.
(b) Avoid block effect.
(c) Compress the image according to its characteristics.
Compared to the arbitrary shape DCT using GaussianSchmidt method, our proposed method has the following
advantages:
(a) Reduce massive computation time.
(b) Energy concentration is as good as the results of the
Gram-Schmidt method.
7. CONCLUSION
In this paper, we describe the ways to generate the complete and orthonormal DCT basis in a trapezoid or a triangular region efficiently without using the Gram-Schmidt
method. With the proposed method, the JPEG compression
algorithm can become much more general and we can divide an image into trapezoid or triangular blocks instead of
88 blocks. Moreover, our method can also be applied to
the DFT basis, the Hadamard (Walsh) basis, or any other
bases with even and odd symmetric relations. By the new
segmentation and compression algorithm we proposed, the
block effect problem in the JPEG compression standard
REFERENCES
[1] R.C. Gonzalez, Digital image processing, 3rd ed., N.J.,
Prentice Hall, 2008.
[2] N. Ahmed, T. Natarajan, and K.R. Rao, “Discrete cosine transform,” IEEE Trans. Comput., vol. C-23, pp. 9093, Jan 1974.
[3] W.B. Pennebaker, JPEG Still Image Data Compression
Standard, New York, Van Nostrand Reinhold, 1993.
[4] I.E.G. Richardson, H.264 and MPEG-4 Video Compression, Chichester, Wiley, 2003.
[5] A.V. Oppenheim and R.W. Schafer, Discrete-Time
Signal Processing, 2nd ed., London, Prentice-Hall, 1999.
[6] S.S. Agaian, Hadamard Matrices and Their Applications, New York, Springer-Verlag, 1985.
[7] S.F. Chang and D.G. Messerschmitt, “Transform Coding of Arbitrarily-Shaped Image Segments,” Proceedings
of the first ACM international conference on Multimedia,
pp. 83-90, Aug. 1993.
[8] W.K. Ng and Z. Lin, “A New Shape-Adaptive DCT for
Coding of Arbitrarily Shaped Image Segments,” ICASSP,
vol. 4, pp. 2115-2118, 2000.
[9] M. Gilge, T. Engelhardt, and R. Mehlan, “Coding of
arbitrarily shaped image segments based on a generalized
orthonormal transform,” Signal Process: Image Commun., vol.1, pp. 153-180, Oct. 1989.
[10] T. Sikora, S. Bauer, and B. Makai, “Efficiency of
shape-adaptive 2-D transforms for coding of arbitrarily
shaped image segments,” IEEE Trans. Circuits Syst. Video Technol., vol. 5, pp.254-258, June 1995.
[11] A. Kaup and T. Aach, “Coding of Segmented Images
Using Shape-Independent Basis Functions,” IEEE Trans.
Image Proc., vol. 7, no. 7, pp. 937-947, July 1998.
[12] F. Davoine, M. Antonini, J. Chassery, and M. Barlaud,
“Fractal image compression based on Delaunay triangulation and vector quantization,” IEEE Trans. Image Proc.,
vol. 5, no. 2, pp. 338-346, 1996.
[13] T. Sikora, “Low complexity shape-adaptive DCT for
coding of arbitrarily shaped image segments,” Signal
Processing: Image Commun., vol. 7, pp. 381-395, Nov,
1995
Download