www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242

advertisement
www.ijecs.in
International Journal Of Engineering And Computer Science ISSN: 2319-7242
Volume 4 Issue 8 Aug 2015, Page No. 13696-13703
A 16-Core Processor with Shared-Memory and Message-Passing
Communications
Shaik Mahmed basha, G.Nageswararao,(phD)
2 nd Year M.TechVLSI,
Department of ECE
Audisankara Institute Of Technology, Gudur.
Associate Professor
Department of ECE.
Audisankara Institute Of Technology, Gudur.
Abstract—A 16-core processor with both message-passing and shared-memory inter-core communication mechanisms is
implemented in 65 nm CMOS. Message-passing communication is enabled in a 36 Mesh packet-switched network-on-chip, and
shared-memory communication is supported using the shared memory within each cluster. The processor occupies 9.1 and
operates fully functional at a clock rate of 750 MHz at 1.2 V and maximum 800 MHz at 1.3 V. Each core dissipates 34 mW under
typical conditions at 750 MHz and 1.2 V while executing embedded applications such as an LDPC decoder, a 3780-point FFT
module, an H.264 decoder and an LTE channel estimator. Index Terms—Chip multiprocessor, cluster-based, FFT, H.264 decoder,
inter-core communication, inter-core synchronization, LDPC decoder, LTE channel estimator, message-passing, multi-core,
network-on-chip, NoC, shared-memory, SIMD.
Index Terms—Chip multiprocessor, cluster-based, FFT,
H.264 decoder, inter-core communication, inter-core
synchronization, LDPC decoder, LTE channel estimator,
message-passing, multi-core, network-on-chip, NoC,
shared-memory, SIMD.
I.
INTRODUCTION
POWER budgets of embedded processors are bearing higher
pressure than before, driven by the massive employment of
mobile computing devices. The advancement of applications
in communication and multimedia systems even exacerbates
the situation. Fortunately, chip multiprocessors emerge as a
promising solution and many efforts are taken to increase
the parallelism and optimize memory hierarchy
concurrently, in order to meet the stringent power budget
while still enhancing performance [7]–[9]. However, even
managing to rebalance performance and power, multi core
architecture still introduces new challenges on inter-core
communications, which soon becomes the key for further
performance improvement.
Especially, the efficiency of inter-core communications
has direct impact on the performance and power metrics of
embedded processors. When a certain application is mapped
on a multicore system, it is usually pipelined and the
throughput depends on both computing capability and
communication efficiency between cores. Although there
are various performance-enhanced technologies, such as
super-scalar and Very Long Instruction Word (VLIW) etc.,
mature solutions for inter-core communications are still in
absence temporarily. Hence, the next stage research should
focus on improving the efficiency of inter-core
communications.
In the history of microprocessor, shared-memory
communication is the most often used mechanism due to its
simple programming model [3]–[5], [15]. However, it fails
to provide sufficient scalability to cater to the increasing
core numbers. Therefore, multicore processor designers
turned to the messagepassing communication mechanism, which shows more
scalability and potential to be employed in the nextgeneration embedded multicore processors [6]–[9].
In this paper, we attempt to summarize the key features
of
the
shared-memory
and
message-passing
communications. We show that different inter-core
communication methods match with different scenarios,
which implies that we could obtain a higher performance
and power efficiency by integrating both inter-core
communication mechanisms.
We propose a 16-core processor adopting hybrid intercore communication schemes with both shared-memory and
message-passing inter-core communications. A 2D Mesh
Net- work-on-Chip (NoC) is adopted to support messagepassing communications. Meanwhile, a cluster-based
memory hierarchy including shared memory enables shared-
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13696
DOI: 10.18535/ijecs/v4i8.14
memory communications. We also propose a hardwareaided mailbox inter-core synchronization method to support
inter-core communications, and a new memory hierarchy to
achieve higher energy efficiency. A prototype 16-core
processor chip has been fabricated in TSMC 65 nm Low
Power (LP) CMOS process and shows full functions.
This paper is organized as below. Section II describes the
key features of the 16-core processor and related work.
Section III details its design and implementation. Section IV
presents the measured result with the fabricated chip.
Section V concludes
the paper.
Fig.1: Motivation: Reduce the efficiency gap while
maintaining the flexibility.
II.
MOTIVATION AND KEY FEATURES
The primary motivation of our work is to improve the
performance and power efficiency of embedded multicore
processors while still maintaining flexibility, in other words,
to reduce the efficiency gap between multicore processors
and ASICs as shown in Fig. 1. Several key features are
implemented which are detailed in the following
subsections.
A. Chip Multiprocessor and Inter-Core Communications
The inter-core communications are becoming
increasingly important in the era of chip multiprocessors. As
multicore became mainstream architecture, processor
designers began to place more significance on inter-core
communications since the overall performance of
multiprocessors relies highly on the inter-core
communications [1].
Especially, embedded multicore processors face much
more pressure to improve their inter-core communication
efficiency. First, power and cost budgets limit the possibility
to integrate processor cores with strong computability,
which forces us to exploit extra performance from the intercore cooperation. Second, embedded applications usually
have the characteristic of stream processing. The data stream
flows through several processor cores until getting the
results, which increase the challenges of massive data passthrough across different cores, thus the throughput is highly
relevant to the efficiency of inter-core communications.
Finally, we see the trend that core number of embedded
multicore processor is getting larger, and more core counts
introduce more challenges to achieve efficient inter-core
communications. Accordingly, the inter core communication
efficiency plays a more critical role for embedded
multiprocessors.
B. Hybrid Inter-Core Communication Mechanism
Several inter-core communication schemes are
proposed in precedent work. For traditional Symmetric
Multi-Processing (SMP) processors, shared cache or
memory units can support shared-memory inter-core
communications. Typical examples are MRTP [3], Hydra
[4], UltraSPARC T1 [5] and Cortex-A9 [15]. Although with
simple programming, shared-memory communications face
several challenges limiting its massive use in future manycore processors. First, its low scalability doesn’t allow more
cores. Even with only 8 cores, the interconnects consume
power equivalent of one core and take area equivalent of
three cores [2]. Second, cache coherence issues are
extremely complex, resulting in large hardware overhead
and power consumption. Conversely, message-passing
communication mechanism at- tracts lots of attention
recently because of its better scalability. Typical examples
are RAW [6], TILE64 [7], Intel 80-Tile [8] and ASAP [9].
They adopt NoC as the channels to link massive cores, and
it’s convenient to add or reduce cores underlying certain
topology.
Table I
The Comparison Of Shared-Memory And Message-Passing
Inter-Core Communications
Method
Usage
Pro
Con
Medium
Scenario
Shared-memory
Large, unsplit data
block
Simple
programming
Lower scalability
Shared
cache/memory
Computation data
flow
Message-passing
Frequent,
scattered data
Better scalability
Uncertain
channel
Network-on-chip
Control data flow
Graphics
However, for message-passing, the benefit of
strong scalability is undermined by its complex
programming model and difficult QoS (Quality of Service)
guarantee.
In fact, shared-memory and message-passing intercore communications are suitable for different scenarios, as
Table I shows. For typical multicore embedded applications,
the data flows can be classified as two categories:
computation data flow and control data flow. Most of
computation data flows are continuous and in block which
are suitable for shared memory communication, while
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13697
DOI: 10.18535/ijecs/v4i8.14
control data flows are usually occasional and scattered data
packets which are suitable for message-passing
communication. Different inter-core communication
schemes are suitable for different scenarios and it’s possible
to achieve a better efficiency by integrating the sharedmemory and message-passing communications, therefore a
hybrid on-chip inter-core communication scheme is
proposed in this paper. The recently proposed 48-core IA-32
processor proposed a 2D Mesh NoC that supports messagepassing communication, and a Message Passing Buffer
(MPB) is used to increase the performance of message
passing programming model [30]. The pioneering work
‘Alewife’ from MIT [38] proposed integrating sharedmemory and message-passing communication for multiboard super computing in late 90’s, and we believe now it is
the time to enable both mechanisms for on-chip multi-core
communication with state-of-the-art NoC techniques.
C. Cluster-Based Memory Hierarchy
The memory hierarchy in a processor has an
enormous impact on its overall performance and power
consumption, especially in multicore systems. With
increasing core numbers, competition for memory resources
among cores becomes greater, resulting in more access
latency. Second, cache coherence issues become too
complex to solve with limited hardware and power budgets.
Finally, “Memory Wall” issues become more significant in
multicore systems, limiting the increasing number of cores
[10]. Hence it’s necessary to optimize the memory hierarchy
for multicore processors. Although traditional SMP
hierarchy is widely used in many chip multiprocessors, such
as Power 4 [31] and Core i7-940 [16], it is still experiencing
low efficiency for most embedded applications. Some
designers tried to solve this problem by using cache-free
architectures, such as the 167-processor computational
platform [32] with less memory hierarchy, and the Imagine
[33] with limited application domains. While some
designers suggested to partition the cache unit into different
layer that includes private and shared parts to improve
efficiency, such as Merrimac [34] and TRIPS [35].
Memory accesses operations consume large proportion of
power breakdown and the associated latency also reduces
the overall performance [11]. Thus our primary work in
memory hierarchy optimization is to avoid frequent memory
access and to increase the data locality.
III.
The 16-core processor we proposed has a 36 2D Mesh NoC
that links sixteen processor cores (PCore) based on MIPS
4KE and two memory cores (MCore). A hybrid inter-core
communication scheme is employed supporting both sharedmemory and message-passing communications. Shared
memory in MCore enables shared-memory communications
within the cluster, and the NoC enables message-passing
among all PCores. Cluster-based architecture is employed
with two clusters implemented. Each cluster comprises eight
PCores and one MCore, shared memory in MCore can be
accessed by PCores in the same cluster. Data enters the
processor through the input First In First Out (FIFO), and
exits through the output FIFO. An on-chip Voltage
Controlled Oscillator (VCO) generates the system clock,
together with static and dynamic clock-gating schemes.
External clock can be selected by a mux. A test mode allows
monitoring the inner operation flows. Fig. 2 depicts the
architecture overview and key features of the proposed
processor [12].
The PCore includes a typical Reduced Instruction Set
Computer (RISC) style processor core with six-stage
pipeline, a 2k-word instruction memory, a 1k-word private
data memory, a router and interfaces for inter-core
communications. The MCore includes an 8k-word shared
memory with four banks.
A. Design of Key Modules
1) Processor Core: Fig. 3 shows the architecture of PCore.
Two input FIFOs are implemented to support both core-core
and core-memory inter-core communications. The mailbox
is used for inter-core synchronization which will be detailed
later. An arbitrator is employed to manage private and
shared memory access. The processor core has a six-stage
pipeline illustrated in Fig. 4. In IFetch stage, instructions are
fetched according to the PC (Program Counter). The Decode
stage generates control signals and fetches operands from
the register file or the input FIFO. Operations and address
calculations are done in the Execution stage with function
blocks including ALU (Arithmetic Logic Unit), Shifter and
MDU (Multiplication Division Unit). The Memory stage is
associated with data memory access. Typically, it consumes
one clock cycle to finish a private data memory access,
while shared memory access requires 2 cycles when no
contention occurs. In the Align stage, data is aligned which
will be written to the register file or the output FIFO in
Write Back stage.
PROCESSOR WITH HYBRID
COMMUNICATIONS
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13698
DOI: 10.18535/ijecs/v4i8.14
Fig.2: Architecture overview of the proposed 16-core processor.
Fig.4: The six-stage pipeline of the processor core.
Fig.5: Read and write datapath of the extended register file
Fig.3: Architecture overview of the PCore.
Fig.6: Architecture overview of the MCore.
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13699
DOI: 10.18535/ijecs/v4i8.14
Fig.7: Architecture of the proposed voltage controlled
oscillator.
Data-Level Parallelism (DLP), configurability and
simplicity are three principles underlying the processor
design. DLP is enhanced using Simple Instruction Multiple
Data (SIMD) Instruction Set Architecture (ISA) supporting
three kinds of data width: 8 b, 16 b and 32 b since commonused data widths in embedded era are 8 b and 16 b. We
reconstruct the datapath (ALU, shifter and MDU) with
configurable data width. Three computing modes are
proposed to support the SIMD ISA, including scalar-scalar,
vector-vector and scalar-vector SIMD operations which are
detailed in our previous work in [13].
It’s necessary to increase the data locality to reduce
power consumption. In the original MIPS 4KE processor
which is also our baseline processor, the register file has 32
words, limiting the data locality. Hence, we extended the
size of register file to 64 words. The processor benefits from
the extended register file in three aspects. First, more
available registers mean more capacity to allocate data used
by SIMD instructions, thus the performance of processor is
improved. Second, the data locality of processor is enhanced
with more registers, resulting in less memory access and
power consumption. Third, the extended register file serves
as FIFO mapping ports. As Fig. 5 shows, FIFO read and
write ports are mapped to $24 and $25 registers,
respectively. A special instruction (regconfig) is employed
to activate certain parts of register file and FIFO ports,
which is illustrated in Fig. 5 and we can access FIFO ports
directly, with the register-related instruction. Thus load/store
instructions are not necessary here, reducing the number of
instructions. We have no cache with PCore to reduce chip
area and elude complex cache coherence issues. The cachefree design scheme aims to low-power required by most
embedded applications.
2) Memory Core: A MCore includes an 8k-word shared
data memory partitioned into four banks with increased
bandwidth of 102.4 Gbps @ 800 MHz, together with an
input FIFO to receive data from NoC, illustrated in Fig. 6.
All PCores in the same cluster are able to access MCore via
direct hardwires with fixed priority order, to simplify the
arbitration logic and optimize the critical path, and to obtain
high performance and low cost. On the other side, the
software programmers shall take this issue into
consideration to avoid the live-lock. In theory, the live-lock
is a possible risk for the lower priority core however in
reality the live-lock is rarely observed. From the perspective
of soft- ware, we use the shared memory to transfer data
between the different modules of a specific application
which are mapped to different cores, the processor will be
stalled when data in the shared memory is not ready. Thus
even the core is with the highest priority, it won’t keep
occupying the shared memory, which will grant the low
priority cores to have access to shared memory. The latency
accessing MCore without contention is 2 cycles. However,
since we adopt a fixed priority shared memory access
scheme, when several cores access MCore at a time, cores
with low priority will be stalled causing a latency larger than
2 cycles whose accurate value depending on the MCore
accessing pattern. The latency increases dramatically as
hardwire length increases, thus we only implemented
hardwires inside the cluster to avoid long distance
interconnects.
3) VCO & Clock-Gating: An on-chip VCO is integrated to
generate clock whose architecture is shown in Fig. 7. The
VCO includes a typical saturated-type ring oscillator, a level
shift module, a duty cycle correction module and a
frequency divider as shown in Fig. 7 [14]. Test results show
that the VCO can generate clock ranging from 200 MHz–1.6
GHz.
Two clock-gating schemes are implemented, both
static and dynamic. Fig. 8 shows the two clock-gating
domains. Static clock gating scheme is used to turn off clock
of PCore excluding its router. So we can selectively shut
down clock of certain PCores which are not used. The
configuration of static clock-gating can be done manually
via the clock-gating signal shown in Fig. 8. The dynamic
one is used to turn off clock of certain components,
including the extended register file, MDU, private data
memory and shared memory banks when idle. The
configuration process is autonomous and self-activated. A
dynamic clock-gating management unit will shut down the
clock of certain modules if they are idle. Test results show
that dynamic clock gating can reduce the overall power
consumption by 28.6% averagely when running various
applications under the same conditions. Worth mentioning,
no performance over- head occurs since the static one is
configured in initialization while the dynamic one is
controlled by common clock-gating cells.
Fig.8: The Static and dynamic clock-gating domains in
PCore. (a) PCore - Static Clock-gating
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13700
DOI: 10.18535/ijecs/v4i8.14
with a fixed access priority order for hardware simplicity,
i.e., PCore in the top-left corner has the highest priority, and
PCore in the bottom-right corner has the lowest priority. The
reason for adopting this scheme and software mechanism to
avoid live-lock is detailed in the previous section. Second, a
hardware-aided mailbox mechanism is used to achieve high
inter-core synchronization efficiency, which will be
mentioned later.
b) PCore - Dynamic Clock-gating.
B. Design of Hybrid Inter-Core Communications:
A hybrid inter-core communication mechanism is employed
to cater to different communication scenarios, integrating
both shared-memory and message-passing communication
schemes. Fig. 9 illustrates the implementation of the hybrid
inter-core communications. The shared data memory in
MCore supports the shared-memory communication inside
the cluster, which is an ideal way for large block data
transferring. Meanwhile, the 2D Mesh NoC supports the
message-passing communication, which is suitable to
transfer frequent and scattered data packets. Moreover,
message-passing shows more extensibility than sharedmemory so we adopt multiplying the number of clusters to
realize the scalability of our processor while the number of
cores within one cluster is fixed.
Fig.9: Implementation of the hybrid inter-core
communications: (a) Shared- memory via MCore (b)
Message-passing via NoC.
To add core numbers within one cluster will increase access
latency of shared memory and make the shared memory
access arbitration more complicated thus adding cost of
hardware and accessing latency, so it will not be applicable.
1)Shared-Memory Communication: Two features
distinguish the proposed shared-memory communication
scheme from other previous work [15]–[17]. First, only
eight PCores in the same cluster can access shared memory
Typically, three steps are used to complete sharedmemory communication. First, Src PCore stores data into
the shared memory. Next, it sends a synchronization signal
to the Dest PCore, informing that data is ready. Third, the
Dest PCore loads data after the synchronization signal is
confirmed. Fig. 10 illustrates the three steps in sharedmemory communication.
2)Message-Passing Communication: Although the sharedmemory communication is only allowed within the same
cluster, the message-passing enables all of the PCores to
communicate with each other. We implement a 36 2D Mesh
NoC to support message-passing communications, where an
XY dimension-ordered deadlock-free wormhole routing
algorithm [18] is adopted.
Even with more scalability and adaptability for
frequent and scattered data transferring, the efficiency of
message-passing communication is limited by two
bottlenecks. The first is the uncertainty in the
communication channel, as the NoC with heavy traffic load
will block the data packets and increase the latency.
Fortunately, with the aid of shared-memory communications
within cluster, the traffic load of NoC can be reduced
significantly. The second bottleneck lies in the data transferring between processor core and router. Usually, FIFO ports
are mapped in the data memory address space [19],
accessing with load/store. Extra operations are needed to
calculate the address for memory access resulting in extra
power consumption.
We proposed two solutions to conquer the second
bottleneck. First, the destination can either be the PCore or
the private data memory. We have two input FIFOs in the
PCore, one for processor core, and the other for private data
memory. The first bit of packet head determines the
communication destination, as shown in Fig. 11. Second,
two FIFO ports mapping schemes are proposed. One is the
traditional method that maps the FIFO ports in data memory
address space, using load/store to access the FIFO. The
other is mapping the FIFO ports in the extended register file
address space, and register operation instructions are used to
access the FIFO. The number of communication instructions
is reduced by 50% by eliminating redundant load/store
operations (e.g., two instructions “lw $24, 0($3)” and “add
$1, $2, $24” are needed when mapping to memory space
while only one instruction “add $1, $2, $24” is needed when
mapping to the register file space).
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13701
DOI: 10.18535/ijecs/v4i8.14
Fig.10: Three steps in a typical shared-memory communication: (1) Src PCore stores data to shared memory in MCore; (2) Src
PCore sends synchronization signal to Dest PCore; (3) Dest PCore loads data from shared memory when synchronization signal
is received.
Fig.11: Datapath of the message-passing communication in PCore
IV.
SIMULATION RESULTS
The simulation of the proposed design is carried out by
using Verilog HDL language in Xilinx ISE simulator. The
simulation results of the proposed design is as shown in
below figure.
Message-passing communications are supported by the 36
2D Mesh NoC, and shared-memory communications are
supported by shared memory units in the memory cores. The
proposed cluster-based memory hierarchy makes the
processor well-suited for most embedded applications. The
processor chip has a total 256 KB on-chip memory, while
each processor core has an 8 KB instruction memory and a 4
KB private data memory, and each memory core has a 32
KB shared memory. The processor is fabricated in TSMC
65 nm LP CMOS with the chip area of 9.1, while each core
occupies 0.43. Typically, the frequency of each processor
core is 750 MHz at 1.2 V while dissipating 34 mW, with an
energy efficiency of 45 pJ/Op for 32-bit operation and 22
pJ/Op for 16-bit operation.
REFERENCES
Fig.12: Simulation results of the proposed design
V.
CONCLUSIONS
A 16-core processor for embedded applications with hybrid
inter-core communications is proposed in this paper. The
processor has 16 processor cores and 2 memory cores.
[1] G. Blake, R. G. Dreslinski, and T. Mudge, “A survey of
multicore pro- cessors: A review of their common
attributes,” IEEE Signal Process. Mag., pp. 26–37, Nov.
2009.
[2] R. Kumar, V. Zyuban, and D. Tullsen, “Interconnections
in multi-core architecture: Understanding mechanisms,
overheads and scaling,” in Proc. 32nd Int. Symp. Computer
Architecture (ISCA’05), 2005, pp. 408–419.
[3] H.-Y. Kim, Y.-J. Kim, J.-H. Oh, and L.-S. Kim, “A
reconfigurable SIMT processor for mobile ray tracing with
contention reduction in shared memory,” IEEE Trans.
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13702
DOI: 10.18535/ijecs/v4i8.14
Circuits Syst. I, Reg. Papers, no. 60, pt. 4, pp. 938–950, Apr.
2013.
[4] L. Hammond, B.-A. Hubbert, M. Siu, M.-K. Prabhu, M.
Chen, and K. Olukolun, “The stanford Hydra CMP,” IEEE
Micro, vol. 20, no. 2, pp. 71–84, 2000.
[5] A. S. Leon, B. Langley, and L. S. Jinuk, “The
UltraSPARC T1 pro- cessor: CMT reliability,” in Proc.
Custom Integrated Circuits Conf. (CICC’06) Dig. Tech.
Papers, 2006, pp. 555–562.
[6] M.-B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F.
Ghodrat, B. Green- wald, H. Hoffman, P. Johnson, J.-W.
Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V.
Stumpen, M. Frank, S. Amarasinghe, and A.Agarwal, “The
Raw microprocessor: A computational fabric for software
circuits and general-purpose programs,” IEEE Micro, vol.
22, no. 2, pp. 25–35, Mar/Apr. 2002.
[7] Tilera Corp., Tilepro64 Processor Tilera Product Brief,
2008
[Online].
Available:
http://www.tilera.com/pdf/ProductBrief_TILEPro64_Web_v
2.pdf
[8] S. R. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson,
J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V.
Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S.
Borkar, “An 80-tile sub-100-W teraflops processor in 65-nm
CMOS,” IEEE J. Solid-State Circuit, vol. 43, no.
[9] Z. Yu, M. J. Meeuwsen, R. W. Apperson, O. Sattari, M.
Lai, J. W. Webb, E. W. Work, D. Truong, T. Mohsenin, and
B. M. Baas, “AsAP: An asynchronous array of simple
processors,” IEEE J. Solid-State Cir- cuits, vol. 43, no. 3,
pp. 695–705, Mar. 2008.
[10] B. Rogers, A. Krishna, G. Bell, and K. Vu, “Scaling the
bandwidth wall: Challengesn and avenues for CMP
scaling,” in Proc. ACM Int. Symp. Computer Architecture
(ISCA’09), 2009, pp. 371–382.1, pp. 29–41, Jan 2008.
2
G.Nageswararao pursuing phD in
wireless communications at Nagarjuna University,Guntur.He
received his M.Tech in VLSI from Samuel Institute Of
Engineering & Technology,Markapur,Prakasam(Dist). He has
16 years teaching experience. He is presently working as
Associate Professor in the department of ECE Audisankara
Institute Of Technology,Gudur, Affiliated to JNTU, Anantpur.
AUTHORS
1
Shaik.mahmedbasha received his
B.TECH degree in Electronics and Communication
Engineering from Siddhartha Institute Of Science &
Technology, Puttur, Chithoor (Dist), affiliated to JNTU
Anantapur. He is currently pursuing M.Tech VLSI in
Audisankara Institute Of Technology, Gudur(Autonomous),
Nellore (Dist), affiliated to JNTU Anantapur.
Shaik Mahmed basha, IJECS Volume 4 Issue 8 Aug, 2015 Page No.13696-13703
Page 13703
Download