Publications (cont'd) 2012

advertisement
Overview of Center for Domain-Specific Computing
Supported by NSF “Expedition in Computing” Program
www.cdsc.ucla.edu
CDSC Retreat, April 15 – 17, 2012
Jason Cong
CDSC Director
cong@cs.ucla.edu
Focus: New Transformative Approach to Power/Energy
Efficient Computing
♦
♦
Current solution: Parallelization
Next significant opportunity – Customization
Parallelization
Customization
Adapt the architecture to
application domain
Source: Shekhar Borkar, Intel
2
Justification 1 – Potential of Customization
AES 128bit key
128bit data
Throughput
Power
Figure of Merit
(Gb/s/W)
0.18mm CMOS
3.84 Gbits/sec
350 mW
11 (1/1)
1.32 Gbit/sec
490 mW
2.7 (1/4)
ASM StrongARM [2]
31 Mbit/sec
240 mW
0.13 (1/85)
ASM Pentium III [3]
648 Mbits/sec
41.4 W
0.015 (1/800)
C Emb. Sparc [4]
133 Kbits/sec
120 mW
0.0011 (1/10,000)
Java [5] Emb. Sparc
450 bits/sec
120 mW
0.0000037 (1/3,000,000)
FPGA [1]
[1] Amphion CS5230 on Virtex2 + Xilinx Virtex2 Power Estimator
[2] Dag Arne Osvik: 544 cycles AES – ECB on StrongArm SA-1110
[3] Helger Lipmaa PIII assembly handcoded + Intel Pentium III (1.13 GHz) Datasheet
[4] gcc, 1 mW/MHz @ 120 Mhz Sparc – assumes 0.25 u CMOS
[5] Java on KVM (Sun J2ME, non-JIT) on 1 mW/MHz @ 120 MHz Sparc – assumes 0.25 u CMOS
Source: P Schaumont and I Verbauwhede, "Domain specific
codesign for embedded security," IEEE Computer 36(4), 2003
3
Justification 2 – Advance of Civilization
♦
For human brain, Moore’s Law scaling has long stopped
 The number neurons and their firing speed did not change significantly
♦
Remarkable advancement of civilization via specialization
♦
More advanced societies have higher degree of specialization
4
Project Goals
♦
A general, customizable platform for the given domain(s)
 Can be customized to a wide-range of applications in the domain
 Can be massively produced with cost efficiency
 Can be programmed efficiently with novel compilation and runtime systems
♦
Metric of success
 A “supercomputer-in-a-box” with +100x performance/power improvement via
customization for the intended domain(s)
5
Chosen Application Domain: Healthcare
♦
Medical imaging has transformed healthcare



An in vivo method for understanding disease
development and patient condition
Estimated to be $100 billion/year
More powerful & efficient computation can help
• Fewer exposures using compressive sensing
• Better clinical assessment (e.g., for cancer) using
improved registration and segmentation algorithms
♦
Hemodynamic simulation

♦
Very useful for surgical procedures involving
blood flow and vasculature
Magnetic resonance (MR) angiograph of an aneurysm
Both may take hours to days to construct


Clinical requirement: 1-2 minutes
Cloud computing won’t work
• Communication, real-time requirement, privacy

A megawatt-datacenter for each hospital?
Intracranial aneurysm reconstruction with hemodynamics
6
Overview of CDSC Research Program
Customizable Heterogeneous Platform (CHP)
$
$
$
$
DRAM
I/O
CHP
Fixed
Core
Fixed
Core
Fixed
Core
Fixed
Core
DRAM
CHP
CHP
Custom
Core
Custom
Core
Custom
Core
Custom
Core
Prog
Fabric
Prog
Fabric
accelerator
accelerator
Domain-specific-modeling
(healthcare applications)
Reconfigurable RF-I bus
Reconfigurable optical bus
Transceiver/receiver
Optical interface
Architecture
modeling
CHP creation
Customizable computing engines
Customizable interconnects
Design once
Customization
setting
CHP mapping
Source-to-source CHP mapper
Reconfiguring & optimizing backend
Adaptive runtime
Invoke many times
7
Center for Domain-Specific Computing (CDSC)
Organization
A diversified & highly accomplished team: 8 in CS&E; 1 in EE; 3 in medical school; 1 in applied math
Aberle
Baraniuk
Bui
Chang
Chien
UCLA
Rice
Domain-specific modeling
Bui, Reinman, Potkonjak
Sarkar, Baraniuk
CHP creation
Chang, Cong, Reinman
CHP mapping
Cong, Palsberg, Potkonjak
Sarkar
Application modeling
Aberle, Bui, Chien, Vese
Baraniuk
Experimental systems
All (led by Cong & Bui)
All
Palsberg
Potkonjak
Reinman
Cheng
UCSB
Cong (Director)
Ohio State
Sadayappan
Cheng
Sadayappan
Cheng
Sadayappan
All
All
Sarkar
(Associate Dir)
Vese
8
Application Thrust: Highlights
♦ Compressive sensing

EM+TV algorithm for computed
tomography (CT) reconstruction
• Image quality is maintained or improved with
~20% of conventional sampling (radiation)
• Preliminary CPU/GPU/FPGA implementations,
with up to 35x speedup over sequential version
• Initiated clinical evaluation with Siemens Medical
Corp. to utilize real-world data for comparison
 Preliminary work on magnetic resonance (MR)
♦ Medical image processing pipeline

Library of algorithms across all stages of
the pipeline have been implemented
• Denoising/deblurring, registration, segmentation
• Threaded, CnC, GPU, and FPGA
implementations available for several algorithms,
running on experimental platform
• Tumor volumetric analysis application prototyped
 Testbed for further CDSC development,
evaluation on runtime requirements
9
Domain-Specific Modeling Overview
Create
executable
models
(DSCG + step code)
Producer-Consumer edges
(data dependence)
(step 1)
[item]
Parent-Child edges
(control dependence)
(step 2)
(step 1)
<t2>
(step 2)
Static Analysis
Abstract execution
Code generation
Static characteristics
Dynamic characteristics
Mapping tool chain
data types, custom instruction patterns,
accelerator opportunities, vector parallelism, task
parallelism, data access patterns, …
Auto generation of
stub code for CPUs,
GPUs, FPGAs, with
complete step code
generation in some
cases
10
Domain-Specific Modeling: Highlights
♦
♦
♦
♦
Static extraction of custom scalar and vector instructions
Dynamic working set characterization using fast binary
instrumentation
Instruction mix characterization
Domain-specific coordination graph (DSCG)
 Development of Habanero-Java implementation of Intel Concurrent Collections
(CnC) model, and CnC-Babel extension for abstract execution of steps written in
multiple languages
♦
Domain-specific language extensions (DSLEs)
 DSLEs for Stencil Computations
 Characterization of Matlab operations as another DSLE
♦
Dynamic characterization of vector parallelism
11
Customizable Heterogeneous Platform (CHP) Creation
Cache parameters
Cache size & configuration
Cache vs. SPM
…
NoC parameters
 Interconnect topology
 # of virtual channels
 Routing policy
 Link bandwidth
 Router pipeline depth
 Number of RF-I enabled
routers
 RF-I channel and
bandwidth allocation
…
Customizable Heterogeneous Platform (CHP)
Core parameters
$
$
$
$
Frequency & voltage
Datapath bit width
Instruction window size
Issue width
Cache size & configuration
Register file organization
# of thread contexts
…
Fixed
Core
Fixed
Core
Fixed
Core
Fixed
Core
Custom
Core
Custom
Core
Custom
Core
Custom
Core
Custom instructions & accelerators
Prog
Fabric
Prog
Fabric
accelerator
accelerator
Shared vs. private accelerators
Choice of accelerators
Custom instruction selection
Amount of programmable fabric
…
Reconfigurable RF-I bus
Reconfigurable optical bus
Transceiver/receiver
Optical interface
Key questions: Optimal trade-off between efficiency & customizability
Which options to fix at CHP creation? Which to be set by CHP mapper?
12
CHP Creation: Highlights
♦
Hardware accelerator management
 “sea-of-accelerator” architecture
 Accelerators themselves provide an average 168x improvement in performance
(241x in energy) over pure general purpose core implementation
 Hardware management provides an average 51x improvement in performance (17x
in energy) over OS management of accelerators
 Dynamic composition of accelerators from simpler building blocks provides a
further average 2.1x improvement in performance (2.9x in energy)
♦
Adaptive hybrid cache
 Combination of traditional cache and scratchpad memory with dynamically adaptive
distribution of space between the two
 Dynamically adapt size to match application demand
 18% - 33% reduction in energy-runtime product
13
CHP Mapping Overview
Goal: Efficient mapping of domain-specific specification to customizable hardware
Adapt the CHP to a given application so as to optimize performance/power efficiency
Domain-specific applications
Abstract
execution
Application
characteristics
CHP architecture
models
Programmer
Domain-specific programming model
(Domain-specific coordination graph and domain-specific language extensions)
Source-to source CHP Mapper (Rose)
C/C++ code
ROSE SAGE IR
C/C++ front-end
ROSE  LLVM
translator
Reconfiguring and optimizing back-end (LLVM)
Binary code for fixed & customized
cores
GPU kernel
C/SystemC behavioral
spec
CUDA/OpenCL compiler
RTL Synthesizer
(AutoPilot/xPilot)
GPU code
RTL for prog fabric
Performance
feedback
Unified Adaptive Runtime system
(maps tasks across CPU, GPU, and FPGA processor)
CHP architectural prototypes
(CHP hardware testbeds, CHP simulation testbed,
full CHP)
14
CHP Mapping: Highlights
♦
Unified adaptive runtime system


♦
High-level source-to-source mapper


♦
Vectorization: New dynamic programming algorithm for SIMD instruction selection
Matlab-to-HC parallelization and code generation


♦
Habanero-C compiler: Code generation for finish, async, and place constructs
Data layout transformation for stencil computations on SIMD processors
Optimizing back-end

♦
CnC-HC runtime: Data-driven scheduling of CnC steps as Habanero-C (HC) tasks
HC runtime: Heterogeneous work-stealing scheduler for CPUs, GPUs, FPGAs
Type inference removes runtime checks in typical medical imaging Matlab kernels
Spatial parallelization reduces number of intermediate matrices
Leveraging error significance for energy reduction

Assuring application-level correctness against soft errors
15
Experimental Platform Thrust: Highlights
Server-class platform
CPU, GPU, FPGA, etc.
(e.g., Convey HC-1)
High performance; good energy efficiency
Diamondville
Harpertown
C1
C2
C3
C4
C5
L1
cache
L1
cache
L1
cache
L1
cache
L1
cache
L1
cache
L2 cache
L2 cache
FSB
Mobile platform
SOC, etc.
(e.g., Tegra 2)
Low power, good energy efficiency
Mobile components in
server platforms
C0
L2 cache
Server/client model
Computation vs. display
FSB
Memory control
Memory
QuickIA Platform courtesy of Intel
RF Interconnects
High bandwidth, programmable interface
16
Interaction Between the Thrusts
Experimental
Systems
Domain
Specific
Modeling
Application
Driver
CHP
Creation
CHP
Mapping
17
Some Key Statistics of CDSC
♦
People



♦
Publications: 159




♦
♦
2009: 1
2010: 33
2011: 75
2012: 50
Keynote/invited talks: 56




♦
Faculty: 13 (UCLA – 9; Rice – 2; Ohio-State – 1; UC Santa Barbara – 1 )
Graduate students: 42
Postdocs, research scientists, associate faculty: 13
2009: 4
2010: 20
2011: 23
2012: 9
New courses: 11
PhD students graduated from CDSC: 11
18
Supports from Industrial Partners
♦
Very grateful to the financial supports from the
campuses at various levels




Broadcom
Intel
Mentor Graphics
Xilinx
19
CDSC Faculty Awards
♦
2010

Professor Jason Cong
• IEEE Circuits and System (CAS) Society Technical Achievement Award

Professor Richard Baraniuk
• IEEE Signal Processing Society Education Award
♦
2011

Professor Jason Cong
• ACM/IEEE A. Richard Newton Technical Impact Award in Electronic Design Automation
• Best Paper Award, IEEE Symposium on Field-Programmable Custom Computing Machines

Professor Tim Cheng
• Best Student Paper Award at 2011 IEEE International Test Conference (ITC)
♦
2012

Professor Jens Palsberg
• ACM SIGPLAN Distinguished Service Award

Professor Vivek Sarkar
• ACM SIGPLAN Programming Languages Software Award

Professor Miodrag Potkonjak
• Outstanding Masters Graduate Award for Professor Miodrag Potkonjak’s PhD student Saro Meguerdichian

Professor Jason Cong
• 2012 ACM Transactions on Design Automation of Electronic Systems (TODAES) Best Paper Award
20
Publications
2009 (Total 1)
1.
Yonghong Yan, Jisheng Zhao, Yi Guo, and Vivek Sarkar. "Hierarchical Place Trees: A Portable Abstraction for Task
Parallelism and Data Movement", Proceedings of the 22nd Workshop on Languages and Compilers for Parallel
Computing (LCPC), October 2009.
2010 (Total 33)
2.
3.
4.
5.
6.
7.
8.
9.
Jonathan K. Lee, Jens Palsberg, “Featherweight X10: a core calculus for async-finish parallelism”, Principles and
Practice of Parallel Programming (PPoPP) Conference, January 2010.
V. Sarkar, W. Harrod, A.E. Snavely. Software Challenges in Extreme Scale Systems. SciDAC Review Special Issue
on Advanced Computing: The Roadmap to Exascale, pp. 60-65, January 2010.
Baraniuk RG, Cevher V, Duarte MA, Hegde C. "Model-based compressive sensing", IEEE Trans Information Theory,
April 2010.
Yi Guo, Jisheng Zhao, Vincent Cavé, and Vivek Sarkar. SLAW: A Scalable Locality-aware Adaptive Work-stealing
Scheduler. 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2010.
S. Shamshiri and K.-T. Cheng, "Modeling yield, cost, and quality of an NoC with uniformly and non-uniformly
distributed redundancy", IEEE VLSI Test Symposium (VTS), April 2010.
Jun Shirako and Vivek Sarkar. "Hierarchical Phasers for Scalable Synchronization and Reduction", 24th IEEE
International Parallel and Distributed Processing Symposium (IPDPS), April 2010.
Jason Cong, Yi Zou, "A Comparative Study on the Architecture Templates for Dynamic Nested Loops", in Proc.
18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, pp.251-254, May
2010.
K.-T. Cheng, and S. Shamshiri, "Yield, Quality, and Cost of a Reliable NoC", 2010 DAC Workshop on Diagnostic
Services in Network-on-Chips, June 2010.
21
Publications (cont’d)
10. Jason Cong, Chunyue Liu and Glenn Reinman. "ACES: Application-Specific Cycle Elimination and Splitting for
Deadlock-Free Routing on Irregular Network-on-Chip", In Proceedings of Design Automation Conference (DAC),
Anaheim, CA, pp. 443-448, June 2010.
11. Jason Cong, Yi Zou, "A Comparative Study on the Architecture Templates for Dynamic Nested Loops", in Proc.
18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, pp.251-254, May
2010.
12. Yan M. "Convergence analysis of SART by Bregman iteration and dual gradient descent", UCLA Center for Applied
Math (CAM) Report 10-27, June 2010.
13. Zoran Budimlic, Michael Burke, Vincent Cave, Kathleen Knobe, Geoff Lowney, Ryan Newton, Jens Palsberg, David
Peixotto, Vivek Sarkar, Frank Schlimbach, Sagnak Tasirlar. “Concurrent Collections”, Scientific Programming,
18:3–4, pp. 203–217, August 2010.
14. Jisheng Zhao, Jun Shirako, V. Krishna Nandivada, and Vivek Sarkar. "Reducing Task Creation and Termination
Overhead in Explicitly Parallel Programs", The Nineteenth International Conference on Parallel Architectures and
Compilation Techniques (PACT), September 2010.
15. Yi-Chu Wang, Bryan Donyanavard, Kwang-Ting Cheng, "Energy-Aware Real-time Face Recognition System on
Mobile CPU-GPU Platform", International Workshop on Computer Vision on GPU 2010 (CVGPU 2010), September
2010.
16. C. Sankaranarayanan, P. K. Turaga, R. G. Baraniuk, and R. Chellappa, "Compressive Acquisition of Dynamic
Scenes", European Conference on Computer Vision, Heraklion, Crete, Greece, September 2010.
17. Jung M, Resmerita E, Vese LA. "Dual norm-based iterative methods for image restoration", UCLA Center for
Applied Math (CAM) Report 09-88; accepted for conference proceedings presentation and publication, ECCV,
September 2010.
22
Publications (cont’d)
18. Vincent Cavé, Zoran Budimlic, Vivek Sarkar, "Comparing the Usability of Library vs. Language Approaches to Task
Parallelism", Workshop on Evaluation and Usability of Programming Languages and Tools (PLATEAU), co-located
with SPLASH 2010, October 2010.
19. Zoran Budimlic, Alex Bui, Jason Cong, Glenn Reinman, Vivek Sarkar, "Modeling and Mapping for Customizable
Domain-Specific Computing", Workshop on Concurrency for the Application Programmer (CAP), co-located with
SPLASH 2010, October 2010.
20. Max Grossman, Alina Simion, Zoran Budimlic, Vivek Sarkar, "CnC-CUDA: Declarative Programming for GPU’s",
2010 Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2010.
21. H. Noshadi, F. Dabiri, M. Meguerdichian, M. Potkonjak, M. Sarrafzadeh, "Energy Optimization in Wireless Medical
Systems Using Physiological Behavior", ACM/BMES Wireless Health, pp. 1-8, October 2010, San Diego, CA.
22. Yi-Chu Wang, Sydney Pang, Kwang-Ting Cheng, "A GPU-Accelerated Face Annotation System for Smartphones",
ACM International Conference on Multimedia Technical demonstrations, October 2010.
23. M. A. Davenport, C. Hegde, M. F. Duarte, and R. G. Baraniuk, "Joint manifolds for data fusion", IEEE Transactions
on Image Processing, vol. 19, no. 10, pp. 2580-2594, Oct., 2010.
24. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, Eran Yahav, "Efficient Data Race Detection for
Async-Finish Parallelism", Proceedings of the 1st International Conference on Runtime Verification (RV ’10),
November 2010. Recipient of Best Paper Award.
25. M. Potkonjak, S. Meguerdichian, J.L. Wong, "Trusted Sensors and Remote Sensing", IEEE Sensors, pp. 1-4,
November 2010, Honolulu, HW.
23
Publications (cont’d)
26. Vahdatpour, M. Potkonjak, S. Meguerdichian, "A Gate Level Sensor Network for Integrated Circuits Temperature
Monitoring", IEEE Sensors, pp. 1-4, November 2010, Honolulu, HW.
27. S. Meguerdichian, H. Noshadi, F. Dabiri, M. Potkonjak, "Semantic Multimodal Compression for Wearable Sensing
Systems", IEEE Sensors, pp. 1-4, November 2010, Honolulu. HW.
28. Saeed Shamshiri and Kwang-Ting Cheng, "Error-Locality-Aware Linear Coding to Correct Multi-bit Upsets in
SRAMs", IEEE International Test Conference (ITC), November 2010.
29. S. Wei, M. Potkonjak, "Scalable Segmentation-Based Malicious Circuitry Detection and Diagnosis", International
Conference on Computer Aided Design, pp. 1-4, November 2010, Santa Clara, CA.
30. Rajkishore Barik, Jisheng Zhao, Vivek Sarkar, "Efficient Selection of Vector Instructions using Dynamic
Programming", MICRO-43, December 2010.
31. S. Shamshiri and K.-T. Cheng, "Error-Locality-Aware Linear Coding to Correct Multi-bit Upsets in SRAMs", to
appear in 2010 International Test Conference (ITC), October 2010.
32. Stephen Kou, Jens Palsberg, “From OO to FPGA: fitting round objects into square hardware?”, Object Oriented
Programming Systems and Applications Conference (OOPSLA), October 2010.
33. Cevher V, Wakin MB, Baraniuk RG. "Low-dimensional models for dimensionality reduction and signal recovery: A
geometric perspective" Accepted, Proc IEEE (special issue on sparsity in signal processing), 2010.
34. Davenport MA, Boufounos PT, Wakin MB, Baraniuk RG. "Signal processing with compressive measurements",
Accepted, IEEE J Selected Applications in Signal Processing (special issue on applications of compressive
sensing), 2010.
24
Publications (cont’d)
2011 (Total 42)
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Chunyue Liu, Glenn Reinman, Yi Zou, "AXR-CMP: Architecture Support in
Accelerator-Rich CMPs", 2nd Workshop on SoC Architecture, Accelerators and Workloads (SAW-2), San Antonio, TX, February
2011.
Mohsen Lesani and Jens Palsberg, "Communicating memory transactions", Proceedings of PPOPP'11, 15th ACM SIGPLAN
Annual Symposium on Principles and Practice of Parallel Programming, San Antonio, Texas, February 2011.
Gyung-Su Byun, Yanghyo Kim, Jongsun Kim, Sai-Wang Tam, Jason Cong, Glenn Reinman, M-C. F Chang, "An 8.4Gb/s 2.5pJ/b
Mobile Memory I/O Interface Using Bi-directional and Simultaneous Dual (Base+RF)-Band Signaling", IEEE International SolidState Circuits Conference (ISSCC 2011), Session, 28, Feb. 2011, San Francisco, California.
R. G. Baraniuk, "More is Less? Signal Processing and the Data Deluge," Science, Vol. 331 No. 6018 pp. 717-719, 11 February
2011.
Zoran Budimlic´, Michael Burke, Kathleen Knobe, Ryan Newton, David Peixotto, Vivek Sarkar, Edwin Westbrook. "Deterministic
Reductions in an Asynchronous Parallel Language", The 2nd Workshop on Determinism and Correctness in Parallel
Programming (WoDet), March 2011.
Tom Henretty, Kevin Stock, Louis-Noel Pouchet, Franz Franchetti, J. Ramanujam and P. Sadayappan, "Data Layout
Transformation for Stencil Computations on Short SIMD Architectures", ETAPS International Conference on Compiler
Construction (CC'11), Saarbrucken, Germany, Macrh 2011.
J. Cong, V. Sarkar, G. Reinman and A. Bui, "Customizable Domain-Specific Computing", IEEE Design and Test of Computers,
Volume 28, Issue 2, pp. 5-15, March/April 2011.
C. Hegde and R. G. Baraniuk, "Compressive Sampling of Pulse Streams," IEEE Transactions on Signal Processing, Vol. 59, No.
4, April 2011.
J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers and Z. Zhang, "High-Level Synthesis for FPGAs: From Prototyping to
Deployment, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 30, Number 4, pp. 473491, April 2011. (Keynote paper)
Kwang-Ting Cheng and Yi-Chu Wang, "Using Mobile GPU for General-Purpose Computing - A Case Study of Face Recognition
on Smartphones, "International Symposium on VLSI Design, Automation and Test (VLSI-DAT 2011) Hsinchu, Taiwan (invited),
April 2011.
25
Publications (cont’d)
45. Papakonstantinou, Y. Liang, J.A. Stratton, K. Gururaj, D. Chen, W.M. Hwu and J. Cong, "Multilevel Granularity Parallelism
Synthesis on FPGAs", Proceeding of 19th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
(FCCM 2011), Salt Lake City, UT , May 2011. (Best Paper Award)
46. Jason Cong, Hui Huang, Chunyue Liu and Yi Zou, "A Reuse-Aware Prefetching Algorithm for Scratchpad Memory", Proceedings
of 48th Annual Design Automation Conference (DAC 2011), San Diego, CA, June 2011.
47. S. Meguerdichian M. Potkonjak, Device Aging-Based Physically Unclonable Functions, Design Automation Conference (DAC),
accepted for publication, June 2011.
48. M. Potkonjak, S. Meguerdichian, A. Nahapetian, Sheng Wei Differential Public, Physically Unclonable Functions: Architecture
and Applications, Design Automation Conference (DAC), accepted for publication, June 2011.
49. S. Wei, M. Potkonjak, Integrated Circuit Security Techniques Using Variable Supply Voltage, Design Automation Conference
(DAC), accepted for publication, June 2011.
50. Mackale Joyner, Zoran Budimlic, Vivek Sarkar, "Subregion Analysis and Bounds Check Elimination for High Level Arrays",
Proceedings of the 2011 International Conference on Compiler Construction (CC 2011), April 2011.
51. Rajkishore Barik, Jisheng Zhao, David Grove, Igor Peshansky, Zoran Budimlic, Vivek Sarkar, "Communication Optimizations for
Distributed-Memory X10 Programs", 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2011.
52. J. H. Ahnn, M. Potkonjak, What to Read? With Whom to Work? Where to Publish? - Scientific Techniques for Organizing and
Conducting Engineering Research, 2011 IEEE International Conference on Microelectronic Systems Education, accepted for
publication, June 2011.
53. Jun Shirako, Kamal Sharma, Vivek Sarkar. "Unifying Barrier and Point-to-Point Synchronization in OpenMP with Phaser", 7th
International Workshop on OpenMP (IWOMP), June 2011.
54. Yi-Chu Wang and K.-T. Tim Cheng, "Energy-Optimized Mapping of Application to Smartphone Platform - A Case Study of Mobile
Face Recognition," Best Paper Award, IEEE Workshop on Embedded Computer Vision (ECVW 2011, In conjunction with
CVPR2011) Colorado Springs, USA, June 2011.
26
Publications (cont’d)
55. J. Cong, M. Huang and Y. Zou, "3D Recursive Gaussian IIR on GPU and FPGAs, A Case Study for Accelerating BandwidthBounded Applications ", 9th IEEE Symposium on Application Specific Processors (SASP 2011), San Diego, CA, Jun 2011
56. Jianwen Chen, Ming Yan, Luminita A. Vese, John Villasenor, Alex Bui, and Jason Cong: EM+TV for Reconstruction of ConeBeam CT with Curved Detectors using GPU (accepted) Proceedings, (Fully 3D) 11th International Meeting on Fully ThreeDimensional Image Reconstruction in Radiology and Nuclear Medicine July 11 - July 15, 2011, Potsdam, Germany.
57. S. Wei, S. Meguerdichian, M. Potkonjak, Malicious Circuitry Detection, Using Thermal Conditioning, accepted for publication in
IEEE Transactions on Information Forensics and Security, 2011.
58. S. Wei, M. Potkonjak, Scalable Hardware Trojan Diagnosis, accepted for publication in IEEE Transactions on Very Large Scale
Integration, (VLSI) Systems, 2011.
59. S. Meguerdichian, M. Potkonjak, Matched Public PUF: Ultra Low Energy Security Platform, International Symposium on Low
Power Electronics and Design, accepted for publication, August 2011.
60. M. Rofouei, M. Sarrafzadeh, M. Potkonjak, Efficient Collaborative Sensing-based Soft Keyboard, International Symposium on
Low Power Electronics and Design, accepted for publication, August 2011.
61. Cevher V, Indyk P, Carin L, Baraniuk RG. "A tutorial on sparse signal recovery with graphical models", Accepted, IEEE Signal
Processing Magazine, 2011.
62. C. Hegde, and R. G. Baraniuk, "Sampling and Recovery of Pulse Streams", to appear in IEEE Transaction on Signal Processing,
2011.
63. Ming Yan and Luminita Vese, "Expectation maximization and total variation based model for computed tomography
reconstruction from undersampled data", Proceedings SPIE Medical Imaging 2011 (to appear).
64. Jason Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn Reinman and Yi Zou, "An Energy-Efficient Adaptive Hybrid
Cache," To appear in the Proceedings of International Symposium on Low Power Electronics and Design (ISLPED), Fukuoka,
Japan, August 2011 (Best paper award candidate).
27
Publications (cont’d)
65. S. Shamshiri and K.-T. Tim Cheng, "Modeling yield, cost, and quality of a spare-enhanced multi-core chip," to appear in IEEE
Transactions on Computers, special issue on Concurrent On-Line Testing and Error/Fault Resilience of Digital Systems, 2011.
66. Yonghong Yan, Sanjay Chatterjee, Daniel Orozco, Elkin Garcia, Zoran Budimlic, Jun Shirako, Robert Pavel, Guang R. Gao, and
Vivek Sarkar. "Hardware and Software Tradeoffs for Task Synchronization on Manycore Architectures", Proceedings of EuroPar 2011, August 2011 (to appear).
67. Sagnak Tasirlar, Vivek Sarkar. "Data-Driven Tasks and their Implementation", Proceedings of 2011 International Conference on
Parallel Processing (ICPP), September 2011 (to appear).
68. J. Cong, M. Huang and Y. Zou, " Accelerating Fluid Registration Algorithm on Multi-FPGA Platforms", 21st International
Conference on Field Programmable Logic and Applications (FPL 2011), Chania, Crete, Greece, Sep 2011.
69. J. Cong, B. Grigorian, G. Reinman and M. Vitanza, "Accelerating Vision and Navigation Applications on a Customizable
Platform," to appear in Proceedings of the 22nd IEEE International Conference on Application-specific Systems, Architectures
and Processors (ASAP 2011), Santa Monica, CA, September 2011.
70. J. Cong, K. Gururaj, M. Huang, S. Li, B. Xiao and Y. Zou, "Domain-Specific Processor with 3D Integration for Medical Image
Processing," to appear in Proceedings of the 22nd IEEE International Conference on Application-specific Systems,
Architectures and Processors (ASAP 2011), Santa Monica, CA, September 2011.
71. Y. Chen, J. Cong and G. Reinman, "HC-Sim: A Fast and Exact L1 Cache Simulator with Scratchpad Memory Co-Simulation
Support," to appear in Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis
(CODES+ISSS 2011), Taipei, Taiwan, October 2011.
72. Adrian Tang, G. Virbila, T. LaRocca, M. Chang, "A 75-110 GHz Digitally-Probed Artificial Dielectric Phase Demodulator in 65nm
CMOS" accepted in IEEE International Microwave Symposium 2011.
73. Naser Sedaghati, Louis-Noel Pouchet, Renji Thomas, Radu Teodorescu, and P. Sadayappan, "A Vector Instruction Extension for
High Performance Stencil Computation," International Conference on Parallel Architectures and Compilation Techniques (PACT
2011), To Appear.
74. J. N. Laska, P. Boufounos, M. A. Davenport, and R. G. Baraniuk, "Democracy in Action: Quantization, Saturation, and
Compressive Sensing," Applied and Computational Harmonic Analysis, 2011.
28
Publications (cont’d)
75. L. Carin, R. G. Baraniuk, V. Cevher, D. Dunson, M. Jordan, G. Sapiro, M. B.Wakin, "A Bayesian Approach to Learning LowDimensional Signal Models from Incomplete Measurements," IEEE Signal Processing Magazine (special issue on
Dimensionality Reduction via Subspace and Manifold Learning), 2011.
76. S. Meguerdichian, M. Potkonjak, Matched Public PUF: Ultra Low Energy Security Platform, International Symposium on Low
Power Electronics and Design, pp. 45-50, August 2011.
77. M. Rofouei, M. Sarrafzadeh, M. Potkonjak, Efficient Collaborative Sensing-based Soft Keyboard, International Symposium on
Low Power Electronics and Design, pp. 339-344, August 2011.
78. Jason Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn Reinman and Yi Zou. An Energy-Efficient Adaptive Hybrid Cache.
International Symposium on Low Power Electronics and Design (ISLPED), Aug 2011.
79. Habanero-Java: the New Adventures of Old X10. Vincent Cave, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 9th International
Conference on the Principles and Practice of Programming in Java (PPPJ), August 2011.
80. DrHJ --- a lightweight pedagogic IDE for Habanero Java. Jarred Payne, Vincent Cave, Raghavan Raman, Mathias Ricken, Robert
Cartwright, Vivek Sarkar. Tool Demonstration paper, 9th International Conference on the Principles and Practice of
Programming in Java (PPPJ), August 2011.
81. S. Wei, M. Potkonjak, Scalable Consistency-based Hardware Trojan Detection and Diagnosis, The 5th International Conference
on Network and System Security, pp. 176-183, September 2011.
82. Beayna Grigorian, Marco Vitanza, Jason Cong, and Glenn Reinman. Accelerating Vision and Navigation Applications on a
Customizable Platform. International Conference on Application-specific Systems, Architectures and Processors (ASAP), Sep
2011.
83. Dynamic Task Parallelism with a GPU Work-Stealing Runtime System. Sanjay Chatterjee, Max Grossman, Alina Sbirlea, Vivek
Sarkar. 2011 Workshop on Languages and Compilers for Parallel Computing (LCPC), September 2011.
84. Permission Regions for Race-Free Parallelism. Edwin Westbrook, Jisheng Zhao, Zoran Budimlic, Vivek Sarkar. Proceedings of
the 2nd International Conference on Runtime Verification (RV '11), September 2011
29
Publications (cont’d)
85. Data-Driven Tasks and their Implementation. Sagnak Tasirlar, Vivek Sarkar. Proceedings of the International Conference on
Parallel Processing (ICPP) 2011, September 2011.
86. S. Wei, F. Koushanfar, M. Potkonjak, Integrated Circuit Digital Rights Management Techniques Using Physical Level
Characterization, ACM Workshop onDigital Rights Management, pp. 3-14, October 2011.
87. M. Rofouei, M. Sarrafzadeh, M. Potkonjak, Detecting Local Events Using Global Sensing, IEEE Sensors, pp. 1165-1168, October
2011.
88. S. Meguerdichian, M. Potkonjak, Security Primitives and Protocols for Ultra Low Power Sensor Systems, IEEE Sensors, pp.
1225-1228, October 2011.
89. J. B. Wendt, M. Potkonjak, Medical Diagnostic-Based Sensor Selection, IEEE Sensors, pp. 1507-1510, October 2011.
90. J. B. Wendt, M. Potkonjak, Nanotechnology-Based Trusted Remote Sensing, IEEE Sensors, pp. 1213-1216, October 2011.
91. Y. Chen, J. Cong and G. Reinman, "HC-Sim: A Fast and Exact L1 Cache Simulator with Scratchpad Memory Co-Simulation
Support", Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis
(CODES+ISSS 2011), Taipei, Taiwan, pp. 295-304, October 2011.
92. K. Therdsteerasukdi, G. Byun, J. Ir, G. Reinman, J. Cong and M.F. Chang, "The DIMM Tree Architecture: A High Bandwidth and
Scalable Memory System", Proceedings of the 2011 IEEE International Conference on Computer Design (ICCD 2011), Amherst,
Massachusetts, pp. 388-395, October 2011.
93. Yu-Ting Chen, Jason Cong and Glenn Reinman. HC-Sim: A Fast and Exact L1 Cache Simulator with Scratchpad Memory Cosimulation Support. International Conference on Hardware/Software Co-Design and System Synthesis (CODES+ISSS), Oct
2011.
94. Intermediate Language Extensions for Parallelism. Jisheng Zhao, Vivek Sarkar. 5th Workshop on Virtual Machine and
Intermediate Languages (VMIL), October 2011.
95. Interfacing Chapel with Traditional HPC Programming Languages. Adrian Prantl, Thomas Epperly, Shams Imam, Vivek Sarkar.
Fifth Conference on Partitioned Global Address Space Programming Models (PGAS), October 2011.
96. SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism. Dragos Sbirlea, Jun Shirako, Ryan Newton, Vivek
Sarkar. Proceeding of the Data-Flow Execution Models for Extreme Scale Computing (DFM 2011), in conjunction with PACT
2011, October 2011.
30
Publications (cont’d)
97. S. Wei, A. Nahapetian, M. Potkonjak, Robust Passive Hardware Metering, International Conference on Computer-Aided Design,
pp. 802-809, November 2011.
98. J. Cong, P. Zhang and Y. Zou, "Combined Loop Transformation and Hierarchy Allocation for Data Reuse Optimization",
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2011), San Jose, CA, pp. 185192, November 2011.
99. J. Cong and K. Gururaj, "Assuring Application-Level Correctness against Soft Errors", Proceedings of the 2011 IEEE/ACM
International Conference on Computer-Aided Design (ICCAD 2011), San Jose, CA, pp. 150-157, November 2011.
100. J. Cong, Y. Huang and B. Yuan, "A Tree-Based Topology Synthesis for On-Chip Network", Proceedings of the 2011 IEEE/ACM
International Conference on Computer-Aided Design (ICCAD 2011), San Jose, CA, pp. 651-658, November 2011.
101. Andrew E. Waters, Aswin C. Sankaranarayanan, Richard G. Baraniuk, "SpaRCS: Recovering Low-Rank and Sparse Matrices
from Compressive Measurements", in Advances in Neural Information Processing Systems (NIPS), Granada, Spain, December
2011.
102. A. Maleki, M. Narayan, and R. G. Baraniuk, "Suboptimality of Nonlocal Means for Images with Sharp Edges", submitted to Appl.
Comput. Harmon. Anal., 2011.
103. A. Maleki, M. Narayan, R. G. Baraniuk, "Anisotropic nonlocal means", submitted to Appl. Comput. Harmon. Anal., 2011.
104. A. Maleki, L. Anitori, Z. Yang, R. G. Baraniuk, "Asymptotic analysis of complex LASSO via complex approximate message
passing (CAMP)," submitted to IEEE Trans. Inf. Theory, 2011.
105. M. F. Duarte and R. G. Baraniuk, "Spectral Compressive Sensing", submitted to Appl. And Comput. Harmon. Anal., 2011.
106. Ming Yan, Jianwen Chen, Luminita A. Vese, John Villasenor, Alex Bui and Jason Cong, EM+TV Based Reconstruction for ConeBeam CT with Reduced Radiation, Advances in Visual Computing, Lecture Notes in Computer Science, 2011, Volume 6938/2011,
1-10
107. Carl Lederman, Luminita Vese and Aichi Chien, Registration for 3D Morphological Comparison of Brain Aneurysm Growth,
Advances in Visual Computing Lecture Notes in Computer Science, 2011, Volume 6938/2011, 392-399.
108. Pascal Getreuer, Melissa Tong and Luminita A. Vese, A Variational Model for the Restoration of MR Images Corrupted by Blur
and Rician Noise, Advances in Visual Computing Lecture Notes in Computer Science, 2011, Volume 6938/2011, 686-698
109. Ming Yan, EM-Type Algorithms for Image Reconstruction with Background Emission and Poisson Noise, Advances in Visual
Computing Lecture Notes in Computer Science, 2011, Volume 6938/2011, 33-42.
31
Publications (cont’d)
2012 (Total 50)
110. J. Cong, M.A. Ghodrat, M. Gill, H. Huang, B. Liu, R. Prabhakar, G. Reinman and M. Vitanza, "Compilation and Architecture
Support for Customized Vector Instruction", Proceedings of the 17th Asia and South Pacific Design Automation Conference
(ASPDAC 2012), Sydney, Australia, pp. 652-657, January 2012.(Invited paper)
111. A. Bui, K. Cheng, J. Cong, L. Vese, Y. Wang, B. Yuan and Y. Zou, "Platform Characterization for Domain-Specific Computing
(invited paper) ", Proc. Conference on Asia South Pacific Design Automation(ASPDAC), Sydney, Australia, Jan 2012
112. K. Stock, L.-N. Pouchet, and P. Sadayappan. Using machine learning to improve automatic vectorization. ACM Trans. Archit.
Code Optim., 8(4):50:1-50:23, Jan. 2012.
113. The Tuning Language for Concurrent Collections. Kathleen Knobe, Michael G. Burke. Proceedings of the 2012 International
Workshop on Compilers for Parallel Computing (CPC 2012), January 2012.
114. M. F. Duarte and R. G. Baraniuk, "Kronecker Compressive Sensing", IEEE Transactions on Image Processing, vol. 21, no. 2, pp.
494-504, Feb., 2012.
115. J. Zheng, M. Potkonjak, Securing Netlist-Level FPGA Design through Exploiting Process Variation and Degradation, pp. 129-139,
FPGA, February 2012.
116. Y Kim, GS Byun, A Tang, CP Jou, HH Hsieh, G Reinman, J Cong, M.C. F Chang, "An 8Gb/s/pin 4pJ/b/pin Single-T-Line Dual
(Base+RF) Band Simultaneous Bidirectional Mobile Memory I/O Interface with Inter-Channel Interference Suppression",
International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, page 50-52, Feb.
2012.
117. J. Chen, J. Cong, M. Yan and Y. Zou, " FPGA-Accelerated 3D Reconstruction using Compressive Sensing", Proc. International
Symposium on Field-Programmable Gate Arrays(FPGA), Monterey CA, Feb 2012
118. J. Cong, M. Huang, B. Liu, P. Zhang and Y. Zou, "Combining Module Selection and Replication for Throughput-Driven Streaming
Programs", Proceedings of Design, Automation and Test in Europe (DATE 2012), Dresden, Germany, March 2012.
119. Y. Chen, J. Cong, H. Huang, B. Liu, C. Liu, M. Potkonjak and G. Reinman, "Dynamically Reconfigurable Hybrid Cache: An
Energy-Efficient Last-Level Cache Design", Proceedings of Design, Automation and Test in Europe (DATE 2012), Dresden,
Germany, March 2012.
120. A. C. Sankaranarayanan, C. Studer, and R. G. Baraniuk, "CS-MUVI: Video Compressive Sensing for Spatial-Multiplexing
Cameras", IEEE International Conference on Computational Photography, Seattle, WA, April, 2012.
32
Publications (cont’d)
121. J. B. Wendt, S. Meguerdichian, H. Noshadi, M. Potkonjak, Energy and Cost Reduction in Localized Multisensory Systems
through Application-Driven Compression", Data Compression Cpnference, page 411, April 2012.
122. Habanero-Scala: Async-Finish Programming in Scala. Shams Imam, Vivek Sarkar. The Third Scala Workshop (Scala Days 2012),
April 2012.
123. Analytical Bounds for Optimal Tile Size Selection. Jun Shirako, Kamal Sharma, Naznin Fauzia, Louis-Noel Pouchet, J.
Ramanujam, P. Sadayappan, Vivek Sarkar. Proceedings of the 2012 International Conference on Compiler Construction (CC
2012), April 2012.
124. A. Mirhoseini, M. Potkonjak, F. Koushanfar, Coding-based energy minimization for phase change memory, Design Automation
Conference, pp, 68-76, June 2012.
125. S. Wei, K. Li, F. Koushanfar, M. Potkonjak, Hardware Trojan horse benchmark via optimal creation and placement of malicious
circuitry, DAC, pp. 90-95, June 2012.
126. Y. Kim, S.-W. Tam, G.-S. Byun, H. Wu, L. Nan, G. Reinman, J. Cong, .and M.-C. F. Chang, "Analysis of Noncoherent ASK
Modulation-Based RF-Interconnect for Memory Interface", IEEE Journal on Emerging and Selected Topics in Circuits and
Systems, Jun. 2012.
127. J. Cong, P. Zhang and Y. Zou, "Optimizing Memory Hierarchy Allocation with Loop Transformations for High-Level Synthesis",
to appear in the Proceedings of the 49th Annual Design Automation Conference (DAC 2012), San Francisco, CA, June 2012.
128. J. Cong, M.A. Ghodrat, M. Gill, B. Grigorian and G. Reinman, "Architecture Support for Accelerator-Rich CMPs", Proceedings of
the 49th Annual Design Automation Conference (DAC 2012), San Francisco, CA, June 2012.
129. A. Sbirlea, Y. Zou, Z. Budimlic, J. Cong and V. Sarkar, Mapping a Data-Flow Programming Model onto Heterogeneous Platforms,
Proc. ACM SIGPLAN/SIGBED Conference on. Languages, Compilers, Tools and Theory for Embedded Systems (LCTES),
Beijing, China, Jun 2012
130. Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, and Glenn Reinman. Accelerator-Rich Architecture for
Power-Constrained CMPs. Dark Silicon Workshop (DaSi - held in conjunction with ISCA), Jun 2012
131. Mapping a Data-Flow Programming Model onto Heterogeneous Platforms. Alina Sbirlea, Yi Zou, Zoran Budimlic ´, Jason Cong,
Vivek Sarkar. Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES), June 2012.
33
Publications (cont’d)
132. Scalable and Precise Dynamic Data Race Detection for Structured Parallelism. Raghavan Raman, Jisheng Zhao, Vivek Sarkar,
Martin Vechev, Eran Yahav. 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI),
June 2012.
133. Practical Permissions for Race-Free Parallelism. Edwin Westbrook, Jisheng Zhao, Zoran Budimlic, Vivek Sarkar. 26th
European Conference on Object-Oriented Programming (ECOOP), June 2012.
134. J. V. Shi, A. C. Sankaranarayanan, C. Studer, and R. G. Baraniuk, "Video Compressive Sensing for Dynamic MRI", Annual
Computational Neuroscience Meeting (CNS), Atlanta, GA, July 21-26, 2012.
135. C. Hedge and R. G. Baraniuk, "SPIN: Iterative Signal Recovery on Incoherent Manifolds", IEEE International Symposium on
Information Theory (ISIT), Cambridge, MA, Jul., 2012.
136. Yu-Ting Chen, Jason Cong, Hui Huang, Chunyue Liu, Raghu Prabhakar and Glenn Reinman. Static and Dynamic CoOptimizations for Blocks Mapping in Hybrid Caches. International Symposium on Low Power Electronics and Design (ISLPED),
Jul/Aug 2012.
137. Implicit Multicore Parallelism using CnC-Python. Shams Imam, Vivek Sarkar. Eleventh annual Scientific Computing with
Python conference (SciPy 2012), July 2012.
138. Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, Glenn Reinman, "CHARM: A Composable Heterogeneous
Accelerator-Rich Microprocessor", to appear in the Proceedings of the International Symposium on Low Power Electronics and
Design (ISLPED 2012).
139. Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Chunyue Liu, Glenn Reinman, "BiN: A Buffer-in-NUCA Scheme for
Accelerator-Rich CMPs", to appear in the Proceedings of the International Symposium on Low Power Electronics and Design
(ISLPED 2012).
140. Jason Cong and Bo Yuan, "Energy-Efficient Scheduling on Heterogeneous Multi-core Architectures", to appear in the
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED 2012).
141. J. B. Wendt, S. Meguerdichian, H. Noshadi, M. Potkonjak, Semantics-driven Sensor Configuration for Energy Reduction in
Medical Sensor Networks, International Symposium on Low Power Electronics and Design, August 2012.
142. J. Zheng, E. Chen, M. Potkonjak, A Benign Hardware Trojan on FPGA-based Embedded Systems, International Conference on
Field Programmable Logic and Applications, August 2012.
34
Publications (cont’d)
143. Folding of Tagged Single Assignment Values for Memory-Efficient Parallelism. Dragos Sbirlea, Kathleen Knobe, Vivek Sarkar.
International European Conference on Parallel and Distributed Computing (Euro-Par), August 2012.
144. A Practical Approach to DOACROSS Parallelization. Priya Unnikrishnan, Jun Shirako, Kit Barton, Sanjay Chatterjee, Raul
Silvera, Vivek Sarkar. International European Conference on Parallel and Distributed Computing (Euro-Par), August 2012.
145. S. Meguerdichian, M. Potkonjak, Using Standardized Quantization for Multi-Party PPUF Matching: Foundations and
Applications, International Conference on Computer-Aided Design, November 2012.
146. S. Wei, K. Li, F. Koushanfar, M. Potkonjak, Provably Complete Hardware Trojan Detection Using Test Point Insertion
International Conference on Computer-Aided Design, November 2012.
147. A. C. Sankaranarayanan, P. Turaga, R. Chellappa, and R. G. Baraniuk, "Compressive Acquisition of Dynamic Scenes", submitted
to SIAM J. Imaging Sci., 2012.
148. C. Hedge and R. G. Baraniuk, "Signal Recovery on Incoherent Manifolds", submitted to IEEE. Trans. Inf. Theory, 2012.
149. S. Wei, A. Nahapetian, M. Nelson, F. Koushanfar, M. Potkonjak, Gate Characterization Using Singular Value Decomposition:
Foundations and Applications, IEEE Transactions on Information Forensics and Security, 7(2) pp. 765-773, 2012.
150. S. Wei, M. Potkonjak, Scalable Hardware Trojan Diagnosis, IEEE Trans. VLSI Syst. 20(6) pp. 1049-1057, 2012.
151. Y.-C. Wang and K.-T. Cheng, "Energy and Performance Characterization of Mobile Heterogeneous Computing ", submitted to
IEEE Workshop on Signal Processing Systems (SiPS 2012).
152. Y.-C. Wang, Y. Gong, Y. Teng, and K.-T. Cheng, "Acceleration and Energy Optimization of Medical Image Visualization on Mobile
Devices", submitted to IEEE International Symposium on Workload Characterization (IISWC 2012).
153. X. Yang, and K. -T. Tim Cheng, "Accelerating SURF on mobile devices", submitted to ACM Multimedia 2012.
154. A. Ghofrani, R. Parikh, S. Shamshiri, A. DeOrio, K. -T. Cheng, and V. Bertacco, "Comprehensive Online Defect Diagnosis in OnChip Networks", IEEE VLSI Test Symposium.
155. H. Wu, L. Nan, S.-W. Tam, H.-H. Hsieh, C. Jou, G. Reinman, J. Cong, and M.-C. F. Chang, "A 60GHz On-Chip RF-Interconnect with
lambda/4 Coupler for 5Gbps Bi-Directional Communication and Multi-Drop Arbitration", to be presented in IEEE Custom
Integrated Circuits Conference 2012.
156. Jens Palsberg, "Overloading is NP-Complete", Logic and Program Semantics, pages 204-218, LNCS 7230, Springer, 2012.
35
Publications (cont’d)
157. Jonathan K. Lee, Jens Palsberg, Rupak Majumdar, Hong Hong, "Efficient May Happen in Parallel Analysis for Async-Finish
Parallelism", Static Analysis Symposium 2012.
158. J. Holewinski, , L.-N. Pouchet, and P. Sadayappan. High-performance code generation for stencil computations on gpu
architectures. In Proceedings of the 26th International Conference on Supercomputing, ICS ’12. ACM, 2012.
159. J. Holewinski, R. Ramamurthi, M. Ravishankar, N. Fauzia, L.-N. Pouchet, A. Rountev, and P. Sadayappan. Dynamic trace-based
analysis of vectorization potential of applications. In Proceedings of the 33rd ACM SIGPLAN conference on Programming
Language Design and Implementation, PLDI ’12, pages 371-382. ACM, 2012.
36
Download