Hybrid Java Compilation and Optimization for Digital TV Dong-Heon Jung, Hyeong-Seok Oh, Soo-Mook Moon smoon@snu.ac.kr School of EECS Seoul National University, Korea Accelerating DTV S/W Platform DTV allows data-broadcasting • Sending data as well as picture/sound Data-broadcasting platform is based on Java • Java xlets + Java middleware at the set-top box Java is slow, so use just-in-time compilation (JITC) Propose using ahead-of-time and idle-time compilation/optimization as well • Hybrid compilation and optimization Microprocessor Architecture & System Software Lab 2 Executing xlet with JITC only Microprocessor Architecture & System Software Lab 3 Executing xlet with Hybrid Microprocessor Architecture & System Software Lab 4 Outline Background on digital TV S/W platform • Xlet lifecycle • DTV acceleration Hybrid Java Compilation and Optimization • JITC for xlet methods • AOTC for system/middleware methods • ITC and ITO for xlets Experimental Results Summary Microprocessor Architecture & System Software Lab 5 Digital Television (DTV) DTV sends digital signals instead of analog signals • Higher definition pictures and clearer sounds Remaining bandwidth can be used for sending data • General information: traffic, weather, news, stock, … • Program-specific information (plot, cast, director,…) • Interaction using a return channel – T-commerce, T-banking, T-government, … Provides the data-broadcasting, interactive TV (iTV) Microprocessor Architecture & System Software Lab 6 Java for Interactive TV One key technology for iTV is Java • Many open standards are based on Java – DVB-MHP (satellite), OCAP (cable), ACAP (terrestrial) Programmed using xlet applications • xlet classes + image/text files • Downloaded to the DTV set-top box • Interact with middleware/system classes at the set-top xlet execution starts only when the user initiates it Microprocessor Architecture & System Software Lab 7 Sending and Receiving xlet App. Xlet application is sent via carousel mechanism • Send a stream of xlet files repeteatedly in a round-robin • Carousel file manager in the set-top handles the receiving When the DTV is turned on, • JVM starts and the application manager starts • Then xlet application for current channel start its lifecycle Microprocessor Architecture & System Software Lab 8 The xlet Lifecycle Not Loaded Loaded When starting download of xlet application When loading xlet’s main class file initXlet() Paused startXlet() destroyXlet() Destroyed pauseXlet() When switching to a different channel Started At Started state, a red-dot appears on the TV screen Microprocessor Architecture & System Software Lab 9 An Example of xlet Execution (a) Display Red-dot (c) Select xlet menu (b) Display xlet Menu (d) Display Slected Menu Microprocessor Architecture & System Software Lab 10 DTV Java Architecture Two types of classes in DTV Java Platform • System/middleware classes statically installed at DTV • xlet classes dynamically downloaded from TV station Similarities in other platforms • Mobile phone Java platform: MIDP middleware + midlet • Bluray disk Java platform: BD-J middleware + xlet Both class types are getting more substantial • E.g., MIDP -> JTWI -> MSA How to accelerate these substantial, dualcomponent Java platforms? Microprocessor Architecture & System Software Lab 11 Hybrid Compilation and Optimization Current wisdom of Java acceleration: JITC • Compile bytecode to machine code at runtime • In DTV, do JITC both xlets and system/middleware Our proposal: hybrid compilation and optimization • Ahead-of-time compilation (AOTC) for system/middleware • Idle-time compilation (ITC) for xlets • Idle-time optimization (ITO) for images and text fonts Microprocessor Architecture & System Software Lab 12 Hybrid Environment for DTV XLET Applications Set-top Box Object Carousel File Manager Phone Me Advanced Middleware & system methods Xlet methods Xlet images and texts AOTC JITC/ITC ITO Persistent Storage OS & Hardware We actually built a hybrid environment for a DTV based on a PhoneME Advanced (CDC) VM Microprocessor Architecture & System Software Lab 13 AOTC for System/Middleware Employ AOT module in PhoneME Advanced VM • Compile pre-chosen methods using JITC and save in a file • When JVM starts officially, use the machine code directly – With no interpretation or compilation overhead Two issues • Which methods to AOTC in system/middleware? – AOTC only those methods compiled at least once by JITC • Optimization – AOT-generated code is worse than JITC-generated code Microprocessor Architecture & System Software Lab 14 AOT Enhancements AOT inlining without runtime behavior • Implement inlining based on profile-feedback No code patch optimization • Translated code for class initialization check, GC-check can be patched Relocation prohibits some optimizations • Constant pointer optimization Microprocessor Architecture & System Software Lab 15 Idle-Time Compilation (ITC) for xlet Compile xlet methods in advance (idle-time) • Saves the JITC and interpretation overhead • Use our enhanced AOT • Assign a separate, lowest-priority thread for ITC to reduce the delay of the main thread (displaying red-dot) • OK even if user executes xlet in the middle of ITC Microprocessor Architecture & System Software Lab 16 Idle-Time Optimization for Images Loading/decoding of xlet images occur at runtime • Just-in-time when they are needed • Their overhead is substantial, taking much of running time Propose pre-loading/decoding during idle-time Two issues • When we start pre-loading/decoding in the xlet lifecycle – Started state or Not-loaded state: Do not work – Loaded state: good • How we perform pre-loading/decoding transparently – Use the ITC thread Useful even when user executes xlets early and becomes idle Microprocessor Architecture & System Software Lab 17 Just-in-Time Loading/Decoding Start Run java code of selected menu Initialize Xlet Request Image Object Image is cached?? Start Xlet Display red-dot no yes User select the menu Display selected menu Load image from cache Perform Image loading/decoding Save the image to image cache Finish xlet Get image object End Finish java code of selected menu Microprocessor Architecture & System Software Lab 18 Pre-loading/decoding Start Start Imagepreprocessing thread Initialize Xlet New file is received? yes Get each image file name Is preprocessed? no Perform Preloading/decoding no Terminate the thread Start Xlet Display red-dot User select the menu Display selected menu yes Finish xlet End Save the image to cache Microprocessor Architecture & System Software Lab 19 Idle-Time Optimization for Texts Creating some font objects occur at runtime Pre-creating of them at idle-time Microprocessor Architecture & System Software Lab 20 Experimental Results Experimented on a commercial DTV platform with real, on-air xlets broadcasted in Korea Experimental Environment • • • • DTV set-top box 333MHZ MIPS CPU with 128MB memory Linux with kernel 2.6 Sun’s phoneMe Advanced MR2 version Advanced common application platform (ACAP) Microprocessor Architecture & System Software Lab 21 Benchmarks xlets of three terrestrial TV stations in Korea • • • • Designated by A, B, C News, weather, traffic, and stock menu items Interested in running time of each menu item Size of xlet applications (KB) Station A Station B Station C class image text & et c. Total 276 1,348 344 1,968 360 1,596 372 2,328 448 1,280 288 2,016 Microprocessor Architecture & System Software Lab 22 Distribution of Method Calls 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% NEWS WEATHER TRAFFIC STOCK Station A xlet method NEWS WEATHER TRAFFIC STOCK Station B system method NEWS WEATHER Geomean Station C middleware method Microprocessor Architecture & System Software Lab 23 Distribution of JITCed Methods 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% NEWS WEATHER TRAFFIC STOCK Station A xlet method NEWS WEATHER TRAFFIC STOCK Station B system method NEWS WEATHER Geomean Station C middleware method Microprocessor Architecture & System Software Lab 24 Image Loading/Decoding Overhead 100% 80% 60% 40% 20% 0% NEWS WEATHER TRAFFIC STOCK NEWS WEATHER TRAFFIC Station A Image processing runtime portion Station B STOCK NEWS WEATHER Geomean Station C others (java & native code) Microprocessor Architecture & System Software Lab 25 Running Time Impact of AOTC 18,000 Running time (ms) 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 Station A JITC only Station B JITC + AOT(original) WEATHER NEWS STOCK TRAFFIC WEATHER NEWS STOCK TRAFFIC WEATHER NEWS 0 Station C JITC + AOT(enhanced) Microprocessor Architecture & System Software Lab 26 Performance Impact of AOTC 180% Speedup 160% 140% 120% 100% 80% 60% 40% 20% Station A JITC only Station B JITC + AOT(original) Geomean WEATHER NEWS STOCK TRAFFIC WEATHER NEWS STOCK TRAFFIC WEATHER NEWS 0% Station C JITC + AOT(enhanced) Microprocessor Architecture & System Software Lab 27 Impact of Pre-loading/decoding 18,000 Running time (ms) 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 Station A JITC only Station B WEATHER NEWS STOCK TRAFFIC WEATHER NEWS STOCK TRAFFIC WEATHER NEWS 0 Station C JITC + image pre-loading/decoding Microprocessor Architecture & System Software Lab 28 Impact of Text Font Pre-creation 18,000 Running time (ms) 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 Station A JITC only Station B WEATHER NEWS STOCK TRAFFIC WEATHER NEWS STOCK TRAFFIC WEATHER NEWS 0 Station C JITC + Font Pre-creation Microprocessor Architecture & System Software Lab 29 Overall Running Time of Hybrid 18,000 An average of 150% reduction (15% by AOTC) 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 NEWS WEATHER TRAFFIC Station A JITC only STOCK NEWS WEATHER TRAFFIC STOCK Station B NEWS WEATHER Station C JITC + AOT(enhanced) + image/text pre-processing Microprocessor Architecture & System Software Lab 30 Impact on Transparency Running time (ms) 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 JITC only Our Optimized VM Station A JITC only Our Optimized VM Station B Red-dot JITC only Our Optimized VM Station C Pre-processing completion Microprocessor Architecture & System Software Lab 31 Summary and Future Work Proposed hybrid compilation/optimization for DTV • Just-in-time, ahead-of-time, and idle-time • Improves performance dramatically than JITC-only – With little change to other DTV behavior • Some ideas would work for other dual-component Java Some future work • AOTC for system/middleware beyond AOT – By performing off-line AOTC with full optimizations enabled The idea of pre-loading/decoding has been filed for patent application. Microprocessor Architecture & System Software Lab 32 Thank you!