Android* on Intel platforms And what it means for you, developers. Agenda • And what it means for you, developers. Powering the next generation of smartphones and feature phones, tablets, wearables, and connected devices with leading-edge technologies Tablet Product Roadmap … 2012 In the game 2013 IA Tablets ramping Motorola Razr i w/ Medfield Windows 8 connected standby tablets TTM Samsung Galaxy Tab 3 on Clover Trail+ Bay Trail-T Accelerating: ~25 wins & counting 2014 14nm Coming Soon “Across the board Intel manages a huge advantage …The x86 power myth has been busted.” --Anandtech* 12/24/12 Intel HD Graphics 22nm, Quad Core Targeting significant graphics performance increase over Bay Trail Intel HD Graphics “Clover Trail+” “Clover Trail” “Bay Trail” Excellent Android Performance 2X Faster CPU & up to 4.8X 1 Faster Graphics than Z2760 Intel XMM 7160 LTE option Intel XMM 7160 LTE option Up to 4k x 2k Display Rich security, context, and imaging capability set built-in Intel XMM 7260 LTE w/CA “Medfield” Product timing aligned to year of introduction 1For performance details, see https://www-ssl.intel.com/content/www/us/en/benchmarks/atom/atom-tablet-z3770-infographic.html Smartphones with Intel Inside - 2012 Z2460 Orange* San Diego (UK) Orange* avec Intel Inside (FR) Lava* Xolo X900 Motorola* RAZR i ZTE* Grand X IN Megafon* Mint Lenovo* K800 Smartphones with Intel Inside - 2013 Z2420 Z2560 Z2580 Lenovo* K900 – 5.5” Intel® Yolo ASUS Fonepad™ Note FHD - 6” Etisalat E-20* ZTE* Geek – 5” Acer* Liquid C1 … ZTE* Grand X2 In – 4.5” Tablets with Intel Inside - 2013 ASUS* MeMO Pad FHD 10” ASUS* Fonepad™ 7” (Z2560) (Z2420/Z2460) Samsung* Galaxy™ Tab 3 10.1” (Z2560) Dell Venue 7 and 8 Clover Trail Plus dual-core Atom chip Future Android* platforms based on Intel* Silvermont microarchitecture New 22nm tri-gate microarchitecture ~3X more peak performance or ~5X lower power than previous Atom microarchitecture Intel® Atom™ Processor Z3000 Series (Bay Trail) Next Generation Tablets Merrifield Next Generation Smartphones Our devices are already fully compatible with established Android* ecosystem • Android* Dalvik* apps • These will directly work, Dalvik has been optimized for Intel® platforms. Android Runtime Dalvik Virtual Machine Core Libraries • Android NDK apps • Most will run without any recompilation on consumer platforms. • Android NDK provides an x86 toolchain since 2011 • A simple recompile using the Android NDK yields the best performance • If there is specific processor dependent code, porting may be necessary Most of the time, it just works ! What we are working on for Android* Key AOSP and Kernel Contributor Optimized Drivers & Firmware Porting and Optimizing Browser and Apps NDK Apps Bridging Technology Highly Tuned Dalvik Runtime 64 bit 64-Bit How to target multiple platforms (incl. x86) from NDK apps ? INTEL CONFIDENTIAL Configuring NDK Target ABIs If you have the source code of your native libraries, you can compile it for several CPU architectures by setting APP_ABI to all in the Makefile “jni/Application.mk”: APP_ABI=all Put APP_ABI=all inside Application.mk Run ndk-build… ARM v7a libs are built ARM v5 libs are built x86 libs are built mips libs are built The NDK will generate optimized code for all target ABIs You can also pass APP_ABI variable directly to ndk-build, and specify each ABI: ndk-build APP_ABI=x86 Packaging APKs for Multiple CPU Architectures Two options: PIDs PSI TS • One package for all (“fat binary”) • Embed native libraries for each architecture in one APK • Easiest and preferred way to go • Multiple APKs • One APK per architecture • If you have good reasons to do so (i.e., your fat binary APK would be larger than 50MB) Fat Binaries By default, an APK contains libraries for every supported ABIs. libs/armeabi Use lib/armeabi libraries libs/armeabi-v7a libs/x86 … APK file Use lib/armeabi-v7a libraries Use lib/x86 libraries The application will be filtered during installation (after download) Multiple APKs • Google Play* supports multiple APKs for the same application. • What compatible APK will be chosen for a device entirely depends on the android:VersionCode • Using this convention, the chosen APK will be the one that run best on the device: 3rd party libraries x86 support Game engines/libraries with x86 support: • Havok Anarchy SDK: android x86 target available • Unreal Engine 3: android x86 target available • Marmalade: android x86 target available • Cocos2Dx: set APP_ABI in Application.mk • FMOD: x86 lib already included, set ABIs in Application.mk • AppGameKit: x86 lib already included, set ABIs in Application.mk • libgdx: x86 lib available from nightly builds • … No x86 support but works on consumer devices: • Corona • Unity Software and Services Group Porting processor specific code to x86 on Android* INTEL CONFIDENTIAL SIMD Instructions • NEON* instruction set on ARM* platforms • MMX™, Intel® SSE, SSE2, SSE3, SSSE3 on Intel® Atom™ processor based platforms • http://intel.ly/10JjuY4 - NEONvsSSE.h : wrap NEON functions and intrinsics to SSE3 – 100% covered //******* definition sample ***************** int8x8_t vadd_s8(int8x8_t a, int8x8_t b); // VADD.I8 d0,d0,d0 #ifdef USE_MMX #define vadd_s8 _mm_add_pi8 //MMX #else #define vadd_s8 _mm_add_epi8 #endif //… Supplemental Streaming SIMD Extensions (SSSE) Intel® Streaming SIMD Extensions (Intel® SSE) Optimization Notice Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 Memory Alignment By default struct TestStruct { int mVar1; long long mVar2; int mVar3; }; Easy fix struct TestStruct { int mVar1; long long mVar2 __attribute__ ((aligned(8))); int mVar3; }; Intel® Tools for Android* apps developers Most of our tools are relevant even if you’re not targeting x86 platforms! INTEL CONFIDENTIAL Faster Android* Emulation on Intel® Architecture Based Host PC Pre-built Intel® Atom™ Processor Images • Android* SDK manager has x86 emulation images built-in • To emulate an Intel Atom processor based Android phone, install the “Intel Atom x86 System Image” available in the Android SDK Manager Much Faster Emulation • Intel® Hardware Accelerated Execution Manager (Intel® HAXM) for Mac and Windows uses Intel® Virtualization Technology (Intel® VT) to accelerate Android emulator • Intel VT is already supported in Linux* by qemu -kvm Intel x86 Atom System Image Intel x86 Emulator Accelerator Software and Services Group Intel® Threading Building Blocks (TBB) • Specify tasks instead of manipulating threads • Intel® Threading Building Blocks (Intel® TBB) maps your logical tasks onto threads with full support for nested parallelism • Targets threading for scalable performance • Uses proven efficient parallel patterns • Uses work-stealing to support the load balance of unknown execution time for tasks • Open source and licensed versions available on Linux*, Windows*, Mac OS X*, Android*… Open Source version available on: threadingbuildingblocks.org Licensed version available on: software.intel.com/en-us/intel-tbb Intel® TBB - Example #include <tbb/parallel_reduce.h> #include <tbb/blocked_range.h> Lambda function with Computes Calculating Defining Defining reduction a areduction apart one-ofover Pi range and initial value as dimensional within function a the range range ranger parm double getPi() { const int num_steps = 10000000; const double step = 1./num_steps; double pi = tbb::parallel_reduce( tbb::blocked_range<int>(0, num_steps), //Range double(0), //Value //function [&](const tbb::blocked_range<int>& r, double current_sum ) -> double { for (size_t i=r.begin(); i!=r.end(); ++i) { double x = (i+0.5)*step; current_sum += 4.0/(1.0 + x*x); } return current_sum; // updated value of the accumulator }, []( double s1, double s2 ) { //Reduction return s1+s2; } ); return pi*step; } Intel® Graphics Performance Analyzers • Profiles performance and Power • Real-time charts of CPU, GPU and power metrics • Conduct real-time experiments with OpenGL-ES* (with state overrides) to help narrow down problems • Triage system-level performance with CPU, GPU and Power metrics Available freely on intel.com/software/gpa Intel® Graphics Performance Analyzers 1. Install APK, and connect to Host PC via adb 2. Run Intel® GPA System Analyzer on development machine 3. View Profile Intel® HTML5 Development Environment (XDK NEW) • Great tools for free • Convenient, cloud-based build tool lets you target all popular platforms & app stores • Write once, run anywhere; code once, debug once, publish everywhere more on: html5dev-software.intel.com Software and Services Group Other tools and libs for Android* • Intel IPP Preview • Intel Compiler Software and Services Group Call to Action Software and Services Group Backup INTEL CONFIDENTIAL Apps GPU & Video support for canvas operations Home Application Framework User Experience Apps Optimizing Android for Intel® Atom™ Processor-Based Devices Contacts SKIA and openGLWindows* Activity Manager Manager optimizations Telephony Manager Package Manager Phone Browser Extensive middleware development in imaging, … Content Providers View System media and DRM deliver compelling media Resource Location Manager Notification Manager experiences Manager Operating System Middleware Libraries Android* Runtime Surface Manager Media Framework SQLite OpenGL* ES FreeType WebKit SGL SSL libc Memory Optimizations, AVI, DivX*, Display and ASF container types, Driver WMV /VC-1 decoder. Live Streaming optimizations, HDMI Driver andKeypad WiDI Extended Video Modes, Video Playback DRM … Core Libraries Apply our extensive experience optimizing Java* to the Dalvik* VM Dalvik Virtual Machine … Linux* Kernel Camera Driver We optimize web technologies such as HTML 5, WebKit and JavaScript† Flash Memory Driver Enhanced debugging and logging WiFi Driver Audio Drivers Binder (IPC) Driver IA assembly optimizations Drivers validated & optimized for power & memory footprint Power Management †Based on third party validation and sampling of Android apps using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance and/or results. Components of TBB (version 4.2) Parallel algorithms Flow Graphs parallel_for parallel_for_each parallel_reduce parallel_do parallel_scan parallel_pipeline & filters parallel_sort parallel_invoke functional nodes (source, continue, function) buffering nodes (buffer, queue, sequencer) split/join nodes other nodes Ranges and partitioners Tasks & Task groups Task scheduler Synchronization primitives atomic operations mutexes : classic, recursive, spin, queuing rw_mutexes : spin, queuing Thread Local Storage combinable enumerable_thread_specific flattened2d Concurrent containers concurrent_hash_map concurrent_queue concurrent_bounded_queue concurrent_priority_queue concurrent_vector concurrent_unordered_map concurrent_unordered_set Memory allocators tbb_allocator, cache_aligned_allocator, scalable_allocator Intel® TBB – Integration #for including tbb in your project: Android.mk include $(CLEAR_VARS) LOCAL_MODULE := tbb LOCAL_SRC_FILES := $(TBB_PATH)/lib/android/libtbb.so LOCAL_EXPORT_C_INCLUDES := $(TBB_PATH)/include include $(PREBUILT_SHARED_LIBRARY) #for calling tbb from your lib: LOCAL_CPP_FEATURES := rtti exceptions LOCAL_SHARED_LIBRARIES += tbb LOCAL_CFLAGS += -DTBB_USE_GCC_BUILTINS -std=c++11 APP_STL := gnustl_shared System.loadLibrary("gnustl_shared"); System.loadLibrary("tbb"); System.loadLibrary("YourLib"); Application.mk Java Intel® TBB - Download GPLv2 with runtime exception available from threadingbuildingblocks.org/download : The commercial version with support is here: software.intel.com/en-us/intel-tbb Beacon Mountain Preview v0.5 Downloader and Installer for a common set of Android* tools from Intel and 3rd parties : Intel Tools Third-Party Tools Intel® Hardware Accelerated Execution Manager Intel® Graphics Performance Analyzers System Analyzer Intel® Integrated Performance Primitives Preview Intel® Threading Building Blocks Intel® Software Manager Google Android SDK (ADT Bundle) Android NDK Eclipse Integrated Development Environment Android Design Cygwin* (for Windows operating systems) Free download at http://intel.com/software/beaconmountain Software and Services Group Intel® C++ Compiler for Android* • Based on Intel® C/C++ Compiler XE 13.0 for Linux* • Integrates into the Android* NDK as additional toolchain which can be used from the command-line • Supports Intel® Atom™ processor optimization • Free for now Available on http://software.intel.com/c-compiler-android