Advanced Technical Skills (ATS) North America Oracle Database on AIX Key Ingredients for a Successful Relationship ... and improved communication between System Administrator, DBA and SAN team Ralf Schmidt-Dannert Executive IT Specialist, IBM 9/18/2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Special notices This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area. Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied. All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies. All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generallyavailable systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment. Revised September 26, 2006 2 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Special notices (cont.) © IBM Corporation 1994-2012 All rights reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country. Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml. Adobe, Acrobat, PostScript and all Adobe-based trademarks are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Cell Broadband Engine and Cell/B.E. are trademarks of Sony Computer Entertainment, Inc., in the United States, other countries, or both and are used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Information is provided "AS IS" without warranty of any kind. The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here. 3 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Acknowledgements Dale Martin, IBM ATS Rebecca Ballough, IBM ATS Steven Nasypany, IBM ATS Damir Rubic, IBM ATS 4 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Learning Objectives Develop a better understanding how the Oracle database and AIX interact especially in the area of memory. Be aware of typical “pitfalls” in the context of storage layout and configuration for Oracle databases on AIX. Know the AIX tuning parameters and their “best practice” values for an Oracle database server. Know what tools are available on AIX to analyze performance problems. 5 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Agenda AIX - Oracle: Memory Utilization – AIX Memory - Overview, kernel tuning parameters, … – What memory is Oracle using – file cache / computational, 4KB / 64KB / 16MB pages, pinned / un-pinned? – Tuning for expected workload behavior AIX - Oracle: CPU Utilization – How to utilize IBM PowerVM features in an Oracle environment? – Oracle license implications – Tuning AIX - Oracle: I/O – Storage options, characteristics and comparison – Database layout on AIX - Best Practice – Setup and tuning AIX - Oracle: Network AIX - Oracle: Miscellaneous My system doesn't behave - What tools are available in AIX to investigate the cause? Where to find AIX/Oracle documentation? 6 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Power 7 Systems Portfolio fully supports Oracle Databases Power 795 Power 780 Power 770 Power 750 Power 775 NEW Intel BladeCenter PS700 / PS701 PS702 / PS703 / PS704 Power 710 Power 720 Power 730 Power7 Storage HPC Network Power 740 PureSystems Power 755 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX – Oracle Memory Utilization 8 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX Physical Memory – pools, pages, page lists Automatic Automatic psmd proc. 4KB free Physical Memory Automatic (*1) Automatic (*1) Manual 4KB free Automatic 16GB free 16MB Manual free 64KB psmd free proc. 16GB free 16MB free 64KB free psmd proc.Automatic 4KB used 4KB 16MB used 64KB psmd proc. used 64KB Memory Pool 0 used 16GB 16MB used used 16GB used Memory Pool 1 Paging Space On Disk This is a simplified view (*1) Only when large amounts of memory are requested at once and not enough free pages on 4KB / 64KB free lists. 9 September 2012 © 2012 IBM Corporation 1 4 :0 2 1 4 :0 2 1 4 :0 2 1 4 :0 3 1 4 :0 3 1 4 :0 3 1 4 :0 3 1 4 :0 4 1 4 :0 4 1 4 :0 4 1 4 :0 5 1 4 :0 5 1 4 :0 5 1 4 :0 6 1 4 :0 6 1 4 :0 6 1 4 :0 6 1 4 :0 7 1 4 :0 7 1 4 :0 7 1 4 :0 8 1 4 :0 8 1 4 :0 8 1 4 :0 9 1 4 :0 9 1 4 :0 9 1 4 :0 9 1 4 :1 0 1 4 :1 0 1 4 :1 0 1 4 :1 1 1 4 :1 1 1 4 :1 1 1 4 :1 2 1 4 :1 2 1 4 :1 2 1 4 :1 2 1 4 :1 3 1 4 :1 3 1 4 :1 3 1 4 :1 4 1 4 :1 4 1 4 :1 4 1 4 :1 5 1 4 :1 5 1 4 :1 5 1 4 :1 5 1 4 :1 6 1 4 :1 6 1 4 :1 6 1 4 :1 7 1 4 :1 7 1 4 :1 7 1 4 :1 8 M B used Advanced Technical Skills (ATS) North America 4K - 64K - 16MB Page Dynamics 4KB_used MB 4KB_free MB 64KB_used MB 64KB_free MB 16MB_usedMB 16MB_freeMB 5000 4500 4000 16MB pages 3500 3000 64kb free 2500 2000 1500 1000 10 September 2012 64KB pages 4kb free 64kb used 4KB pages 4kb used 500 0 Time © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX Memory Management Concepts Two primary categories of memory pages: Computational and File System AIX tries to utilize all of the physical memory available – What is not required to support computational page demand will tend to be used for file system cache Requests for new memory pages are satisfied from the free page list(s) – Small reserve of free pages maintained by “stealing” Computational or File pages – AIX uses “demand paging” algorithm – generally not written to paging space until “stolen” System% Process% FScache% 100 90 Free Memory 70 60 File cache is always 4KB memory pages ! 50 40 30 20 10 17:10 17:05 17:00 16:54 16:49 16:44 16:38 16:33 16:28 16:22 16:17 16:12 16:06 16:01 15:56 15:50 15:45 15:40 15:34 15:29 15:24 15:18 15:13 15:08 15:02 14:57 14:52 14:46 14:41 14:36 14:30 14:25 14:20 14:14 14:09 14:04 13:58 13:53 13:48 13:42 13:37 13:32 13:26 13:21 13:16 13:10 13:05 13:00 12:54 12:49 12:44 12:38 12:33 12:28 12:22 12:17 12:12 12:06 0 12:01 % Physical memory used 80 Time 11 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America VMM Page Stealing Process (lrud) Definitions: lrud = VMM page stealing process = LRU Daemon (1 per memory pool) numperm, numclient = # pages currently used for filesystem buffer cache maxperm, maxclient = target maximum # pages to use for filesystem buffer cache free pages = # pages immediately available to satisfy new memory requests vmo Parameters: minperm% = target min % real memory for filesystem buffer cache maxperm%, maxclient% = target max % real memory for filesystem buffer cache minfree = target minimum number of free memory pages maxfree = number of free memory pages at which lrud stops stealing pages When does lrud (for a given memory pool and page size) start? When free pages < minfree (4K and 64K pages) When (maxclient - numclient) < minfree (4K pages only) When does lrud stop? When free pages > maxfree (4K and 64K pages) When (maxclient – numclient) > maxfree (4K pages only) 12 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America VMM Page Stealing Thresholds (AIX 7.1, 6.1, 5.3) minfree / maxfree values are per memory pool – Total system minfree = minfree * # of memory pools – Total system maxfree = maxfree * # of memory pools AIX 7.1, 6.1 and 5.3 defaults are acceptable for most workloads – Consider increasing if vmstat ‘fre’ column frequently approaches zero, or if “vmstat –s” shows significantly increasing “free frame waits” over time Suggested starting points if tuning is required: – minfree >= max(960,(120 x # logical CPUs )) / #mem pools – maxfree = minfree + ((MAX(maxpgahead, j2_maxPageReadAhead) * # logical CPUs) / # mem pools) Example: 10-way LPAR with SMT-2 enabled, with maxpgahead=8 and j2_maxPageReadAhead=128 and 2 memory pools: minfree = 1200 = max(960,(120 x 10 x 2) / 2 maxfree = 2480 = 1200 + ((max(128,8) x 10 x 2) / 2) vmo –p –o minfree=1200 –o maxfree=2480 13 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX System Paging Concepts & Requirements By default, AIX uses a “demand paging” policy – For Oracle DB, the goal is ZERO system paging activity – Filesystem pages written back to filesystem disk (if dirty); never to system paging space – Unless otherwise specified, computational pages are not written to paging space unless/until they are stolen by lrud. (*1) Once written to paging space, pages are not removed from paging space until the process associated with those pages terminates – For long running processes (e.g. Oracle DB), even low levels of system paging can result in significant growth in paging space usage over time – Paging space should be considered a fail-safe mechanism for providing sufficient time to identify and correct paging issues, not a license to allow ongoing system paging activity Paging space allocation Rule-of-Thumb: – ½ the physical memory + 4 GB, with the following cap: Resolve paging issues quickly: Reduce effective minimum file system cache size (minperm) Reduce Oracle SGA or PGA size Add physical memory 14 September 2012 Physical Memory lower or equal to Paging Space Max 128GB 60GB 256GB 100GB 512GB 150GB 1TB 200GB © 2012 IBM Corporation Advanced Technical Skills (ATS) North America JFS2 inode / metadata caches JFS2 utilizes two caches - one for inodes and one for metadata Unused Caches grow in size until maximum size is reached before cache slots are reused File cache Default values are tuned for a file server! Each entry in the inode cache requires about 1KB of physical memory 1MB of memory can cache about 1000 files Process Configured via ioo parameters: – j2_inodeCacheSize (Default: 400 = 10%) *1 – j2_metadataCacheSize (Default: 400 = 4%) *1 The current memory use can be verified via: “System memory” 100% physical memory AIX “pinned” cat /proc/sys/fs/jfs2/memory_usage metadata cache: 31186944 inode cache: 34209792 total: 65396736 4% *1 metadata cache 10% *1 Inode cache Can not be paged ! *1 Note: Default values in AIX 7.1 are 200 (5%) , 200 (2%) 15 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Large Segment Aliasing (AKA Terabyte Segment) Workloads with large memory footprints and low spatial locality may perform poorly due to Segment Lookaside Buffer (SLB) faults – May consume up to 20% of total execution time for some workloads Architectural trend toward smaller SLB sizes can exacerbate SLB related performance issues: – POWER6 has 64 SLB entries – 20 for kernel, 44 for user processes – allowing 11GB of accessible memory before incurring SLB faults – POWER7 has 32 SLB entries – 20 for kernel, 12 for user processes – allowing 3GB of accessible memory before incurring SLB faults Oracle SGA sizes are typically in the 10s to 100s of Gigabytes With Large Segment Aliasing, each SLB entry can address up to 1TB of memory – Supports shared memory (Oracle SGA) addressability for up to 12TB on POWER7 and up to 44TB on POWER6 without SLB faults – Enabled by default on AIX 7.1 – Disabled by default on AIX 6.1 TL06+ – May be enabled by setting “vmo” esid_allocator=1 (Recommended) 16 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Local, Near and Far Memory Power Systems use a “shared memory” model – any processor has local access to part of memory High-end Power Systems (e.g. p770, p780, p795) use multiple building blocks (CECs) to scale capacity – Each building block has it’s own set of processor and memory chips – Building blocks are interconnected via a switched communications fabric The closer the memory is to the processor accessing it, the faster the memory access – Local Memory: Directly attached to the chip’s memory controller – Near Memory: On an adjacent chip, accessed via intra-node communication paths – Far Memory: On a different CEC drawer, accessed via inter-node communication paths Model Local Near Far Power 710/730 Same Chip Other Chip n/a Power 720/740 Same Chip Other Chip n/a Power 750 Same Chip Other Chip n/a Flex System p260 / p460 Same Chip Other Chip n/a Power 770/780 Same Chip Other Chip, Same CEC Different CEC Power 795 Same Chip Other Chip, Same CEC Different CEC 17 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Memory and Memory Affinity Oracle SGA is “striped” across all the available memory in the LPAR – If the LPAR configuration has a combination of near, local and far memory allocated to it, SGA will be (more or less) evenly spread across all of it – The greater the number of CECs involved, the greater the likelihood of remote memory accesses Oracle PGA for a given process tends to be allocated in the near memory of the processor that process was running on when the memory was allocated – The AIX dispatcher will attempt to maintain affinity between a given process and the processor that process gets scheduled on – rsets may (optionally) be used to force affinity to a subset of available processors (e.g. those on a given chip, or within a given CEC), although this could potentially cause dispatching delays in heavily loaded environments “vmo” enhanced_affinity_private (“vmo” restricted parameter) – The percentage of application data that is to be allocated local, with the remaining memory to be striped across all available memory in the LPAR – Default value is 20% in AIX 6.1 TL5 and 40% in AIX 6.1 TL6+ and AIX 7.1 – A value of 100% has been used in a number of Oracle performance benchmarks, but before changing a restricted parameter, contact IBM support. 18 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America “Socketizing” Workloads (p770,p780,p795) Don’t over-allocate CPUs – If a given workload (LPAR) requires <= 16 processors (single CEC), don’t allocate more than 16 processors (vCPU) – If all the LPARs in a given shared pool require (in aggregate) <= 32 processors (2 CECs), don’t allocate more than 32 processors (vCPU) to the shared pool – For Shared Processor LPARs, don’t overallocate vCPUs relative to Entitled Capacity Don’t over-allocate memory – May cause processors/memory to be allocated on additional CECs because there wasn’t sufficient free memory available on the optimal CEC Help the Hypervisor do its job – Stay current on Firmware (e.g. AM720_101 or later) to avoid any known CPU/memory allocation or virtual processor dispatching issues – Where appropriate, consider LPAR boot order to ensure high priority LPARs get optimal choice of the available CPUs and memory Consider the use of rsets (Advanced Tuning – Can lead to unexpected results) – For example, where heavy application (e.g. java) workload is co-located on DB LPAR – Or, could potentially affinitize some of the Oracle processes, e.g. system processes, or DB connections spawned by a given listener 19 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Server Architecture – Memory Structures PGA PGA PGA RVWR PMON SMON PGA ARC0 System Global Area (SGA) Shared Pool Flashback Log DB Buffer Cache Redo Log Buffer Archive Logs PGA LGWRn PGA DBWRn PGA User PGA D000 CHKP PGA Control Files SGA is shared among processes DB Files 20 Redo Logs PGA is private to an individual server or background process September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Memory Usage in an Oracle Database Environment Computational Some used for AIX kernel processing Some used by Oracle/client executable programs Includes Oracle SGA and PGA memory File System Cache May be used for caching or pre-fetching of Oracle .dbf files – Only for local file system based (non-RAC) environments where Direct I/O (or Concurrent I/O) is not used May be used for other Oracle related files – Archive logs, export/import files, backups, binaries, etc. May be used for non-Oracle related files – Application files, system files, etc. Virtual Memory Management Priorities Always want to keep computational pages in memory -- System paging/swapping may degrade Oracle/application performance – Allocate enough physical memory to support computational footprint requirement + small file cache – When necessary, steal filesystem pages, not computational 21 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Parameter Tuning (AIX 7.1, AIX 6.1) Most AIX 7.1, AIX 6.1 parameters configured by default to be ‘correct’ for most workloads As of AIX 6.1, many tunables are now classified as ‘Restricted’ – Only change if AIX Support requests it – Restricted parameters will not be displayed unless the ‘-F’ option is used for “vmo” or other commands When migrating from AIX 5.3 to AIX 6.1 or AIX 7.1, existing parameter override settings in AIX 5.3 will be transferred to AIX 6.1 or AIX 7.1 environment – After migration, review/verify parameter values are properly set 22 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Recommended vmo “Starting Points” - Review Parameter Recommend Value AIX 7.1 Default AIX 7.1 Restricted AIX 6.1 Default AIX 6.1 Restricted AIX 5.3 Default esid_allocator 1 1 Yes 0 Yes N/A minperm% 3 3 No 3 No 20 maxperm% 90 90 Yes 90 Yes 80 maxclient% 90 90 Yes 90 Yes 80 strict_maxclient 1 1 Yes 1 Yes 1 strict_maxperm 0 0 Yes 0 Yes 0 lru_file_repage 0 N/A N/A 0 Yes 1 or 0(*1) lru_poll_interval 10 10 Yes 10 Yes 10 minfree 960 960 No 960 No 960 maxfree 1088(*2) 1088 No 1088 No 1088 page_steal_method 1 1 Yes 1 Yes 0 memory_affinity 1 1 Yes 1 Yes 1 v_pinshm 0 0 No 0 No 0 lgpg_regions 0 0 No 0 No 0 lgpg_size 0 0 No 0 No 0 maxpin% Leave at Default 90(*3) No 80(*3) No 80 2 (see notes) 2 No 1 Yes N/A vmm_klock_mode *1 Depending on AIX 5.3 TL level 23 *2 Do not reduce below default *3 Depends on LSA use – LSA active September 2012 90, otherwise 80 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX Multiple Page Size Support with Oracle 4K – Typically used on older hardware which does not support 64K pages, or with older Oracle versions (< 10.2.0.4) 64K – Preferred – most of the “Large Pages” benefit without the issues – In 10.2.0.4+ (*1), 11g Oracle will automatically use 64k for SGA if supported by hardware *1 – with Oracle patch 7226548 – May also be used for program data, text and stack areas: # ldedit –btextpsize=64k –bdatapsize=64k –bstackpsize=64k oracle # export LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K oracle 16M (Large Pages) – Discouraged – Limited benefit and potential adverse impacts – May be useful if maximum possible performance is required and Oracle SGA changes are tightly coordinated with AIX Sysadmin – If improperly configured, can contribute to severe system paging and kernel panics 16G available with POWER5+ and later & AIX 5.3 TL4+ and later AIX releases – Cannot be used with Oracle (*1) – MOS with Oracle patch 7226548 Also see: note # 372157.1 24 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Memory Structures Allocation 9i : Dynamic memory resizing – db_cache_size (dynamic parameter) – sga_max_size (static parameter) – maximum size of the SGA for the lifetime of the instance. – pga_aggregate_target (dynamic parameter) – specifies the target aggregate PGA memory available to all server processes attached to the instance; note that this is a target and not a hard limit! – additional parameter : db_cache_advice (dynamic parameter) – enables or disables statistics gathering used for predicting behavior with different cache sizes. 10g : Automatic Shared Memory Management (ASMM) – sga_target (dynamic) – if set, db_cache_size, shared_pool_size, large_pool_size and streams_pool_size are dynamically sized • Minimum values for these pools may optionally be specified - recommended – Can be dynamically increased up to sga_max_size – To use ASMM, sga_target must be >0 25 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Memory Structures Allocation 11g : Automatic Memory Management (AMM) – memory_target (dynamic parameter) – specifies the total memory size to be used by the instance SGA and PGA. Exchanges between SGA and PGA are done according to workload requirements – If sga_target and pga_aggregate_target are not set, the policy is to give 60% of memory_target to the SGA and 40% to the PGA – memory_max_target (static parameter) – specifies the maximum amount of memory allowed to be used by the Oracle instance – To use Automatic Memory Management, memory_target must be >0 and LOCK_SGA=false See Metalink notes 443746.1 and 452512.1 explaining AMM and these new parameters AMM dynamic resizing of the shared pool can cause a fair amount of “cursor: pin s” wait time. One strategy to minimize this is to set minimum sizes for memory areas you particularly care about. In addition, you can change the frequency how often AMM analyzes and adjusts the memory distribution. See: Metalink note: 742599.1 ( _memory_broker_stat_interval) 26 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America SGA_MAX_SIZE and LOCK_SGA implications (11g, 10.2.4.0+) LOCK_SGA=false Preferred – Oracle dynamically allocates memory for the SGA only as needed up to the size specified by SGA_TARGET – SGA_TARGET may be dynamically increased, up to SGA_MAX_SIZE – 64K pages automatically used for SGA if supported in the environment. If needed, 4K (or 16M) pages are converted to 64K pages. Down-conversion of 16M pages to 64K pages is only triggered at DB startup if needed; after startup additional unused 16M pages are not converted, even if not enough 4K or 64K pages are available! LOCK_SGA=true Discouraged – Oracle Automatic Memory Management (AMM) cannot be used – Oracle pre-allocates all memory as specified by SGA_MAX_SIZE and pins it in memory, even if it’s not all used (i.e. SGA_TARGET < SGA_MAX_SIZE) – If sufficient 16M pages are available, those will be used. Otherwise, all the SGA memory will be allocated from 64K (if supported) or 4K pages (if 64K pages are not supported). If needed, 4K or 16M pages will be converted to 64K pages, but 16M pages are never automatically created. Also see comment above re 16M to 64K page conversion! – If a value for SGA_MAX_SIZE is specified larger than the amount of available memory for computational pages, the system can become unresponsive due to system paging. – If the specified SGA_MAX_SIZE is much larger than the currently available pages on the combined 64K and 16M page free lists, the database startup can fail with error: “IBM AIX RISC System/6000 Error: 12: Not enough space”. In this case re-try to start the database. 27 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Memory - Page Sizes and use with Oracle DB Page type medium large huge 4KB 64KB 16MB 16GB Processor all POWER5+™ or later POWER4™ or later Power5+ or later AIX™ support all AIX 5.3 TL04+ AIX 6.1, AIX 7.1 AIX 5.3 and later AIX 5.3 TL04+ AIX 6.1, AIX 7.1 32bit & 64bit 64bit 32bit & 64bit 64bit Restricted no no yes yes Pageable yes yes no no Requires User Configuration no no Yes (OS) Yes (HMC + OS) Memory page size Kernel Automatic conversion Oracle SGA support Activate use for Oracle SGA small (*1) small <-> medium small <-> medium (AIX6.1+, POWER6+) (AIX6.1+, POWER6+) (AIX6.1+, POWER6+) <10.2.0.4 default > = 10.2.0.4/5 default (*2) 11g default 10g / 11g no automatic > = 10.2.0.4/5 autom. (*2) 11g: automatic lock_sga=true and user permissions n/a *1 - Used system wide if Active Memory Sharing (AMS) is used, (*3) large -> medium no *2 – with Oracle patch 7226548 for 10gR2 *3 – This conversion is only triggered at database startup, if needed, and not for later memory allocations 28 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX – Oracle CPU Utilization 29 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Physical, Virtual, Logical Layers AIX 5.2 AIX 5.3 AIX 5.3 AIX 6.1 LPAR LPAR Micro Micro SMT=off SMT=on Partition Partition SMT=on SMT=off L LL LL LL LL L 2 CPUs 1 CPU (dedicated) (dedicated) L L L L LL L L L L L L Logical V V V V 2.1 Proc. Units AIX 7.1 Micro Partition SMT-4 V V 0.8 Proc Units Shared Pool 1 V V V 1.2 Proc Units Virtual Physical Pool 2 13 CPU Default Pool 0* 16 CPU SMP Server Think “PVL “ P=Physical V=Virtual L=Logical (SMT) * All activated, non-dedicated CPUs are automatically placed into the shared processor pool 0. Only 2.1+0.8+1.2 = 4.1 processor units of “desired capacity” has been allocated from the pool of 13 CPUs 30 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Virtual Shared Processor Pools – Licensing Benefits Server with 16 processor cores POWER6/7 Multiple shared pools: • Can reduce the number of software licenses by putting a limit on the amount of processors an uncapped partition can use • Up to 64 shared pools CUoD n5 n6 n7 n8 n9 Uncapped Uncapped Uncapped Uncapped Uncapped AIX AIX AIX AIX AIX OAS OAS OAS Oracle DB Oracle DB App 1 App2 QA VP = 5 VP = 4 VP = 4 VP = 6 VP = 3 Ent. = 2.5 Ent. = 1.70 Ent. = 2.00 Ent. = 2.00 Ent. = 1.00 n1 n2 n3 n4 Virtual Shared pool #1 Virtual Shared pool #2 VIOS VIOS AIX Linux Max Cap: 5 processors Max Cap: 6 processors Physical Shared Pool (9 processor cores) Oracle DB 4 0.5 0.5 1 1 1 2 3 4 5 6 7 8 9 Oracle DB cores to license: • 1 from dedicated partition n3 • 5 from shared CPU pool 1 =6 Oracle DB core – license factors: POWER6: 1.0 POWER7: 1.0 OAS cores to license: • 6 from shared CPU pool 2 =6 31 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Virtual Processor - Folding Dynamically adjusts active Virtual Processors (VPs) – System consolidates loads onto a minimal number of VPs • Scheduler computes utilization of VPs every second – If VPs needed to host physical utilization is less than the current active VP count, a VP is put to sleep – If VPs needed are greater than the current active VPs, more are enabled – On by default in AIX 5.3 TL03 and later • vpm_xvcpus tunable • vpm_fold_policy tunable Increases processor utilization and affinity – Inactive VPs don’t get dispatched and waste physical CPU cycles – Fewer VPs can be more accurately dispatched to physical resources by the Hypervisor with potential for improved processor cache efficiency When to adjust – Check with IBM support before changing! – Bursty workloads with short response-time requirements may need sub-second dispatch latency • Disable processor folding or manually tune the number of VPs – # schedo –o vpm_xvcpus=[-1 | N] – Where N specifies the number of VPs to enable in addition to the number of VPs needed to consume physical CPU utilization – A value of “-1” disables CPU folding 32 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America CPU Considerations Use Simultaneous Multi-Threading (SMT) with AIX 5.3 (or later) on Power5 (or later) environments Micropartitioning Guidelines – Virtual CPUs for specific LPAR <= physical processors in shared CPU pool CAPPED LAR • Virtual CPUs should be the nearest integer >= capping limit UNCAPPED LPAR • Virtual CPUs should be set to the max peak demand requirement • Preferably, set Entitlement >= Virtual CPUs / 1.5 DLPAR considerations – CPU_COUNT refers to Logical CPU Oracle 9i – Oracle CPU_COUNT does not recognize change in # cpus – AIX scheduler can still use the added CPUs Oracle 10g/11g – Oracle CPU_COUNT recognizes change in number of active CPU Max CPU_COUNT limited to 3x CPU_COUNT at instance startup. This can limit the amount of physical CPU resources utilized if you start out with SMT-off and switch to SMT-4 or dynamically add a large number of VP. Note: Recommended to set PARALLEL_THREADS_PER_CPU=1 if SMT is active. 33 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX – Oracle IO 34 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America The AIX IO stack Application memory area caches data to avoid IO Application ASM options Logical file system Raw LVs Raw disks Local FS JFS / JFS2 NFS NFS caches file attributes NFS has a cached filesystem for NFS clients and uses client pages for cache Other VMM JFS and JFS2 cache use extra system RAM JFS uses persistent pages for cache JFS2 uses client pages for cache LVM (LVM device drivers) Multi-path IO driver (optional) Disk Device Drivers Adapter Device Drivers Disk subsystem (optional) Disk Cache Queues exist for both adapters and disks Adapter device drivers use DMA for IO Disk subsystems have read and write cache Disks have memory to store commands/data Persisted write Cache, as well as read Cache – “I/O complete” acknowledge sent back to application before data written to physical disk. IOs can be coalesced (good) or split up (bad) as they go thru the IO stack 35 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America I/O Stack Tuning Options (Device Level) Disk queue_depth - maximum # of concurrent active I/Os for an hdisk / hdiskpower; additional I/O beyond that limit will be queued. Recommended/supported maximum is storage subsystem dependent. max_transfer - the maximum allowable I/O transfer size (default is 0x40000 or 256k). Maximum supported value is storage subsystem dependent. All current technology supports 1MB I/O size set to 0x100000. Fiber Channel Disk Adapter (fcsn) num_cmd_elems - maximum number of outstanding I/Os for an adapter. set to 1024 or 2048 (within storage subsystem vendor guidelines) max_xfer_size - Increasing value (to at least 0x200000) will also increase DMA size from 16 MB to 256 MB, but this should only be done after IBM support has directed you to do so, as it can lead in specific configurations to system stability issues or AIX not being able to boot. dyntrk - when set to yes (recommended), allows for immediate re-routing of I/O requests to an alternative path when a device ID (N_PORT_ID) change has been detected; only applies to multi-path configurations. fc_err_recov - when set to “fast_fail” (recommended), if the driver receives an RSCN notification from the switch, the driver will check to see if the device is still on the fabric and will flush back outstanding I/Os if the device is no longer found. To validate / change current parameter settings use: “lsattr”, “chdev” 36 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Data Striping to Avoid I/O Hotspots Old Wisdom Isolate files based on function and/or usage – Manually intensive effort – Leads to I/O hotspots over time that impact throughput capacity and performance New Wisdom Stripe objects across as many physical disks as possible – Minimal manual intervention – Evenly balanced I/O across all available physical components – Good average I/O response time and object throughput capacity with no hotspots Implementation Options: – ASM and GPFS do this automatically within a given disk group or file system – Can be implemented with conventional Volume Managers and file systems 37 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Data Layout for Optimal I/O Performance Example: 1. Use RAID-5 or RAID-10 to create striped LUNs (hdisks) – Four 3+P (or 2+2) RAID arrays – 1 (or more) LUNs per RAID array – Each LUN is spread across 4 drives Storage RAID-5 vs. RAID-10 Performance Comparison I/O Profile RAID-5 1 HW Striping RAID-10 LUN 1 Sequential Read Excellent Excellent Sequential Write Excellent (*1) Good LUN 2 Random Read Excellent Excellent LUN 3 Random Write Fair Excellent LUN 4 *1 – Assumes optimizing SAN storage sub-system! 38 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Data Layout for Optimal I/O Performance Example… 2. Stripe or spread individual objects across multiple LUNs (hdisks) for maximum distribution – Each object is spread across 4 LUNs, each from different array (16 drives) AIX Storage HW Striping IBM GPFS AIX LVM striping with JFS2 or 2 ASM Disk Group or SW Striping Volume (Disk) Group hdisk 1 LUN 1 hdisk 2 LUN 2 hdisk 3 LUN 3 hdisk 4 LUN 4 Note: ASM, AIX LVM or GPFS can not share the same hdisks. 39 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Software Striping with AIX – LV striping & PP spreading Stripe using Logical Volume (LV) – Create Logical Volume with the striping option : mklv –S <strip-size> ... – Oracle recommends a stripe size of a multiple of db_block_size * db_file_multiblock_read_count (Usually around 1 MB) – Valid LV Strip sizes: AIX 7.1, 6.1, 5.3: 4k to 128M in powers of 2. – For RAW LVs, use AIX Logical Volume 0 offset (9i Release 2 or later) Use Scalable Volume Groups (VGs), or use “mklv –T O” with Big VGs Requires AIX APAR IY36656 and Oracle patch (bug 2620053) PP striping (AKA spreading) – Create a Volume Group with a 8M,16M or 32M PPsize. (PPsize will be the “Strip size”) – Choose a Scalable Volume Group : # mkvg –S –s <PPsize> ... – Create LV with “Maximum range of physical volume” option to spread PPs on different hdisk in a Round Robin fashion : # mklv –e x ... Note: If you create a JFS2 filesystem on a striped (or PP spreaded) LV, use the INLINE logging option. It will avoid « hot spot » by creating the jfs2 redo log inside the filesystem itself, which is striped, instead of using a single PP stored on 1 hdisk. (# crfs –a logname=INLINE …) 40 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Filesystems AIX Standard Filesystems (FS): JFS – no longer being enhanced JFS2 – generally the preferred filesystem Better for large files/filesystems and for filesystems with large numbers of files. Mount options: Buffer Caching (default) – stage data in FS buffer cache Direct I/O (DIO) – no filesystem caching Concurrent I/O (CIO) – DIO + no write serialization (JFS2 only) Release Behind Read (RBR) – memory pages released (available for stealing) after pages copied to internal buffers Release Behind Write (RBW) – memory pages released (available for stealing) after pages written to disk Release Behind Read / Write (RBRW) – combination of RBR and RBW No Access Time (NOATIME): do not update last accessed time when file is accessed 41 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Server Architecture – Files PGA PGA PGA PMON RVWR SMON PGA ARC0 System Global Area (SGA) Shared Pool Flashback Log DB Buffer Cache Redo Log Buffer Archive Logs PGA LGWRn PGA DBWRn PGA User PGA D000 CHKP PGA Control Files DB Files 42 Redo Logs Oracle Binaries September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Tuning Oracle DB Buffer Cache Buffer Cache is the primary database I/O avoidance option! Old Wisdom If the buffer hit% is > 90% it’s good enough New Wisdom: Depending on workload, a higher hit% may provide significant improvements – For a given workload with a buffer hit% of 98%, a 1% increase (to 99%) will reduce physical I/O requests by 50% – Reducing IOPS typically also improves response time for remaining I/Os – In many cases, adding server memory may be cheaper than adding I/O subsystem cache memory or short-stroking disks Evaluate impact of increasing db_cache_size on physical I/O Monitor for and address potential impact: – Increased logical read rates and higher peak CPU demand due to reduced I/O wait time (increase CPU capacity as appropriate) – System paging due to memory shortage (add physical memory as necessary) 43 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Options for Data Storage on AIX (JFS) / JFS2 RAW GPFS ASM ACFS (11.2.0.2) Database Files Redo Log Files Control Files Archive Log Files Oracle Binaries Unsupported upgrade after 11gR2, or new installs with 11gR2 or later 44 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America JFS/JFS2 environments - Oracle Database Files Data Base Files (DBF) I/O size ranges from db_block_size to db_block_size * db_file_multiblock_read_count Use CIO (or DIO for JFS) or filesystem cache, depending on I/O characteristics If block size is >=4096, use a filesystem block size of 4096, else use 2048 Redo Log/Control Files I/O size is always a multiple of 512 bytes Use CIO (or DIO for JFS) and set filesystem block size (agblksize) to 512 Archive Log and Backup Files Don’t use CIO or DIO ‘rbrw’ mount option can be advantageous Flashback Log Files Writes are sequential, sized as a multiple of db_block_size By default, dbca will configure a single location for the flash recovery area - for flashback logs, archive logs, and backup logs Flashback Log files should use CIO, DIO, or ‘rbrw’ mount Oracle Binaries Don’t use CIO or DIO Use NOATIME to reduce ‘getcwd’ overhead System Root (/) Filesystem Use NOATIME to reduce ‘getcwd’ overhead 45 September 2012 (requires “bosboot”, “reboot”) © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Asynchronous I/O for filesystem environments AIX parameters aio_minservers = minimum # of AIO server processes aio_maxservers = maximum # of AIO server processes aio_maxreqs = maximum # of concurrent AIO requests “enable” at system restart (not required with AIX 6.1 or AIX 7.1) aio_server_inactivity = time before idle AIO processes will be terminated (AIX 6.1 and AIX 7.1 only) AIX 5.3 settings are often too low for Oracle workloads > Recommend using AIX 6.1 defaults Oracle parameters disk_asynch_io = TRUE filesystemio_options = {ASYNCH | SETALL} db_writer_processes (let default) db_writer_io_slaves (do not set when using AIO) 46 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Asynchronous I/O for filesystem environments… Monitor Oracle usage: • Watch alert log and *.trc files in BDUMP directory for warning message: Warning “lio_listio returned EAGAIN” This usually indicates that maxservers and/or maxreqs is set too low or that the IO sub-system is not able to support the IO workload, but check AIX system log for errors before increasing the number of AIO servers! Monitor from AIX: • “pstat –a | grep aios” (for AIO server processes) • Use “-A” option for NMON (interactive or spreadsheet mode) • iostat –Aq 47 (AIX 5.3 – ensure AIO is enabled!) (Note: initial AIX 7.1 and some older AIX 6.1 TL levels do not work correctly) September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Asynchronous I/O – Parameter Tuning AIX 6.1 and AIX 7.1 – Use ioo command to change – Defaults are good starting points: aio_minservers = 3 aio_maxservers = 30 (maybe increase to 50) aio_maxreqs = 65536 aio_server_inactivity = 300 (per logical CPU) (per logical CPU) AIX 5.3 – Use aioo (or ‘smitty aio’) command to change – Recommended starting points: minservers = 3 (Systemwide) (default = 1) maxservers = 50 (Per CPU) (default = 10) maxreqs = 65536 (default = 4096) “enable” at system restart (default = disable) Note: In AIX 6.1 and AIX 7.1 AIO processes will end when not used, whereas in AIX 5.3 a started AIO process runs till system is rebooted 48 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Asynchronous IO (AIO) fastpath Raw Devices or ASM (rhdisk) environments use kernelized or “fastpath” AIO and not AIO processes Better performance compared to non-fast_path The AIO parameters discussed earlier do not apply No aio server processes => “pstat –a | grep aios | wc –l” is not relevant, use “iostat –A” instead 1 AIX Kernel Application 2 3 Disk 4 5 • Raw Devices / ASM : verify AIO configuration with : AIX 5L: lsattr –El aio0, AIX 6.1, AIX 7.1: ioo –Fa enable asynchronous IO fast_path. : AIX 5L : chdev -a fastpath=enable -l aio0 (default since AIX 5.3) AIX 6.1, AIX 7.1 : ioo –p –o aio_fastpath=1 (default setting) • JFS2 with CIO and AIX 5.3 TL5+ : Activate fsfast_path (comparable to fast_path but for JFS2 + CIO) AIX 5L : adding the following line in /etc/inittab: aioo:2:once:aioo –o fsfastpath=1 AIX 6.1, AIX 7.1 : ioo –p –o aio_fsfastpath=1 (default setting) Note: Apply APAR IZ74245 for AIX 6.1 or IZ59538 for AIX 5.3 to fix potential sequential read issue with ASM and “fastpath” IO. 49 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Cached vs. non-Cached (Direct) I/O File System caching tends to benefit heavily sequential workloads with low write content due to sequential read ahead. To enable caching for JFS/JFS2: Use default filesystem mount options Set Oracle filesystemio_options=ASYNCH (default) DIO tends to benefit heavily random access workloads and CIO, in addition to DIO benefits, tends to benefit heavy update workloads. To disable JFS, JFS2 caching: In 9i, set filesystemio_options=ASYNCH and use dio (JFS) or cio (JFS2) file system mount option In 10g/11g If Oracle files do not need to be concurrently accessed by external utilities, set filesystemio_options=SETALL Otherwise set filesystemio_options=ASYNCH and use dio (JFS) or cio (JFS2) mount With Oracle 11.2.0.2+ and AIX 6.1+ always use filesystemio_options=SETALL and do not specify dio / cio FS mount option. When using DIO/CIO, FS buffer cache is not used. Consider the following Oracle DB changes: Increase db_cache_size Increase db_file_multiblock_read_count if set in init.ora, but see Notes Read Metalink Note #s 272520.1, 257338.1, 360287.1, 232935.1 50 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX rendev command with ASM rendev command is used for renaming devices which are listed in ODM Syntax / Description – rendev –l <original name> -n <new name> – The device entry under /dev will be renamed corresponding to <new name> – Certain devices such as /dev/console, /dev/mem, /dev/null, and others that are identified only with /dev special files cannot be renamed – Command will fail for any device that does not have both a Configure and an Unconfigure method – Any name that is 15 characters or less and not already used in the system can be used If used to rename hdisk devices for ASM use, it is recommended that you keep the "hdisk" prefix, as this will allow the default ASM discovery string to match the renamed hdisks. Corresponding rhdisk is renamed as well. Example: # rendev –l hdisk10 –n hdiskASM10 # ls /dev/*ASM* /dev/hdiskASM10 /dev/rhdiskASM10 51 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX lkdev command with ASM The lkdev command locks the specified device. Any attempt to modify device attributes by using the chdev or chpath command is denied. In addition, an attempt to delete the specified device or one of its paths from the ODM by using either the rmdev or rmpath command is denied. Syntax: lkdev [ -l <Name> -a | -d [ -c <Text> ] ] <Name> -a -d -c <Text> Name of device to be changed (required) Locks the specified device. Unlocks the specified device. Specifies a text of up to 64 printable characters with no embedded spaces. Examples: – To enable the lock for the hdiskASM10 disk device and create a text label, enter the following command: # lkdev -l hdiskASM10 -a -c ASMdisk – To remove the lock for the hdisk1 disk device and remove the text label, enter the following command: # lkdev -l hdiskASM10 -d Note: The text label of a locked device can not be changed! Instead, the device needs to be first unlocked and then locked again with the new text label specified. 52 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX – Oracle Network 53 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Network parameters (no) use_isno = 1 means any parameters set at the interface level override parameters set with ‘no’ – This is the default value; (restricted parameter in AIX 7.1 and AIX 6.1) If use_isno = 0, any parameters set with ‘no’ override interface-specific parameters If use_isno = 1, set parameters for each interface using ‘ifconfig’ or ‘chdev’ (check with: lsattr -E -l en0 –H) Refer to the following URL for a chart on appropriate interface-specific parameters: – http://publib.boulder.ibm.com/infocenter/systems/topic/com.ibm.aix.prftungd/doc/prftungd/prftungd.pdf Generally appropriate parameters for Gigabit Ethernet Oracle interfaces: – tcp_sendspace = 262144 – tcp_recvspace = 262144 – rfc1323 = 1 Examples: # no -p -o tcp_sendspace=262144 # no -p -o tcp_recvspace=262144 # no -p -o rfc1323=1 54 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Routing Table Entry Locking There are 2 alternative locking strategies for Routing Table entries (rtentry) – simple and complex – The current default locking strategy is “simple” (*1) The Simple Performance Lock Analysis Tool (splat) may be used to monitor rtentry lock performance The complex locking strategy can improve performance when there is a lot of activity on Routing Table entries – Can be enabled by setting rtentry_lock_complex=1 (Default in AIX 7.1) Example: > # no -p -o rtentry_lock_complex=1 *1: Parameter not available in AIX 5.3; AIX 7.1 default value is “complex” – value of 1 55 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX – Oracle Miscellaneous 56 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX - Oracle: Miscellaneous (1/2) /etc/security/limits Set to “-1” for everything except “core” for Oracle user, but beware that the DBA does not mis-configure the SGA size! sys0 attribute maxuproc – Should be >= 16384 (verified by installer!) – maxuproc > 128 + SUM of PROCESSES + PARALLEL_MAXSERVERS for all DB instances in LPAR Oracle environment variables: AIXTHREAD_SCOPE=S Use 64-bit AIX kernel (32bit kernel only available in AIX 5.3 and earlier) Time synchronization – use the “-x” flag with xntpd Edit /etc/rc.tcpip, search for xntpd and add the ‘-x’ to the line for xntpd: • # Start up Network Time Protocol (NTP) daemon • start /usr/sbin/xntpd "$src_running" “-x” 57 September 2012 Add (lower case x) © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX - Oracle: Miscellaneous (2/2) Oracle hot patching (11.2.0.2 and later) Online patches (11gR2+) should only be used when the patch needs to be applied urgently and database downtime cannot be scheduled. It is strongly recommended to rollback all online patches and replace them with regular (offline) patches on next instance shutdown See MOS note 761111.1 for further details Oracle 11.2.0.2, 11.2.0.3 require USLA Heap patch: • 13443029 (requires AIX 6.1 TL07 SP2 or AIX 7.1 TL01 SP2) OR • 10190759 (disables hot patching) Further details / latest updates please check: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102066 58 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America My system doesn't behave What tools are available in AIX to investigate the cause? 59 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Performance Monitoring and Tuning Tools CPU Status and Initial Monitor Commands Memory I/O Subsystem Network vmstat, topas, nmon, iostat, ps, mpstat, lparstat, sar, time/timex, emstat/alstat vmstat, topas, nmon, ps, lsps, ipcs, lparstat vmstat, topas, nmon, iostat, lvmstat, lsps, lsattr/lsdev, lspv/lsvg/lslv netstat, topas, nmon, atmstat, entstat, tokstat, fddistat, nfsstat, ifconfig ps, pstat, topas, nmon, emstat/alstat netpmon svmon, netpmon, filemon fileplace, filemon netpmon, tcpdump svmon, truss, kdb, dbx, gprof, kdb, fuser, prof tprof, curt, splat, trace, trcrpt trace,trcrpt trace, trcrpt iptrace, ipreport, trace, trcrpt truss, pprof curt, splat, trace, trcrpt schedo, fdpr, bindprocessor, bindintcpu, nice/renice, setpri vmo, rmss,fdpr, chps/mkps ioo, lvmo, chdev, migratepv,chlv, reorgvg no, chdev,ifconfig nfso,chdev, fdpr Monitor Commands Trace Level Commands Tuning tools Processes & Threads Note: In AIX 5.3 TL09+, AIX 6.1 TL02+, AIX 7.1 releases nmon == topas_nmon and is part of AIX base install! 60 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America NMON nmon -M -^ -f -d -T -A -s10 -c9999 (as root user!) – -M: detailed memory information per page size (4K, 64K, 16M, 16G) – -^: FC adapter statistics – -f: spreadsheet mode – -d: disk service times section – -T: collect TOP and UARG information – -A: include AIO statistics – -s 10: 10 second capture interval – -c 9999: number of intervals to run To stop the data collection cleanly: – kill –USR2 <PID of nmon process> Creates by default a file in the current directory: <server name>_<date>_<time>.nmon Note: Try “nmon –h” to get the full list of available options. 61 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America nmon On Demand Recording (ODR) New function ideal for benchmarks, proof-of-concepts and problem analysis Allows “high-resolution” recordings to be made while in monitoring mode – Records samples at the interactive monitoring rate – AIX 5.3 TL12, AIX 6.1 TL05 and AIX 7.1 Usage – Start nmon, use “[“ and “]” brackets to start and end a recording • Records standard background recording metrics, not just what is on screen. • You can adjust the recorded sampling interval with -s [seconds] on startup, interactive options “-” and “+” (<shift> +) do NOT change ODR interval – Generates a standard nmon recording of format: <host>_<YYYYMMDD>_<HHMMSS>.nmon – Tested with nmon Analyser, and works well 62 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America NMON Analyser – Which Worksheets to look at. Kernel settings, Disk devices, … : – BBBx CPU: – SYS_SUMM, PROC (RunQueue), TOP; LPAR or CPU_ALL – depends if LPAR deployed in shared CPU pool or not. – Try the nmon_analyser option “PIVOT” to see in one graph what processes are using the CPU; requires TOP data! Memory: – MEMUSE, MEMPAGES64K, PAGE, TOP Disk IO: – DISK_x, EMC_x, ESS_x, PROCAIO, PROC (Swap-In) Network: – NET, NETPACKET, NETSIZE Tip: Try the “PIVOT” option to see CPU utilization by process type. 63 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Where to find AIX / Oracle documentation? 64 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Where to find AIX / Oracle documentation? IBM TechDocs My Oracle Support (MOS) notes IBM Redbooks: (www.ibm.com/redbooks) AIX / PowerVM Wiki AIX InfoCenter pages on the WEB 65 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America IBM TechDocs - Technical Sales Library http://www.ibm.com/support/techdocs Oracle Architecture and Tuning on AIX v2.20 Soon to be updated to 2.3 http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100883 Configuring IBM TotalStorage for Oracle OLTP Applications http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319 Diagnosing Oracle® Database Performance on AIX® Using IBM® NMON and Oracle Statspack Reports http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101720 Breaking the Oracle I/O Performance Bottleneck http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS3885 Oracle Technology Essential White Papers Regularly Updated! http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101559 There are many more Oracle related white papers – especially covering Oracle RAC with IBM servers and IBM storage. 66 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Oracle Documentation Regularly Updated! My Oracle Support: http://support.oracle.com – 282036.1 - Minimum Software Versions and Patches Required to Support Oracle Products on IBM Power Systems 756671.1 - Oracle Recommended Patches -- Oracle Database Oracle Reference Manuals: http://otn.oracle.com/documentation/index.html Oracle Certification Info (on MOS as well): http://otn.oracle.com/support/metalink/index.html 67 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX AIX 5.3 Product Documentation. – http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.pseries.doc/hardware.htm AIX 6.1 Product Documentation – http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp AIX 7.1 Product Documentation – http://publib.boulder.ibm.com/infocenter/aix/v7r1/index.jsp – http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=sg247910 (IBM AIX Version 7.1 Differences Guide) IBM Wikis – https://www.ibm.com/developerworks/wikis/dashboard.action AIX Wiki – http://www.ibm.com/developerworks/wikis/display/WikiPtype/Home AIX Performance Tools (nmon, nmon analyser/consolidator, etc) – http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmon AIX DeveloperWorks – http://www.ibm.com/developerworks/aix AIX multiple page supprt – http://www-03.ibm.com/systems/resources/systems_p_os_aix_whitepapers_multiple_page.pdf Tuning IBM AIX 5L V5.3 and AIX 6.1 for Oracle Database on POWER systems – http://www-304.ibm.com/partnerworld/wps/servlet/ContentHandler/whitepaper/aix/oracle/performance_analysis PowerVM Wiki – https://www.ibm.com/developerworks/wikis/display/virtualization/Home 68 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX / POWER AIXpert Blog on Local, Near and Far Memory https://www.ibm.com/developerworks/mydeveloperworks/blogs/aixpert/entry/local_near_far_memory_part _1_large_power7_boxes_more_local_memory26?lang=en Oracle Database and 1 TB Segment Aliasing (TD105761) http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105761 IBM EnergyScale for POWER7 Processor-Based Systems ftp://public.dhe.ibm.com/common/ssi/ecm/en/pow03039usen/POW03039USEN.PDF Active Memory Expansion: Overview and Usage Guide ftp://ftp.software.ibm.com/common/ssi/sa/wh/n/pow03037usen/POW03037USEN.PDF IBM PowerVM Virtualization Active Memory Sharing http://www.redbooks.ibm.com/abstracts/redp4470.html?Open IBM System p Advanced POWER Virtualization (PowerVM) Best Practices http://www.redbooks.ibm.com/abstracts/redp4194.html?Open Power Systems Enterprise Servers with PowerVM Virtualization and RAS http://www.redbooks.ibm.com/abstracts/sg247965.html?Open 69 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America AIX APARs to check IZ88711: BIND64 CORES WITH -BLAZY OPTION ON AIX61 IZ91983: LOCKL PERFORMANCE ISSUE IZ94396: THERE IS A TIMING ISSUE BETWEEN THE SYNC DAEMON AND A MAPPED FIL APPLIES TO AIX 6100-06 IZ97088: SMT 4 scheduling fix / improvement IV10172: WAITPROC IDLE LOOPING CONSUMES CPU IV06194: SRAD LOAD BALANCING ISSUES ON SHARED LPARS IV10259: MISCELLANEOUS DISPATCHER/SCHEDULING PERFORMANCE FIXES IV03903: Address space lock contention issue IZ76101: Scale Light Load borrowing (Multi-SRAD Scaling issue) IV11857: Slow startup of AIO processes (workaround: use kernelized AIO, or increase aio_minservers and set aio_server_inactivity to 86400. IZ71987 (AIX6.1), IZ67445 (AIX 5.3 TL12): Paging Space Growth May Occur Unexpectedly With 64K (medium) Pages Enabled IV11261: SYSTEM CRASH IN AS_FORK_ALIAS IF ESID_ALLOCATOR IS ENABLED IV23851 (AIX 6.1), IV23859 (AIX 7.1): APPLICATIONS RUN SLOWLY WITH HIGH SYSTEM TIME IV20880: This problem requires use of shared symbol table and has been seen on AIX 6.1 TL7 and 7.1 TL1 when using Oracle 11gR2 with on-line patching. This includes versions 11.2.0.2 / 11.2.0.3 with Oracle Patch 13443029. IV09580 (AIX 6.1 TL07), IV09541 (AIX 7.1 TL01): FILE.ATION OVERFLOW REPORTED IN ERROR WHILE LINKING LARGE binaries IV26272: 4K page size working storage page stealing when the 64K pagesize is in a maxpin condition and there are 4K file cache pages available for stealing. AIX APARs are available here: http://www-933.ibm.com/support/fixcentral/ 70 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Important Oracle BUGs (11gR2) to check in context of AIX Mutex Wait: Bug 10411618: ADD DIFFERENT WAIT SCHEMES FOR MUTEX WAITS Note: 11.2.0.2.2 PSU breaks this patch and additional patch (12431716) is required Master Note: WAITEVENT: "library cache: mutex X" [ID 727400.1] Bug 12740358 : DBMS_UTILITY.FORMAT_CALL_STACK is Still Slower Than 10G. This issue can be observed in AIX as a high number of system calls – millions – to functions like “sigaction()”. Bug 9842771: Wrong SREADTIM and MREADTIM statistics in AUX_STATS$ Bug 12596494: Generally Higher CPU Usage in 11.2.0.2 than 10.2.0.4 MOS 1062676.1 - ORAAGENT or ORAROOTAGENT High Resource (CPU, Memory etc) Usage 11.2.0.3 RAC shows high CPU usage in ologgerd and heavy write to $GRID_HOME/crf/db - 9+MB/s. Patch 11.2.0.3.1 GI+PSU resolves this issue 71 September 2012 © 2012 IBM Corporation Advanced Technical Skills (ATS) North America Q&A Presenter email address: dannert <at> us <dot> ibm <dot> com 72 September 2012 © 2012 IBM Corporation