Checkpoint and Redo Tuning Copyright © 2006, Oracle. All rights reserved. Objectives After completing this lesson, you should be able to do the following: • Diagnose checkpoint and redo issues • Implement Fast Start MTTR Target • Monitor the performance impact of Fast Start MTTR Target • Tune the redo chain • Size the redo log file • Size the redo log buffer 11-2 Copyright © 2006, Oracle. All rights reserved. Checkpoint and Redo • Checkpoint: – Transfers changed data to disk – Makes buffer space available for more data blocks – Controls mean time to recover (MTTR) • Redo: – Recovers committed data not on disk – Provides uncommitted data for rollback operation – Provides source data for complete recovery 11-3 Copyright © 2006, Oracle. All rights reserved. Oracle Database Architecture Instance SGA Fixed size nK Default Keep Recycle Redo Sort Global Flash Large Java Streams buffer buffer buffer buffer Shared ASH log pool buffer extent context back pool pool pool cache caches cache cache buffer pool pool buffer PMON SMON RECO MMON MMAN MMNL QMNC LGWR CTWR ARCn S000 RVWR D000 FMON Qnnn Password file Control files Redo log files Temp SYSTEM SYSAUX Data file Change tracking file Undo 11-4 CKPT CJQ0 PSP0 Spfile DBWn Copyright © 2006, Oracle. All rights reserved. Flashback logs Archive log files Checkpoint Architecture The checkpoint architecture provides: • Checkpoint position: A starting position in the redo logs to begin recovery • Checkpoint target: A calculated position in the redo logs where the checkpoint position should be • An estimated mean time to recover • A high-performance incremental checkpoint • A full checkpoint when required 11-5 Copyright © 2006, Oracle. All rights reserved. Database Writer (DBWn) Process Background Information SGA Database buffer cache Database Writer (DBWn) Data files 11-6 DBWn writes when one of the following events occurs: • Checkpoint • Dirty buffers’ threshold • No free buffers • Timeout • RAC ping request • Tablespace OFFLINE • Tablespace READ ONLY • Table DROP or TRUNCATE • Tablespace BEGIN BACKUP Copyright © 2006, Oracle. All rights reserved. Checkpoint (CKPT) Process Responsible for: • Signaling DBWn at checkpoints • Updating data file headers with checkpoint information • Updating control files with checkpoint information Checkpoint (CKPT) 11-7 Copyright © 2006, Oracle. All rights reserved. SGA Database buffer cache Database Writer (DBWn) Redo Architecture Redo is designed for minimum performance impact. • Server processes write to the redo log buffer: – Circular buffer – Memory-to-memory writes – Small fast writes • LGWR writes log buffer blocks to log files: – Circular files – Memory-to-disk write – Full blocks if possible • ARCn copies log files to archive log files: – Disk-to-disk writes – Multiple archiver processes can be started. 11-8 Copyright © 2006, Oracle. All rights reserved. The Redo Log Buffer Redo log buffer Database buffer cache Shared pool Library cache Data dictionary cache User global area Server process LGWR ARCn Control files SQL> UPDATE employees 11-9 2 SET salary=salary*1.1 3 WHERE employee_id=736; Data files Redo log files Copyright © 2006, Oracle. All rights reserved. Archived log files Redo Log Files and LogWriter SGA Redo log buffer Log Writer (LGWR) Group 1 Group 2 Redo log files: • Record changes to the database • Should be multiplexed to protect against loss LogWriter writes: • At commit • When one-third full Group 3 • Every three seconds • Before DBWn writes Redo log files 11-10 Copyright © 2006, Oracle. All rights reserved. Archiver (ARCn) • • • Is an optional background process Automatically archives online redo log files when ARCHIVELOG mode is set for the database Preserves the record of all changes made to the database SGA Redo log buffer LogWriter (LGWR) Archiver (ARCn) 11-11 Copyright © 2006, Oracle. All rights reserved. Incremental Checkpointing t2 Incremental checkpoint b1 Checkpoint queue b2 b3 b4 Redo stream c2 c1 Checkpoint position c3 c4 Target RBA t1 FAST_START_MTTR_TARGET=T t1 + t2 < T 11-12 Copyright © 2006, Oracle. All rights reserved. c5 Tail of redo thread Incremental Checkpoint and Log File Size • • • The maximum checkpoint lag is: 90%*(SUM(log_sizei) – MAX(log_sizei)) Checkpoint lag is designed to prevent log switch from blocking. A few small log files can result in excess checkpoint writes. Current tail 9,000 blocks Log file#2: 10,000 blocks Target checkpoint 11-14 Log file#1: 10,000 blocks Copyright © 2006, Oracle. All rights reserved. Adjusting the Checkpoint Rate The checkpoint rate is determined by the most aggressive of: • FAST_START_MTTR_TARGET parameter (only on Enterprise Edition) • Size of the smallest redo log file • LOG_CHECKPOINT_TIMEOUT parameter (overrides FAST_START_MTTR_TARGET if set) • LOG_CHECKPOINT_INTERVAL parameter (overrides FAST_START_MTTR_TARGET if set) 11-15 Copyright © 2006, Oracle. All rights reserved. Redo Logfile Size Advisor • This advisor determines the optimal size of your online redo logs: – No additional checkpoint writes beyond those caused by FAST_START_MTTR_TARGET. • 11-16 FAST_START_MTTR_TARGET must be set. View name V$INSTANCE_RECOVERY Column name OPTIMAL_LOGFILE_SIZE Description This column shows the redo log file size (in megabytes) that is considered as minimal. Copyright © 2006, Oracle. All rights reserved. Impact of the Checkpoint Rate From the V$ views: SELECT c.value-nc.value FROM V$SYSSTAT c, V$SYSSTAT nc WHERE c.name = 'physical writes' AND nc.name = 'physical writes non checkpoint'; From the Statspack report: Statistic Total -------------------------- -----------------physical writes 47,308 physical writes non checkpoint 44,674 11-17 Copyright © 2006, Oracle. All rights reserved. Automatic Checkpoint Tuning • • • • 11-19 There is no longer a continuous manual tuning effort. Automatic checkpoint tuning is the best-effort checkpointing, without much overhead. It reduces average recovery time by making use of unused bandwidth. Automatic checkpoint tuning is enabled when FAST_START_MTTR_TARGET is not explicitly set to zero. Copyright © 2006, Oracle. All rights reserved. ADDM Report: Checkpoints 11-20 Copyright © 2006, Oracle. All rights reserved. ADDM Report: Redo Logs 11-21 Copyright © 2006, Oracle. All rights reserved. Statspack and AWR Reports Checkpoint and redo show certain symptoms: • Alert log shows log switch not complete • I/O symptoms caused by excessive checkpoints • Log switches per hour > 4 • log file and latch:redo in Top Timed Events Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time ----------------------- ------- ----------- ------ -----CPU time 551 56.8 log file parallel write 3,899 201 52 20.7 log file sync 823 58 71 6.0 latch: redo copy 635 44 70 4.6 latch: redo allocation 1,109 42 38 4.4 11-22 Copyright © 2006, Oracle. All rights reserved. Check Parameters Review checkpoint parameters for reasonable values: Parameter Name Begin value ---------------------------- ----------fast_start_mttr_target 25 Use V$MTTR_TARGET_ADVICE for optimum value: SQL> SELECT mttr_target_for_estimate, 2> estd_total_ios, estd_total_io_factor 3> FROM V$MTTR_TARGET_ADVICE ORDER BY 1 MTTR_TARGET_FOR_ESTIMATE ESTD_TOTAL_IOS ESTD_TOTAL_IO_FACTOR ------------------------ -------------- -------------------20 2436690 1.0739 22 2330674 1.0272 25 2268973 1 37 2204817 .9717 62 2181841 .9616 11-24 Copyright © 2006, Oracle. All rights reserved. Check the Redo Log Size Review the current size of the redo log files: Check alert log for log switch rate: 11-25 Copyright © 2006, Oracle. All rights reserved. Redo Log Chain Tuning Redo tuning starts with the slowest part. • Reduce the amount of redo generated. • Check archive logging (waits for archiving needed). • Check the redo log file size and log switch rate. • Check the checkpoint parameters. • Look for log space requests. • The Redo Buffer Allocation Retries value should be near 0 and should be less than 1% of redo entries. 11-26 Copyright © 2006, Oracle. All rights reserved. Reducing Redo Operations Ways to avoid logging bulk operations in the redo log: • Direct Path loading without archiving does not generate redo. • Direct Path loading with archiving can use NOLOGGING mode. • Direct Load INSERT can use NOLOGGING mode. • Some SQL statements can use NOLOGGING mode. 11-28 Copyright © 2006, Oracle. All rights reserved. Increasing the Performance of Archiving • Share the archiving work during a temporary increase in workload: ALTER SYSTEM ARCHIVE LOG ALL TO <log_archive_dest> • Increase the number of archiver processes with LOG_ARCHIVE_MAX_PROCESSES. • Multiplex the redo log files, and add more members. Change the number of archive destinations: – LOG_ARCHIVE_DEST_n • 11-30 Copyright © 2006, Oracle. All rights reserved. Diagnostic Tools V$ARCHIVE_DEST V$ARCHIVED_LOG V$ARCHIVE_PROCESSES LOG_ARCHIVE_DEST_n LOG_ARCHIVE_DEST_STATE_n 11-32 Copyright © 2006, Oracle. All rights reserved. Archived logs Redo Log Groups and Members LGWR Group 1 Group 2 Group 3 Disk 1 Member Member Member Disk 2 Member 11-33 Member Copyright © 2006, Oracle. All rights reserved. Member Online Redo Log File Configuration • • • • 11-34 Size redo log files to minimize contention. Provide enough groups to prevent waiting. Store redo log files on separate, fast devices. Monitor the redo log file configuration with: – V$LOGFILE – V$LOG – V$LOG_HISTORY Copyright © 2006, Oracle. All rights reserved. Monitoring Online Redo Log File I/O 11-35 Copyright © 2006, Oracle. All rights reserved. Sizing the Redo Log Buffer The size of the redo log buffer is determined by: • LOG_BUFFER parameter • Remaining space in the fixed area granule Default value: Either 2 MB or 128 KB the value of CPU_COUNT, whichever is greater 11-36 Copyright © 2006, Oracle. All rights reserved. Diagnosing Redo Log Buffer Inefficiency SQL> UPDATE employees 2 SET salary=salary*1.1 3 WHERE employee_id=736; Server process Server process LGRW SQL> DELETE FROM employees 2 WHERE employee_id=7400; ARCH Redo log files 11-37 Copyright © 2006, Oracle. All rights reserved. Archived log files Diagnosing Log Buffer Problems V$SESSION_WAIT Redo log buffer Log Buffer Space event V$SYSSTAT Redo Buffer Allocation Retries Redo Entries 11-38 Copyright © 2006, Oracle. All rights reserved. Log Space Request Waits: Further Investigation Possible reasons for log space request waits: • There is disk I/O contention on redo log files. • LGWR is waiting on DBWn to complete the checkpointing of the required redo log file. • LGWR is waiting on ARCn to complete archiving of the required redo log file. • Log buffer is too small. 11-40 Copyright © 2006, Oracle. All rights reserved. Practice Overview: Diagnose Checkpoints and Redo This practice covers the following topics: • Diagnose checkpoint and redo issues • Resize log files • Adjust the checkpoint parameters 11-42 Copyright © 2006, Oracle. All rights reserved. Summary In this lesson, you should have learned how to: • Diagnose checkpoint and redo issues • Implement Fast Start MTTR Target • Monitor the performance impact of Fast Start MTTR Target • Implement multiple database writers • Tune the redo chain • Size the redo log file • Size the redo log buffer 11-43 Copyright © 2006, Oracle. All rights reserved.