Best Practices to Optimize Your Domino Server Performance Andy Pedisich Technotics © 2011 Wellesley Information Services. All rights reserved. In This Session ... • • • Performance is a tangible, measurable phenomenon Makes the difference between providing excellent service and having disgruntled users Your users won’t mention it when performance is rocking They probably won’t even notice it, which is fine by me But they will be vocal about it when performance is poor This session is designed to help Administrators optimize the performance of their Domino servers 1 What We’ll Cover … • • • • • Understanding the issues Maxing the speed of mail Making the Web swing Tuning Domino for dominance Wrap-up 2 1. Don’t Trust Operating System Teams for Statistics • Some administrators rely on the operating system teams to tell them important statistical information about servers This is a major mistake, since the OS teams can’t provide important information about Domino, such as: Concurrent users per hour Server transactions per hour Domino cluster replication statistics These are all important to know if you are going to tune your environment 3 Collect Your Own Domino Statistics • Designate one server in your domain by creating a Statistics Collection document in the EVENTS4.NSF That’s the Monitoring Configuration Database Collect stats ever 60 minutes 4 Add the Collect Task to the Server Collecting Stats • • • Enter a LOAD COLLECT from the console of the collecting server to get things started Add the COLLECT task to the SERVERTASKS= parameter of the collecting server Enter the following command to start the collecting immediately TELL COLLECTOR COLLECT This will ensure that everything is working correctly because it will force statistics to show in STATREP.NSF 5 2. Get Off on the Right Foot • • • Performance is often based on user perceptions Sometimes a performance problem that is perceived as a server problem is actually a problem with the client or mail file Enter these parameters into the NOTES.INI of the client CLIENT_CLOCK=1 DEBUG_CONSOLE=1 You’ll see a window like this open when you start the Lotus Notes client 6 Cryptic, but Useful Information • • • You’ll see what the client is doing How long it takes to access servers, and perform regular tasks Watch for “indexing” that takes more than 20 seconds when deleting mail or moving a document to a folder It’s a sign of having a lot of documents in the mail file’s inbox folder We’ll talk more about that in a few moments If you want to capture the debug information to a file, use this parameter in the NOTES.INI on the Notes client DEBUG_OUTFILE=(name of file) 7 What We’ll Cover … • • • • • Understanding the issues Maxing the speed of mail Making the Web swing Tuning Domino for dominance Wrap-up 8 3. Control Threads on Large Mail Messages • • If a big message is sent to a large group of people, all other mail backs up until it’s delivered to all recipients That’s because a bunch of threads will be used to process the message to the group This is a parameter you can use in the NOTES.INI of the server that will make the router use one thread, leaving other threads to deliver other mail RouterMaxConcurrentDeliverySize=<size in bytes> Down side – big messages might take longer to deliver Good side – frees up the router for other messages 9 4. Use Multiple Mail.boxes • • • Server tasks need exclusive access to mail.box They lock it to prevent access by other processes Other processes must wait for an unlock before they run Multiple mail.boxes removes contention for mail.box Lets multiple concurrent processes act on messages Increases server throughput Set this in server configuration document, SMTP/Basics 10 The Challenge • • How many mail.boxes are right for your server? As a general rule, I typically use two by default I have not seen any problems using this rule of thumb But some administrators do not want to make a configuration more complicated than it needs to be How does one determine the right number? 11 Check Stats to See If You Need Additional Mail.boxes • Check two critical Mail.Mailbox stats to determine if more mail.boxes are needed Mail.Mailbox.Accesses Total number of times threads accessed any mailbox on the server Mail.Mailbox.AccessConflicts Total number of times a thread attempting to access a mailbox had to wait because the number of concurrent threads exceeded the number of mailboxes configured Example: Two mailboxes are configured and three concurrent accesses occur. This results in a conflict stat being incremented. 12 Calculation to Determine Additional Mail.boxes • • Do the following calculation (Mail.Mailbox.AccessConflicts/Mail.Mailbox.Accesses) x 100 The result should be less than 2 If it is consistently greater than 2, another new mailbox should be configured After adding four mail.boxes, the improvement gained with each additional mail.box is increasingly smaller But who wants to calculate these stats every day? There’s a better way! 13 New Version of STATREP.NSF • • Your conference CD has a new version of the Monitoring Results template “Technotics Statrep 2011” You’ll find a view called Mail.box conflicts that shows the critical calculation You can check on all your servers to ensure configuration meets the email demands of your domain 14 Going from One Mail.box to More • If a server is configured for only one mail.box and you go to additional ones, Domino keeps mail.box and adds a mail1.box and a mail2.box Domino will create the two legitimate boxes but will leave the mail.box To eliminate confusion, delete the old mail.box Don’t forget to make sure there isn’t mail in the mail.box you are going to delete Use Admin client and use the file listing for mailboxes only to access the old mail.box to make sure it’s empty 15 Opening Mail.boxes • Just remember that using multiple mail.boxes causes some unusual behavior If you do a File Open Database MAIL.BOX You might get mail2.box You might get mail1.box Use the Notes Administration client’s messaging tab to open mail.boxes so that you are sure you are opening what is actually being used by your server 16 Another Issue with Multiple Mail.boxes • • When multiple mail.boxes are enabled, they are titled mail1.box, mail2.box, and so on The router uses these mail.boxes only for routing messages However, some third-party applications require the existence of a “mail.box” To force Domino to use mail.box when multiple mail.boxes are enabled, add this parameter to the Notes.ini file: Mail_Enable_Mailbox_Compatibility=1 When set, this parameter keeps the “mail.box” file and will still create mail2.box, mail3.box, and so on 17 Transaction Logging and the Mail.boxes • • If you’re using transaction logging, make sure that transaction logging is disabled for the mail.boxes There are some known issues with servers crashing when transaction logging is enabled But often new mail.boxes are created and administrators forget to disable this feature Starting in release 6.0.4/6.5.4 there is a parameter to prevent this from happening 18 Typo in Administration Help • • The parameter is: MailBoxDisableTXNLogging=1 It works really well Be aware that the parameter is spelled incorrectly in both the ND6 and ND7 version of the Administrator Help database 19 5. Keep as Few Documents in Inbox as Possible • • • We all know large mail files are a problem, right? This is true if only from the perspective of disk space But the issue is bigger than just disk space And here’s the proof you can take back to your domain IBM/Lotus did a study using Domino on the iSeries called: Sizing Large-Scale Domino Workloads on iSeries They found that reducing the number of documents kept in the inbox Reduces overall CPU usage Improves response time And can dramatically improve startup/recovery performance 20 It’s Very Logical When You Think About It • • In terms of performance, the Inbox is the most “expensive” container in a mail file The Inbox folder contains all new messages a mail file receives It must be updated each time a user opens the file Or clicks Refresh to see new mail The more documents kept in the Inbox folder, the more expensive it is to refresh the view of it Reducing the number of documents in the folder reduces the CPU and main storage required to update the view of it 21 What Can You Do About It? • Three things you can do about this problem First, when a user calls and says that Notes is slow, ask this question: How many messages are in your inbox? This should be a standard part of your help desk response Urge them to keep no more than 90 days in the inbox Use CLIENT_CLOCK=1 to demonstrate how indexing the inbox is a major problem 22 The Art of Archiving • Second, archive documents in the inbox that are more than 90 days old You can do this with a policy Have it take effect using COMPACT -A 23 Use Release 8.x Inbox Manager • • • Third, control the number of messages in the inbox using settings in the AdminP section of the server document AdminP can start an agent in the user’s mail file to remove messages from the Inbox This can also be controlled from policies The messages are not deleted They are still in the All Documents view Users need to know where the messages can be found 24 6. Control User Polling for New Mail • • Some users want to know if they have new mail They configure a user preference to check for new mail every couple of minutes If there are a lot of users on a server, a setting like this can really hurt performance 25 Override the User Configuration for New Mail Polling • • • Add this parameter to mail server’s NOTES.INI to control how often a client can check for new mail MinNewMailPoll= (number of minutes) Experiment with this number, but 15 is safe This parameter overrides the user’s selection in the Mail Setup dialog box This can prevent frequent polling from affecting server performance Parameters like this one should be in every server’s NOTES.INI That’s why they belong in a server configuration document 26 What We’ll Cover … • • • • • Understanding the issues Maxing the speed of mail Making the Web swing Tuning Domino for dominance Wrap-up 27 7. Allow Concurrent Web Agents • • Agents that are initiated by an HTTP app aren’t managed by the AMGR task They are managed by the HTTP task By default, they do not run concurrently, which produces performance bottlenecks when multiple users hit the server Allow them to run concurrently using the server document Internet Protocols/Domino Web Engine 28 8. Restrict Potential Problems with Long Web Agents • While you’re editing the server document you might want to control how long a Web agent can run The default is 0 seconds, meaning it can run for as long as it wants to run The key here is merge the needs of your users with the requirements for keeping the Web server running well for everyone 29 9. Default to the Lite DWA/iNotes • • If your server is running Release 8.0.1 or higher, consider using iNotes lite as the default setting by using this parameter in the server’s NOTES.INI iNotes_WA_DefaultUI=dwa_lite It’s slicker, faster and a little light on features But users can easily quickly switch to the full version when needed 30 10. Disable Resumable SSL Sessions • • • • When a server creates SSL keys for encryption, it’s a processorintensive task It’s best to do it as infrequently as possible By default, Domino caches SSL information from the 50 most recently negotiated sessions Use this parameter to make Domino keep an unlimited amount of key caches SSL_RESUMABLE_SESSIONS=0 There is a pretty big improvement on performance with no effect whatsoever on security 31 11. Disable HTTP Server Logging • • We’ve found many instances where DOMLOG.NSF was well over 2GB And it was nearly impossible to wait for it to open Because it had never actually been opened before If you don’t look at the logs, improve performance by disabling the HTTP server logging It’s in the HTTP section of the server document Disable both the Enable Logging and Domlog.nsf 32 What We’ll Cover … • • • • • Understanding the issues Maxing the speed of mail Making the Web swing Tuning Domino for dominance Wrap-up 33 12. Use Transaction Logging • • Transaction logging can increase performance significantly Enable transaction logging in the server document T-Logs might already be in use in Archive logging style if servers are backed up incrementally Otherwise, use the Circular logging style so that transaction logging reuses space But be careful where you put the logs 34 Location of Transaction Logs • • Transaction logs work best if placed on Raid 1 disks These are mirrored drives And should be local to the server These logs should not be placed On the Wintel system drive C: On the same drive as the Domino data On a SAN drive 35 13. Change the View Temp File Default Folder • • • By default, Domino generates temp files in the server’s temporary folder when it rebuilds a view The default is usually somewhere on the system drive C: When using Windows servers If the system doesn’t have a temp folder, Domino puts the temp files in the Domino data folder Because of the disk I/O and disk space required, you should change the location to a different drive Use this parameter: VIEW_REBUILD_DIR=(drive and folder location) Make sure you have plenty of space available 36 Make Sure There Is Plenty of Space Available • If Domino calculates that there isn’t enough space on the temporary folder’s drive, it uses a slower method to rebuild the view You’ll see the message below in the log and console It’s best to remedy this with more disk space or performance will actually drop Warning: Unable to use optimized view rebuild for view due to insufficient disk space at directory. Estimate may need x million bytes for this view. Using standard rebuild instead. 37 14. Disconnect Idle Users • • • An idle user stays connected to a server for 4 hours This takes up valuable server resources Use this parameter to drop idle users faster SERVER_SESSION_TIMEOUT=(number of minutes) Users will not have to re-enter a password if they become active after the time limit The minimum recommended setting is 30-45 minutes A lower setting may negatively impact server performance IBM/Lotus says it’s not needed in R8 But I like to use the parameter regardless It gives you more realistic user concurrency stats 38 1,000 Users — Server_session_timeout=60 • Comparison of memory usage on a Domino server 39 650 Users — Server_session_timeout=30 • Domino server memory comparison with and without the parameter set to 30 40 650 Users — Server_session_timeout=30 (cont.) • CPU Utilization comparison with and without the parameter There is definitely more CPU utilization when logging idle users off after 30 minutes of inactivity The resource savings and reduction in virtual memory usage is worth it 41 15. Prevent Simple Search DBs with No Full Text Index • • Simple search is the type of processing used when a user searches a non-full text indexed application The simple search algorithm does the job, but is not very efficient It can significantly impact performance on a Domino server Simple search is simply awful sometimes For some applications, the ability to search documents may not really be necessary However, the default functionality still allows users to do simple searches on applications that are non-full text indexed 42 Preventing Simple Searches • Administrators can now prevent simple searches if an application is not full text indexed Enable this by selecting “Don’t allow simple search” on the Advanced tab of Database Properties 43 Preventing Simple Searches (cont.) • If users attempt to simple search a database with this option enabled, they will receive an error message as shown below This will probably generate a few help desk calls Be prepared by providing info about this feature if you’re going to deploy it 44 Property Doesn’t Replicate • Keep in mind that the “Don’t allow simple search” property does not replicate for existing database replicas This lets you decide selectively whether each replica should have the setting enabled The setting is carried over to new replicas and copies 45 16. Don’t Maintain Read Marks on All Databases • • Replication of unread marks was primarily designed for mail databases If you don’t need them, don’t replicate them because it can significantly slow database performance For example, keep them switched off in Help, LOG.NSF, NAMES.NSF and any reference application Work with your developers to develop standards for enabling or disabling the feature 46 Remember to Compact to Activate • • The setting is in the Advanced Properties section of databases If you select or deselect the “Don’t maintain unread marks” property, the database must be compacted Otherwise, the changed setting has no effect 47 17. Manage Agents to Control Resources • • Check the view in the “Technotics Statrep 2011” exposing how much time the Agent Manager uses Use it to better understand how busy the agent manager is Mail servers are vulnerable to pressure by agents if users have designer or higher privileges 48 In Large Domains Consider Using an Agent Server • • • Some large domains with several application servers host the same application on several servers There is a danger of replication conflicts when two different agents run on two different servers Create an agent server where all agents run Replicate all applications requiring agents back to the agent server Since the agent server has no real users, you can run 10 concurrent agents 24 hours a day 49 18. Make Sure Partitioned Servers Have Enough Resources • • • One of the servers on a partitioned server can use a large amount of system resources The resources are denied to other partitioned servers on that computer Use OS level tools to evaluate server performance If there are issues, you must consider moving the offending server to a different computer If it’s strictly a resource problem causing slow disk performance, you can move the data directory to a different disk drive 50 19. Manage Cluster Synchronization and Load Balancing • Cluster replication keeps the database on the primary server in sync with the replica on the failover server Cluster replication is an event-driven process that occurs automatically when a change is made to a database It’s vital that these replicas are synchronized But, by default, servers in a cluster only have a single cluster replicator thread between them 51 Can the Single Cluster Replicator Keep Up? • • Occasionally there is too much data changing to be replicated efficiently by a single cluster replicator If cluster replicators are too busy, replication is queued until more resources are available and databases get out of sync Then a database on a failover server does not have all the data it’s supposed to have If users must failover to a replica on a different server, they think their information is gone forever! All because replicas will not have the same content Users need their cluster insurance! 52 How Many Is Enough? • • • Adding a cluster replicator will help fix this problem Use this parameter in the NOTES.INI CLUSTER_REPLICATORS=# Add one dynamically from the console using this command Load clrepl The challenge is to have enough cluster replicators without adding too many Adding too many clusters will have a negative effect on server performance Here are some important statistics to watch so that you can make a wise decision about how many to add! 53 Key Stats for Vital Information About Cluster Replication Statistic What It Tells You Acceptable Values Replica.Cluster. SecondsOnQueue Total seconds that last DB replicated spent on work queue < 15 sec – light load < 30 sec – heavy Replica.Cluster. SecondsOnQueue.Avg Average seconds a DB spent Use for trending on work queue Replica.Cluster. SecondsOnQueue.Max Maximum seconds a DB spent on work queue Replica.Cluster. WorkQueueDepth Current number of databases Usually zero awaiting cluster replication Replica.Cluster. WorkQueueDepth.Avg Average work queue depth since the server started Use for trending Replica.Cluster. WorkQueueDepth.Max Maximum work queue depth since the server started Use for trending Use for trending 54 What to Do About Stats Over the Limit • • Acceptable Replica.Cluster.SecondsOnQueue Queue is checked every 15 seconds, so under light load should be less than 15 Under heavy load, if the number is larger than 30, another cluster replicator should be added If the above statistic is low and Replica.Cluster. WorkQueueDepth is constantly higher than ten … Perhaps your network bandwidth is too low Consider setting up a private LAN for cluster replication traffic 55 Stats That Have Meaning but Have Gone Missing • The Technotics Statrep 2011 tracks the key statistics you need to help adjust your clusters It also has a column for the Server Availability Index 56 Balance the Load Between Cluster Members • • • Make sure clustered servers don’t run out of resources Force failover to another server if server reaches predetermined level of resource usage Use the following parameter to cause failover when the server is 90% busy SERVER_AVAILABILITY_THRESHOLD=90 Adjust this parameter based on your own experiences 57 20. Plan on a Monthly Restart for Domino Servers • • • Consider regular monthly restarts of Domino servers Not just Wintel-based servers, all servers Server memory allocation and shared memory fragmentation can occur over time Plus there could be undocumented memory leaks Regular restarts will help ensure your Domino servers are running as efficiently as possible 58 What We’ll Cover … • • • • • Understanding the issues Maxing the speed of mail Making the Web swing Tuning Domino for dominance Wrap-up 59 Resources • • • • How to limit the number of threads used for sending large messages www-01.ibm.com/support/docview.wss?uid=swg21108351 Always use the proper directory design for the highest release of Domino in your domain www-01.ibm.com/support/docview.wss?uid=swg21304915 Notes/Domino Best Practices: Transaction Logging www-01.ibm.com/support/docview.wss?uid=swg27009309 Lotus Domino server maintenance tips www-01.ibm.com/support/docview.wss?uid=swg21248830 60 7 Key Points to Take Home • • • • • • • Make sure you are collecting statistics from all servers hourly so that you can observe trends and remediate problems Turn on client debugging when it looks like servers are running fine, but a user is complaining about slow performance Keep as few documents as possible in a mail file’s inbox A default to iNotes Lite will ensure good performance for Web users Use transaction logging even if it is not part of your backup/restore strategy Disconnect idle users Don’t maintain read marks on all databases 61 Your Turn! How to contact me: Andy Pedisich Andyp@technotics.com www.andypedisich.com 62