Admin2011_Pedisich_Bestpracticestooptimize

Best Practices
to Optimize Your
Domino Server
Performance
Andy Pedisich
Technotics
© 2011 Wellesley Information Services. All rights reserved.
In This Session ...
•
•
•
Performance is a tangible, measurable phenomenon
 Makes the difference between providing excellent service and
having disgruntled users
Your users won’t mention it when performance is rocking
 They probably won’t even notice it, which is fine by me
 But they will be vocal about it when performance is poor
This session is designed to help Administrators optimize the
performance of their Domino servers
1
What We’ll Cover …
•
•
•
•
•
Understanding the issues
Maxing the speed of mail
Making the Web swing
Tuning Domino for dominance
Wrap-up
2
1. Don’t Trust Operating System Teams for Statistics
•
Some administrators rely on the operating system teams to tell
them important statistical information about servers
 This is a major mistake, since the OS teams can’t provide
important information about Domino, such as:
 Concurrent users per hour
 Server transactions per hour
 Domino cluster replication statistics
 These are all important to know if you are going to tune your
environment
3
Collect Your Own Domino Statistics
•
Designate one server in your domain by creating a Statistics
Collection document in the EVENTS4.NSF
 That’s the Monitoring Configuration Database
 Collect stats ever 60 minutes
4
Add the Collect Task to the Server Collecting Stats
•
•
•
Enter a LOAD COLLECT from the console of the collecting server
to get things started
Add the COLLECT task to the SERVERTASKS= parameter of the
collecting server
Enter the following command to start the collecting immediately
 TELL COLLECTOR COLLECT
 This will ensure that everything is working correctly because
it will force statistics to show in STATREP.NSF
5
2. Get Off on the Right Foot
•
•
•
Performance is often based on user perceptions
 Sometimes a performance problem that is perceived as a server
problem is actually a problem with the client or mail file
Enter these parameters into the NOTES.INI of the client
 CLIENT_CLOCK=1
 DEBUG_CONSOLE=1
You’ll see a window like this open when you start the Lotus
Notes client
6
Cryptic, but Useful Information
•
•
•
You’ll see what the client is doing
 How long it takes to access servers, and perform regular tasks
Watch for “indexing” that takes more than 20 seconds when
deleting mail or moving a document to a folder
 It’s a sign of having a lot of documents in the mail file’s inbox
folder
 We’ll talk more about that in a few moments
If you want to capture the debug information to a file, use this
parameter in the NOTES.INI on the Notes client
 DEBUG_OUTFILE=(name of file)
7
What We’ll Cover …
•
•
•
•
•
Understanding the issues
Maxing the speed of mail
Making the Web swing
Tuning Domino for dominance
Wrap-up
8
3. Control Threads on Large Mail Messages
•
•
If a big message is sent to a large group of people, all other mail
backs up until it’s delivered to all recipients
 That’s because a bunch of threads will be used to process the
message to the group
This is a parameter you can use in the NOTES.INI of the server
that will make the router use one thread, leaving other threads to
deliver other mail
 RouterMaxConcurrentDeliverySize=<size in bytes>
 Down side – big messages might take longer to deliver
 Good side – frees up the router for other messages
9
4. Use Multiple Mail.boxes
•
•
•
Server tasks need exclusive access to mail.box
 They lock it to prevent access by other processes
 Other processes must wait for an unlock before they run
Multiple mail.boxes removes contention for mail.box
 Lets multiple concurrent processes act on messages
 Increases server throughput
Set this in server configuration document, SMTP/Basics
10
The Challenge
•
•
How many mail.boxes are right for your server?
 As a general rule, I typically use two by default
 I have not seen any problems using this rule of thumb
But some administrators do not want to make a configuration
more complicated than it needs to be
 How does one determine the right number?
11
Check Stats to See If You Need Additional Mail.boxes
•
Check two critical Mail.Mailbox stats to determine if more
mail.boxes are needed
 Mail.Mailbox.Accesses
 Total number of times threads accessed any mailbox on the
server
 Mail.Mailbox.AccessConflicts
 Total number of times a thread attempting to access a
mailbox had to wait because the number of concurrent
threads exceeded the number of mailboxes configured
 Example: Two mailboxes are configured and three
concurrent accesses occur. This results in a conflict stat
being incremented.
12
Calculation to Determine Additional Mail.boxes
•
•
Do the following calculation
 (Mail.Mailbox.AccessConflicts/Mail.Mailbox.Accesses) x 100
 The result should be less than 2
 If it is consistently greater than 2, another new mailbox
should be configured
 After adding four mail.boxes, the improvement gained with
each additional mail.box is increasingly smaller
But who wants to calculate these stats every day?
 There’s a better way!
13
New Version of STATREP.NSF
•
•
Your conference CD has a new version of the Monitoring Results
template “Technotics Statrep 2011”
You’ll find a view called Mail.box conflicts that shows the critical
calculation
 You can check on all your servers to ensure configuration
meets the email demands of your domain
14
Going from One Mail.box to More
•
If a server is configured for only one mail.box and you go to
additional ones, Domino keeps mail.box and adds a mail1.box and
a mail2.box
 Domino will create the two legitimate boxes but will leave the
mail.box
 To eliminate confusion, delete the old mail.box
 Don’t forget to make sure there isn’t mail in the mail.box you
are going to delete
 Use Admin client and use the file listing for mailboxes only to
access the old mail.box to make sure it’s empty
15
Opening Mail.boxes
•
Just remember that using multiple mail.boxes causes some
unusual behavior
 If you do a File  Open Database  MAIL.BOX
 You might get mail2.box
 You might get mail1.box
 Use the Notes Administration client’s messaging tab to open
mail.boxes so that you are sure you are opening what is
actually being used by your server
16
Another Issue with Multiple Mail.boxes
•
•
When multiple mail.boxes are enabled, they are titled mail1.box,
mail2.box, and so on
 The router uses these mail.boxes only for routing messages
 However, some third-party applications require the existence of
a “mail.box”
To force Domino to use mail.box when multiple mail.boxes are
enabled, add this parameter to the Notes.ini file:
 Mail_Enable_Mailbox_Compatibility=1
 When set, this parameter keeps the “mail.box” file and will
still create mail2.box, mail3.box, and so on
17
Transaction Logging and the Mail.boxes
•
•
If you’re using transaction logging, make sure that transaction
logging is disabled for the mail.boxes
 There are some known issues with servers crashing when
transaction logging is enabled
 But often new mail.boxes are created and administrators
forget to disable this feature
Starting in release 6.0.4/6.5.4 there is a parameter to prevent this
from happening
18
Typo in Administration Help
•
•
The parameter is:
 MailBoxDisableTXNLogging=1
 It works really well
Be aware that the parameter is spelled incorrectly in both the ND6
and ND7 version of the Administrator Help database
19
5. Keep as Few Documents in Inbox as Possible
•
•
•
We all know large mail files are a problem, right?
 This is true if only from the perspective of disk space
 But the issue is bigger than just disk space
 And here’s the proof you can take back to your domain
IBM/Lotus did a study using Domino on the iSeries called:
 Sizing Large-Scale Domino Workloads on iSeries
They found that reducing the number of documents kept in the
inbox
 Reduces overall CPU usage
 Improves response time
 And can dramatically improve startup/recovery performance
20
It’s Very Logical When You Think About It
•
•
In terms of performance, the Inbox is the most “expensive”
container in a mail file
 The Inbox folder contains all new messages a mail file receives
 It must be updated each time a user opens the file
 Or clicks Refresh to see new mail
The more documents kept in the Inbox folder, the more expensive
it is to refresh the view of it
 Reducing the number of documents in the folder reduces the
CPU and main storage required to update the view of it
21
What Can You Do About It?
•
Three things you can do about this problem
 First, when a user calls and says that Notes is slow,
ask this question:
 How many messages are in your inbox?
 This should be a standard part of your help desk response
 Urge them to keep no more than 90 days in the inbox
 Use CLIENT_CLOCK=1 to demonstrate how indexing the
inbox is a major problem
22
The Art of Archiving
•
Second, archive documents in the inbox that are more than 90
days old
 You can do this with a policy
 Have it take effect using COMPACT -A
23
Use Release 8.x Inbox Manager
•
•
•
Third, control the number of messages in the inbox using settings
in the AdminP section of the server document
 AdminP can start an agent in the user’s mail file to remove
messages from the Inbox
 This can also be controlled from policies
The messages are not deleted
 They are still in the All Documents view
Users need to know where the messages can be found
24
6. Control User Polling for New Mail
•
•
Some users want to know if they have new mail
They configure a user preference to check for new mail every
couple of minutes
 If there are a lot of users on a server, a setting like this can
really hurt performance
25
Override the User Configuration for New Mail Polling
•
•
•
Add this parameter to mail server’s NOTES.INI to control how
often a client can check for new mail
 MinNewMailPoll= (number of minutes)
 Experiment with this number, but 15 is safe
This parameter overrides the user’s selection in the Mail Setup
dialog box
 This can prevent frequent polling from affecting
server performance
Parameters like this one should be in every server’s NOTES.INI
 That’s why they belong in a server configuration document
26
What We’ll Cover …
•
•
•
•
•
Understanding the issues
Maxing the speed of mail
Making the Web swing
Tuning Domino for dominance
Wrap-up
27
7. Allow Concurrent Web Agents
•
•
Agents that are initiated by an HTTP app aren’t managed by the
AMGR task
 They are managed by the HTTP task
 By default, they do not run concurrently, which produces
performance bottlenecks when multiple users hit the server
Allow them to run concurrently using the server document
 Internet Protocols/Domino Web Engine
28
8. Restrict Potential Problems with Long Web Agents
•
While you’re editing the server document you might want to
control how long a Web agent can run
 The default is 0 seconds, meaning it can run for as long as it
wants to run
 The key here is merge the needs of your users with the
requirements for keeping the Web server running well for
everyone
29
9. Default to the Lite DWA/iNotes
•
•
If your server is running Release 8.0.1 or higher, consider using
iNotes lite as the default setting by using this parameter in the
server’s NOTES.INI
 iNotes_WA_DefaultUI=dwa_lite
It’s slicker, faster and a little light on features
 But users can easily quickly switch to the full version when
needed
30
10. Disable Resumable SSL Sessions
•
•
•
•
When a server creates SSL keys for encryption, it’s a processorintensive task
 It’s best to do it as infrequently as possible
By default, Domino caches SSL information from the 50 most
recently negotiated sessions
Use this parameter to make Domino keep an unlimited amount of
key caches
 SSL_RESUMABLE_SESSIONS=0
There is a pretty big improvement on performance with no effect
whatsoever on security
31
11. Disable HTTP Server Logging
•
•
We’ve found many instances where DOMLOG.NSF was well over
2GB
 And it was nearly impossible to wait for it to open
 Because it had never actually been opened before
If you don’t look at the logs, improve performance by disabling
the HTTP server logging
 It’s in the HTTP section of the server document
 Disable both the Enable Logging and Domlog.nsf
32
What We’ll Cover …
•
•
•
•
•
Understanding the issues
Maxing the speed of mail
Making the Web swing
Tuning Domino for dominance
Wrap-up
33
12. Use Transaction Logging
•
•
Transaction logging can increase performance significantly
Enable transaction logging in the server document
 T-Logs might already be in use in Archive logging style if
servers are backed up incrementally
 Otherwise, use the Circular logging style so that transaction
logging reuses space
 But be careful where you put the logs
34
Location of Transaction Logs
•
•
Transaction logs work best if placed on Raid 1 disks
 These are mirrored drives
 And should be local to the server
These logs should not be placed
 On the Wintel system drive C:
 On the same drive as the Domino data
 On a SAN drive
35
13. Change the View Temp File Default Folder
•
•
•
By default, Domino generates temp files in the server’s temporary
folder when it rebuilds a view
 The default is usually somewhere on the system drive C: When
using Windows servers
 If the system doesn’t have a temp folder, Domino puts the temp
files in the Domino data folder
Because of the disk I/O and disk space required, you should
change the location to a different drive
Use this parameter:
 VIEW_REBUILD_DIR=(drive and folder location)
 Make sure you have plenty of space available
36
Make Sure There Is Plenty of Space Available
•
If Domino calculates that there isn’t enough space on the
temporary folder’s drive, it uses a slower method to rebuild
the view
 You’ll see the message below in the log and console
 It’s best to remedy this with more disk space or performance
will actually drop
Warning: Unable to use optimized view rebuild for view due to
insufficient disk space at directory. Estimate may need x
million bytes for this view. Using standard rebuild instead.
37
14. Disconnect Idle Users
•
•
•
An idle user stays connected to a server for 4 hours
 This takes up valuable server resources
Use this parameter to drop idle users faster
 SERVER_SESSION_TIMEOUT=(number of minutes)
 Users will not have to re-enter a password if they become
active after the time limit
The minimum recommended setting is 30-45 minutes
 A lower setting may negatively impact server performance
 IBM/Lotus says it’s not needed in R8
 But I like to use the parameter regardless
 It gives you more realistic user concurrency stats
38
1,000 Users — Server_session_timeout=60
•
Comparison of memory usage on a Domino server
39
650 Users — Server_session_timeout=30
•
Domino server memory comparison with and without the
parameter set to 30
40
650 Users — Server_session_timeout=30 (cont.)
•
CPU Utilization comparison with and without the parameter
 There is definitely more CPU utilization when logging idle users
off after 30 minutes of inactivity
 The resource savings and reduction in virtual memory usage
is worth it
41
15. Prevent Simple Search DBs with No Full Text Index
•
•
Simple search is the type of processing used when a user
searches a non-full text indexed application
 The simple search algorithm does the job, but is not very
efficient
 It can significantly impact performance on a Domino server
Simple search is simply awful sometimes
 For some applications, the ability to search documents may not
really be necessary
 However, the default functionality still allows users to do
simple searches on applications that are non-full text indexed
42
Preventing Simple Searches
•
Administrators can now prevent simple searches if an application
is not full text indexed
 Enable this by selecting “Don’t allow simple search” on the
Advanced tab of Database Properties
43
Preventing Simple Searches (cont.)
•
If users attempt to simple search a database with this option
enabled, they will receive an error message as shown below
 This will probably generate a few help desk calls
 Be prepared by providing info about this feature if you’re
going to deploy it
44
Property Doesn’t Replicate
•
Keep in mind that the “Don’t allow simple search” property does
not replicate for existing database replicas
 This lets you decide selectively whether each replica should
have the setting enabled
 The setting is carried over to new replicas and copies
45
16. Don’t Maintain Read Marks on All Databases
•
•
Replication of unread marks was primarily designed for mail
databases
 If you don’t need them, don’t replicate them because it can
significantly slow database performance
For example, keep them switched off in Help, LOG.NSF,
NAMES.NSF and any reference application
 Work with your developers to develop standards for enabling or
disabling the feature
46
Remember to Compact to Activate
•
•
The setting is in the Advanced Properties section of databases
If you select or deselect the “Don’t maintain unread marks”
property, the database must be compacted
 Otherwise, the changed setting has no effect
47
17. Manage Agents to Control Resources
•
•
Check the view in the “Technotics Statrep 2011” exposing how
much time the Agent Manager uses
 Use it to better understand how busy the agent manager is
Mail servers are vulnerable to pressure by agents if users have
designer or higher privileges
48
In Large Domains Consider Using an Agent Server
•
•
•
Some large domains with several application servers host the
same application on several servers
 There is a danger of replication conflicts when two different
agents run on two different servers
Create an agent server where all agents run
 Replicate all applications requiring agents back to the agent
server
Since the agent server has no real users, you can run 10
concurrent agents 24 hours a day
49
18. Make Sure Partitioned Servers Have Enough Resources
•
•
•
One of the servers on a partitioned server can use a large amount
of system resources
 The resources are denied to other partitioned servers on that
computer
Use OS level tools to evaluate server performance
If there are issues, you must consider moving the offending
server to a different computer
 If it’s strictly a resource problem causing slow disk
performance, you can move the data directory to a different
disk drive
50
19. Manage Cluster Synchronization and Load Balancing
•
Cluster replication keeps the database on the primary server in
sync with the replica on the failover server
 Cluster replication is an event-driven process that occurs
automatically when a change is made to a database
 It’s vital that these replicas are synchronized
 But, by default, servers in a cluster only have a single
cluster replicator thread between them
51
Can the Single Cluster Replicator Keep Up?
•
•
Occasionally there is too much data changing to be replicated
efficiently by a single cluster replicator
 If cluster replicators are too busy, replication is queued until
more resources are available and databases get out of sync
 Then a database on a failover server does not have all the
data it’s supposed to have
If users must failover to a replica on a different server, they think
their information is gone forever!
 All because replicas will not have the same content
 Users need their cluster insurance!
52
How Many Is Enough?
•
•
•
Adding a cluster replicator will help fix this problem
 Use this parameter in the NOTES.INI
 CLUSTER_REPLICATORS=#
 Add one dynamically from the console using this command
 Load clrepl
The challenge is to have enough cluster replicators
without adding too many
 Adding too many clusters will have a negative effect on
server performance
Here are some important statistics to watch so that you can make
a wise decision about how many to add!
53
Key Stats for Vital Information About Cluster Replication
Statistic
What It Tells You
Acceptable Values
Replica.Cluster.
SecondsOnQueue
Total seconds that last DB
replicated spent on work
queue
< 15 sec – light load
< 30 sec – heavy
Replica.Cluster.
SecondsOnQueue.Avg
Average seconds a DB spent Use for trending
on work queue
Replica.Cluster.
SecondsOnQueue.Max
Maximum seconds a DB
spent on work queue
Replica.Cluster.
WorkQueueDepth
Current number of databases Usually zero
awaiting cluster replication
Replica.Cluster.
WorkQueueDepth.Avg
Average work queue depth
since the server started
Use for trending
Replica.Cluster.
WorkQueueDepth.Max
Maximum work queue depth
since the server started
Use for trending
Use for trending
54
What to Do About Stats Over the Limit
•
•
Acceptable Replica.Cluster.SecondsOnQueue
 Queue is checked every 15 seconds, so under light load should
be less than 15
 Under heavy load, if the number is larger than 30, another
cluster replicator should be added
If the above statistic is low and Replica.Cluster. WorkQueueDepth
is constantly higher than ten …
 Perhaps your network bandwidth is too low
 Consider setting up a private LAN for cluster
replication traffic
55
Stats That Have Meaning but Have Gone Missing
•
The Technotics Statrep 2011 tracks the key statistics you need to
help adjust your clusters
 It also has a column for the Server Availability Index
56
Balance the Load Between Cluster Members
•
•
•
Make sure clustered servers don’t run out of resources
 Force failover to another server if server reaches predetermined
level of resource usage
Use the following parameter to cause failover when the server is
90% busy
 SERVER_AVAILABILITY_THRESHOLD=90
Adjust this parameter based on your own experiences
57
20. Plan on a Monthly Restart for Domino Servers
•
•
•
Consider regular monthly restarts of Domino servers
 Not just Wintel-based servers, all servers
Server memory allocation and shared memory fragmentation can
occur over time
 Plus there could be undocumented memory leaks
Regular restarts will help ensure your Domino servers are running
as efficiently as possible
58
What We’ll Cover …
•
•
•
•
•
Understanding the issues
Maxing the speed of mail
Making the Web swing
Tuning Domino for dominance
Wrap-up
59
Resources
•
•
•
•
How to limit the number of threads used for sending large
messages
 www-01.ibm.com/support/docview.wss?uid=swg21108351
Always use the proper directory design for the highest release of
Domino in your domain
 www-01.ibm.com/support/docview.wss?uid=swg21304915
Notes/Domino Best Practices: Transaction Logging
 www-01.ibm.com/support/docview.wss?uid=swg27009309
Lotus Domino server maintenance tips
 www-01.ibm.com/support/docview.wss?uid=swg21248830
60
7 Key Points to Take Home
•
•
•
•
•
•
•
Make sure you are collecting statistics from all servers hourly so
that you can observe trends and remediate problems
Turn on client debugging when it looks like servers are running
fine, but a user is complaining about slow performance
Keep as few documents as possible in a mail file’s inbox
A default to iNotes Lite will ensure good performance for Web
users
Use transaction logging even if it is not part of your
backup/restore strategy
Disconnect idle users
Don’t maintain read marks on all databases
61
Your Turn!
How to contact me:
Andy Pedisich
Andyp@technotics.com
www.andypedisich.com
62