Storage is critical to the performance of Domino. Domino puts heavy demands on storage, both in terms of performance and capacity. Because it is such a storage-intensive application, it's important to optimize Domino storage.
If you're an administrator running a single Domino server and supporting a dozen users, storage optimization won't be much of an issue for you. (Although even a small organization can find its Domino servers increasing in number and its message volumes growing exponentially as users learn to take advantage of Domino's features.) But as the organization grows, so does the need for storage optimization. In fact, the proliferation of messaging and collaborative tools tends to outpace the growth of the organization. Domino storage tends to grow like kudzu in a Southern summer -- 25 to 35 percent a year in installations where storage growth isn't managed. Inside the largest Domino installations, there are tens of thousands of users and millions of messages a day. This means a tremendous demand for storage resources and a huge load on Domino storage systems as well.
There's another reason storage optimization is important -- Domino responds well to it. Or to put it another way: If you're not monitoring your performance metrics, the load on the system will grow. As that happens, Domino performance can degrade radically while your storage requirements go through the roof. There's enough raw computing power and storage capacity to cover up the inadequacies until you get to a certain volume. Then things go south quickly.
Domino is built around databases. So the database, and the note (which is contained in the databases) are the fundamental units of Domino, in the same way that folders and files are the basic units of storage. But because the typical Domino server supports a lot of small databases, access tends to be much more random than on a conventional database server. So while some people assume that you should treat Domino as a DBMS, the truth is that Domino more closely resembles a file server than a DBMS when it comes to optimization. That requires a different storage optimization strategy than a huge Oracle database.
The first rule of Domino storage optimization is "horses for courses." For best performance and efficient utilization, you need to choose the appropriate kind of storage for Domino's various functions and tune each one appropriately.
Rule 1. Put the operating system and data storage on different devices.
Put the operating system on its own RAID array, complete with its own controller. Do the same thing with the data.
To keep a Domino server operating at peak performance, it needs fast access to its components, especially the databases and transaction logs. Giving the data its own RAID array and controller maximizes throughput to the databases. However, if you want, you can combine the OS and the Domino program files on the same storage device.
Rule 2. Put the transaction logs on their own disks.
In the event of a problem, transaction logs are important for fast, accurate recovery of files and data, but they extract a performance penalty because every update produces a write to the log. To minimize this performance penalty, you need to provide transaction logs with their own storage device and their own controller. Ideally, this device should use RAID 1 (mirroring) for data protection, since this method provides data protection and fast restores combined with fast writes.
Rule 3. Use the appropriate RAID levels.
Another advantage of allocating different storage for different operations is that you can assign different RAID levels, depending on the storage requirements.
RAID 01 (striping and mirroring) is a common choice for the OS and Domino program files. It is fast and provides a high level of redundancy. Although mirroring doubles the storage capacity needed, this is less of an issue for the OS and program files than it is for the data, because the OS and program files grow little or not at all as the system is used.
Some Domino installations use either striping alone or simply put the OS and program files on an unprotected disk. The theory is that if there is a failure, these files can be easily restored from backup. The tradeoff is that you don't have the ability to fall back on a mirror or automatically rebuild (as with RAID 5) in the event of a failure. If you employ this option, it's a good idea to keep the OS and program files on separate disks to speed up restoration in the event of a failure.
For your data, the appropriate RAID level is either RAID 5 or RAID 01, depending on whether you want to optimize performance or storage capacity. RAID 5 is the most common RAID level for storing data, but it is not as fast RAID 01, because each RAID 5 write requires three additional writes per operation. On the other hand, RAID 01 requires twice the storage of RAID 0 (striping without redundancy) and significantly more than RAID 5.
As mentioned above, it is best to use mirroring (RAID 1) for transaction logs.
Rule 4. Use a separate network port for cluster replication on clustered servers.
Domino's extensive use of replication is one of its most powerful features. However, replication can bog down a clustered server by overloading the communications channels.
Rule 5. Use aggressive read-ahead caching appropriately.
Read-ahead caching extracts a penalty when the system guesses wrong and has to flush the cache. Data in Domino tends toward the random, making read-ahead caching less useful. However, the log files are written and (usually) read sequentially, which makes them ideal candidates for aggressive read-ahead caching. When the transaction log has to be read as a whole or in large chunks, this can improve performance significantly.
Rule 6. Use Shared Templates and (maybe) Shared Mail.
The Shared Templates and Shared Mail features in Domino are examples of single-instancing. In other words, they store a single example of an e-mail or a template and refer to it by reference when a user opens it. The users don't care since, from their standpoint, the process is invisible. However, storage managers and Domino administrators care a lot because single-instancing saves an enormous amount of storage space. There is a slight performance penalty for Shared Mail and Shared Templates, but the capacity savings are enormous.
Meta Group estimates that Domino message servers can save up to 60 percent of their storage space simply by using the Shared Mail feature. Back in 2003, in a report titled Domino Storage: What Can be Done?, Meta Group recommended steering clear of Shared Mail, pointing to the "disastrous results" that many Domino shops had encountered when they tried Shared Mail in Domino R3/R4. However, the feature has become much more useful in Domino R6 and R7.
Rule 7. Analyze your logs for performance bottlenecks.
Using Microsoft Windows' built-in Performance Monitor utility, you can log and analyze relevant statistics for a variety of Domino-related performance metrics. The key metrics for understanding storage performance are Average Disk Reads/Writes per second, Percent Disk Time and Drive Queue Length. These metrics will give you a picture of how well your storage system is holding up under Domino, and can also help you spot performance-related trends.
Rule 8. Test your configuration.
The best way to test your Domino configuration is to heavily load the system and then carefully analyze what it does. IBM provides a tool with Domino, called Server.Load, that can help you test installations under load. Server.Load collects about 150 performance statistics from the system being tested and can run a variety of scripts to test workload.
A more complex tool, called NotesBench, is available through IBM to members of the NotesBench consortium. NotesBench is more robust than Server.Load, is command-line-based and is aimed at more sophisticated administrators inside larger installations. NotesBench can use a set of built-in dummy work loads to test system performance, and also produces a users-per-second number with the system response.
Rule 9. Consider archiving software.
Domino is assuredly not a tool for long-term data storage. Nor is document management its strength. Several companies, including Sherpa Software and RPR Wyatt, offer document management and archiving software designed to support Domino. These tools can reduce the amount of storage needed to support Domino by as much as 85 percent.
Archiving programs perform their magic by a combination of data compression and single-instancing. Not only do they compress the messages and other documents, but they only store attachments and other material once, instead of on every user's system.
Besides reducing the absolute amount of storage, archiving software also lets administrators substitute cheaper storage for the expensive, high-performance storage systems needed to support e-mail.
From the user's standpoint, all this is virtually invisible. Archiving and retrieval takes place behind the scenes and usually the user doesn't know if the message is stored in an archive or not.
The process of selecting archiving software can be time-intensive because there are so many different products with different approaches. Some are designed primarily to optimize storage; others focus on document management or compliance. Of course, every archiving product will optimize storage, but you should examine your particular archiving needs and the product's features before making a selection.
Furthermore, as compliance and document management grow in importance in the enterprise, they may also become key criteria in your selection process. You may even find your company's legal department and records management people involved in the evaluation.
About the author: Rick Cook is a freelance writer specializing in storage-related
issues. He has been writing about storage since the days of ferrite cores and magnetic
This was first published in November 2005