We are being forced (ever increasing message store) into thinking about migrating from server disk to a SAN. The current architecture employs Lotus Notes clustering mainly for failover. There are also a few thousand DWA (internal) users. Do you know of a good source for info regarding Notes clustering in a SAN environment, and also how to accommodate DWA users? And does Microsoft clustering make sense in a Notes-focused SAN?
This is a tough question, really, because it is about the heart of the way a company lays out its strategic resources as much as about the products themselves. First let my fire off the short and direct answers, and then go into some other things to consider.
To Domino, storage on a SAN is going to be essentially the same as if it is local. The issue is going to be one of I/O performance. Certainly a high-end SAN solution can be as fast as or faster than local disk storage. On other hand, entry-level SAN products are notoriously poor in this respect. Your Domino server's performance is very heavily tied to disk I/O performance on that machine. A really good SAN that supports the kind of volume you're talking about is going to get very expensive; in fact, expensive enough to justify looking at some alternatives. (I'll talk about those in a minute.) As far as the impact of a SAN on a Domino cluster, you should never (never ever -- and I mean it) try to use a shared file set via SAN or any other mechanism as a way to have a Domino cluster without duplicating the stored data. Remember the description of what happens when you cross the beams from the movie Ghostbusters? Enough said.
Let me give you a little background (because there are other people listening in) on Domino clustering, and then I'll talk about another way to attack this problem.
When it comes to clustering, there is a very big inherent difference in what a Domino cluster does versus what a Microsoft cluster does. Microsoft clustering is all about making what is to all outward (network) appearances a single, big, strong, tough machine. The cluster incorporates the operating system, and, in theory, if one of the hardware boxes goes down in flames, it should just keep running. This is great for operating system-level tasks like an important PDC; but what happens if Domino running on that one big machine develops a problem? It wouldn't have any more redundancy on a Microsoft cluster than off one. In short, the hardware is redundant, but not the mail server. In addition, all the members of the cluster must be very much the same.
Domino clustering, on the other hand, happens at the Domino server layer (the application layer). That means all the Domino servers must be within a supported version range (in this case, virtually all Domino server versions still supported by IBM and many before that), but beyond that they can be on different hardware platforms, in different buildings, running different operating systems. The advantage to this structure is that they are completely independent. I can take the pick-headed axe off my fire truck and smash one to bits, and the other will not be impacted by anything but the shrapnel. The disadvantage is that it's not a 100% silent failover process. Users who are in the middle of doing something may notice the failover or even have to re-open a resource to begin using it on the other server.
OK, so now you know about clustering. Now think about how the users in your environment are distributed. Do you REALLY want to have everyone on one server or one clustered group of servers? That's part of Exchange's biggest weakness; the single database storage for all that mail means if part of it breaks for one person, everyone is down and stays down until it's all fixed. Sometimes that means fixing or restoring massive amounts of data. Domino doesn't work that way. Within a few seconds, you can move Domino users' mail files from server to server at will. They're distinct files. The Lotus Notes client will even get a notification and switch to the right server. Why not, then, consider an array approach to mail servers in your environment? I blogged about one way back in July.
Read the second part of this answer.