Be careful! The activity logging functionality of Lotus Notes might cause problems with Lotus Instant Messaging.
Activity logging and activity trends are terrific new features in ND6 that give you tons of great information about your servers. They are both activated in the server configuration document.
At the end of this article, I'll provide some details on how to get activity logging working in your environment. Once in place, you'll have valuable information about server growth rates that will be helpful for capacity planning. But, perhaps best of all, you'll know which databases have not been used in a while.
Wow! Won't that be terrific? You will finally know which databases have not been used, and then you can remove them from your servers, saving disk space and shortening your backup window! But first, fellow Notes admins, read on about my shock and the repercussions when I discovered an issue within the guts of the process.
A few months ago we rolled out activity logging at the client du jour. The process automatically created an Activity.nsf on every server. Each Activity.nsf had the same replica identification, but replication was turned off.
This style of creating replicas with the same rep ID for system databases is actually quite common in Notes. As you know, admin4.nsf and catalog.nsf are similar. Each one has a unique replica ID that is the same throughout your domain.
But unlike catalog and admin4, replication was turned off on Activity.nsf. The collection of valuable information about activity on each server was thus collected and held only on each server. Admins had to visit each server to look at the details.
In my quest for easy analysis of the data, I decided that what we needed was to have all of that data centralized on the administration server. That way, administrators would only have to open the copy on the admin server to analyze any server in the domain, and the activity logging would have even more value.
I went about setting it up like the Domain Catalog model, where each server's replica contains information only from itself, and the central replica on the administration server aggregates the information from all the servers. This was all controlled via active control lists (ACLs) that I set up like the Catalog ACLs. Each of the replicas of Activity.nsf on the individual servers were about 75 MB, and the centralized copy was pretty big, but tolerable, at 2.7 GB.
About a week after implementation, Lotus Instant Messaging (formerly known as Samtime) began acting very strange. It seemed to take forever to get connected to the LIM server and to get buddy lists. Well, maybe not forever, but a good 2 minutes, and for an engineer with no patience, it might as well be an eternity. So off I went to have a look around on the LIM server.
I was horrified! The VPUserInfo.nsf file had also grown to a massive 2.7 GB. VPUserInfo holds buddy lists and some other LIM info. Could it be that LIM had become very popular all of a sudden? And wow, what a coincidence -- VPUserInfo.nsf and Activity.nsf were almost exactly the same size.
Unfortunately, it wasn't a coincidence. VPUserInfo.nsf and Activity.nsf had identical replica IDs. They had merged into a single big database. And Activity.nsf had won the battle for design supremacy. VPUserinfo had lost its design as keeper of buddy lists.
The buddy lists still seemed to work, but the sheer size of the database and the bloat of data that had nothing to do with buddy lists was causing the performance issues.
I don't know if you've noticed, but messaging users are known to be pretty religious about their buddy lists. I strongly suspect that blowing away the buddy lists and making users re-create the lists from scratch was probably a career shortening move. As a matter of fact, any minute I expected a mob of users to be massing around my cubicle with pitchforks and torches like an old Frankenstein movie if I didn't fix this and fix it fast.
It only took a few minute to create new VPUserinfo databases, with new replica IDs of course, and then paste in all the data from the old messed up database. A quick downing of the LIM servers and a few renamed files later, and I was back in business.
What happened here is super important. The domain had created Activity.nsf and VPUserinfo.nsf with the same replica ID. If I hadn't tried to centralize the data, then perhaps I would never have noticed the problem because replication was turned off on all of the Activity.nsf replicas. .
But it wasn't turned off on the VPUserinfo.nsfs. As soon as I turned on replication on the activity logging databases, Sametime was affected. We did report this problem to IBM, which responded with the kBase document #1190856, dated 12/16/04.
The solution IBM proposes is to re-create the replica Activity.nsf with a different replica ID throughout the domain so that it doesn't conflict with VPUserinfo. Well pardon me, but doesn't it make more sense to change the replica ID of VPUserinfo, since there are generally just a few Sametime servers in a domain but many more that will be using activity logging?
IBM also indicated that it should not be a problem as indicated by this response: "Note: Since ACTIVITY.NSF should never replicate, no real problems should result from this issue." Maybe it's just me thinking "Enterprise" with a capital "E", but if the developer decided to give all the Activity.nsfs the same replica ID, then maybe it shows that you might indeed find centralized activity logging quite valuable!
Most disappointing was IBM's final note: "There are no plans to address the issue in the Domino 6.x product series."
Lesson learned: If you're going to use Sametime and centralized activity logging in your domain, watch your rep IDs for these databases and make adjustments accordingly.
How to start activity logging in your domain
Use a server configuration document to get things going. Use default one if you want activity logging from all servers in the domain.
Complete the main activity logging page as shown below. Don't forget to enable logging types. Generally I log everything except mail. I have other ways of monitoring that aspect of the environment.
Then complete the activity trending forms as shown below.
Don't be bashful about using defaults for everything. They're good to get you started. In fact you can keep the defaults on Retention and Proxy data as well.
That's it! You're done.
Activity is collected in your log.nsfs, and every night an analysis is performed and added to Activity.nsf. The process uses the catalog as well, so make sure the catalog runs on all of your servers -- including mail server -- for this to be effective.
As I mentioned, one of the most interesting aspects of Activity.nsf is not the activity, but the inactivity.
Just imagine how many databases you can eliminate when you are armed with this information.
This is an excellent tip, and IBM's response is VERY disappointing. Who other than IBM would think this was okay? And why would they have set it up that way in the first place? Surely they're not running out of unique replica IDs.
First, the identical replication ID -- is that really an "issue," or isn't it just by mere accident that the replica IDs became the same (i.e., is it reproducible)?
Second, the Activity logging is in my opinion not so well documented, so it's not straight forward to interpret the different tables that are generated (which is also indicated with the weak example of checking the "inactivity"). Activity Trend collection seems very useful, but be prepared to spend time on understanding how the different activity collections should be interpreted.
Another thing that would be nice to know more about is that the checkpoint interval could affect system performance if there is a lot of activity.
It's really an issue that the rep ID is the same. When it happened to me, I immediately called my partner, Rob Axelrod, and had him check at his client. To his surprise, he found it there as well. Fortunately it was prior to him turning on replication for Activity.nsf and he was able to do the workaround where you create new replicas for vpuserinfo.
We then checked our own Technotics domain and found it there as well. Since then, a bunch of other administrators have come forward with the same information. As I say, it only really becomes an issue if you turn on replication of Activity.nsf. But that is a step we really want to have to make administration easier and centrally managed.
—Andrew Pedisich, tip author
This is exactly what happened to our environment. Via replication the vpuserinfo.nsf got the activity.nsf design elements and documents, causing the Sametime server stopping its work.
Fix: I created a new vpuserinfo.nsf from the template (this time with a unique replica ID). Additionally, I located all documents with Form="StorageAttributes" in the activity.nsf and _moved_ it to the new vpuserinfo.nsf, this way restoring my user's configuration.
Do you have comments on this tip? Let us know.
Please let others know how useful it is via the rating scale below. Do you have a useful Notes/Domino tip or code to share? Submit it to our monthly tip contest and you could win a prize and a spot in our Hall of Fame.