Thursday, April 14, 2011

SnTT - Mixed Email Server Environments Monitoring

If you are an IBM Collaboration Software Lotus Domino Administrator and you have a mixed messaging environment that involves gateways, relays or any other type of forwarding, you really should be using DDM (Domino Domain Monitoring) and the Events4.nsf database.

Last year at MWLUG I gave a session to a packed room on the basics of DDM, can grab it here.

This year I am speaking at The View's Admin Conference in Las Vegas June 22-24 for the 2nd year in a row and one of my 3 sessions is about DDM for the uninitiated.

So what is so special about this for DDM that you should use it for mixed environments? Ever have mail stop flowing to one of them(Exchange Connector), long time no messages from it? How would you know unless the executives or help desk told you? Hopefully few of you reading this find out this way.

The smarter way is of course to use either built-in and FREE tools or a 3rd party tool like RPR Wyatt's Vital Signs or GSX's Monitor. Note: We have a relationship with both of these great products.

But FREE is for me..and you.

DDM actually is a nice GUI that wraps around the events4.nsf but I use DDM as the terminology to represent stats and alarms. If it wasn't for the probes I set up I would be constantly putting out fires and taking phone calls never ending at a time when I need to be focused on the clients problems.

How do you enable DDM probes? Here are the steps:

  1. Open the events4.nsf
  2. Click on DDM probes from the list in the Left column
  3. Select By Type
  4. Click on "New DDM Probe" from the sub menu
  5. Select Messaging from the drop down list
  6. A new window will open with details to be filled in.
  7. Probe Subtype, look at the list
  8. Select Mail Flow Statistic Check or whatever else your heart desires
  9. Enter a description if you want, like "This will let us know when Exchange crashes"
  10. Select which servers it should run on. Hint your mail server or gateways
  11. Select a destination or leave it as all
  12. Select Services
  13. Finally on thispage select the limits you can accept. keep in mind if you set it to 1 email you will get MANY notifications. I like to set it to a retry of 3-5 and a message count exceeds 30 for Fatal but it is up to you.
  14. Click Save and Close
  15. Done
And now you have a probe to check for problems. You can then go to your DDM and look at the nice GUI Dashboard and see when you have issues. Keep this running on the giant 42" LCD monitor 24x7.

Or you could have run the Setup Wizards from the left column. But where would your education be if everything was simple? How would you ever troubleshoot a problem again? Yes I am talking to the Exchange admins.

But if you read my blog regularly, you know I am a lazy admin. I prefer to be proactive not reactive, so I spend my time upfront configuring events to notify me when something is wrong. Looking at DDM would take me away from my other work.

How to notify yourself or other admins about the probes:
  1. On the Left column click on event Handlers
  2. Select By Action
  3. Click on New Event Handler
  4. Select which server to notify about
  5. Leave the Trigger as default
  6. Click on the Event tab
  7. LEave events can be any type
  8. For the second option, I select events must be one of these severities and pick Fatal, Failure, Warning High it is up to you but at least pick what you selected up above for your mail routing probes.
  9. Then select events can have any message, hey you might get lucky and see other built in probes. Neat huh?
  10. Select Action tab
  11. Method you should select Twitter Mail
  12. Enter names or servers or mail-in db for the notices
  13. Select when to enable it.
  14. Save and Close

And you are done! Cool huh? and none of that wizard stuff. Real admin coolness in under 5 minutes.
Any questions? Speak now so my attendees in Vegas get your knowledge as well as mine.

PS here is what your notification will look like in your email:

Originating Server: Presto/RUSH
Event Severity: Fatal
Event Type: Server
Event Time: 04/13/2011 03:09:13 PM

Lotus Entries
  Probable Cause:
     1. The number of attempts to deliver messages to this destination is
excessive.
     2. The named server may be down.
     3. There may be problems with network connection.
     4. The details tab lists any errors encountered when attempting to
access the destination.
  Possible Solution:
     1. Configure a mail routing probe to quickly detect a nonresponding
server.
     2. Verify that the other server is running.
     3. Check connection documents for the correct information.
     4. Verify that the number of messages required before routing occurs
is less than the configured probe limit.
  Corrective Action:
     1. Create a mail routing probe.
     2. Inspect or modify a Connection document on server 'Presto/RUSH'.


To see additional information about this error message, click here -->
(Document link: Error Message Document)

To see the document that triggered this notification, click here -->
(Document link: Event Notification Document)