Thursday 14 January 2010

Monitoring Services using Kaseya

Monitoring Services on a Server is one of the top tasks for any IT Support company. Getting a stopped service restarted before anybody notices can save you a lot of time and effort, and ensures that the Client experiences minimal downtime. Imagine if you never had to wait for someone to call to let you know their e-mail isn't working? Exchange Information store stops, but because you are monitoring it, and have set the correct recovery options, it restarts by itself without any intervention from you, and without the client even noticing?

Kaseya can monitor these services, however getting the appropriate restart response and alerts can be tricky.

Monitor Sets can be set up for each service on the Server (or indeed on a PC), and can be configured to attempt to restart the Service 3 times at selected intervals. The Monitor set can then send you an e-mail and set up a dashboard alert.

However, what we really need is to be alerted only if the service is not sucessfully re-started. This requires a little more work, but gives very much more satisfying results. No intervention is necessary unless the service didn't restart after 3 attempts, in which case it might by symptomatic of something bigger going on, in which case you'll need/want to intervene anyway.

Monitor Set
Configure your Monitor Set in Kaseya to monitor the Selected Service; in this example I'll use the POP3svc Service. Configure the recovery options to 3 restart attempts at 1 minute intervals.
However, do not set the monitor set to alarm or to e-mail. We only need to know if the Service did not restart. Instead, we want it to call a script to check that the service has restarted.

Script 1


Script Name: 10.1.1.5a) Wait 5 mins
Script Description: [GMC]
Wait 5 mins then check POP3SVC Service is running

IF True
THEN
Schedule Script
Parameter 1 : 10.1.1.5b) MS Exchage POP3
Parameter 2 : 5
Parameter 3 :
OS Type : 0
ELSE

This waits 5 minutes, giving the monitor set enough time to attempt the restart, then calls script 2 to check that it is running:

Script 2

Script Name: 10.1.1.5b) MS Exchange POP3
Script Description: [GMC]
Monitor Set - Lynx - Exchange POP3

IF Service is Running
Parameter 1 : pop3svc
THEN
Write Script Log Entry
Parameter 1 : 10.1.1.5 MS Exchange POP3 Service was restarted by Lynx Monitoring
OS Type : 0
ELSE
Execute Shell Command
Parameter 1 : eventcreate /l system /so "LYNX MONITORING" /t warning /id 601 /d "The POP3Svc Service failed to restart after 3 attempts by Lynx Monitoring. An alert was raised."
Parameter 2 : 0
OS Type : 0
Write Script Log Entry
Parameter 1 : 10.1.1.5 The POP3Svc Service failed to restart after 3 attempts by Lynx Monitoring. An alert was raised.
OS Type : 0

This 2nd script checks that the service is running. If it is, it writes to the script log that Lynx Monitoring (thats us!) saw that the service stopped, and was restarted.

If the service is not running, then using the command line, it writes a System Event using the eventcreate cmd, ID 601 from source Lynx Monitoring.

Alerts
This is the good bit. You then configure an alert to search the Event Log for Event ID 601 (or whatever you chose as your ID) from source Lynx Monitoring (replace, obviously, with your own company name). If it finds this event, only then will you receive an e-mail and an alert on your monitoring dashboard, indicating that the service is still not running after 5 minutes and 3 attempted restarts.

The best bit about this is that you only need 1 alert for all the services on the server. Each Service needs it's own monitor set, and its own small script to check that they have restarted, but the Alert is only looking for ID 601 from Lynx Monitoring. When you configure your alert e-mail, it will populate the contents with the details of the service, which you entered using the 'eventcreate' command.

Summary
A little time is needed to set up the monitor sets for each service and creating the scripts, but once you have the details set, it should just be a case of changing the service name in each new script. And just think, you'll never have to check the services on startup ever again.

Also, ss long as you remember to create the script log entry, you will also be able to fully report to the client on every service you restarted over the month.

1 comment:

  1. If you were to capture the service name in a variable in the first script, couldn't you call it in the 2nd script so that you only have to have one script to do the alerting? For example, if you add a step to your "wait 5 minutes" script that sets the #servicename# variable to 'MSExchangeIS', you may be able to call that variable in the next script, which checks to see if the service is running. In the 2nd script, where it checks to see if the service is running, you just input the variable, #servicename#. And, you could use that variable in the subsequent steps as well to make the 2nd script generic. That way, you would only have to have one script for all services to do the alerting part. you would still have to have a separate script for each service to set the variable name and wait 5 minutes. I hope this makes sense. I'm testing it out right now so i don't know if it will actually work. i'll post again to let you know. thanks.

    if you have questions or comments for me, email me at aaron at subnetted.com

    - Aaron

    ReplyDelete