8.5. Configuration of Automated Actions - Remedy

Introduced in SysOrb 4.0, remedy is a functionality that makes it possible for SysOrb to automatically carry out an agent action. A typical situation where remedy is used, could be to automatically restart a service if it stops.

To set up a remedy, the first thing to do is to set up a Agent actions, look in Section 8.4 if you want to know more about this topic.

After this, configure the check, by choosing the action that you want to happen when the check goes into alert in the remedy dropdown. Once this is done, remedy should run as soon as the check goes into alert, triggering the action that have been selected.

Once an automatic action has been carried out by SysOrb, the remedy will be disabled for 30 minutes, which means, that if a second alert comes up, the check will go into alert state, and follow through as normal.

To come with an example: A server is monitoring a service, that HAS to run all the time. However, due to some issues on the server, the service will shut down every now and again, therefore a agent action that restarts the service have been set up. Remedy will make sure, that if the service only shuts down one time, it will automatically restart it with the agent action, however, if the service goes down a second time within 30 minutes (while remedy is disabled) it will go into alert. This is done to make sure that if the service goes down for good, and do not recover, SysOrb will tell you.

At the same time, when remedy is executed, there is automatically introduced a 5 minutes unexpected downtime for the whole node, this is done in order to avoid alarms from other checks which might arise because of the action being executed. A common situation could be: A web site starts running poorly. SysOrb remedy is set to re-starts IIS, if a site is responding above a certain limit. Restarting IIS causes all sites on the host to temporarily malfunction, but we do not want to have alarms from all sites. If restarting the web server has solved the problem, everything will be back to normal when the node comes out of downtime. The incident log on the given node will tell you when SysOrb executed the remedy.