nagiosで障害発生時にリモートサーバーのサービスを自動再起動させる方法

/usr/local/nagios/libexec/eventhandlersに以下のスクリプトを作成

#!/bin/sh
#
# Event handler script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
#       retried 3 times (in a "soft" state) or if the web service somehow
#       manages to fall into a "hard" error state.
#

# What state is the HTTP service in?
case "$1" in
OK)
        # The service just came back up, so don't do anything...
        ;;
WARNING)
        # We don't really care about warning states, since the service is probably still running...
        ;;
UNKNOWN)
        # We don't know what might be causing an unknown error, so don't do anything...
        ;;
CRITICAL)
        # Aha!  The HTTP service appears to have a problem - perhaps we should restart the server...

        # Is this a "soft" or a "hard" state?
        case "$2" in
                
        # We're in a "soft" state, meaning that Nagios is in the middle of retrying the
        # check before it turns into a "hard" state and contacts get notified...
        SOFT)
                        
                # What check attempt are we on?  We don't want to restart the web server on the first
                # check, because it may just be a fluke!
                case "$3" in
                                
                # Wait until the check has been tried 3 times before restarting the web server.
                # If the check fails on the 4th time (after we restart the web server), the state
                # type will turn to "hard" and contacts will be notified of the problem.
                # Hopefully this will restart the web server successfully, so the 4th check will
                # result in a "soft" recovery.  If that happens no one gets notified because we
                # fixed the problem!
                2)
                        echo -n "Restarting HTTP service (3rd soft critical state)..."
                        # Call the init script to restart the HTTPD server
                        /usr/bin/ssh -i /usr/local/nagios/.ssh/keys/apache_restart root@$4
                        ;;
                        esac
                ;;
                                
        # The HTTP service somehow managed to turn into a hard error without getting fixed.
        # It should have been restarted by the code above, but for some reason it didn't.
        # Let's give it one last try, shall we?  
        # Note: Contacts have already been notified of a problem with the service at this
        # point (unless you disabled notifications for this service)
        HARD)
                echo -n "Restarting HTTP service..."
                # Call the init script to restart the HTTPD server
                /usr/bin/ssh -i /usr/local/nagios/.ssh/keys/apache_restart root@$4
                ;;
        esac
        ;;
esac
exit 0

/usr/local/nagios/.ssh/keys/apache_restartは、通常どおりRSAのキーを作るだけ、但しapacheの再起動専用にファイル名を別にして作成。/root/.ssh/authorized_keys

そして、リモートサーバー側の/root/.ssh/authorized_keysの先頭行に以下を追加

command="service httpd restart",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ....省略

参考

説明がめんどうだったので、参考になりそうなページをぐぐる