Given the need to keep tabs on atleast three web sites, my simple scripts for handling a single site were no longer sufficient, so installed Nagios on my Linux box.
Fedora FC5 was the system, and getting the Nagios bits was pretty easy - use yum to install the following packages:
Nagios is a pretty complex package to install, at least based on all the writeups on the web, so I made a short detour to try to install and use Zabbix. That is certainly easier to get up and running, but I am not too comfortable with UI management screens, and got stuck for too long on the configuring part - add to this the incorrect use of MBytes instead of GBytes in the disk space rows, I went back to trying to get Nagios configured.
Nagios is not too bad - in fact, for anyone comfortable with editing text config files, it is downright easy.
Within a few hours, I had all config files set right, and email notifications working.
Mainly followed the documentation that came with Nagios, and on Fedora, using the pre-packaged yum kits, here are the additional things I had to do:
- chkconfig nagios on
- use htpasswd to add user accounts for web access
- edited the nagios.conf httpd file, to allow access from specific hosts
- created nagios.cfg, and minimal.cfg starting with given samples, but greatly simplified by following the nagios doc section that gives tips on how to use the template mechanism for inheritance of host and service attributes
- changed the notification from using /bin/mail to using /usr/sbin/sendmail.
The FAQ at nagios suggests using sendmail configuration (genericstable) to fix the From address of outgoing notifications, but that felt like too much, so instead, it was easier to use sendmail and add a -f$ADMINEMAIL$ option to set the from address correctly.
Yes, this does add a Authentication warning header line to the email, but the system admins who receive this email know not to get worried about it. $ADMINEMAIL$ is defined in main nagios.cfg file.
- changed the sample check_http command to this, the sites to be monitored could be slow, so increased warning/critical timeouts, also don't consider HTTP error code 403 for a page to be a warning in terms of network monitoring, so used the -e option for check_http:
$USER1$/check_http -w 10 -c 20 -e "HTTP/1." -H $HOSTADDRESS$
- changed the check_ping calls to use larger values for round-trip average time, minimal.cfg example used 100ms, and this needed to be much higher for my web sites.
- added local switch and broadband modem to the hosts configuration, added ping and http checks for both, and then used the parents keyboard to define host relationships
Everything is all set, and working well. Great package.
Next step is to figure out how to execute remote secure-shell ssh commands - there may be some conditions where the remote box is accessible, but will need to be rebooted to fix some other issue.