who is getting a rest?
This one is specially interesting for people who manage big networks (more than 10-15 hosts which couldn't get down). There is a lot of software to perform network monitoring, like Big Brother or Gkrellm (which can be used with the gkrellmd/gkrellm combo to monitor multiple hosts from one point over the network. Of course those are valuable options, and they check a lot of things from the servers you want to keep an eye on, but this time I only need to know if my hosts are up or down, simply, isn't it?.
The idea is get a list of hosts that are up or down, which refresh itself every X seconds, and a tool that will send us an alert in case any host got down. The software I will use is sntop, which will provide us with a nice ncurses list of hosts (names, addresses and descriptions), as well as a html list too, this way you could use sntop in your intranet web server and keep an eye on which hosts are up and running from everywhere inside your network.
I will not cover sntop installation here, it will depend on your system, but installing from sources is not difficult (In the case of FreeBSD?, there is a port: /usr/ports/net/sntop). The interesting part about sntop against other kind of software is that you only have to install it in one host (your desktop, your intranet web server, etc).
Configuring it is quite easy, just create a file called sntoprc (~/.sntoprc if you plan to run it only as one user or /etc/sntoprc for a global configuration). Inside this file we are going to set some information about the hosts we are going to monitorize:
# simple network top - a top-like console network status tool # # http://sntop.sourceforge.net - homepage # ftp://sntop.sourceforge.net/pub/sntop/ - anon ftp # Frey 192.168.23.2 FreeBSD / e-shell main server tyr 192.168.23.1 OpenBSD / e-shell firewall webdev 192.168.23.6 Slackware Linux / codigo23 ws wyrmslayer 10.0.0.2 FreeBSD/OpenBSD / Laptop abalom 10.0.0.3 Slackware Linux / dolo ws # EOF
In this example, I have defined five hosts (server, firewall, two workstations and a laptop), each one with a simbolic name, an ip address and a short description.
Ok, we are done with the configuration, now we can run sntop. Looking at the man page SNTOP(1), we could get a list of posible parameters. as well as a list of interactive commands to use inside the curses-bases interface. From all the posible parameters, there are three which will be interesting:
... -d, --daemon - daemon mode: make sntop capable of running in the back- ground. note, it wont automatically fork into the background. ... -w, --html - generate html output of results ... -a , --alarm=file - alarm mode: execute when a site first goes DOWN ...
The first one is interesting when we use sntop inside, for example, an internal web server, to offer only html reports. We need the second one to generate html reports, which are always useful even if we use the curses-based tool aswell. The third parameter is used to set a file that will be called as soon as a host is down (offline, shutted down, etc). The file could be a shell script that sends an email/sms, or whatever we want the alert to be. Obviously we need such a script to be writted before running sntop, let's see a little sample:
#!/bin/sh # # Script that sends an email advicing a host is down # # $1 = # $2 = # $3 = (always down for this script) echo "The host $1 ($2) is down since `date` " | mail -s "$1 is $3" firstname.lastname@example.org
When sntop calls the script, it calls the script provinding it with three parameters, the first one is the short name of the host, the second one is the ip address and the third one is the status of the host. As you have probably noticed, this sample script sends an email to me, advicing me that which host is down and the current date (from the moment sntop detects the host changed status).
[Frey] ~> sntop -w -a sntop_mail.sh
As soon as we run sntop this way, the curses-based interface appears on the terminal/console from where we run it:
Nice! all hosts are up and running right now, we can check the html file generated by sntop (by default sntop.html, created inside the home directory of the user running sntop):
Even better, this html file is designed to be updated every 180 seconds (sntop too), so you can use it to provide any point of your network with accurate information about which hosts are running and which ones not.
And what will happen if a host got down? Well, sntop will refresh the curses frontend aswell as the html file and will call our script to send us the proper advice:
Notice the different colour (red) for hosts which are down.
And finally, we check our e-mail account to find the e-mail advicing us of the problem:
So, last words about this, there are some other options out there when you are searching for network and server monitoring tools, probably there are a lot of them with a lot of cool features (cpu cycles, temperature, use of RAM, etc), but you will need to install the software on all your hosts (which will lead you to upgrade/maintain such software once a while).
With sntop you can monitor which hosts inside your network are up and notice (quite quickly) which are down. In this example the host that went down was a simple workstation, and we only have 5 hosts. Imagine now a bigger scheme, perhaps with 5-10 intranet/extranet hosts providing services for a 50-60 workstation environment, wouldn't be nice to check with a simple look if one of your main hosts are down (even for a while)?.