Skip to main content

You are here

Nagios

Ah Shit - check_http string

After updating the themes of www.alpha01.org, www.rubysecurity.org, www.rubyninja.org I completely forgot to also update the header template files to include once again their respective Google Analytics tracking code. This resulting in almost three months of no stats. When I originally setup the Nagios check_http 's on my sites, I didn't set them to also search for the custom Google Analytics string, which I always use this configuration at work on all http checks.

This can easily be accomplish using the -s|--string option of the check_http plugin.

/usr/local/nagios/libexec/check_http -I www.rubysecurity.org -S -t 10 --string UA-12912270-3

So the lesson learned, you should always configure your check_http Nagios service checks to also search for a custom string as part of the check!

Awesome Applications: 

Monitoring TFTPd server

So I just spent the last two hours of my life trying to figure why PXE booting was not working in my home network. Turned out the root cause was my fault completely since, I forgot to add a firewall rule on my dhcp/PXE server to allow incoming UDP connections on port 69.

Fix:

iptables -A INPUT -p udp -m udp --dport 69 -j ACCEPT

As with just about any other service, this service can be monitored using Nagios. Originally, I had problems using the check_tftp.pl and check_tftp plugins that are available from on Nagios Exchange repo, mainly because of the way I have setup my machines.

check_tftp This plugin was useless in my environment because this plugin all it does is send out an status command to the TFTP server. Since I'm using the BSD tftp client, all status commands sent to any host will always show up as being connected regardless.
http://exchange.nagios.org/directory/Plugins/Network-Protocols/TFTP/chec...

check_tftp.pl This plugin was not opted to work in my environment. Mainly because it uses Net::TFTP, unlike the tftp client application, Net::TFTP does not support specifying a custom reverse connection port (or port ranges). By default, when connecting to a TFTP server, the TFTP server will dynamically choose a random non-standard port to connect back to the client machine and proceed with the TFTP download. My Nagios machine (like all of my machines) are set to drop all incoming packets except for specific ports and related/established connections.
http://exchange.nagios.org/directory/Plugins/Network-Protocols/TFTP/chec...

I wrote a simple Nagios plugin that monitors TFTP. All it simply does, is download a non-empty file called test.txt.

#!/usr/bin/perl -w

# Tony Baltazar. root[@]rubyninja.org

use strict;
use Getopt::Long;




my %options;
GetOptions(\%options, "host|H:s", "port|p:i", "rport|R:s", "file|f:s", "help");


if ($options{help}) {
	usage();
	exit 0;
} elsif ($options{host} && $options{port} && $options{file}) {
	chdir('/tmp');

	my $cmd_str = ( $options{rport} ?  "/usr/bin/tftp -R $options{rport}:$options{rport} $options{host} $options{port} -c get $options{file}" : "/usr/bin/tftp $options{host} $options{port} -c get $options{file}");

	my $cmd = `$cmd_str`;
	if ($? != 0) {
		print "CRITICAL: $cmd";
		system("rm -f /tmp/$options{file}");
		exit 2;
	} else {
		if (! -z "/tmp/$options{file}" ) {
			print "TFTP is ok.\n$cmd";
			system("rm -f /tmp/$options{file}");
			exit 0;
		} else {
			print "WARNING: $cmd";
			system("rm -f /tmp/$options{file}");
			exit 1;
		}
	}

} else {
	usage();
}



sub usage {
print < --port= --file=]

   --host | -H  : TFTP server.
   --port | -p  : TFTP Port.
   --file | -m  : Test file that will be downloaded.
   --help | -h  : This help message.

Optionally,
   --rport | -R : Explicitly force the reverse originating connection's port.

EOF
}

https://github.com/alpha01/SysAdmin-Scripts/blob/master/nagios-plugins/c...

Seeing the plugin in action:
Assuming, we're using port udp 1069 to allow the TFTP server (192.168.1.2) to connect to the Nagios monitoring machine.

[[email protected] libexec]# iptables -L -n |grep "Chain INPUT"
Chain INPUT (policy DROP)
[[email protected] libexec]# iptables-save|grep 1069
-A INPUT -s 192.168.1.2/32 -p udp -m udp --dport 1069 -j ACCEPT

Firewall not allowing TFTP to connect back using port 1066.

[[email protected] libexec]# su - nagios -c '/usr/local/nagios/libexec/check_tftp.pl -H 192.168.1.2 -p 69 -R 1066 -f test.txt'
CRITICAL: Transfer timed out.

Downloading a non-existing file from the TFTP server.

[[email protected] tmp]# su - nagios -c '/usr/local/nagios/libexec/check_tftp.pl -H 192.168.1.2 -p 69 -R 1069 -f test.txtFAKESHIT'
WARNING: Error code 1: File not found

Successful connection and transfer.

[[email protected] tmp]# su - nagios -c '/usr/local/nagios/libexec/check_tftp.pl -H 192.168.1.2 -p 69 -R 1069 -f test.txt'
TFTP is ok.

Programming: 

Awesome Applications: 

ZFS on Linux: Nagios check_zfs plugin

To monitor my ZFS pool, of course I'm using Nagios, duh. Nagios Exchange provide a check_zfs plugin written in Perl. http://exchange.nagios.org/directory/Plugins/Operating-Systems/Solaris/c...

Although the plugin was originally designed for Solaris and FreeBSD systems, I got it to work under my Linux system with very little modification. The code can be found on my SysAdmin-Scripts git repo on GitHub https://github.com/alpha01/SysAdmin-Scripts/blob/master/nagios-plugins/c...

[[email protected] ~]# su - nagios -c "/usr/local/nagios/libexec/check_zfs backups 3"
OK ZPOOL backups : ONLINE {Size:464G Used:11.1G Avail:453G Cap:2%}

Programming: 

Awesome Applications: 

Installing the Nagios Service Check Acceptor

One of the cool things that Nagios supports is the ability to do passive checks. That is instead of Nagios actively checking a client machine for errors, the client is able to send error notifications to Nagios. This can be accomplished using the Nagios Service Check Acceptor.

Installing plugin is a straight forward process. The following steps were the ones I made to get it working under CentOS 6 (Nagios server) and CentOS 5 (client).

Install dependencies:

yum install libmcrypt libmcrypt-devel

Download latest stable version (I tend to stick with stable versions, unless it's absolutely necessary to run development versions), configure and compile.

wget http://prdownloads.sourceforge.net/sourceforge/nagios/nsca-2.7.2.tar.gz
tar -xvf nsca-2.7.2.tar.gz
cd nsca-2.7.2
./configure
[...]
*** Configuration summary for nsca 2.7.2 07-03-2007 ***:

General Options:
-------------------------
NSCA port: 5667
NSCA user: nagios
NSCA group: nagios

make all

Copy xinet.d sample config file and nsca.cfg file.

cp sample-config/nsca.cfg /usr/local/nagios/etc/
cp sample-config/nsca.xinetd /etc/xinetd.d/nsca

Update /etc/xinetd.d/nsca.xinetd/nsca (where 10.10.1.20 is the client IP that will be passively checked)

# default: on
	# description: NSCA
	service nsca
	{
        	flags           = REUSE
	        socket_type     = stream        
        	wait            = no
	        user            = nagios
		group		= nagcmd
        	server          = /usr/local/nagios/bin/nsca
	        server_args     = -c /usr/local/nagios/etc/nsca.cfg --inetd
        	log_on_failure  += USERID
	        disable         = no
		only_from       = 10.10.1.20
	}

Restart xinet.d

service xinetd restart

Verify that it's running

netstat -anp|grep 5667
tcp 0 0 :::5667 :::* LISTEN 30008/xinetd

Add firewall rule

iptables -A INPUT -p tcp -m tcp --dport 5667 -s 10.10.1.20 -j ACCEPT

Set password and update decryption type in /usr/local/nagios/etc/nsca.cfg

Finally, update the permissions so no one can read the password.

chmod 400 /usr/local/nagios/etc/nsca.cfg
chown nagios.nagios /usr/local/nagios/etc/nsca.cfg

Now lets configure the client machine. The same dependencies also need to be installed on the client system. I also went ahead and download and compiled nsca. (In theory I could of just copied over the send_nsca binary that was compiled on the Nagios server since both are x64 Linux systems).
Once compiled, copy the send_nsca binary and update its permissions.

cp src/send_nsca /usr/local/nagios/bin/
chown nagios.nagios /usr/local/nagios/bin/send_nsca
chmod 4710 /usr/local/nagios/bin/send_nsca

Copy the sample send_nsca.cfg config file and update the encryption settings, this must match those as the nsca server

cp sample-config/send_nsca.cfg /usr/local/nagios/etc/

Finally, update the permissions so no one can read the password.

chown nagios.nagios /usr/local/nagios/etc/send_nsca.cfg
chmod 400 /usr/local/nagios/etc/send_nsca.cfg

Now you can use the following test script to test the settings.

#!/bin/bash
CFG="/usr/local/nagios/etc/send_nsca.cfg"
CMD="rubyninja;test;3;UNKNOWN - just an nsca test"
 
/bin/echo $CMD| /usr/local/nagios/bin/send_nsca -H $nagiosserveriphere -d ';' -c $CFG

In my case:

[[email protected] ~]# su - nagios -c 'bash /usr/local/nagios/libexec/test_nsca'
1 data packet(s) sent to host successfully.

Server successfully received the passive check.

Feb 22 20:46:39 monitor nagios: Warning:  Passive check result was received for service 'test' on host 'rubyninja', but the service could not be found!

Last words, the only problem I ran into was having xinetd load the newly available nsca properly.

xinetd[3499]: Started working: 0 available services
nsca[3615]: Handling the connection...
nsca[3615]: Could not send init packet to client

Fix: The was because the sample nsca.xinetd file had the nagios as the group setting. I simply had to update it to 'nagcmd'.
I suspect this is because of the permissions set on the Nagios command file nagios.cmd, which is the interface for the external commands sent to the Nagios server.

Linux: 

Awesome Applications: 

Custom Nagios mdadm monitoring: check_mdadm-raid

Simple Nagios mdadm monitoring plugin.

#!/usr/bin/env ruby

# Tony Baltazar. root[@]rubyninja.org

OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

# Note to self, mdadm exit status:
#0 The array is functioning normally.
#1 The array has at least one failed device.
#2 The array has multiple failed devices such that it is unusable.
#4 There was an error while trying to get information about the device.

raid_device = '/dev/md0'

get_raid_output = %x[sudo mdadm --detail #{raid_device}].lines.to_a


get_raid_status = get_raid_output.grep(/\sState\s:\s/).to_s.match(/:\s(.*)\\n\"\]/)
raid_state = get_raid_status[1].strip



if raid_state.empty?
 print "Unable to get RAID status!"
 exit UNKNOWN
end

if /^(clean(, checking)?|active)$/.match(raid_state) 
 print "RAID OK: #{raid_state}"
 exit OK
elsif /degraded/i.match(raid_state)
 print "WARNING RAID: #{raid_state}"
 exit WARNING
elsif /fail/i.match(raid_state)
 print "CRITICAL RAID: #{raid_state}"
 exit CRITICAL
else
 print "UNKNOWN RAID detected: #{raid_state}"
 exit UNKNOWN
end

Programming: 

Awesome Applications: 

Monitoring computer's temperature with lm_sensors

One of the primary reasons I use SSD drives on both of my Mac Minis that I use as hypervisors (besides speed), is that compared to regular hard drives, SSD drives consume far less power and more importantly generate less heat. Before using SSD drives on my machines, the fan noise both of them made during the middle of summer was pretty evident compared to any other time during the year.

Although at the time I did little research about proactively monitoring the temperature of my machines, now thanks to the Nagios book that I'm currently reading, I learned about the tool lm-sensors, which is available to monitor the hardware temperature in Linux.

Installing lm-sersors in Ubuntu Server 12.04 is really simple.

sudo apt-get install libsensors4 libsensors4-dev lm-sensors

Since lm-sensors requires low-level hooks to monitor hardware temperate, it comes with the utility sensors-detect, which can be used to automatically detect and load the appropriate kernel modules for the lm-sensors tool to function on the respective piece of hardware.

[email protected]:~$ sudo sensors-detect
# sensors-detect revision 5984 (2011-07-10 21:22:53 +0200)
# System: Apple Inc. Macmini5,1
# Board: Apple Inc. Mac-8ED6AF5B48C039E1

This program will help you determine which kernel modules you need
to load to use lm_sensors most effectively. It is generally safe
and recommended to accept the default answers to all questions,
unless you know what you're doing.

Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no): YES
[...]

In the case of my mid 2011 Apple Mac Minis, it was only able to use the coretemp module. File /etc/modules :

# Generated by sensors-detect on Sat Feb  2 21:22:20 2013
# Chip drivers
coretemp

After the module has been added, then its just a matter of loading the recently applied modules.

[....]
Do you want to add these lines automatically to /etc/modules? (yes/NO)yes
Successful!

Monitoring programs won't work until the needed modules are
loaded. You may want to run 'service module-init-tools start'
to load them.

Unloading i2c-dev... OK
Unloading i2c-i801... OK
Unloading cpuid... OK

[email protected]:~$ sudo service module-init-tools start
module-init-tools stop/waiting
[email protected]:~$

Now that the appropiate kernel modules have been loaded. I have everything needed to check the temperature.

[email protected]:~$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +49.0°C (high = +86.0°C, crit = +100.0°C)
Core 0: +48.0°C (high = +86.0°C, crit = +100.0°C)
Core 1: +50.0°C (high = +86.0°C, crit = +100.0°C)

applesmc-isa-0300
Adapter: ISA adapter
Exhaust : 1801 RPM (min = 1800 RPM)
TA0P: +36.0°C
TA0p: +36.0°C
TA1P: +34.8°C
TA1p: +34.8°C
TC0C: +47.0°C
TC0D: +44.8°C
TC0E: +57.5°C
TC0F: +58.5°C
TC0G: +94.0°C
TC0J: +0.8°C
TC0P: +42.5°C
TC0c: +47.0°C
TC0d: +44.8°C
TC0p: +42.5°C
TC1C: +50.0°C
TC1c: +50.0°C
TCFC: +0.2°C
TCGC: +49.0°C
TCGc: +49.0°C
TCPG: +98.0°C
TCSC: +50.0°C
TCSc: +50.0°C
TCTD: +255.5°C
TCXC: +49.5°C

Of course, I just had to write a Nagios plugin to monitor them:

#!/usr/bin/env perl
use strict;
# Tony Baltazar. root[@]rubyninja.org

use constant OK => 0;
use constant WARNING => 1;
use constant CRITICAL => 2;
use constant UNKNOWN => 3;

my %THRESHOLDS = (OK => 70, WARNING => 75, CRITICAL => 86);

# Sample output
#Physical id 0:  +55.0°C  (high = +86.0°C, crit = +100.0°C)
#Core 0:         +54.0°C  (high = +86.0°C, crit = +100.0°C)
#Core 1:         +55.0°C  (high = +86.0°C, crit = +100.0°C)
my @get_current_heat = split "\n", `sensors 2>/dev/null|grep -E -e '(Physical id 0|Core [0-1])'`;


my $counter = 0;
my $output_string;

for my $heat_usage_per_core (@get_current_heat) {
	$heat_usage_per_core =~ /(.*):\s+\+([0-9]{1,3})/;
	my $core = $1;
	my $temp = $2;
	
	
	if ($temp < $THRESHOLDS{OK}) {
		$output_string .= "$core - temperature : $temp" . 'C | ';
		$counter++;
	} elsif ( ($temp > $THRESHOLDS{OK}) && ($temp >= $THRESHOLDS{WARNING}) && ($temp < $THRESHOLDS{CRITICAL}) ) {
		print "WARNING! $core temperature: $temp\n";
		exit(WARNING);
	} elsif ( ($temp > $THRESHOLDS{OK}) && ($temp > $THRESHOLDS{WARNING}) && ($temp >= $THRESHOLDS{CRITICAL}) ) { 
		print "CRITICAL! $core temperature: $temp\n";
		exit(CRITICAL);
	}
}

if ($counter == 3 ) {
	print $output_string;
	exit(OK);
} else {
	print "Unable to get all CPU's temperature.\n";
	exit(UNKNOWN);
}

Programming: 

Linux: 

Awesome Applications: 

Monitoring DHCP server with check_dhcp

Setting Nagios to monitor my DHCP server using the plugin check_dhcp was a little tricky to setup.

First, the check_dhcp documentation indicates setting setuid on the check_dhcp binary in order to successfully query the dhcp server and receive a valid dhcp offer.

[email protected] libexec]# su - nagios -c '/usr/local/nagios/libexec/check_dhcp -s 192.168.1.2'
Warning: This plugin must be either run as root or setuid root.
To run as root, you can use a tool like sudo.
To set the setuid permissions, use the command:
chmod u+s yourpluginfile
Error: Could not bind socket to interface eth0. Check your privileges...

Fix:

chown root.root check_dhcp
chmod u+s check_dhcp

Secondly, since I always have all of my machines block all incoming traffic, I had to open up the UDP port 68 in order for the Nagios machine to accept the dhcp offer.

iptables -A INPUT -p udp --dport 68 -j ACCEPT

Linux: 

Awesome Applications: 

Writing custom Nagios plugins: check_public-ip

Now that I think Nagios is the greatest thing since slice bread, I'm slowly but surely re-writing all my custom monitoring scripts to Nagios plugins.

The following is a Nagios plugin ready script that I used to replace my old public IP monitoring (See https://www.rubysecurity.org/ip_monitoring ).

#!/bin/bash

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

current_ip="YOUR-IP-ADDRESS-HERE"
ip=`curl -connect-timeout 30 -s ifconfig.me`

if [ "$current_ip" != "$ip" ] || [ -z "$ip" ]
then
        if [[ "$ip" =~ "Service Unavailable" ]] || [[ "$ip" =~ "html" ]]
        then
                echo "IP service monitoring is unavailable."
                exit $STATE_WARNING
        elif [[ "$ip"  =~ [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ]]
        then
                echo "ALERT: Public IP has changed. NEW IP: $ip"
                exit $STATE_CRITICAL
        else
                echo "Unknown state detected."
                exit $STATE_UNKNOWN
        fi

else
        echo "Public OK: $ip"
        exit $STATE_OK
fi

Programming: 

Awesome Applications: 

Cron monitoring plugin for Nagios

#!/bin/bash
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

CRON_CHECK=`ps aux | grep cron|grep -v grep|awk '{print $NF}'|grep -E -e '^(/usr/sbin/cron|crond)$'|wc -l`

case "${CRON_CHECK}" in
        0)  echo "Crond is not running."; exit ${STATE_CRITICAL}
        ;;
        1)  echo "Crond is running."; exit ${STATE_OK}
        ;;
        *)  echo "More than one crond process detected / crond is in an unknown state."; exit ${STATE_WARNING}
        ;;
esac

Programming: 

Awesome Applications: 

Installing Nagios Remote Plugin Executor in FreeBSD 9.1

This also installs the Nagios plugins in addition of nrpe. Follow the text-based menu install options. The installer will create and configure the nagios user account, and will install the naios and nrpe plugins in /usr/local/libexec/nagios .

cd /usr/ports/net-mgmt/nrpe2
make install clean

Update permissions.

chown -R nagios:nagios /usr/local/libexec/nagios

Create nrpe config file.

cd /usr/local/etc
cp nrpe.cfg-sample nrpe.cfg

Add the following entry to /etc/rc.conf .

nrpe2_enable="YES"

Edit nrpe.cfg (Example: 192.168.1.5 is my nagios server)

allowed_hosts=192.168.1.5

Start the nrpe daemon.

/usr/local/etc/rc.d/nrpe2 start

Awesome Applications: 

Unix: 

Pages

Premium Drupal Themes by Adaptivethemes