Nagios Plugin to Check the Status of PRI Lines in Asterisk

I have a number of Asterisk implementations that I keep an eye on that have multiple PRI connections. Knowing if and when they ever go down has the obvious benefits of alerting me to a problem in near real time. But besides that, it allows my customers and I to verify SLAs, track and log issues, etc.

To this end, I have written a Nagios plugin which queries Asterisk’s manager interface and executes the pri show spans CLI command (this is Asterisk 1.4 by the way). The script then parses the output to ascertain whether a PRI is up or not.

The actual code to connect to the manager interface and execute the query is simply:

if( ( $astsock = fsockopen( $host, $port, $errno, $errstr, $timeout ) ) === false )
{
    echo "Could not connect to Asterisk manager: $errstr";
    exit( STATUS_CRITICAL );
}

fputs( $astsock, "Action: Login\r\n");
fputs( $astsock, "UserName: $username\r\n");
fputs( $astsock, "Secret: $password\r\n\r\n"); 

fputs( $astsock, "Action: Command\r\n");
fputs( $astsock, "Command: pri show spans\r\n\r\n");

fputs( $astsock, "Action: Logoff\r\n\r\n");

while( !feof( $astsock ) )
{
    $asttext .= fread( $astsock, 8192 );
}

fclose( $astsock );

if( strpos( $asttext, "Authentication failed" ) !== false )
{
    echo "Asterisk manager authentication failed.";
    exit( STATUS_CRITICAL );
}

This plugin is hard coded to English and expects to find Provisioned, Up, Active for a good PRI. For example, the Asterisk implementations that support the pri show spans command that I have access to return one of:

  • PRI span 1/0: Provisioned, In Alarm, Down, Active
  • PRI span 3/0: Provisioned, Up, Active
  • PRI span 2/0: Up, Active

I’m actually running a slightly older version of Nagios at the moment, version 1.3. To integrate the plugin, first add the following command definition to an appropriate existing or new file under /etc/nagios-plugings/config/:

define command{
        command_name    check_asterisk_pri
        command_line    /usr/lib/nagios/plugins/check_asterisk_pri.php \\
             -H $HOSTADDRESS$ -U $ARG1$ -P $ARG2$ -w $ARG3$ \\
             -c $ARG4$ -n $ARG5$
}

where $ARG1$ is the Asterisk manager username and $ARG2$ is the password. $ARG3$ and $ARG4$ are the warning and critical thresholds respectively whereby if the number of available PRIs reaches one of these values, the appropriate error condition will be set. Lastly, $ARG5$ is the number of PRIs the plugin shouldexpect to find.

NB: the command_line line above is split for readability but it should all be on the one line.

Now create a test for a host in an appropriate file in /etc/nagios/config/:

define service{
        use                             core-service
        host_name                       hostname.domain.ie
        service_description             Asterisk PRIs
        check_command                   check_asterisk_pri!user!pass!2!1!4
}

Ensure that your Nagios server has permissions to access the Asterisk server via TCP on the Asterisk manager port (5038 by default). If on a public network, this should be done via stunnel or a VPN for security reasons.

Lastly, you’ll need a user with the appropriate permissions and host allow statements in your Asterisk configuration (/etc/asterisk/manager.conf):

[username]
secret = password
deny=0.0.0.0/0.0.0.0
permit=1.2.3.4/255.255.255.255
read = command
write = command

The next version may include support for BRI and Zap FXO ports also. I also plan on a Cacti plug in to show the channels on each PRI (up – on a call, down, etc). In any case, updates will be posted here.

The plug in can be download from: http://www.opensolutions.ie/misc/check_asterisk_pri.php.txt

UPDATED 20/03/2012: Aterisk 1.8.9 takes out the word “Provisioned” in “pri show spans”. Thanks to Shane O’Cain.

Easy PHP Search in Firefox

Niall has created a quick Opensearch file to add the PHP Function search to the search bar of Firefox 2 And IE7. If anyone is interested it’s available here.

For those that don’t know, this feature has existing in KDE in multiple forms for some time. For example, pressing ALT-F2 opens the Run Command dialog and typing, for example:

php:fopen

will bring up PHP.net’s own search page. The same goes for the location bar in Konqueror.

By the way, other nice short cuts in the Run Command dialog include:

  • gg: <keywords> for a quick Google search;
  • wp: <keywords> for a quick Wikipedia search;
  • dict: <keyword> for a quick dictionary look-up;
  • man: <keyword> for a man page look-up;
  • info: <keyword> for an info page look-up;
  • rfc: <number> to be brought to the relevant RFC page;

Of course, entering a command will execute it and just entering a URL will open it in Konqueror.

lft :: Layer Four Trace

Colin pointed out a useful utility called lft in response to a question on IIU. lft looks like a useful alternative traceroute application as it claims to have the ability to identify stateful inspection firewalls and other useful information.

What I found immediately attractive was the -A option which displays the AS numbers of addresses along the path and also the -N which looks up and displays the network names.

e.g.

# lft -S -A  www.yahoo.com

TTL  LFT trace to f1.us.www.vip.ird.yahoo.com (87.248.113.14):80/tcp
 ...
 ...
 3   [AS35272] lns3.net.imagine.ie (87.232.0.26) 27.3ms
 4   [AS35272] ve5.core.net.imagine.ie (87.232.0.129) 9.0ms
 5   [AS35272] ge0-0.border1.net.imagine.ie (87.232.0.1) 8.6ms
 6   [AS3257] ge-2-0-0-207.dub20.ip.tiscali.net (213.200.67.145) 13.8ms
 7   [AS3257] yahoo-overture-gw2.dub20.ip.tiscali.net (213.200.67.202) 13.9ms
 8   [AS34010] ge-1-4.bas-b1.ird.yahoo.com (87.248.101.13) 10.9ms
 9   [AS34010] [target] f1.us.www.vip.ird.yahoo.com (87.248.113.14):80 12.6ms

and

# lft -S -N www.heanet.ie

TTL  LFT trace to www.heanet.ie (193.1.219.79):80/tcp
 ...
 ...
 3   [87-RIPE/IMAGINE-IRL] lns1.net.imagine.ie (87.232.0.24) 24.0ms
 4   [87-RIPE/IMAGINE-IRL] ve5.core.net.imagine.ie (87.232.0.129) 22.3ms
 5   [87-RIPE/IMAGINE-IRL] ge0-0.border1.net.imagine.ie (87.232.0.1) 60.6ms
 6   [RIPE-CBLK/IE-INEX-IPV4-PI-NETBLK1] gige6-1-cr1-cwt.hea.net (193.242.111.16) 8.7ms
 7   [RIPE-CBLK/HEANET-EXT] gige6-1-ar1-cwt.hea.net (193.1.195.177) 45.4ms
 8   [RIPE-CBLK/HEANET-EXT] blanch-sr1-po1.services.hea.net (193.1.195.139) 25.6ms
 9   [RIPE-CBLK/HEANET-LAN] [target] www.heanet.ie (193.1.219.79):80 9.4ms

 

Nagios Plugin for the Promise VTrak 200i

For a project I was working on, I installed a Promise VTrak M200i disk shelf (i for iSCSI but then that’s a whole other blog post!) and needed to add it into the customers management systems.

Unfortunately there didn’t seem to be a lot of information out there on Promise’s SNMP MIBs so with a bit of playing about, I was able to dig out the ones I needed. The Nagios plug-in I wrote and am making available here will monitor the shelf via SNMP and alert on the following chassis issues:

  • critical if any of the shelf’s disk states changes from “OK”;
  • warning if the battery state changes from “FullyCharged”;
  • critical if either of the PSU states change from “Powered On and Functional”;
  • critical is any of the cooling devices (fans) change from “Functional”;
  • critical if any of the temperature sensors’ states change from “normal”;
  • critical if any of the drives go offline or are missing; and
  • warning if any of the drives go into the rebuilding state or have their PFA flag set.

While this is specifically designed for a single M200i, it should be easily customisable for other models.

It can be downloaded from here (http://www.opensolutions.ie/). It will also appear on the development section of this site and Nagios Plugins.

OIDs Used

1.3.6.1.4.1.7933.1.10.2.1.1.1.8
The table of physical disk statuses.
.1.3.6.1.4.1.7933.2.1.7.1.1.14.1.1
The battery status.
.1.3.6.1.4.1.7933.2.1.4.1.1.2.1
The table of Power Supply Unit statuses.
.1.3.6.1.4.1.7933.2.1.3.1.1.3.1
The table of cooling device/fan statuses.
.1.3.6.1.4.1.7933.2.1.5.1.1.3
The table of temperature sensor statuses.
.1.3.6.1.4.1.7933.1.10.1.2.1.1.22.1
The number of drives that are offline.
.1.3.6.1.4.1.7933.1.10.1.2.1.1.23.1
The number of drives in the PFA status set.
.1.3.6.1.4.1.7933.1.10.1.2.1.1.24.1
The number of drives in rebuild status.
.1.3.6.1.4.1.7933.1.10.1.2.1.1.25.1
The number of drives that are missing.

Nagios Alerts via SMS with Kapow

I have a client who required a Nagios installation with alerting via SMS (*). They use Kapow as their SMS gateway.

There were two aspects required:

  1. The sending of alerts via the SMS gateway;
  2. The monitoring of available credits on the SMS gateway;

 

1. Send Alerts via SMS Gateway

The sendsms script is:

#! /bin/bash

USERNAME=username
PASSWORD=password
SENDSMSADDRESS="https://www.kapow.co.uk/scripts/sendsms.php"
MAXMSGLENGTH=320

read -n $MAXMSGLENGTH -r MSG

MSG=`php -r "echo urlencode( \"$MSG\" );"`

wget -q -O - "$SENDSMSADDRESS?username=$USERNAME&password=$PASSWORD&mobile=$1&sms=$MSG"

I use a quick hack with PHP to URL encode the string. I didn’t know a shell command off hand but I’m open to suggestions. This can be tested with:

echo This is a test message | sendsms 353861234567

Edit /etc/nagios/misccommands.cfg to include the following:

# 'host-notify-by-sms' command definition
define command{
        command_name    host-notify-by-sms
        command_line    /usr/bin/printf "%b" "Host '$HOSTALIAS$' is $HOSTSTATE$: $OUTPUT$" | /usr/local/bin/sendsms $CONTACTPAGER$
        }

# 'notify-by-sms' command definition
define command{
        command_name    notify-by-sms
        command_line    /usr/bin/printf "%b" "$NOTIFICATIONTYPE$: $SERVICEDESC$@$HOSTNAME$: $SERVICESTATE$ ($OUTPUT$)" | /usr/local/bin/sendsms $CONTACTPAGER$
        }

Ensure your /etc/nagios/contacts.cfg is updated to include notification by SMS with your mobile number:

define contact{
        contact_name                    barryo
        alias                           Barry O'Donovan
        service_notification_period     barryoworkhours
        host_notification_period        barryoworkhours
        service_notification_options    w,u,c,r
        host_notification_options       d,u,r
        service_notification_commands   notify-by-email,notify-by-sms
        host_notification_commands      host-notify-by-email,host-notify-by-sms
        email                           joe@bloggs.com
        pager                           353868765432
}

Sin é.

 

2. Monitor SMS Gateway Credits

The plugin code is:

#! /bin/bash

USERNAME=username
PASSWORD=password
CHECKCREDITSADDRES="https://www.kapow.co.uk/scripts/chk_credit.php"

CRIT=$1
WARN=$2

CREDITS=`wget -q -O - "$CHECKCREDITSADDRES?username=$USERNAME&password=$PASSWORD"`

if [[ -z $CREDITS || ! $CREDITS -ge 0 ]]; then
        echo -e "$CREDITS\\n";
        exit 3;
elif [[ $CREDITS -le $CRIT ]]; then
        echo -e "$CREDITS SMS credits remaining\\n";
        exit 2;
elif [[ $CREDITS -le $WARN ]]; then
        echo -e "$CREDITS SMS credits remaining\\n";
        exit 1;
else
        echo -e "$CREDITS SMS credits remaining\\n";
        exit 0;
fi

Create a plugin configuration file for Nagios, say /etc/nagios-plugins/config/sms_credits.cfg:

# 'check_sms_credits' command definition
define command{
        command_name    check_sms_credits
        command_line    /usr/local/bin/check_sms_credit $ARG2$ $ARG1$
        }

Where $ARG1$ is the warning threshold and $ARG2$ is the critical threshold.

I add the service to the Nagios monitoring box via /etc/nagios/config/sms_credit.cfg:

#
# check sms credits on Kapow - barryo 20070519
#

define service{
        use                             core-service
        host_name                       noc
        service_description             SMS Credits
        check_command                   check_sms_credits!50!100
}

And I believe that’s it.

*) The monitoring box is in a different country to the servers it monitors so a network failure will not prevent the alert getting out.

Putting /etc Under Subversion (SVN)

A Google for the above took some work to locate the exact recipe I wanted for this. The problem is that one really needs to do an ‘in-place’ import. The solution was from Subversion‘s own FAQs (specifically this) which is reproduced here with some changes:

# svn mkdir svn+ssh://user@host/srv/svn-repository/hosts/host1/etc \
         -m "Make a directory in the repository to correspond to /etc for this host"
# cd /etc
# svn checkout svn+ssh://user@host/srv/svn-repository/hosts/host1/etc .
# svn add *
# svn commit -m "Initial version of this host's config files"

 

OpenVPN “Just Works”

When it comes to OSS, it very often happens that I find something I like and stick with it.

OpenVPN is a good example of this.

I have a number of OpenVPN installations for various purposes and today I had need of yet another for a new client.

I often thought about writing a how-to for OpenVPN. But why bother? It’s quick and easy to implement and they already have a brief but comprehensive how-to which always does the job for me – once you’ve set it up once, the next time will take just 30 minutes.

OpenVPN just works. It does what it says on the tin and it’s reliable and robust.

IPMI Sensor Data on Dell 1850s and 2850s via SNMP and Cacti

I use Cacti to monitor a lot of Dell servers, primarily 1850s and 2850s but also the newer models of same (1950s and 2950s). One itch that I’ve meant to scratch for a while is graphing some of the information available through the servers’ IPMI interface; specifically the servers’ various temperatures and and fan speeds.

IPMI Details

There are patches available for the Linux kernel to allow the IPMI information to be read via the lm_sensors project but I chose to avoid this (at least for now) as I’d have to schedule downtime to reboot the servers for a new kernel. It’d also ruin their uptime – most of the servers (serving many thousands of users daily) have almost two years of uptime. (The kernels are monolithic.)

Instead, I went with the already compiled in Linux IPMI Driver (see kernel source: Documentation/IPMI.txt) which is available in the ‘Character Devices’ menu. I specifically needed the following options for the Dells:

  • drivers/char/ipmi/ipmi_msghandler
  • drivers/char/ipmi/ipmi_devintf
  • drivers/char/ipmi/ipmi_si

In order to read information from the IPMI, you need the ipmitool utility which is available on most recent Linux distributions or from here.

Lastly, I needed to create a character special file to interface with the IPMI:

mknod /dev/ipmi0 c 254 0       

The sensor information was then available via:

# ipmitool sensor
Temp             | 30.000     | degrees C  | ok    | na        | na        | na        | 85.000    | 90.000    | na
Temp             | 34.000     | degrees C  | ok    | na        | na        | na        | 85.000    | 90.000    | na
Ambient Temp     | 16.000     | degrees C  | ok    | na        | 3.000     | 8.000     | 42.000    | 47.000    | na
...

Making IPMI Sensor Information Available via SNMP

I make the IPMI sensor information available over SNMP by adding the following to the snmpd.conf file:

# Monitor IPMI Temperature and Fan stats
exec    .1.3.6.1.4.1.X.1000 ipmitemp        /usr/local/sbin/ipmi-temp-stats
exec    .1.3.6.1.4.1.X.1001 ipmifan         /usr/local/sbin/ipmi-fan-stats

(Replace X above as appropriate.)

The scripts referenced are: /usr/local/sbin/ipmi-temp-stats:

#! /bin/sh

PATH=/usr/bin:/bin
STATS=/tmp/ipmisensor-snmp

printf "%f\n" `cat $STATS | grep Temp | cut -s -d "|" -f 2`

And /usr/local/sbin/ipmi-fan-stats:

#! /bin/sh

PATH=/usr/bin:/bin
STATS=/tmp/ipmisensor-snmp

printf "%f\n" `cat $STATS | grep FAN | cut -s -d "|" -f 2`

The file they reference is generated every 5mins (Cacti polling interval) via a cron entry in the file /etc/cron.d/ipmitool:

*/5 * * * * root /usr/bin/ipmitool sensor >/tmp/ipmisensor-snmp

After restarting SNMP and allowing the cron job to execute at least once, you can test the results via:

# snmpwalk -c <community> -v <version> <ip/hostname> .1.3.6.1.4.1.X.1000
SNMPv2-SMI::enterprises.X.1000.1.1 = INTEGER: 1
SNMPv2-SMI::enterprises.X.1000.2.1 = STRING: "ipmitemp"
SNMPv2-SMI::enterprises.X.1000.3.1 = STRING: "/usr/local/sbin/ipmi-temp-stats"
SNMPv2-SMI::enterprises.X.1000.100.1 = INTEGER: 0
SNMPv2-SMI::enterprises.X.1000.101.1 = STRING: "37.000000"
SNMPv2-SMI::enterprises.X.1000.101.2 = STRING: "39.000000"
SNMPv2-SMI::enterprises.X.1000.101.3 = STRING: "23.000000"
SNMPv2-SMI::enterprises.X.1000.101.4 = STRING: "36.000000"
...
SNMPv2-SMI::enterprises.X.1000.102.1 = INTEGER: 0
SNMPv2-SMI::enterprises.X.1000.103.1 = ""

Graphing This Information in Cacti

Finally, I graph this information on Cacti (see end of post for examples).

I am making six templates available here which can be imported into Cacti (these were generated using version 0.8.6j) for graphing the above:

  1. Cacti graph template for Dell 1850 temperatures (see first image below);
  2. Cacti graph template for Dell 2850 temperatures (see second image below);
  3. Cacti graph template for Dell 1850 fan speeds (see third image below);
  4. Cacti graph template for Dell 2850 fan speeds (see fourth image below);
  5. Cacti host template for Dell 1850; and
  6. Cacti host template for Dell 2850.

The last two templates available are host templates for Dell 1850s and 2850s (I’m sure they’ll work fine with 1950s and 2950s also). These templates include:

  • Host MIB – Logged in Users;
  • Host MIB – Processes;
  • IPMI Fan Speeds (Dell x850) (from above);
  • IPMI Temperatures (Cel) (Dell x850) (from above);
  • ucd/net – CPU Usage;
  • ucd/net – Load Average;
  • ucd/net – Memory Usage;
  • SNMP – Get Mounted Partitions (data query); and
  • SNMP – Interface Statistics (data query).

Example graphs are shown below; they’re not the cleanest given the amount of information they contain but they serve my purposes.

[Dell 1850 Temps]

[Dell 2850 Temps]

[Dell 1850 Fan Speeds]

[Dell 2850 Fan Speeds]

© 2007 Barry O’Donovan. All text is licensed under a Creative Commons Attribution 3.0 License. All scripts and Cacti templates are licensed under the MIT License.

Sangoma Inconsistancies with Latest Zaptel-1.4

I’m on a tight deadline and the last thing I need right now is kernel/Asterisk/Zaptel/Sangoma issues… This post may just help someone else save some time:

When running Sangoma’s Setup script from their wanpipe-2.3.4-7 release, the following error occurs when it tries to patch the latest zaptel (version 1.4 checked from SVN revision 2399):

Enable TDMV DCHAN Native HDLC Support & Patch Zaptel ? (y/n) y

Did NOT find the seached str:chan->writen\[chan->inwritebuf\] = amnt;
search_and_replace(zaptel-base.c) failed

Applying the following diff to Setup should solve the problem:

1c1
< #!/bin/sh
---
> #!/bin/bash
6128c6128
< ZAPTEL_C_SEARCH_STR="chan->writen\[chan->inwritebuf\] = amnt;"
---
> ZAPTEL_C_SEARCH_STR="chan->writen\[res\] = amnt;"
6134c6134
<                       chan->writen[chan->inwritebuf] = amnt;"
---
>                       chan->writen[res] = amnt;"

The change from sh to bash was to overcome the following error:

./Setup: 1014: Syntax error: Bad substitution

Finally, it looks like Sangoma’s current wanpipe will not work with linux-2.6.20.x:

/usr/src/wanpipe/kdrvtmp/sdla_xilinx.c:636:62: error: macro "INIT_WORK" passed 3 arguments, but takes just 2
/usr/src/wanpipe/kdrvtmp/sdla_xilinx.c: In function ‘wp_xilinx_init’:
/usr/src/wanpipe/kdrvtmp/sdla_xilinx.c:636: error: ‘INIT_WORK’ undeclared (first use in this function)
/usr/src/wanpipe/kdrvtmp/sdla_xilinx.c:636: error: (Each undeclared identifier is reported only once
/usr/src/wanpipe/kdrvtmp/sdla_xilinx.c:636: error: for each function it appears in.)

It works fine with linux-2.6.19.x.

I have let Sangoma’s support team know of these issues so hopefully they’ll be resolved before anyone has to actually use this post.