Saturday, August 05, 2006

Mysql Replication check for Nagios

Many a times I had faced issues with MySQL replication breaking off due to network problem or system related problem. And most of the time by the time I get to know of these kind of problems it would be late. So I had developed a piece of code for Nagios to alert me when ever the replication has any trouble. Below is the code snippet from the check_repl script I developed for monitoring the replication check on a single master single slave setup.

#!/bin/sh
###################################################################
# The script is a plugin for nagios. The script is used to check
# the replication status between the Master and Slave. The script
# has to be executed from the master.
# Written by : Jithesh M K
# Created on : July, 2006
# Updates on : ###################################################################

# Nagios alert status
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4

# Script Variables
SLAVEIP_1= # IP Address of the slave
REPLUSERNAME=nagios # User who has power to query 'show master status' and 'show slave status'
REPLPASSWD=nagios
CRITICAL_VALUE=1000
WARNING_VALUE=500

iSlave_1=`mysql -h $SLAVEIP_1 -u $REPLUSERNAME -p$REPLPASSWD -e "show slave status" | grep bin | cut -f7`
iMaster=`mysql -u $REPLUSERNAME -p$REPLPASSWD -e "show master status" | grep bin | cut -f2`
iDiff_1=`expr $iMaster - $iSlave_1`
echo "Master Log Position : "$iMaster" 1st Slave Log Position : "$iSlave_1 " Difference : "$iDiff_1
if [ $iDiff_1 -gt $CRITICAL_VALUE ]
then
exit $STATE_CRITICAL
elif [ $iDiff_1 -gt $WARNING_VALUE ]
then
exit $STATE_WARNING
else
exit $STATE_OK
fi

2 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

it's simple, but much more sophisticated solution than monitoring just Slave status, IO status and "seconds behind".

I've just tuned your script in order to show graphs.

perfData="replication_lag="$iDiff_1";"$WARNING_VALUE";"$CRITICAL_VALUE";0;"

echo "Master Log Position : "$iMaster" 1st Slave Log Position : "$iSlave_1 " Difference : "$iDiff_1"|"$perfData