[LARTC] DGD patch not detecting dead gateway

Manish Kathuria mkathuria at tuxtechnologies.co.in
Sat Jan 27 17:12:57 CET 2007


On 1/27/07, Geoff Dornan <geoff at cmcnetworks.net> wrote:
> Hi
>
> Can you post your script please?
>
> Cheers
> geoff
>

>
> On 1/20/07, Grant Taylor <gtaylor at riverviewtech.net> wrote:
> > On 01/19/07 12:45, Manish Kathuria wrote:
> > > My experience has been mixed. The patch worked very well in many
> cases
> > > but in some it worked only if the first hop gateway was down and not
> > > any of the subsequent hops. So as you mentioned its happening since
> it
> > > can ping the switch / modem, it thinks the link is good. You can
> make
> > > a script which will keep on running in the background and check it
> the
> > > links are up or not and if any of the links is down, it can change
> the
> > > default route and provide a failover.
> >
> > I have been tasked with writing such a script.  In my scenario, I'm
> > taking it a bit further though.  I am planing on having my script test
> > the actual service that I'm trying to connect to.  I.e. connect to
> port
> > 80 and request a page.  I'm having to go this route because I've had
> > sporadic MTU issues in one of our (primary) paths.  The provider is
> > suppose to be repairing the problem, however I need a solution before
> > that can happen.
>
> The method I have adopted is to use a shell script which pings a
> popular remote site 's IP (for example www.yahoo.com or
> www.google.com) through each of the interfaces every 10 seconds. The
> default multipath route is replaced by a single default gateway if
> reply is not received for 4 consecutive tries from one of the links.
> This is to avoid very frequent failovers. However, the link is treated
> as live as soon as a  ping reply is received and the multipath route
> is activated.
>

The script is appended. It assumes that you have followed the steps as
described in nano.txt with or without applying the patches. Though it
appears to be very simplistic, its working great at a number of
locations.

#!/bin/bash -x

TESTIP=www.yahoo.com
CHECK=0
ISPA=1
ISPB=1
LINKSTATUS=1
COUNTA=0
COUNTB=0
EXTIF1=eth1
EXTIF2=eth2
GW1=172.16.1.1
GW2=192.168.1.1
W1=1
W2=1

while : ; do

	ping -I $EXTIF1 -c 1 $TESTIP > /dev/null  2>&1
	RETVAL=$?
	if [ $RETVAL -ne 0 ]; then
		COUNTA=`expr $COUNTA + 1`
	else
		COUNTA=0
	fi

	if [ $COUNTA -ge 4 ]; then
		ISPA=0
	else
		ISPA=1
	fi

	ping -I $EXTIF2 -c 1 $TESTIP > /dev/null  2>&1
        RETVAL=$?
        if [ $RETVAL -ne 0 ]; then
                COUNTB=`expr $COUNTB + 1`
        else
                COUNTB=0
        fi

        if [ $COUNTB -ge 4 ]; then
                ISPB=0
        else
                ISPB=1
        fi


	if [ $ISPA -eq 1 ]; then
		if [ $ISPB -eq 1 ]; then
			NEWSTATUS=1
		elif [ $ISPB -eq 0 ]; then
			NEWSTATUS=2
		fi
	elif [ $ISPA -eq 0 ]; then
		if [ $ISPB -eq 1 ]; then
			NEWSTATUS=3
		fi
	fi
	
	case $LINKSTATUS in

	1)	if [ $NEWSTATUS -eq 2 ]; then
			ip route replace default via $GW1 dev $EXTIF1
		elif [ $NEWSTATUS -eq 3 ]; then
			ip route replace default via $GW2 dev $EXTIF2
		fi;;

	2)	if [ $NEWSTATUS -eq 1 ]; then
			ip route del default
			ip route replace default table 222 proto static \
				nexthop via $GW1 dev $EXTIF1 weight $W1\
				nexthop via $GW2 dev $EXTIF2 weight $W2
		elif [ $NEWSTATUS -eq 3 ]; then
			ip route replace default via $GW2 dev $EXTIF2
		fi;;

	3)	if [ $NEWSTATUS -eq 1 ]; then
			ip route del default
			ip route replace default table 222 proto static \
				nexthop via $GW1 dev $EXTIF1 weight $W1\
				nexthop via $GW2 dev $EXTIF2 weight $W2
		elif [ $NEWSTATUS -eq 2 ]; then
			ip route replace default via $GW1 dev $EXTIF1
		fi;;

	*)	echo;;
		
	esac

	LINKSTATUS=$NEWSTATUS
	sleep 10
done

Let me know if you can think of any improvements or modifications.

-- 
Manish Kathuria
Tux Technologies
http://www.tuxtechnologies.co.in/


More information about the LARTC mailing list