[LARTC] DGD patch not detecting dead gateway
Manish Kathuria
mkathuria at tuxtechnologies.co.in
Sat Jan 27 17:12:57 CET 2007
On 1/27/07, Geoff Dornan <geoff at cmcnetworks.net> wrote:
> Hi
>
> Can you post your script please?
>
> Cheers
> geoff
>
>
> On 1/20/07, Grant Taylor <gtaylor at riverviewtech.net> wrote:
> > On 01/19/07 12:45, Manish Kathuria wrote:
> > > My experience has been mixed. The patch worked very well in many
> cases
> > > but in some it worked only if the first hop gateway was down and not
> > > any of the subsequent hops. So as you mentioned its happening since
> it
> > > can ping the switch / modem, it thinks the link is good. You can
> make
> > > a script which will keep on running in the background and check it
> the
> > > links are up or not and if any of the links is down, it can change
> the
> > > default route and provide a failover.
> >
> > I have been tasked with writing such a script. In my scenario, I'm
> > taking it a bit further though. I am planing on having my script test
> > the actual service that I'm trying to connect to. I.e. connect to
> port
> > 80 and request a page. I'm having to go this route because I've had
> > sporadic MTU issues in one of our (primary) paths. The provider is
> > suppose to be repairing the problem, however I need a solution before
> > that can happen.
>
> The method I have adopted is to use a shell script which pings a
> popular remote site 's IP (for example www.yahoo.com or
> www.google.com) through each of the interfaces every 10 seconds. The
> default multipath route is replaced by a single default gateway if
> reply is not received for 4 consecutive tries from one of the links.
> This is to avoid very frequent failovers. However, the link is treated
> as live as soon as a ping reply is received and the multipath route
> is activated.
>
The script is appended. It assumes that you have followed the steps as
described in nano.txt with or without applying the patches. Though it
appears to be very simplistic, its working great at a number of
locations.
#!/bin/bash -x
TESTIP=www.yahoo.com
CHECK=0
ISPA=1
ISPB=1
LINKSTATUS=1
COUNTA=0
COUNTB=0
EXTIF1=eth1
EXTIF2=eth2
GW1=172.16.1.1
GW2=192.168.1.1
W1=1
W2=1
while : ; do
ping -I $EXTIF1 -c 1 $TESTIP > /dev/null 2>&1
RETVAL=$?
if [ $RETVAL -ne 0 ]; then
COUNTA=`expr $COUNTA + 1`
else
COUNTA=0
fi
if [ $COUNTA -ge 4 ]; then
ISPA=0
else
ISPA=1
fi
ping -I $EXTIF2 -c 1 $TESTIP > /dev/null 2>&1
RETVAL=$?
if [ $RETVAL -ne 0 ]; then
COUNTB=`expr $COUNTB + 1`
else
COUNTB=0
fi
if [ $COUNTB -ge 4 ]; then
ISPB=0
else
ISPB=1
fi
if [ $ISPA -eq 1 ]; then
if [ $ISPB -eq 1 ]; then
NEWSTATUS=1
elif [ $ISPB -eq 0 ]; then
NEWSTATUS=2
fi
elif [ $ISPA -eq 0 ]; then
if [ $ISPB -eq 1 ]; then
NEWSTATUS=3
fi
fi
case $LINKSTATUS in
1) if [ $NEWSTATUS -eq 2 ]; then
ip route replace default via $GW1 dev $EXTIF1
elif [ $NEWSTATUS -eq 3 ]; then
ip route replace default via $GW2 dev $EXTIF2
fi;;
2) if [ $NEWSTATUS -eq 1 ]; then
ip route del default
ip route replace default table 222 proto static \
nexthop via $GW1 dev $EXTIF1 weight $W1\
nexthop via $GW2 dev $EXTIF2 weight $W2
elif [ $NEWSTATUS -eq 3 ]; then
ip route replace default via $GW2 dev $EXTIF2
fi;;
3) if [ $NEWSTATUS -eq 1 ]; then
ip route del default
ip route replace default table 222 proto static \
nexthop via $GW1 dev $EXTIF1 weight $W1\
nexthop via $GW2 dev $EXTIF2 weight $W2
elif [ $NEWSTATUS -eq 2 ]; then
ip route replace default via $GW1 dev $EXTIF1
fi;;
*) echo;;
esac
LINKSTATUS=$NEWSTATUS
sleep 10
done
Let me know if you can think of any improvements or modifications.
--
Manish Kathuria
Tux Technologies
http://www.tuxtechnologies.co.in/
More information about the LARTC
mailing list