[LARTC] More things about /proc / failover of default gateway

Ard van Breemen ard@telegraafnet.nl
Fri, 22 Feb 2002 14:07:19 +0100


Hi,
during testing with failover of the default gateways we found out the
following:
/proc/sys/net/ipv4/route/gc_timeout is some sort of timeout value,
after which the kernel declares a route to be dead.

What is the setup:
We have a system that is connected with two NIC's to a switch.
These NIC's are in the same lan, but carry different networks:
A host with two nics on a switch:
ip link set dev eth0 up
ip link set dev eth1 up
ip addr add 192.168.1.10/24 dev eth0
ip addr add 192.168.2.10/24 dev eth1
ip route add default via 192.168.1.1
ip route add default via 192.168.2.1
echo 0 > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter

A single router configuration (from a failover cluster):
ip link set dev eth0 up
ip addr add 192.168.1.1/24 dev eth0
ip addr add 192.168.2.1/24 dev eth0

As long as everything works, you will reach 192.168.2.10 through eth1,
and 192.168.1.10 through eth0.

If you unplug one of the two devices (simulating a dead nic), it will take
some time plus gc_timouet (in seconds) for linux to declare one of the two
default gateways as dead, and to start to use the other default gateway.
As long as your source address is not within 192.168.1.0/24 or
192.168.2.0/24, the kernel must use a default gateway, and therefore
your link will be redundant.
Setting gc_timeout to 10 seconds gave us a failover time of about 110
seconds for existing connections.
I did not look at the timers of the router etc, so that also is important.
-- 
<ard@telegraafnet.nl> Telegraaf Elektronische Media  http://wwwijzer.nl
http://leerquoten.monster.org/ http://www.faqs.org/rfcs/rfc1855.html 
Let your government know you value your freedom. Sign the petition:
http://petition.eurolinux.org/