[LARTC] ping failure on dual-home using -I without default route

Ian! D. Allen idallen@idallen.ca
Wed, 22 Sep 2004 13:18:20 -0400


My Linux workstation (Mandrake 10.1 kernel 2.6.8.1) is dual-homed to two
ADSL Internet providers.  Card eth0 (192.168.9.250) is the default route
and leads to an SMC router (192.168.9.254).  Card eth1 (192.168.1.250)
leads to a Linksys router (192.168.1.1).  I'm not doing any NAT or PPPoE
in the workstation - the SMC and Linksys handle it all.

If I remove all general default routes from my configuration and rely
only on the specific default routes configured in separate tables for
eth0 and eth1 (see below), then these commands both work:

   # nc -s 192.168.1.250 foo.bar           # works out eth1
   # mtr -a 192.168.1.250 foo.bar          # works out eth1

but this one fails:

   # ping -I eth1 foo.bar                  # fails out eth1

For example:

   # ping -I eth1 linux.org
   PING linux.org (198.182.196.48) from 192.168.1.250 eth1: 56(84) bytes of data.
   From 192.168.1.250 icmp_seq=1 Destination Host Unreachable
   From 192.168.1.250 icmp_seq=2 Destination Host Unreachable
   From 192.168.1.250 icmp_seq=3 Destination Host Unreachable

Running tcpdump on eth0, I see no packet traffic related to this.
Running tcpdump on eth1, this is what I see when the ping happens:

   12:35:45.263693 arp who-has linux.org tell 192.168.1.250
   12:35:46.263407 arp who-has linux.org tell 192.168.1.250
   12:35:47.263247 arp who-has linux.org tell 192.168.1.250

I can't imagine how my kernel thinks that linux.org (198.182.196.48)
is directly connected to the network on eth1 and that arp will find it!
How is this possible?  What is going on here?  I am so confused.

The above problem happens for any Internet address I care to try.

I added back my usual default route to my default table, and that didn't
change anything (I didn't expect that it would):
 
----- Table default ----------------------------------
default via 192.168.9.254 dev eth0  proto static  src 192.168.9.250

In frustration, I started trying random things and found this work-around:
If I add a second default route out eth1, the ping starts working:
 
----- Table default ----------------------------------
default via 192.168.9.254 dev eth0  proto static  src 192.168.9.250  metric 1 
default via 192.168.1.1 dev eth1  proto static  src 192.168.1.250  metric 2 

With the second default, the ping now works and all the other commands
continue to work correctly.  (The metric 2 appears to make the first
default the only choice for outgoing traffic, so that nothing I originate
goes out eth1 unless I force it to go there, which is what I want.)

I have no idea why this second default route works or fixes ping.
As I understand it, the second default should be completely ignored
since the "ping -I eth1" should be operating using my table set up for
eth1 addresses.  Why doesn't ping work the same way as nc and mtr?  Help?

(Aside: The traceroute command is completely unable to function in my
 configuration without any default routes defined, even using the -i and
 -s options to set the source address to either interface: 

    # traceroute -i eth1 linux.org
    traceroute to linux.org (198.182.196.48), 30 hops max, 38 byte packets
    traceroute: sendto: Network is unreachable
     1 traceroute: wrote linux.org 38 chars, ret=-1

    # traceroute -i eth0 linux.org
    traceroute to linux.org (198.182.196.48), 30 hops max, 38 byte packets
    traceroute: sendto: Network is unreachable
     1 traceroute: wrote linux.org 38 chars, ret=-1

 Again, I don't know why traceroute isn't operating using my table set
 up for eth0 or eth1 addresses.  How can nc and mtr both work so nicely
 without a general default route yet the venerable traceroute behaves
 so badly?  Is this my LRTC misconfiguration or a problem with traceroute?
)

Detailed config follows:

----- IP Rules ----------------------------------
0:	from all lookup local 
10:	from 192.168.9.0/24 lookup 1             # this is for eth0
20:	from 192.168.1.0/24 lookup 2             # this is for eth1
32766:	from all lookup main 
32767:	from all lookup default 
 
----- Table local ----------------------------------
broadcast 192.168.1.0 dev eth1  proto kernel  scope link  src 192.168.1.250 
broadcast 127.255.255.255 dev lo  proto kernel  scope link  src 127.0.0.1 
broadcast 192.168.9.0 dev eth0  proto kernel  scope link  src 192.168.9.250 
broadcast 192.168.1.255 dev eth1  proto kernel  scope link  src 192.168.1.250 
broadcast 192.168.9.255 dev eth0  proto kernel  scope link  src 192.168.9.250 
local 192.168.1.250 dev eth1  proto kernel  scope host  src 192.168.1.250 
broadcast 127.0.0.0 dev lo  proto kernel  scope link  src 127.0.0.1 
local 192.168.9.250 dev eth0  proto kernel  scope host  src 192.168.9.250 
local 127.0.0.1 dev lo  proto kernel  scope host  src 127.0.0.1 
local 127.0.0.0/8 dev lo  proto kernel  scope host  src 127.0.0.1 
 
----- Table 1 ----------------------------------
default via 192.168.9.254 dev eth0  proto static  src 192.168.9.250 
 
----- Table 2 ----------------------------------
default via 192.168.1.1 dev eth1  proto static  src 192.168.1.250 
 
----- Table main ----------------------------------
192.168.1.0/24 dev eth1  proto kernel  scope link  src 192.168.1.250 
192.168.9.0/24 dev eth0  proto kernel  scope link  src 192.168.9.250 
127.0.0.0/8 dev lo  scope link 
 
----- Table default ----------------------------------
default via 192.168.9.254 dev eth0  proto static  src 192.168.9.250  metric 1 
default via 192.168.1.1 dev eth1  proto static  src 192.168.1.250  metric 2 

-- 
-IAN!  Ian! D. Allen   Ottawa, Ontario, Canada
       EMail: idallen@idallen.ca   WWW: http://www.idallen.com/
       College professor (Linux) via: http://teaching.idallen.com/
       Support free and open public digital rights:  http://eff.org/