From ole.reinartz at gmx.de Sun Apr 1 14:44:52 2007 From: ole.reinartz at gmx.de (Ole Reinartz) Date: Sun Apr 1 14:45:27 2007 Subject: [LARTC] Problem setting shift value in tcindex filter on big endian machine Message-ID: <460FA944.40902@gmx.de> Hi all, I'm trying to get some DiffServ QoS shaping to work on an XScale machine, running big endian. I'm setting it up with tc. Using the tcindex filter I found that regardless what shift value I enter, only '0' is returned when I list the filters afterwards. The very same rules work fine on my (little endian) PC. Looking at the code (iproute2-2.6.18-061002) I found that tc (in tc/f_tcindex.c, line 72 and after) sends the shift value to the kernel as an int. The kernel, however, expects it as a 'u16' (net/sched/cls_tcindex.c, around line 250 depending on the exact kernel version). I checked 2.6 kernel versions back until 2.6.11. So... do we have a type mismatch here? As 'shift' is the last parameter in the buffer, this works still very well on a little endian machine, however on a big endian machine allways 0 is received in the kernel. To check that I changed the type of the shift value to unsigned short in tc, and that fixed it for me. Someone interested in a patch? Regards Ole From linux at arcoscom.com Sun Apr 1 22:43:12 2007 From: linux at arcoscom.com (ArcosCom Linux User) Date: Sun Apr 1 22:41:22 2007 Subject: [LARTC] Re: [Bridge] Why TTL is changing when sending a ping? In-Reply-To: <20070401120519.4a070b95@localhost.localdomain> References: <59452.84.123.233.184.1175286636.squirrel@www.arcoscom.com> <20070401120519.4a070b95@localhost.localdomain> Message-ID: <37471.84.123.233.184.1175460192.squirrel@www.arcoscom.com> El Dom, 1 de Abril de 2007, 21:05, Stephen Hemminger escribi?: > On Fri, 30 Mar 2007 22:30:36 +0200 (CEST) > "ArcosCom Linux User" wrote: > >> The situation is this: >> >> INTERNET -- ROUTER -- ETHERNET+WIFI -- PC's >> >> The conection between INTERNET and ROUTER is done with 2 LINKs with >> static >> IP's. >> >> The conection between ROUTER and PC's is done via ETHERNET lan with many >> bridges and ACCESSPOINTS. >> >> The PC's are in a IP subnet, the BRIDGES in another IP subnet, the AP's >> in >> another IP subnet. The ROUTER has 1 bridge interface (2 real ethernets >> in >> the bridge) connected to the LAN. >> >> In the router exists then br0, br0:1, br0:2, br0:3 (PCs, APs, BRIDGEs IP >> subnets) to allow IP connection over the ETHERNET+WIFI between ROUTER >> and >> clients. >> >> The principal purpose of the ROUTER is to allow internet acces to PC's. >> The BRIDGES and AP's have got implemented STP protocol and appears to be >> working fine (ap's and bridges are embedded linux boxes). >> >> In router I have enabled rp_filter in all interfaces, default and each >> one. >> The ip routing is enabled too (obviously). >> >> I detected that a normal ping from ROUTER to one PC usually has a >> TTL=64, >> but many times that TTL changes to 128. >> >> What could be the problem? The "routing" enabled in bridge devices? >> Some TCP/IP parameter I don't configured fine? >> Any idea? >> > > Are you using some form of connection tracking filtering on the bridge? > If the packet has to be regenerated as part of filtering it might > restart the TTL hop count. > Yes, but not really into the bridges as is. I'm using tracking between wan0 an zlan0, not between the bridges interfaces. As I described below, the TTL changes with pings from ROUTER to any PC, my question is not about pings from LAN to internet and in this case (local pings from router to PCs) the tracking I expect has no effect, is ICMP trafic (echo requests and answers). Could you explain a bit how connection tracking modules (IP layer) can interfere with ICMP traffic as you suggest? Any other suggestions? From bgs at bgs.hu Mon Apr 2 14:02:07 2007 From: bgs at bgs.hu (Bgs) Date: Mon Apr 2 14:02:38 2007 Subject: [LARTC] mark incoming traffic Message-ID: <4610F0BF.20304@bgs.hu> Greetings, I'd like to mark incoming traffic based on TOS to use the mar for backtraffic routing. I have two gateways on the same net and incoming traffic may arrive from any of them. I want the return packets to go the same way. My plan is: Normal traffic goes through default gw. Traffic from the other has TOS 0x08 set. I'd like to mark traffic with TOS and use fwmark awith iproute for outbound packets. My problem is that I can mark based on an incoming property and I need the mark on the outbound packets. How can I do this? Doing "-A INPUT -p tcp -m tos --tos 0x08 -j MARK --set-mark 1" (in mangle of course) is not good as the mark is lost. Doing tests with "-A OUTPUT -p tcp -d test_client_ip -j MARK --set-mark 1" works ok. Is there a solution? Thanks in advance Bgs From kaber at trash.net Mon Apr 2 14:13:33 2007 From: kaber at trash.net (Patrick McHardy) Date: Mon Apr 2 14:13:44 2007 Subject: [LARTC] Problem setting shift value in tcindex filter on big endian machine In-Reply-To: <460FA944.40902@gmx.de> References: <460FA944.40902@gmx.de> Message-ID: <4610F36D.3020108@trash.net> Please send bugreports to netdev@vger.kernel.org. Ole Reinartz wrote: > I'm trying to get some DiffServ QoS shaping to work on an XScale > machine, running big endian. I'm setting it up with tc. Using the > tcindex filter I found that regardless what shift value I enter, only > '0' is returned when I list the filters afterwards. The very same rules > work fine on my (little endian) PC. > Looking at the code (iproute2-2.6.18-061002) I found that tc (in > tc/f_tcindex.c, line 72 and after) sends the shift value to the kernel > as an int. The kernel, however, expects it as a 'u16' > (net/sched/cls_tcindex.c, around line 250 depending on the exact kernel > version). I checked 2.6 kernel versions back until 2.6.11. It appears this was broken during some ->change operation fixes in 2.6.11. > So... do we have a type mismatch here? As 'shift' is the last parameter > in the buffer, this works still very well on a little endian machine, > however on a big endian machine allways 0 is received in the kernel. To > check that I changed the type of the shift value to unsigned short in > tc, and that fixed it for me. > Someone interested in a patch? Yes, but its the kernel that needs to be fixed to expect a u32. From bgs at bgs.hu Mon Apr 2 15:21:22 2007 From: bgs at bgs.hu (Bgs) Date: Mon Apr 2 15:21:34 2007 Subject: [LARTC] mark incoming traffic In-Reply-To: <4610F0BF.20304@bgs.hu> References: <4610F0BF.20304@bgs.hu> Message-ID: <46110352.6000305@bgs.hu> Never mind... got it... I will sit down myself :) Bgs wrote: > > Greetings, > > I'd like to mark incoming traffic based on TOS to use the mar for > backtraffic routing. I have two gateways on the same net and incoming > traffic may arrive from any of them. I want the return packets to go the > same way. My plan is: > > Normal traffic goes through default gw. Traffic from the other has TOS > 0x08 set. I'd like to mark traffic with TOS and use fwmark awith iproute > for outbound packets. > > My problem is that I can mark based on an incoming property and I need > the mark on the outbound packets. How can I do this? > > Doing "-A INPUT -p tcp -m tos --tos 0x08 -j MARK --set-mark 1" (in > mangle of course) is not good as the mark is lost. Doing tests with "-A > OUTPUT -p tcp -d test_client_ip -j MARK --set-mark 1" works ok. > > Is there a solution? > > Thanks in advance > Bgs > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From lists at antonello.org Mon Apr 2 20:02:36 2007 From: lists at antonello.org (lists@antonello.org) Date: Mon Apr 2 20:02:37 2007 Subject: [LARTC] Kernel timer frequency and HTB Message-ID: <4611453C.4000706@antonello.org> Hello, i have a linux box which is acting as a lan router towards the internet doing traffic shaping. My link is 10Mbit/s full duplex. I have set some HTB classes with a rate of 20% (2Mbit/s) and a ceil of 95% (9.5Mbit/s). Is such an excursion of bandwidth in the HTB classes feasible for HTB to control? What Timer frequency (kernel menuconfig) is the most suitable among 250, 300 and 1000Hz for HTB? Also, I really don't have any ideas on how the frequency could affect the network adapter performance. Is a high frequency going to have bad effects on interrupts handling? Has anybody any suggestions about this issue? I have some warnings about HTB quantums being too big, but as i understand, those should only affect the precision of the shaping without undermining the shaping completely. Thank you a lot. jack From gsomlo at gmail.com Tue Apr 3 01:38:29 2007 From: gsomlo at gmail.com (Gabriel Somlo) Date: Tue Apr 3 01:38:34 2007 Subject: [LARTC] Please Help: Can't access bands > 10 on prio qdisc Message-ID: <2387247e0704021638j497cb0b9nf86099a90d878576@mail.gmail.com> Hi, I'm trying to set up 15 different delay intervals for packets leaving on an interface, using netems hanging off of a 16-band prio. I'm having trouble adding anything to bands higher than 10. Here's what I tried: tc qdisc add dev eth0 root handle 1: prio bands 16 \ priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I want all default traffic to go to the highest priority band (0), regardless of the TOS bits and whatever. I'll add filters to place select packets in lower priority bands (1..15). So far, so good. Next: tc qdisc add dev eth0 parent 1:10 handle 100: netem delay 20ms This works fine, adding a netem qdisc to band 10. However, when I try this: tc qdisc add dev eth0 parent 1:11 handle 110: netem delay 30ms I get an error: RTNETLINK answers: Invalid argument The "invalid argument" it's bitching about is "parent 1:11". What am I doing wrong ? Parents 1:1 through 1:10 work fine, but as soon as I go 11 or above, I get this error... Thanks for any pointers or ideas, Gabriel From lsharpe at pacificwireless.com.au Tue Apr 3 03:32:31 2007 From: lsharpe at pacificwireless.com.au (Leigh Sharpe) Date: Tue Apr 3 03:32:58 2007 Subject: [LARTC] invoking ebtables with tc Message-ID: Hi all, Is it possible to invoke an ebtables target from tc? Ie we can use the 'action ipt' to invoke an IPTables target, but I was wanting to use an ebtables target instead. Is this possible? Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070403/861ed0ae/attachment.htm From lists at andyfurniss.entadsl.com Tue Apr 3 03:47:33 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 3 03:47:13 2007 Subject: [LARTC] Divide bandwidth between 4 groups of ip with the same rate In-Reply-To: <460E3A5C.4010309@telefonica.net> References: <27625364.1174434090073.JavaMail.root@ctps1> <460C536F.9020101@andyfurniss.entadsl.com> <460E3A5C.4010309@telefonica.net> Message-ID: <4611B235.5@andyfurniss.entadsl.com> Carlos Vereda-Alonso wrote: > Thank you very munch for your advises. I should admit that I am a > beginner with respect to QoS and stuffs related to network protocols, so > I don't completely understand some of your explanations, doesn't matter. It can be tricky and in some ways there are not any right answers - every script may be wrong in some way. My own is now flawed since I recently got more bandwidth :-) > > I have made some modifications to the previous script. > > 1) I have decide to remove the traffic shape in eth0 (the one connected > to my LAN), because I have seen that the download traffic is not > overhead in the ADSL router that connect the linux box to the ISP. I > have read in Internet that a correct way to shape the download traffic > is using the imq device. So I would have to study how can I use the imq > device and if I should use it. I think that the packets dropping that > you indicated for the ingress traffic would be made in this device. No, you only need imq if you have download traffic to the same PC you are shaping on and you are doing NAT and you want to seperate the traffic from traffic that is to be forwarded to other lan PCs. ifb is now in kernel and can do the other things that people used to use imq for. If you are just dealing with forwarded traffic then it's just as good to to shape on the LAN facing NIC. > > 2) The main problem in my LAN is that two of the IP groups that I have > defined (two neighbor in the building) use almost all the upload > bandwidth. They use Bit-torrent and E-donkey with unlimited upload rate > I ask them to limit the upload without success. So I have followed your > advices related to the traffic control in eth1 (the one connected to the > ADSL router). Limiting bittorrent on a shared connection is not ideal as you don't know what to set the limit to, in some ways it's true even if you have the connection all to yourself because on dsl the acks from downloads eat alot of bandwidth. > 2.1) Now the rates on egress add up to 200 which is equal to the parent > rate. I decrease the parent rate in order to avoid the queue in the ADSL > router that you think is building up in the router. OK - if you are into building your own kernels and want to tweak there are ways that you can get things perfect. You also need to know the exact details of your dsl connection. One day it will be in kernel, but for now you would need to patch. > 2.2) I have removed the htb default on eth1, I suppose that the > unclassified packets will go to the parent class, is ok?. Is my arp > going to the parent class?. Unclassified traffic will go totally unshaped arp is unclassified in this case - if the nic is just to a router and you are sure that your marking gets all the IP traffic then that's OK - you don't even really need the 8/10 mbit class. tc -s qdisc ls dev eth1 will give direct_packets_stat XXXX these are the unshaped packets. > 2.3) Sorry, I don't know how assymetric the dsl line is and I don't > know how I can get this information. I am a beginner :) I am English but I can't spell - it's asymmetric :-) You have roughly 300kbit/s up and 3000kbit/s download which is 1:10, 1:1 would be symmetric. Roughly you can download 250 1500 byte packets/sec, tcp sends an ack packet upstream - mostly for every other packet received, but it may be for every packet sometimes. Because dsl nearly always uses atm cells and some sort of ppp a tcp ack uses 2 cells which is 106 bytes per ack. This means that just downloading at 3meg will use between 106 and 212kbit of your upstream bandwidth at atm level. I have the same problem - I used to share 4/5 way on 288/576, then 288/1156 which wasn't too bad. Now it's 448/7392 and I still haven't decided on a policy or tested how much I can abuse tcp by dropping acks :-) 300kbit is not a dsl sync rate so I guess your ISP allows a bit for the overheads - you should be able to get your router to tell you what your atm bitrate really is. > 2.4) I have changed the r2q value in the parent qdisc from the default > value (10) to 1, because I read in the htb manual that for low rates it > would be better. It may be slightly better to add quantum 1514 to each line that has a rate instead (assuming you have 1500 mtu) > 2.5) Finally, I have read in http://www.etxea.net/docu/qos/qos.html > (the page is written in Spanish but the script is in English) that a 2 > seconds latency is obtained decreasing the queue size of the eth > connected to the router for Outbound Shaping. You can see that I have > added it at the first line of the script that I am using now. Written by Dan Singletary (8/7/02) It's a fair, but old example - all scripts have issues and Dan went on to write a userspace queue so he could allow for the atm overheads - I used to use it. > > It seems the latency problem in our LAN is solved by now, I am going to > testing it for a couple of weeks and if everything is OK I will tell you. > Good - if it works for you it doesn't need fixing. There are lots of things you can do, but if your not an avid gamer who notices every ms then there is no point over complicating things. > Please, forgive my lack of knowledge about all these concerns and thank > you again for your help. The new script is below... > --- > Carlos Vereda > > # set queue size to give latency of about 2 seconds on low-prio packets > ip link set dev eth1 qlen 30 If you didn't have sfq on leafs then this would be OK - but as you do, you will get the default 128 packet queue length of sfq. You can make them shorter with the limit parameter. I use 20 on mine - but then I only send bulk traffic to sfq, so I don't care what gets dropped. > > tc qdisc add dev eth1 root handle 2: htb r2q 1 > tc class add dev eth1 parent 2: classid 2:1 htb rate 10mbit > tc class add dev eth1 parent 2:1 classid 2:11 htb rate 8mbit ceil 10mbit > tc class add dev eth1 parent 2:1 classid 2:12 htb rate 200kbit > tc class add dev eth1 parent 2:12 classid 2:10 htb rate 50kbit ceil > 200kbit prio 2 > tc class add dev eth1 parent 2:12 classid 2:20 htb rate 50kbit ceil > 200kbit prio 2 > tc class add dev eth1 parent 2:12 classid 2:30 htb rate 50kbit ceil > 200kbit prio 2 > tc class add dev eth1 parent 2:12 classid 2:40 htb rate 50kbit ceil > 200kbit prio 2 > > tc qdisc add dev eth1 parent 2:10 handle 210: sfq perturb 5 > tc qdisc add dev eth1 parent 2:20 handle 220: sfq perturb 5 > tc qdisc add dev eth1 parent 2:30 handle 230: sfq perturb 5 > tc qdisc add dev eth1 parent 2:40 handle 240: sfq perturb 5 perturb can cause packet reordering - 5 is a bit low. > > iptables -A FORWARD -t mangle -i eth0 -j MARK -s 192.168.0.0/26 > --set-mark 1 > iptables -A FORWARD -t mangle -i eth0 -j MARK -s 192.168.0.64/26 > --set-mark 2 > iptables -A FORWARD -t mangle -i eth0 -j MARK -s 192.168.0.128/26 > --set-mark 3 > iptables -A FORWARD -t mangle -i eth0 -j MARK -s 192.168.0.192/26 > --set-mark 4 > > tc filter add dev eth1 protocol ip parent 2:0 handle 1 fw flowid 2:10 > tc filter add dev eth1 protocol ip parent 2:0 handle 2 fw flowid 2:20 > tc filter add dev eth1 protocol ip parent 2:0 handle 3 fw flowid 2:30 > tc filter add dev eth1 protocol ip parent 2:0 handle 4 fw flowid 2:40 In this case it will not matter, but not using prio on filters can cause problems with order if you ever do anything more complicated. Andy. From niclas.bentley at bredband.net Tue Apr 3 12:41:15 2007 From: niclas.bentley at bredband.net (Niclas Bentley) Date: Tue Apr 3 12:41:45 2007 Subject: [LARTC] ipp2p: error loading kernel module Message-ID: <1175596875.18427.7.camel@dunder> Hi, I get this error when trying to insmod the ipp2p kernel module: "insmod: error inserting 'ipt_ipp2p.ko': -1 Invalid module format" in the kernel log: "ipt_ipp2p: disagrees about version of symbol struct_module" Kernel version 2.6.20.4 iptables version: 1.3.5 ipp2p version: 0.8.2 (latest) Anyone tried ipp2p with kernel 2.6.20? Best Regards Niclas Bentley From lists at andyfurniss.entadsl.com Tue Apr 3 21:27:38 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 3 21:27:32 2007 Subject: [LARTC] Xen and tc problems In-Reply-To: <08CA2245AFCF444DB3AC415E47CC40AF8120E8@G3W0072.americas.hpqcorp.net> References: <08CA2245AFCF444DB3AC415E47CC40AF8120E8@G3W0072.americas.hpqcorp.net> Message-ID: <4612AAAA.4070608@andyfurniss.entadsl.com> Padala, Pradeep wrote: > Hi, > > I am trying to shape traffic to two VMs hosted in Xen. There seems to be > very little information regarding this. I found this web page > http://www.ioncannon.net/system-administration/57/limiting-bandwidth-usa > ge-on-xen-linux-setup/ and followed the instructions. But, the real > bandwidth experienced from clients always seems to exceed the set rate. > Part of the problem may be because of the way Xen bridging is setup. > There are many interfaces that the packets go through. So, I switched to > the Xen routed networking, in which dom0 simply sees two virtual > interfaces for the VM, which are kind of PPP connections to the eth0 > interfaces in VM. > > eth0 +---- vif1.0 -- eth0 in VM1 > | > | > +---- vif2.0 -- eth1 in VM2 > > Say, I want to limit the bandwidth to VM1 to 100mbit and VM2 to 500mbit > (eth0 is a 1gbit interface), I used to following commands. > > iptables -t mangle -F POSTROUTING > tc qdisc add dev eth0 root handle 1: htb r2q 1000 > iptables -t mangle -A POSTROUTING -s $vm1_ip -j CLASSIFY --set-class 1:1 > iptables -t mangle -A POSTROUTING -d $vm1_ip -j CLASSIFY --set-class 1:1 > tc class add dev eth0 parent 1: classid 1:1 htb rate 512mbit > iptables -t mangle -A POSTROUTING -s $vm2_ip -j CLASSIFY --set-class 1:2 > iptables -t mangle -A POSTROUTING -d $vm2_ip -j CLASSIFY --set-class 1:2 > tc class add dev eth0 parent 1: classid 1:2 htb rate 512mbit > > I setup a web server in VM1 and download a 1GB file from another machine > that is on the same network (actually on the same enclosure). I always > see wire speeds on the client side. I have tried many configurations > including adding a sfq, pfifo, tbf class under the leaf classes, but > either the rate becomes too low (because packets are dropped at the > leaves) or too high. > > Part of the problem lies in the fact the vif1.0 has already received the > traffice, so it has to be overlimited at eth0, instead of dropping. So, > I tried a simple tbf within the VM. That doesn't work either with very > low speeds. Xen VMs don't have very precise clocks, so that might be one > reason why the reliable tbf is also not performing well. > > I also set the burst sizes manually and the speed again becomes > exceptionally low. > > Please let me know if you have any ideas on why this is happening. I can > paste the stats as well, if required. I've never used zen, but IIRC it uses GSO Generic Segmentation offload - like some gig nics. For nics you need to turn it off with ethtool -k, so you could try and see if you can do the same for virtual nics. Andy. From lists at andyfurniss.entadsl.com Tue Apr 3 21:44:37 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 3 21:44:10 2007 Subject: [LARTC] Help needed with HTB In-Reply-To: <45EDF348.4040507@gmail.com> References: <45EDF348.4040507@gmail.com> Message-ID: <4612AEA5.2090701@andyfurniss.entadsl.com> Edgar Merino wrote: > Hello, a few days ago I sent an email asking for help with my tc htb > rules I've got (a script), but I'm not sure if that email got to you... > anyway, I'm sending it again along with my htb script and I'll tell you > the problem once again: > > I have a computer with ip 192.168.0.100 which is acting as a p2p server, > so I want to shape traffic coming out from that ip, I have a linux box > acting as a router with two NICs, server ip is 192.168.0.1. So I hope > you can take a look at it and tell me why is it that everytime I have > mldonkey or any other p2p software running on that computer I experience > a lot of latency in my whole network with http traffic, maybe someone > can help me out specify the burst and cburst parameters... and maybe > even the quantum parameter, and some little explanation on it since I > haven't been able to understand what the benefits of this parameters are. The rates on htb child classes should not add up to any more that about 80% of the link speed. The parent rate and ceils should be equal to about 80% of link speed. I guess you already know for tc bps = bytes/sec. Read some of my recent posts about htb default on eth. Check with iptables -L -v -n that the rules are matching as you expect - without testing I can't recall if the output one will see addresses. Andy. From lists at andyfurniss.entadsl.com Tue Apr 3 21:51:26 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 3 21:50:58 2007 Subject: [LARTC] Incoming traffic In-Reply-To: <519f77360702270831v2dd01e3bu6af6c75de50d96eb@mail.gmail.com> References: <519f77360702270831v2dd01e3bu6af6c75de50d96eb@mail.gmail.com> Message-ID: <4612B03E.7060500@andyfurniss.entadsl.com> mohican 542003 wrote: > Hello, > > with the command : > tc filter add dev eth0 parent ffff: protocol ip u32 patch ip src > 192.168.2.6police rate 10000kbit burst 10000kbit drop flowid :1 > we can limit traffic coming from 192.168.2.6. > > I would like: > for 192.168.1.2, 192.168.1.4 limit to 10mbit > for 192.168.1.3, 192.168.1.5 limit to 20mbit > other ip would have no limit. > > Is it possible with tc ? Should be possible - do you mean .2 and .4 share 10mbit or get 10mbit each? Andy. From lists at andyfurniss.entadsl.com Tue Apr 3 23:04:48 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 3 23:04:17 2007 Subject: [LARTC] Please Help: Can't access bands > 10 on prio qdisc In-Reply-To: <2387247e0704021638j497cb0b9nf86099a90d878576@mail.gmail.com> References: <2387247e0704021638j497cb0b9nf86099a90d878576@mail.gmail.com> Message-ID: <4612C170.2060901@andyfurniss.entadsl.com> Gabriel Somlo wrote: > Hi, > > I'm trying to set up 15 different delay intervals for packets leaving > on an interface, using netems hanging off of a 16-band prio. > > I'm having trouble adding anything to bands higher than 10. Here's > what I tried: > > tc qdisc add dev eth0 root handle 1: prio bands 16 \ > priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > I want all default traffic to go to the highest priority band (0), > regardless of the TOS bits and whatever. I'll add filters to place > select packets in lower priority bands (1..15). So far, so good. > Next: > > tc qdisc add dev eth0 parent 1:10 handle 100: netem delay 20ms > > This works fine, adding a netem qdisc to band 10. However, when I try > this: > > tc qdisc add dev eth0 parent 1:11 handle 110: netem delay 30ms > > I get an error: > > RTNETLINK answers: Invalid argument > > The "invalid argument" it's bitching about is "parent 1:11". What am I > doing wrong ? Parents 1:1 through 1:10 work fine, but as soon as I go > 11 or above, I get this error... > > Thanks for any pointers or ideas, Try using hex 1:a etc 1:10 would be 16 (assuming the bands is decimal) if that doesn't work you could always try multiple ifbs. Andy. From lsharpe at pacificwireless.com.au Wed Apr 4 03:55:10 2007 From: lsharpe at pacificwireless.com.au (Leigh Sharpe) Date: Wed Apr 4 03:55:35 2007 Subject: [LARTC] Some advanced filtering questions Message-ID: Hi All, I need to do some tricky filtering stuff. Can anyone tell me if any of the following are possible? * match on a combination of firewall mark AND u32 criteria. ie. handle 6 fw AND u32 match ip src 1.2.3.4/32 - to match packets from 1.2.3.4 which have been marked elsewhere OR * to OR the values of u32 matches. Something like u32 match ip src 1.2.3.4/32 OR match ip dst 1.2.3.4/32 - to match packets going to or from 1.2.3.4 OR * to use a mask on firewall marks as per iptables/ebtables MARK matches. Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070404/8be8ee90/attachment.html From lsharpe at pacificwireless.com.au Wed Apr 4 04:07:50 2007 From: lsharpe at pacificwireless.com.au (Leigh Sharpe) Date: Wed Apr 4 04:08:13 2007 Subject: [LARTC] Some advanced filtering questions In-Reply-To: Message-ID: Or, for that matter, how to negate a u32 match. ie, match anything NOT from 1.2.3.0/24 Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au _____ From: Leigh Sharpe Sent: Wednesday, April 04, 2007 11:55 AM To: lartc Subject: [LARTC] Some advanced filtering questions Hi All, I need to do some tricky filtering stuff. Can anyone tell me if any of the following are possible? * match on a combination of firewall mark AND u32 criteria. ie. handle 6 fw AND u32 match ip src 1.2.3.4/32 - to match packets from 1.2.3.4 which have been marked elsewhere OR * to OR the values of u32 matches. Something like u32 match ip src 1.2.3.4/32 OR match ip dst 1.2.3.4/32 - to match packets going to or from 1.2.3.4 OR * to use a mask on firewall marks as per iptables/ebtables MARK matches. Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070404/6134a18c/attachment.htm From alex at uh.cu Wed Apr 4 06:05:41 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Wed Apr 4 06:04:31 2007 Subject: [LARTC] Some advanced filtering questions In-Reply-To: References: Message-ID: <200704040405.41665.alex@uh.cu> Hi to all of you!! On Wednesday 04 April 2007 01:55, Leigh Sharpe wrote: > Hi All, > I need to do some tricky filtering stuff. Can anyone tell me if any of > the following are possible? I am very newby on this, but I think I get some idea of how this whole thing works, so, I want to try to answer you this (if any of you think my answers are wrong, please, correct me!!! -and of course, if you have better ideas of if you know how to do this better, just answer to this thread, I guess Leigh and me will be glad to know about you) > > * match on a combination of firewall mark AND u32 criteria. ie. handle > 6 fw AND u32 match ip src 1.2.3.4/32 - to match packets from 1.2.3.4 > which have been marked elsewhere I guess that if you want to combine filters as a conjunction, you may have two classes (parent and child), and then redirect packets matching filter number one to parent, and from the parent, redirect packets matching filter number two to the child. Maybe something like this: ... # the node where the traffic is classified tc class add ... classid 1:1 ... # just to keep first kind of traffic tc class add ... parent 1:1 classid 1:10 ... # handling traffic matching both criteria at the same time tc class add ... parent 1:10 classid 1:100 ... ... # "handle 6 fw" tc filter add ... parent 1:1 flowid 1:10 # "u32 match ip src 1.2.3.4/32" tc filter add ... parent 1:10 flowid 1:100 > OR > * to OR the values of u32 matches. Something like u32 match ip src > 1.2.3.4/32 OR match ip dst 1.2.3.4/32 - to match packets going to or > from 1.2.3.4 If you are looking for a disjunction, you may have one class and two filters with same parent and flowid: ... # the node where the traffic is classified tc class add ... classid 1:1 ... # handling traffic that comes or goes to 1.2.3.4 tc class add ... parent 1:1 classid 1:10 ... ... # "u32 match ip src 1.2.3.4/32" tc filter add ... parent 1:1 flowid 1:10 # "u32 match ip dst 1.2.3.4/32" tc filter add ... parent 1:1 flowid 1:10 > OR > * to use a mask on firewall marks as per iptables/ebtables MARK matches. ??? I need to pass this time :( > > Regards, > Leigh > > Leigh Sharpe > Network Systems Engineer > Pacific Wireless > Ph +61 3 9584 8966 > Mob 0408 009 502 > Helpdesk 1300 300 616 > email lsharpe@pacificwireless.com.au > web www.pacificwireless.com.au PS: please, sorry if my english confuse you, I know I still need to study very hard. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From alex at uh.cu Wed Apr 4 07:02:19 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Wed Apr 4 07:00:50 2007 Subject: [LARTC] tc questions Message-ID: <200704040502.19649.alex@uh.cu> Hi to all of you!!! I am a Computer Science student trying to do the pre-grade thesis. I am trying to develop a free software tool to help administrators to control the traffic. Right now this tool is based on tc and iptables. I am having some problems trying to understand tc and tc examples: - Why in almost every list of tc rules based on htb class, there is a "tc qdisc dev ... root ... htb default ..." as a root node? Is it mandatory to work with htb class? - I understood that every class node has its own qdisc attached (fifo by default, right?). If that is the case, why when I do "tc qdisc show ..." it JUST shows me those qdisc I explicitly attached to classes without any child class? - What should I expect if I run something like this? tc qdisc add dev eth0 root handle 1: htb default 10 tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit tc class add dev eth0 parent 1:1 classid 1:10 htb rate 90mbit tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1kbit tc class add dev eth0 parent 1:20 classid 1:21 htb rate 10mbit I guessed the traffic redirected to 1:21 should have 1kbit of rate at most (because of its parent 1:20), but when I ran this, I got a higher rate (because of the 10mbit rate, I guess). Why? Shouldn't parent classes restrict children's rate? Thanks in advance. Regards, Ale. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From christian.benvenuti at libero.it Wed Apr 4 15:13:18 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Wed Apr 4 15:09:42 2007 Subject: [LARTC] Re: tc questions Message-ID: <1175692398.2649.11.camel@benve-laptop> Hi Alejandro >Hi to all of you!!! > >I am a Computer Science student trying to do the pre-grade thesis. I am trying >to develop a free software tool to help administrators to control the >traffic. Right now this tool is based on tc and iptables. >I am having some problems trying to understand tc and tc examples: >- Why in almost every list of tc rules based on htb class, there is > a "tc qdisc dev ... root ... htb default ..." as a root node? > Is it mandatory to work with htb class? It is not mandatory to attach a HTB qdisc to the root. You can attach it to any classfull qdisc's cass. You can only create HTB classes under a HTB qdisc, and you can only create CBQ classes under a CBQ class. However you can attach any qdisc to a given class. What is exactly that you find strange? >- I understood that every class node has its own qdisc attached > (fifo by default, right?). Correct. To be exact, most qdiscs use Packet FIFO (pfifo) by default, but that's not a rule (there are exceptions). >If that is the case, why when I do "tc qdisc show ..." it >JUST shows me those qdisc I explicitly attached to classes without any child >class? The default pFIFO qdisc that get attached to the classes are not shown by the above command. >- What should I expect if I run something like this? > >tc qdisc add dev eth0 root handle 1: htb default 10 >tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit >tc class add dev eth0 parent 1:1 classid 1:10 htb rate 90mbit >tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1kbit >tc class add dev eth0 parent 1:20 classid 1:21 htb rate 10mbit > >I guessed the traffic redirected to 1:21 should have 1kbit of rate at most >(because of its parent 1:20), but when I ran this, I got a higher rate >(because of the 10mbit rate, I guess). Why? Shouldn't parent classes restrict >children's rate? I would say that that is a misconfiguration. Neither the tc command nor the kernel gives you any warning. You could implement it as part of your project ... :) You are right. Class 1:20 does not limit the class 1:21's rate to 1kbit. This is due to the way the kernel schedules the HTB classes. Note that since you did not use the "ceil" config option, class 1:21 gets by default "ceil" = "rate" = 10mbit, and therefore it can not borrow from its parent 1:20. There would be nothing to borrow anyway, since 1:20 is limited to 1kbit (rate=cel=1kbit). Regards /Christian [http://benve.info] From christian.benvenuti at libero.it Wed Apr 4 15:53:21 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Wed Apr 4 15:49:43 2007 Subject: [LARTC] Re: Some advanced filtering questions Message-ID: <1175694801.2649.38.camel@benve-laptop> Hi Leigh, >Hi All, >I need to do some tricky filtering stuff. Can anyone tell me if any of >the following are possible? > >* match on a combination of firewall mark AND u32 criteria. ie. handle >6 fw AND u32 match ip src 1.2.3.4/32 - to match packets from 1.2.3.4 >which have been marked elsewhere you can do that in a couple (at least) of different ways. 1) Using netfilter custom chains. All the conditions you can express with the U32 classifier can be expressed with iptables too. U32 allows you to use hash tables and speed up the classification in certain contexts, but if you are not using U32 hash tables you can replace any U32 match with an iptables mark/command. To some extent, you can define a combination of conditions using iptables custom chains: you create a chain and insert into the latter the iptables command that test your conditions. iptables allows you to use the ! (i.e. NOT) operator. This solutions however does not scale, and, depending on what configuration you need to enforce, it may not work always. This is not the solution I would suggest to use, especially if your need to define many filters. 2) Using the (relatively) new Basic classifier. More details below. >OR > >* to OR the values of u32 matches. Something like u32 match ip src >1.2.3.4/32 OR match ip dst 1.2.3.4/32 - to match packets going to or >from 1.2.3.4 U32 does not allow you to explicitly OR different matches. However, you can organize your filters using U32 hash tables in a way such that on a given bucket you insert only those match conditions that must be ORed: After all, a list of matches is nothing but a list of ORed conditions: the first one that matches is used. This solution may not scale and may not be usable in all scenarios (it depends a lot on the config you need to enforce). > >OR > >* to use a mask on firewall marks as per iptables/ebtables MARK matches. You can do that with the Basic classifier. (I believe there is also a patch around that adds this functionality to the fw classifier). The Basic classifier allows you to define conditions such as match AND (NOT ( OR ) Here are a couple of examples for the conditions above (see my note at the end of the email): # match ip src 1.2.3.4/32 OR match ip dst 1.2.3.4/32 tc filter add dev eth2 parent 1:0 prio 5 protocol ip \ basic match \ u32\(u32 0x01020304 0xFFFFFFFF at 12\) OR \ u32\(u32 0x01020304 0xFFFFFFFF at 16\) \ flowid 1:11 # match anything NOT from 1.2.3.0/24 tc filter add dev eth2 parent 1:0 prio 5 protocol ip \ basic match \ NOT u32\(u32 0x01020300 0xFFFFFF00 at 12\) \ flowid 1:13 # Example of mask on firewall marks # This filter matches with those pkts whose firewall # mark has the value 1 set in the least significant 4 bits # (you can use 0xF instead of 0x0000000F if you prefer) tc filter add dev eth2 parent 1:0 prio 5 protocol ip \ basic match \ meta\(nf_mark mask 0x0000000F eq 1\) \ flowid 1:12 For more detail on the Basic classifier, see these kernel configuration options: Networking +->Networking options +->QoS and/or fair queueing +->Elementary classification (BASIC) +->Extended Matches Note that the Basic classifier and the extended matches are not as mature and stable as the rest of the Traffic Control code yet. (I have fixed a few bugs both in IPROUTE2 and in the kernel; next week I am going to send the patches to the current maintainer. I can post the patches here too if there is anyone interested) Regards. /Christian [http://benve.info] From lists at andyfurniss.entadsl.com Wed Apr 4 22:40:24 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 4 22:40:30 2007 Subject: [LARTC] Incoming traffic In-Reply-To: <519f77360704040009l150a36b2p69b6baa253a34f4b@mail.gmail.com> References: <519f77360702270831v2dd01e3bu6af6c75de50d96eb@mail.gmail.com> <4612B03E.7060500@andyfurniss.entadsl.com> <519f77360704040009l150a36b2p69b6baa253a34f4b@mail.gmail.com> Message-ID: <46140D38.9040902@andyfurniss.entadsl.com> mohican 542003 wrote: > Hello, > > I would like that .2 and .4 share 10mbit and .3 and .5 share 20 mbit. > > I finally use tcindex that works fine. u32 can only be used with one IP, > and iptables cannot mark packet for incoming traffic. > > Do you have another suggestion ? There are things called shared meters - though I think that name is a bit misleading as to their usefullness. You can use them to make policers from different matches behave as one, so it is possible to do as you want - it won't be a fair share though. The iptables issue is because the place policers hook changed - on 2.4s and if you config your kernel the right way (don't select packet action, and then select the old/depreciated policer) it will see packets after iptables prerouting, the default on 2.6s is to hook before netfilter. Andy. From lists at andyfurniss.entadsl.com Wed Apr 4 23:14:46 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 4 23:14:47 2007 Subject: [LARTC] Re: Some advanced filtering questions In-Reply-To: <1175694801.2649.38.camel@benve-laptop> References: <1175694801.2649.38.camel@benve-laptop> Message-ID: <46141546.4090200@andyfurniss.entadsl.com> Christian Benvenuti wrote: > For more detail on the Basic classifier, see these kernel > configuration options: > > Networking > +->Networking options > +->QoS and/or fair queueing > +->Elementary classification (BASIC) > +->Extended Matches > > Note that the Basic classifier and the extended matches are not as > mature and stable as the rest of the Traffic Control code yet. > (I have fixed a few bugs both in IPROUTE2 and in the kernel; next > week I am going to send the patches to the current maintainer. > I can post the patches here too if there is anyone interested) Cool. I will see them on netdev, but if you are using them for real some examples would be usefull on here. It seems like ages ago Thomas was doing ematch/meta stuff and until now I've not seen one example. To Leigh - there are the other things you can do with tc actions like pipe/reclassify/continue. They are (partially/minimally) documented in iproute2's doc/actions/*. I do have a problem with actions-general (maybe it's just the way I read it), but the comments in the example in that just seem wrong. They are to do with policers/shared meters more than classification, but the comments and the phrase shared meter hold up the possibility of doing some cool stuff that in reality turns out to be impossible. Andy. From alex at uh.cu Wed Apr 4 23:43:17 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Wed Apr 4 23:41:50 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <1175692398.2649.11.camel@benve-laptop> References: <1175692398.2649.11.camel@benve-laptop> Message-ID: <200704042143.18201.alex@uh.cu> On Wednesday 04 April 2007 13:13, Christian Benvenuti wrote: > Hi Alejandro Hi Christian! > It is not mandatory to attach a HTB qdisc to the root. You can attach > it to any classfull qdisc's cass. Yes, I know. I was trying to ask why to attach htb qdisc instead of htb class to the root. In fact, I really don't understand what means "htb qdisc" since I just know htb as a classfull tc node, and (I guess) qdisc are classless tc nodes (am I wrong?) > You can only create HTB classes under a HTB qdisc, and you can only > create CBQ classes under a CBQ class. However you can attach any > qdisc to a given class. > What is exactly that you find strange? Well, I thought I just could attach qdisc nodes to class nodes, not viceversa, and that's the case when attaching htb qdisc to the root and then, declaring a child of the root as a htb class, doesn't it? > Correct. > To be exact, most qdiscs use Packet FIFO (pfifo) by default, but that's > not a rule (there are exceptions). Haha, well, that's why rules are for: to break them with exceptions ;) ...just kidding, of course! > The default pFIFO qdisc that get attached to the classes are not > shown by the above command. ...and which is the command that will show them?? > I would say that that is a misconfiguration. > Neither the tc command nor the kernel gives you any warning. > You could implement it as part of your project ... :) I agree with you: it is a wrong configuration, and I need to deal with it as part of my project. But I am able to run those lines, and I will get a behavior, and I want to know if there is some kind of logic around it: ...how it works?? > You are right. Class 1:20 does not limit the class 1:21's rate to 1kbit. > This is due to the way the kernel schedules the HTB classes. Could you (please) tell me more about how the kernel do this? > Note that since you did not use the "ceil" config option, class 1:21 > gets by default "ceil" = "rate" = 10mbit, and therefore it can not > borrow from its parent 1:20. > There would be nothing to borrow anyway, since 1:20 is limited to > 1kbit (rate=cel=1kbit). > > Regards > /Christian > [http://benve.info] Thank you very much!! -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From lists at andyfurniss.entadsl.com Wed Apr 4 23:42:10 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 4 23:42:16 2007 Subject: [LARTC] Kernel timer frequency and HTB In-Reply-To: <4611453C.4000706@antonello.org> References: <4611453C.4000706@antonello.org> Message-ID: <46141BB2.4040900@andyfurniss.entadsl.com> lists@antonello.org wrote: > Hello, > i have a linux box which is acting as a lan router towards the internet > doing traffic shaping. > My link is 10Mbit/s full duplex. > > I have set some HTB classes with a rate of 20% (2Mbit/s) and a ceil of > 95% (9.5Mbit/s). Is such an excursion of bandwidth in the HTB classes > feasible for HTB to control? I would have thought it's OK, 9.5mbit may be a bit close to the limit depending on what yout overheads on the link are. > > What Timer frequency (kernel menuconfig) is the most suitable among 250, > 300 and 1000Hz for HTB? Also, I really don't have any ideas on how the > frequency could affect the network adapter performance. Is a high > frequency going to have bad effects on interrupts handling? Has anybody > any suggestions about this issue? I would say 1000 is best. An individual class needs a burst/cburst size big enough to reach its rate/ceil so you can have smaller bursts on classes. HTB was written when Hz=100 and unless you specify burst/quantum it will choose them from the rates. If you do use 1000 it may be worth having a look what it chose with tc -s -d class ls dev .... > > I have some warnings about HTB quantums being too big, but as i > understand, those should only affect the precision of the shaping > without undermining the shaping completely. Quantum affects the way excess bandwidth is shared - you can specify it on each line that has a rate/ceil if you want - minimum should be 1514 on eth with mtu 1500. If you don't specify it HTB will use the rates to set it, which may not be what you want. Andy. From lists at andyfurniss.entadsl.com Thu Apr 5 00:01:22 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu Apr 5 00:01:26 2007 Subject: [LARTC] tc questions In-Reply-To: <200704040502.19649.alex@uh.cu> References: <200704040502.19649.alex@uh.cu> Message-ID: <46142032.4000804@andyfurniss.entadsl.com> Alejandro Ramos Encinosa wrote: > Hi to all of you!!! > > I am a Computer Science student trying to do the pre-grade thesis. I am trying > to develop a free software tool to help administrators to control the > traffic. Right now this tool is based on tc and iptables. > I am having some problems trying to understand tc and tc examples: > - Why in almost every list of tc rules based on htb class, there is a "tc > qdisc dev ... root ... htb default ..." as a root node? Is it mandatory to > work with htb class? > - I understood that every class node has its own qdisc attached (fifo by > default, right?). If that is the case, why when I do "tc qdisc show ..." it > JUST shows me those qdisc I explicitly attached to classes without any child > class? > - What should I expect if I run something like this? > > tc qdisc add dev eth0 root handle 1: htb default 10 > tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit > tc class add dev eth0 parent 1:1 classid 1:10 htb rate 90mbit > tc class add dev eth0 parent 1:1 classid 1:20 htb rate 1kbit > tc class add dev eth0 parent 1:20 classid 1:21 htb rate 10mbit > > I guessed the traffic redirected to 1:21 should have 1kbit of rate at most > (because of its parent 1:20), but when I ran this, I got a higher rate > (because of the 10mbit rate, I guess). Why? Shouldn't parent classes restrict > children's rate? > > Thanks in advance. Regards, Ale. > In addition to what Christian said - have you seen the docs on the htb homepage - http://luxik.cdi.cz/~devik/qos/htb/ and Steph Coene's work - http://www.docum.org Andy. From niclas.bentley at bredband.net Thu Apr 5 16:52:09 2007 From: niclas.bentley at bredband.net (Niclas Bentley) Date: Thu Apr 5 16:52:51 2007 Subject: [LARTC] ipp2p problem with kernel 2.6.20! In-Reply-To: <1175596875.18427.7.camel@dunder> References: <1175596875.18427.7.camel@dunder> Message-ID: <1175784729.22316.3.camel@dunder> Hi again, I wrote this a few days ago....Isn't there anybody using ipp2p and linux kernel 2.6.20? I think ipp2p would be very useful in order to identify bittorrent traffic... On Tue, 2007-04-03 at 12:41 +0200, Niclas Bentley wrote: > Hi, > I get this error when trying to insmod the ipp2p kernel module: > "insmod: error inserting 'ipt_ipp2p.ko': -1 Invalid module format" > > in the kernel log: "ipt_ipp2p: disagrees about version of symbol > struct_module" > > Kernel version 2.6.20.4 > iptables version: 1.3.5 > ipp2p version: 0.8.2 (latest) > > Anyone tried ipp2p with kernel 2.6.20? > > Best Regards Niclas Bentley > From christian.benvenuti at libero.it Thu Apr 5 18:00:52 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Apr 5 17:56:10 2007 Subject: [LARTC] Re: tc questions Message-ID: <1175788853.3947.6.camel@benve-laptop> Hi Alejandro >Yes, I know. I was trying to ask why to attach htb qdisc instead >of htb class to the root. >In fact, I really don't understand what means "htb qdisc" since >I just know htb as a classfull tc node, and (I guess) qdisc are >classless tc nodes (am I wrong?) mhmh, I think you are a little confused here. I would recommend reading both the document on HTB pointed out by Andy and the general LARTC howto. Traffic control defines different object types: - qdisc (queueing disciplines: how packets enqueued/dequeued) HTB is one kind of qdisc. - classes (a mechanims for organizing packets inside qdiscs) Only classfull qdiscs allow you to create classes ... HTB is a classful qdisc. - classifiers (used to define filters to map traffic to classes) - classifier extensions: legacy policers and actions. (one of the action types is "police", which replaces the legacy policers) - ... >> You can only create HTB classes under a HTB qdisc, and you can only >> create CBQ classes under a CBQ class. However you can attach any >> qdisc to a given class. >> What is exactly that you find strange? >Well, I thought I just could attach qdisc nodes to class nodes, not >viceversa, and that's the case when attaching htb qdisc to the root >and then, declaring a child of the root as a htb class, doesn't it? You can create classes inside classfull qdiscs. The classes you create are of the same type as the parent qdisc, which explains why you create HTB classes inside HTB qdiscs. You can attach a qdisc to class (if you want to replace the default pFIFO). Well, you can also attach a qdisc directly to another qdisc if you like, but it makes sense only in few cases. >> The default pFIFO qdisc that get attached to the classes are not >> shown by the above command. >...and which is the command that will show them?? There is no command that does that. If you really want to see them, you can explicitly attach a pFIFO queue to the classes. >> I would say that that is a misconfiguration. >> Neither the tc command nor the kernel gives you any warning. >> You could implement it as part of your project ... :) >I agree with you: it is a wrong configuration, and I need to deal with it as >part of my project. But I am able to run those lines, and I will get a >behavior, and I want to know if there is some kind of logic around it: ...how >it works?? There are lots of misconfigurations that neither the tc command nor the kernel detects or cares about. The one you pointed out is just one of them. >> You are right. Class 1:20 does not limit the class 1:21's rate to >> 1kbit. >> This is due to the way the kernel schedules the HTB classes. >Could you (please) tell me more about how the kernel do this? You can refer to the document pointed out by Andy: http://luxik.cdi.cz/~devik/qos/htb/ Devik has documented HTB fairly well. This is a simplified model: for each level L (starting from the leafs) for each priority P (starting from the highest priority) for each class C with priority P at level L serve class C Regards /Christian [http://benve.info] From fernandoblankleder at gmail.com Thu Apr 5 19:01:42 2007 From: fernandoblankleder at gmail.com (Fernando Blankleder) Date: Thu Apr 5 19:01:58 2007 Subject: [LARTC] Routing Question Message-ID: <006c01c777a4$18ec7500$0250a8c0@desktop1> Hi, Somebody can help me , i have a linux gateway running ipsec, so if i ping a host on a remote ipsec network from gateway packet goes out with external ip address of gateway , is there a way that packets going from gateway to a remote network be sourced from internal gateway ip ? Thanks in advance Fernando -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070405/69e99925/attachment.html From alex at uh.cu Thu Apr 5 20:36:54 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Thu Apr 5 20:35:34 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <1175788853.3947.6.camel@benve-laptop> References: <1175788853.3947.6.camel@benve-laptop> Message-ID: <200704051836.54856.alex@uh.cu> First of all, I want to thank to Christian and Andy for answer me. Hi to all! > mhmh, I think you are a little confused here. I would recommend > reading both the document on HTB pointed out by Andy and the > general LARTC howto. I will. > Traffic control defines different object types: > > - qdisc (queueing disciplines: how packets enqueued/dequeued) > HTB is one kind of qdisc. > > - classes (a mechanims for organizing packets inside qdiscs) > Only classfull qdiscs allow you to create classes ... > HTB is a classful qdisc. > > - classifiers (used to define filters to map traffic to classes) > > - classifier extensions: legacy policers and actions. > (one of the action types is "police", which replaces the legacy > policers) Oh!!, now I understand!! > >> The default pFIFO qdisc that get attached to the classes are not > >> shown by the above command. > > > >...and which is the command that will show them?? > > There is no command that does that. > If you really want to see them, you can explicitly attach a pFIFO > queue to the classes. I can do that, but I even have more problems: if I attach a qdisc to a class (lets say, attach an sfq qdisc to an htb class) and the class node is not a leaf, then when I do `tc qdisc show dev eth0` it doesn't show me the qdisc attached. Why? How can I get its statistics? > for each level L (starting from the leafs) > for each priority P (starting from the highest priority) > for each class C with priority P at level L > serve class C hmm, that make sense for me > Regards > /Christian > [http://benve.info] Thanks in advance. Regards, Ale. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From dan at 34q.eu Fri Apr 6 16:11:52 2007 From: dan at 34q.eu (Dan) Date: Fri Apr 6 16:12:21 2007 Subject: [LARTC] tc / MARK question Message-ID: <000301c77855$8642b360$92c81a20$@eu> Hi, I have a router, running iptables & tc, with 2 interfaces (eth0 & eth1). I classify http traffic in iptables (prerouting mangle) coming in on eth0 and going out on eth1 with MARK 0x1, and I also classify return http traffic coming from eth1 -> eth0 with MARK 0x1 as well. I then ACCEPT them in the filter/FORWARD chain based on --mark 0x1. However, I want to shape this traffic, and limit based on the 0x1 MARK. I want to limit traffic to 4MBit outgoing on eth1 (incoming http), and 4MBit outgoing on eth0 (return http), *independently*, even though they use the same MARK. If I use HTB, assigned to egress eth0 and another assigned to eth1, and classify packets based on the MARK 0x1 (from above this is two tcp streams in different connections under the same mark), does tc: a) Treat the interfaces separately, giving me 4MBit either way independently b) Treat the interfaces as one (because one MARK is being used), and give me 4MBit total across both streams? Thanks! Dan From frederic at juliana-multimedia.com Fri Apr 6 19:02:46 2007 From: frederic at juliana-multimedia.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Massot?=) Date: Fri Apr 6 19:03:00 2007 Subject: [LARTC] Re: "dst cache overflow" messages and crash In-Reply-To: References: <45D9A86F.2020407@juliana-multimedia.com> <45DC4BAD.3000903@netwlan.net> <460392DC.4010909@juliana-multimedia.com> Message-ID: <46167D36.5050604@juliana-multimedia.com> Julian Anastasov wrote: > Hello, > > On Fri, 23 Mar 2007, Fr?d?ric Massot wrote: > >>>>>> I regularly have errors (kernel: dst cache overflow) and crash of a >>>>>> firewall under Linux 2.6.17 and the route patch from Julian Anastasov. >>> I assume IP_ROUTE_MULTIPATH_CACHED is disabled. Do you have >>> BRIDGE_NETFILTER enabled/used? >> - IP_ROUTE_MULTIPATH_CACHED is not set >> - BRIDGE_NETFILTER is set, but I do not use it. > > ok, then can you try the attached patch, it solves dst cache > problem for another user, may be it will help you too. This patch can > be used with or without routes patches. It makes sure we don't leak > dst entry in bridge-netfilter. If the patch does not help let me know > and we can add some printks to catch the problem. Hi, Thank you for your answer, as your patch comes from the kernel 2.6.20, I installed this version of the kernel with the patch (routes-2.6.20-14.diff). That made a little more than one week that I supervise and it cache is well cleaned regularly. All seems to be good. :o) Regards. -- ============================================== | FR?D?RIC MASSOT | | http://www.juliana-multimedia.com | | mailto:frederic@juliana-multimedia.com | ===========================Debian=GNU/Linux=== From christian.benvenuti at libero.it Fri Apr 6 20:35:20 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Fri Apr 6 20:29:02 2007 Subject: [LARTC] Re: tc / MARK question Message-ID: <1175884520.6221.6.camel@benve-laptop> Hi Dan, >Hi, > >I have a router, running iptables & tc, with 2 interfaces >(eth0 & eth1). > >I classify http traffic in iptables (prerouting mangle) coming in >on eth0 and going out on eth1 with MARK 0x1, and I also classify >return http traffic coming from eth1 -> eth0 with MARK 0x1 as well. >I then ACCEPT them in the filter/FORWARD chain based on --mark 0x1. > >However, I want to shape this traffic, and limit based on the 0x1 MARK. >I want to limit traffic to 4MBit outgoing on eth1 (incoming http), and >4MBit outgoing on eth0 (return http), *independently*, even though they >use the same MARK. > >If I use HTB, assigned to egress eth0 and another assigned to eth1, and >classify packets based on the MARK 0x1 (from above this is two tcp >streams in different connections under the same mark), does tc: > >a) Treat the interfaces separately, giving me 4MBit either way > independently This is the correct answer. Traffic Control is applied to each interface independently. Didn't you notice that when you configure a qdisc/class/filter you must always specify the interface name ? :) The fact that you use the same MARK in both directions has no influence at all on the queueing. >b) Treat the interfaces as one (because one MARK is being used), and > give me 4MBit total across both streams? Regards /Christian [http://benve.info] From fernandoblankleder at gmail.com Fri Apr 6 21:56:31 2007 From: fernandoblankleder at gmail.com (Fernando Blankleder) Date: Fri Apr 6 21:56:43 2007 Subject: Fw: [LARTC] Routing Question Message-ID: <001401c77885$ae09cd90$0250a8c0@desktop1> ----- Original Message ----- From: "Fernando Blankleder" To: "Evgeni Gechev" Sent: Friday, April 06, 2007 11:37 AM Subject: Re: [LARTC] Routing Question >I was thinking in a more Permanent Solution :) > > ----- Original Message ----- > From: "Evgeni Gechev" > To: "Fernando Blankleder" > Sent: Thursday, April 05, 2007 2:13 PM > Subject: Re: [LARTC] Routing Question > > >> Fernando Blankleder ??????: >>> Hi, Somebody can help me , i have a linux gateway running ipsec, so if i >>> ping a host on a remote ipsec network from gateway packet goes out with >>> external ip address of gateway , is there a way that packets going from >>> gateway to a remote network be sourced from internal gateway ip ? >>> Thanks in advance >>> Fernando >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> LARTC mailing list >>> LARTC@mailman.ds9a.nl >>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >>> >> ping -I Internal_IP Remote_IP > From christian.benvenuti at libero.it Sat Apr 7 00:29:50 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Sat Apr 7 00:23:16 2007 Subject: [LARTC] Re: Routing Question Message-ID: <1175898590.8051.10.camel@benve-laptop> Hi Fernando, >Hi, Somebody can help me , i have a linux gateway running ipsec, so if >i ping a host on a remote ipsec network from gateway packet goes out >with external ip address of gateway , is there a way that packets going >from gateway to a remote network be sourced from internal gateway ip ? > >Thanks in advance >Fernando I do not know what your setup and exact needs are, but have you tried the "src" option of the "ip route" command? Example: ip route add dev eth1 192.168.1.0/24 src 10.0.1.1 ^^^^^^^^^^^^ The routing code uses the primary IP address of the outgoing interface, unless you explicitly configure the preferred source address (as in the example above). Regards /Christian [http://benve.info] From fernandoblankleder at gmail.com Mon Apr 9 18:25:25 2007 From: fernandoblankleder at gmail.com (Fernando Blankleder) Date: Mon Apr 9 18:26:07 2007 Subject: [LARTC] Re: Routing Question References: <1175898590.8051.10.camel@benve-laptop> Message-ID: <004901c77ac3$b16b9e60$0250a8c0@desktop1> Hi, my setup is : 192.168.80.0/24 ---- > ( eth1:192.168.80.254 ) Linux ipsec Router (ppp0/ipsec0) ----> [ internet ] <-----Sonicwall (192.168.1.1) <----- 192.168.1.0/24 When a pc in 192.168.80.0/24 pings anything on 192.168.1.0/24 it works When Linux Ipsec Router pings anything on 192.168.1.0/24 it doesnt works, ping packet goes trough default route because packet originates on eth1 some time ago i made a script using a 2nd route table but i cant find it or remember ----- Original Message ----- From: "Christian Benvenuti" To: Sent: Friday, April 06, 2007 7:29 PM Subject: [LARTC] Re: Routing Question > Hi Fernando, > >>Hi, Somebody can help me , i have a linux gateway running ipsec, so if >>i ping a host on a remote ipsec network from gateway packet goes out >>with external ip address of gateway , is there a way that packets going >>from gateway to a remote network be sourced from internal gateway ip ? >> >>Thanks in advance >>Fernando > > I do not know what your setup and exact needs are, but have you > tried the "src" option of the "ip route" command? > > Example: > > ip route add dev eth1 192.168.1.0/24 src 10.0.1.1 > ^^^^^^^^^^^^ > > The routing code uses the primary IP address of the outgoing > interface, unless you explicitly configure the preferred > source address (as in the example above). > > Regards > /Christian > [http://benve.info] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From fernandes_pablo at yahoo.com.br Mon Apr 9 15:05:02 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Mon Apr 9 19:06:12 2007 Subject: [LARTC] tc (CBQ) and UDP packets Message-ID: <20070409170558.56C7B4B8EB@outpost.ds9a.nl> Hello, I have seen in my site something strange. I use tc-CBQ for bandwidth shaping, and it works properly well. Sometimes i have seen UDP connections spend more bandwidth than i set up to the user. I have 4 sites, and it always happens. Is there a problem with bandwidth shaping and UDP packets? Thanks a lot in advance. Pablo Fernandes -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070409/bc04ab9d/attachment.htm From alex at uh.cu Tue Apr 10 01:49:09 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Tue Apr 10 01:47:57 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <200704051836.54856.alex@uh.cu> References: <1175788853.3947.6.camel@benve-laptop> <200704051836.54856.alex@uh.cu> Message-ID: <200704092349.09428.alex@uh.cu> Hi to all. >>>> why when I do "tc qdisc show ..." it JUST shows me those qdisc I >>>> explicitly attached to classes without any child class? > >>> The default pFIFO qdisc that get attached to the classes are not >>> shown by the above command. > >>...and which is the command that will show them?? > > There is no command that does that. > If you really want to see them, you can explicitly attach a pFIFO > queue to the classes. I have a little question here: If I understood well, if I want to see a classless qdisc statistics I must explicity attach the qdisc to the classful qdisc. However, I have (for example) the following configuration and I still don't get the statistics for 120: (just for 1: and 121:): ----------------------------8<--------------------------------8<----------------------------- tc qdisc add dev eth1 root handle 1: htb default 10 tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit tc class add dev eth1 parent 1:1 classid 1:10 htb rate 2mbit tc class add dev eth1 parent 1:1 classid 1:20 htb rate 98mbit tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit tc qdisc add dev eth1 parent 1:21 handle 121: sfq perturb 10 tc filter add dev eth1 protocol ip parent 1: prio 1 u32 match ip dst 10.6.70.1 flowid 1:20 tc filter add dev eth1 protocol ip parent 1:20 prio 1 u32 match ip sport 80 0xffff flowid 1:21 ---------------------------->8-------------------------------->8----------------------------- If I run `tc -s qdisc show dev eth1' then I will get something like ----------------------------8<--------------------------------8<----------------------------- qdisc htb 1: r2q 10 default 10 direct_packets_stat 0 Sent 2284 bytes 7 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 121: parent 1:21 limit 128p quantum 1514b perturb 10sec Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 ---------------------------->8-------------------------------->8----------------------------- i.e. not 120: at all!!! and I need to get that flow. Worth of that is that if I run `tc -s class show dev eth1' then I will get this for class 1:20 ----------------------------8<--------------------------------8<----------------------------- class htb 1:20 parent 1:1 rate 98000Kbit ceil 98000Kbit burst 50580b cburst 50580b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 4229 ctokens: 4229 ---------------------------->8-------------------------------->8----------------------------- and I am sure I am generating traffic that matchs its filter. Can any of you to help me? PS: what I really want is a way to obtain statistics for each qdisc. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From marco.casaroli at gmail.com Tue Apr 10 05:55:10 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Tue Apr 10 05:55:20 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <200704092349.09428.alex@uh.cu> References: <1175788853.3947.6.camel@benve-laptop> <200704051836.54856.alex@uh.cu> <200704092349.09428.alex@uh.cu> Message-ID: <92ed523b0704092055w3213b484u826d1946abc163e4@mail.gmail.com> Hello. I may be misunderstanding what you are trying to do, but I think tc -s class ls dev eth1 shows the stats you want. note on the "class" word On 4/9/07, Alejandro Ramos Encinosa wrote: > Hi to all. > > >>>> why when I do "tc qdisc show ..." it JUST shows me those qdisc I > >>>> explicitly attached to classes without any child class? > > > >>> The default pFIFO qdisc that get attached to the classes are not > >>> shown by the above command. > > > >>...and which is the command that will show them?? > > > > There is no command that does that. > > If you really want to see them, you can explicitly attach a pFIFO > > queue to the classes. > I have a little question here: > If I understood well, if I want to see a classless qdisc statistics I must > explicity attach the qdisc to the classful qdisc. However, I have (for > example) the following configuration and I still don't get the statistics for > 120: (just for 1: and 121:): > > ----------------------------8<--------------------------------8<----------------------------- > tc qdisc add dev eth1 root handle 1: htb default 10 > > tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit > > tc class add dev eth1 parent 1:1 classid 1:10 htb rate 2mbit > tc class add dev eth1 parent 1:1 classid 1:20 htb rate 98mbit > tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 > > tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit > tc qdisc add dev eth1 parent 1:21 handle 121: sfq perturb 10 > > tc filter add dev eth1 protocol ip parent 1: prio 1 u32 match ip dst 10.6.70.1 > flowid 1:20 > tc filter add dev eth1 protocol ip parent 1:20 prio 1 u32 match ip sport 80 > 0xffff flowid 1:21 > ---------------------------->8-------------------------------->8----------------------------- > > If I run `tc -s qdisc show dev eth1' then I will get something like > > ----------------------------8<--------------------------------8<----------------------------- > qdisc htb 1: r2q 10 default 10 direct_packets_stat 0 > Sent 2284 bytes 7 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > qdisc sfq 121: parent 1:21 limit 128p quantum 1514b perturb 10sec > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ---------------------------->8-------------------------------->8----------------------------- > > i.e. not 120: at all!!! and I need to get that flow. > Worth of that is that if I run `tc -s class show dev eth1' then I will get > this for class 1:20 > > ----------------------------8<--------------------------------8<----------------------------- > class htb 1:20 parent 1:1 rate 98000Kbit ceil 98000Kbit burst 50580b cburst > 50580b > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > lended: 0 borrowed: 0 giants: 0 > tokens: 4229 ctokens: 4229 > ---------------------------->8-------------------------------->8----------------------------- > > and I am sure I am generating traffic that matchs its filter. Can any of you > to help me? > > PS: what I really want is a way to obtain statistics for each qdisc. > -- > Alejandro Ramos Encinosa > Fac. Matem?tica Computaci?n > Universidad de La Habana > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Marco From andrew.lyon at josims.com Tue Apr 10 14:35:05 2007 From: andrew.lyon at josims.com (Andrew Lyon) Date: Tue Apr 10 14:35:38 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 Message-ID: <592F914D209FD942908826DFF2277A2D0329E80D@commsserver.josims.local> Hi, With kernel 2.4 I was able to use equalize to send our outgoing packets to two different routers (our isp supports this setup), like this: ip route add default src ip.a.dd.rr equalize nexthop via weight 1 nexthop via weight 1 The two routes were used equally on a per packet basis, not per flow or per cached route, but per packet, each line has 800k upload and with that route we could upload to a single remote host at 1.6mbit. We replaced the server with a newer one and changed to 2.6 (2.6.20) kernel, I found that equalize no longer works as expected, it does choose a router at random but once a single packet has been sent to a remote host the same route/router is used for all packets going to that remote host. Once the cached route expires a random route is chosen again, but that is not what we want. I had made no changes to the ip route commands, but then I realised that kernel 2.6.20 has options for multipath (IP: equal cost multipath with caching support), I enabled that and now our kernel options are: CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_MULTIPATH_CACHED=y CONFIG_IP_ROUTE_MULTIPATH_RR=m CONFIG_IP_ROUTE_MULTIPATH_RANDOM=m CONFIG_IP_ROUTE_MULTIPATH_WRANDOM=m CONFIG_IP_ROUTE_MULTIPATH_DRR=m But even with these options, and default route set as follows: ip route add default src ip.a.dd.rr mpath rr nexthop via weight 1 nexthop via weight 1 The result is the same, a single upload to a remote host only uses 800k bandwidth on one of the lines, it does not send packets to both lines, although two uploads to two different hosts will usually make use of both lines. It seems to me that the multipath with caching support is broken in 2.6.20? The exact kernel we use is 2.6.20.4, with that kernel how would you specify a remote route such that packets going to a remote host are sent 50/50 ratio to two different routers? Thanks Andy JOSEDV001TAG From e.janz at barceloviajes.com Tue Apr 10 16:16:36 2007 From: e.janz at barceloviajes.com (e.janz@barceloviajes.com) Date: Tue Apr 10 16:16:52 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 In-Reply-To: <592F914D209FD942908826DFF2277A2D0329E80D@commsserver.josims.local> Message-ID: Hi Andrew, I would use a combination with iptables. You should mark the packets, for example using average or n-th, and then use ip rules to send half of the packets v?a one router and the rest to the other router according to the marks you set with iptables. Just a question ? dont you have problems with your source IP and the returning responses when you are sending packets from one connection over multiple routers ? ? do you have something like an AS ? Best regards, Eric Janz Andrew Lyon Enviado por: lartc-bounces@mailman.ds9a.nl 10/04/2007 14:36 Para "'lartc@mailman.ds9a.nl'" cc Asunto [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 Hi, With kernel 2.4 I was able to use equalize to send our outgoing packets to two different routers (our isp supports this setup), like this: ip route add default src ip.a.dd.rr equalize nexthop via weight 1 nexthop via weight 1 The two routes were used equally on a per packet basis, not per flow or per cached route, but per packet, each line has 800k upload and with that route we could upload to a single remote host at 1.6mbit. We replaced the server with a newer one and changed to 2.6 (2.6.20) kernel, I found that equalize no longer works as expected, it does choose a router at random but once a single packet has been sent to a remote host the same route/router is used for all packets going to that remote host. Once the cached route expires a random route is chosen again, but that is not what we want. I had made no changes to the ip route commands, but then I realised that kernel 2.6.20 has options for multipath (IP: equal cost multipath with caching support), I enabled that and now our kernel options are: CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_MULTIPATH_CACHED=y CONFIG_IP_ROUTE_MULTIPATH_RR=m CONFIG_IP_ROUTE_MULTIPATH_RANDOM=m CONFIG_IP_ROUTE_MULTIPATH_WRANDOM=m CONFIG_IP_ROUTE_MULTIPATH_DRR=m But even with these options, and default route set as follows: ip route add default src ip.a.dd.rr mpath rr nexthop via weight 1 nexthop via weight 1 The result is the same, a single upload to a remote host only uses 800k bandwidth on one of the lines, it does not send packets to both lines, although two uploads to two different hosts will usually make use of both lines. It seems to me that the multipath with caching support is broken in 2.6.20? The exact kernel we use is 2.6.20.4, with that kernel how would you specify a remote route such that packets going to a remote host are sent 50/50 ratio to two different routers? Thanks Andy JOSEDV001TAG _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -- ADVERTENCIA LEGAL El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy LEGAL ADVISORY This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070410/9b792350/attachment.html From andrew.lyon at josims.com Tue Apr 10 16:34:52 2007 From: andrew.lyon at josims.com (Andrew Lyon) Date: Tue Apr 10 16:34:58 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 Message-ID: <592F914D209FD942908826DFF2277A2D0329E810@commsserver.josims.local> > >________________________________________ >From: e.janz@barceloviajes.com [mailto:e.janz@barceloviajes.com] >Sent: 10 April 2007 15:17 >To: lartc@mailman.ds9a.nl >Subject: Re: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 > > >Hi Andrew, > >I would use a combination with iptables. You should mark the packets, for example using average or n-th, and then use ip rules to send half of the packets v?a one router and the rest to the other router according to the marks you set with iptables. >Just a question ? dont you have problems with your source IP and the returning responses when you are sending packets from one connection over multiple routers ? ? do you have something like an AS ? > > >Best regards, >Eric Janz > > >Andrew Lyon >Enviado por: lartc-bounces@mailman.ds9a.nl >10/04/2007 14:36 >Para >"'lartc@mailman.ds9a.nl'" >cc > >Asunto > [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 Eric, Could you give me a example of how to do that? With nth if possible... It is not common for a ISP to support that sort of setup, but they do http://aaisp.net.uk/aa/aaisp/multiline.html Each line has two ips, one for the router and another for the interface on a linux box or other device, the isp routes a larger /28 down both lines, and allows packets with source address in the /28 range to be sent through both lines. On my linux server I have a routing table for each line with the necessary routes to make each router ip reachable, and a default route that equalizes over both router ips, it worked with 2.4 but with 2.6 it seems to be per-flow instead of per packet. I can login to a control page app on the ISP website and configure which lines a given block is routed down, and they also do really good traffic monitoring etc http://www.aaisp.net.uk/cqm.html PS. Please reply below original posting, not above! http://en.wikipedia.org/wiki/Top-posting Andy JOSEDV001TAG From e.janz at barceloviajes.com Tue Apr 10 19:00:35 2007 From: e.janz at barceloviajes.com (e.janz@barceloviajes.com) Date: Tue Apr 10 19:00:26 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 In-Reply-To: <592F914D209FD942908826DFF2277A2D0329E810@commsserver.josims.local> Message-ID: Andrew Lyon wrote on 10/04/2007 16:34:52: > > > >________________________________________ > >From: e.janz@barceloviajes.com [mailto:e.janz@barceloviajes.com] > >Sent: 10 April 2007 15:17 > >To: lartc@mailman.ds9a.nl > >Subject: Re: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 > > > > > >Hi Andrew, > > > >I would use a combination with iptables. You should mark the packets, for > example using average or n-th, and then use ip rules to send half of the > packets v?a one router and the rest to the other router according to the > marks you set with iptables. > >Just a question ? dont you have problems with your source IP and the > returning responses when you are sending packets from one connection over > multiple routers ? ? do you have something like an AS ? > > > > > >Best regards, > >Eric Janz > > > > > >Andrew Lyon > >Enviado por: lartc-bounces@mailman.ds9a.nl > >10/04/2007 14:36 > >Para > >"'lartc@mailman.ds9a.nl'" > >cc > > > >Asunto > > [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 > > > > Eric, > > Could you give me a example of how to do that? With nth if possible... > > It is not common for a ISP to support that sort of setup, but they do > http://aaisp.net.uk/aa/aaisp/multiline.html > > Each line has two ips, one for the router and another for the interface on a > linux box or other device, the isp routes a larger /28 down both lines, and > allows packets with source address in the /28 range to be sent through both > lines. > > On my linux server I have a routing table for each line with the necessary > routes to make each router ip reachable, and a default route that equalizes > over both router ips, it worked with 2.4 but with 2.6 it seems to be > per-flow instead of per packet. > > I can login to a control page app on the ISP website and configure which > lines a given block is routed down, and they also do really good traffic > monitoring etc http://www.aaisp.net.uk/cqm.html > > PS. Please reply below original posting, not above! > http://en.wikipedia.org/wiki/Top-posting > > Andy > > > JOSEDV001TAG Hi Andy, thanks for the info. First of all, in order to use the nth match you need to patch your kernel using patch-o-matic. After that, the nth match should be available. Try something like this: Supposing that the local traffic is entering into your linux server via eth0: 1. Mark the packets using iptables before the routing decision is done: iptables -t mangle -A PREROUTING -i eth0 -m nth --every 2 --packet 0 -j MARK --set-mark 111 iptables -t mangle -A PREROUTING -i eth0 -m nth --every 2 --packet 1 -j MARK --set-mark 222 2. Setup some rules to jump to the correct routing tables. In this case I will suppose that you are using the tables 111 and 222 ( obviously you can use the ones you like ) ip rule add prio 111 fwmark 111 table 111 ip rule add prio 222 fwmark 222 table 222 ( you can also set the priority of the rules at your convenience ) 3. Setup your routing tables ( in this example 111 and 222 ) to reach each router as you had with the 2.4 kernel. [ ... ] ip route add table 111 default via ROUTER1_IP_ADDRESS ip route add table 222 default via ROUTER2_IP_ADDRESS In this case we need no multipath route. Half of all the packets that come into eth0 are routed using the 111 table and the rest is routed using the 222 table thanks to the marks we set. The problems you are experiencing with the multipath routing are due to that the routing decision uses a cache and after a routing decision to a destination is done, it would always use the same gateway to reach that destination until the routing cach? expires. I hope this helps, Regards, Eric Janz -- ADVERTENCIA LEGAL El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy LEGAL ADVISORY This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070410/d4e7a34c/attachment.htm From lists at andyfurniss.entadsl.com Tue Apr 10 21:29:33 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Apr 10 21:29:29 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 In-Reply-To: References: Message-ID: <461BE59D.4070405@andyfurniss.entadsl.com> e.janz@barceloviajes.com wrote: > thanks for the info. First of all, in order to use the nth match you need > to patch your kernel using patch-o-matic. I think nth is in kernel now as part of the statistic match. Andy. From lists at andyfurniss.entadsl.com Wed Apr 11 00:11:06 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 11 00:10:59 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <200704092349.09428.alex@uh.cu> References: <1175788853.3947.6.camel@benve-laptop> <200704051836.54856.alex@uh.cu> <200704092349.09428.alex@uh.cu> Message-ID: <461C0B7A.8050601@andyfurniss.entadsl.com> Alejandro Ramos Encinosa wrote: > tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 > > tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit This is a misconfiguration, it doesn't make sense to add sfq and another htb class to 1:20. Andy. From lists at andyfurniss.entadsl.com Wed Apr 11 00:19:36 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 11 00:20:02 2007 Subject: [LARTC] tc (CBQ) and UDP packets In-Reply-To: <20070409170558.56C7B4B8EB@outpost.ds9a.nl> References: <20070409170558.56C7B4B8EB@outpost.ds9a.nl> Message-ID: <461C0D78.5050200@andyfurniss.entadsl.com> Pablo Fernandes Yahoo wrote: > Hello, > > > > I have seen in my site something strange. I use tc-CBQ for bandwidth > shaping, and it works properly well. Sometimes i have seen UDP connections > spend more bandwidth than i set up to the user. I have 4 sites, and it > always happens. Is there a problem with bandwidth shaping and UDP packets? I have never used CBQ. It could be that you are measuring the bandwidth before the queue - UDP won't usually back off in response to delay/drop like TCP does. Andy. From lists at andyfurniss.entadsl.com Wed Apr 11 00:24:13 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 11 00:24:11 2007 Subject: [LARTC] ipp2p problem with kernel 2.6.20! In-Reply-To: <1175784729.22316.3.camel@dunder> References: <1175596875.18427.7.camel@dunder> <1175784729.22316.3.camel@dunder> Message-ID: <461C0E8D.60008@andyfurniss.entadsl.com> Niclas Bentley wrote: > Hi again, > I wrote this a few days ago....Isn't there anybody using ipp2p and linux > kernel 2.6.20? > I think ipp2p would be very useful in order to identify bittorrent > traffic... I don't - maybe you should cc the maintainer aswell. As for bittorrent - there may be easier (and fairer) ways, like per user shaping or if you are doing QOS for yourself use a client that can bind to a different IP. Andy. From fernandes_pablo at yahoo.com.br Tue Apr 10 21:56:09 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Wed Apr 11 01:57:08 2007 Subject: AW: [LARTC] tc (CBQ) and UDP packets In-Reply-To: <461C0D78.5050200@andyfurniss.entadsl.com> Message-ID: <20070410235704.37A9B3FB4@outpost.ds9a.nl> Hi Andy and All, Thank you so much for your explain. I really apreciated it. I was thiking about this UDP question and bandwidth shaping... i saw the same udp connections pass through the shaping of the cable modem and CBQ in a linux box. Anyway, this is an example of my CBQ configuration.... Is it ok? This configuration can carry on maximun of 1500 users. After 1500 users, the process ksoftirqd spend 99% of CPU fulltime and the connections get very crap. I'm telling this just because it could help to see if my configurations are o kor wrong. Here it goes: tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt 1000 cell 8 tc class change dev eth1 root cbq weight 1Mbit allot 1514 tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt 1000 cell 8 tc class change dev eth0 root cbq weight 1Mbit allot 1514 tc class add dev eth1 parent 1: classid 1:5 cbq bandwidth 100Mbit rate 135Kbit weight 10Kbit prio 5 allot 1514 cell 8 maxburst 130 avpkt 1000 bounded isolated tc qdisc add dev eth1 parent 1:5 handle 5 tbf rate 135Kbit buffer 10Kb/8 limit 15Kb mtu 1500 tc filter add dev eth1 parent 1:0 protocol ip prio 100 u32 match ip dst xxx.xxx.xxx.xxx/32 classid 1:5 tc class add dev eth0 parent 1: classid 1:5 cbq bandwidth 100Mbit rate 40Kbit weight 10Kbit prio 5 allot 1514 cell 8 maxburst 130 avpkt 1000 bounded isolated tc qdisc add dev eth0 parent 1:5 handle 5 tbf rate 40Kbit buffer 10Kb/8 limit 15Kb mtu 1500 tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 5 fw classid 1:5 tc filter add dev eth0 parent 1:0 protocol ip prio 100 u32 match ip src xxx.xxx.xxx.xxx/32 classid 1:5 I have this last 2 blocks of commands for each user in my ISP. We sells the speeds: 100Kbps, 150Kbps, 200Kbps, 300Kbps, 450Kbps, 600Kbps, 1000Kbps, 1500Kbps, 2000Kbps. At the example i showed an 135Kbps/40Kbps connection, but it is an exception... we do not have this speed. I will appreciate any help... really. Thanks a lot in advance. Pablo Fernandes -----Urspr?ngliche Nachricht----- Von: Andy Furniss [mailto:lists@andyfurniss.entadsl.com] Gesendet: ter?a-feira, 10 de abril de 2007 23:20 An: Pablo Fernandes Yahoo Cc: lartc@mailman.ds9a.nl Betreff: Re: [LARTC] tc (CBQ) and UDP packets Pablo Fernandes Yahoo wrote: > Hello, > > > > I have seen in my site something strange. I use tc-CBQ for bandwidth > shaping, and it works properly well. Sometimes i have seen UDP connections > spend more bandwidth than i set up to the user. I have 4 sites, and it > always happens. Is there a problem with bandwidth shaping and UDP packets? I have never used CBQ. It could be that you are measuring the bandwidth before the queue - UDP won't usually back off in response to delay/drop like TCP does. Andy. _______________________________________________________ Yahoo! Mail - Sempre a melhor op??o para voc?! Experimente j? e veja as novidades. http://br.yahoo.com/mailbeta/tudonovo/ From e.janz at barceloviajes.com Wed Apr 11 10:01:26 2007 From: e.janz at barceloviajes.com (e.janz@barceloviajes.com) Date: Wed Apr 11 10:01:31 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 In-Reply-To: <461BE59D.4070405@andyfurniss.entadsl.com> Message-ID: Andy Furniss wrote on 10/04/2007 21:29:33: > e.janz@barceloviajes.com wrote: > > > thanks for the info. First of all, in order to use the nth match you need > > to patch your kernel using patch-o-matic. > > I think nth is in kernel now as part of the statistic match. > > Andy. I was searching this yesterday afternoon and could not verify it. Today I found it in the 2.6.18 kernel's changelog: [NETFILTER]: x_tables: add statistic match Add statistic match which is a combination of the nth and random matches. Signed-off-by: Patrick McHardy <...> Signed-off-by: David S. Miller <...> commit 62b7743483b402f8fb73545d5d487ca714e82766 Author: Patrick McHardy <...> Date: Mon May 29 18:20:32 2006 -0700 ? Does this match help you to solve the problem ? Regards, Eric -- ADVERTENCIA LEGAL El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy LEGAL ADVISORY This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070411/940e803f/attachment.html From bob.beers at gmail.com Wed Apr 11 16:16:37 2007 From: bob.beers at gmail.com (Bob Beers) Date: Wed Apr 11 16:16:49 2007 Subject: [LARTC] two routes, non-permanent higher proiority Message-ID: <4f6ba3b0704110716k7b68f183h37f415ddefba3af0@mail.gmail.com> Hi LARTC experts, Can you help me with this scenario? I have been lurking on this list for a while, but I am very green wrt advanced routing. I have a slackware linux based server in a vehicle with a high bandwidth satellite connection which is only available while the vehicle is stationary. This satellite connection comes in via eth0. But while the vehicle is moving or stationary, we can use a CDMA modem via a ppp0 connection coming in on a serial port, but it is much lower bandwidth. The vehicle operator should only need to park the vehicle, press the "engage the satellite" button, and as soon as the link is established, the satellite is the preferred path to the internet. There will be traffic coming both from the router and from wired and wireless devices on private network. Looking forward to your advice. -Bob From andrew.lyon at josims.com Wed Apr 11 17:48:47 2007 From: andrew.lyon at josims.com (Andrew Lyon) Date: Wed Apr 11 17:49:05 2007 Subject: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 Message-ID: <592F914D209FD942908826DFF2277A2D0329E83A@commsserver.josims.local> > > >________________________________________ >From: e.janz@barceloviajes.com [mailto:e.janz@barceloviajes.com] >Sent: 11 April 2007 09:01 >To: lartc@mailman.ds9a.nl >Subject: Re: [LARTC] equalize / ecmp not working as expected in 2.6 vs 2.4 > > >Andy Furniss wrote on 10/04/2007 21:29:33: > >> e.janz@barceloviajes.com wrote: >> >> > thanks for the info. First of all, in order to use the nth match you need >> > to patch your kernel using patch-o-matic. >> >> I think nth is in kernel now as part of the statistic match. >> >> Andy. > >I was searching this yesterday afternoon and could not verify it. >Today I found it in the 2.6.18 kernel's changelog: > >? ? [NETFILTER]: x_tables: add statistic match > >? ? Add statistic match which is a combination of the nth and random matches. > >? ? Signed-off-by: Patrick McHardy <...> >? ? Signed-off-by: David S. Miller <...> > >commit 62b7743483b402f8fb73545d5d487ca714e82766 >Author: Patrick McHardy <...> >Date: ? Mon May 29 18:20:32 2006 -0700 > > >? Does this match help you to solve the problem ? > >Regards, >Eric-- > >ADVERTENCIA LEGAL >El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy > >LEGAL ADVISORY >This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy Your suggestion pointed me in the right direction, it is now working with the following setup: Kernel 2.6.20-gentoo-r4 x86_64 Iptables 1.3.7 Iptables rules: iptables -t mangle -A OUTPUT -s -m statistic --mode nth --every 2 --packet 0 -j MARK --set-mark 111 iptables -t mangle -A OUTPUT -s -m statistic --mode nth --every 2 --packet 1 -j MARK --set-mark 222 is one of the ip addresses from the /28 range that is routed to both of our lines, in your example you said to add to PREROUTING, but the packets are from the box itself so I changed to OUTPUT, so far that has not caused any problems... any comments on that? We only want to do per-packet load balanced for some local and some routed ips, not all of them, some services cannot cope with the out of order packets that arise from sending outgoing traffic through two different links. IP Rules: ip rule add prio 111 fwmark 111 table ADSLLink1 ip rule add prio 222 fwmark 222 table ADSLLink2 Both ADSLLink1 and ADSLLink2 already existed and contain a default route via the router for line 1 or line 2, they also have routes for other subnets so that for example I can ping our routers from my workstation which has a private IP address, I wont show all the routes as the box has some 10 eth interfaces and it is very long and confusing, but the important bit is: ip route show table ADSLLink1 | grep default default via dev inet0 ip route show table ADSLLink2 | grep default default via dev inet0 This is currently working in combination with the ecmp routes that were already in place, and that is working very well for us, services that suffer when there are lots of OOOP's still get per-flow/cached route load balanced over the two lines, and services that can handle a few OOOP's are getting the full benefit of 2 x upload speed. Iptables also gives me much more fine grained control of the setup, when I have more time I will be making more improvements. A final note, I got very confused for a while last night because whenever I used iptables with -t mangle I got a error like this: iptables --list -t mangle Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination FATAL: Module ip_tables not found. It turns out that this is a issue with having ip_tables compiled into the kernel, /sbin/iptables tries to modprobe it regardless of that and then fails because it is not a module, I believe a fix was posted to netfilter mailing list, I got rid of the error by making a dummy kernel module with the name ip_tables, not a nice solution but it does suppress the error. Many thanks for your help Andy JOSEDV001TAG From shuveb at gmail.com Wed Apr 11 17:53:07 2007 From: shuveb at gmail.com (Shuveb Hussain) Date: Wed Apr 11 17:53:14 2007 Subject: [LARTC] Policing based on port numbers Message-ID: Hi, I'm trying to police ingress traffic based on port numbers and IP addresses. The u32 match based on IP addresses seems to work without issues and I'm am able to police incoming packets. However, the same isn't working with u32 matches based on TCP port numbers. For port numbers, I added exactly one 'u32 match' rule: common for both: # tc qdisc add dev eth0 handle ffff: ingress And then: # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match ip src \ 0.0.0.0/0 police rate 128kbit burst 10k drop flowid :1 The rule above works, but the same with a port match does not: # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match tcp dport 0xXYZ 0xFFFF police rate 128kbit burst 10k drop flowid :1 Is there anything I am missing? TIA, -- Shuveb Hussain. When you lose, be patient. When you achieve, be even more patient. From lists at andyfurniss.entadsl.com Wed Apr 11 19:04:14 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 11 19:04:04 2007 Subject: AW: [LARTC] tc (CBQ) and UDP packets In-Reply-To: <20070411000740.061F15CAED7@entadsl.viper.enta.net> References: <20070411000740.061F15CAED7@entadsl.viper.enta.net> Message-ID: <461D150E.7030501@andyfurniss.entadsl.com> Pablo Fernandes Yahoo wrote: > Hi Andy and All, > > Thank you so much for your explain. I really apreciated it. I was thiking > about this UDP question and bandwidth shaping... i saw the same udp > connections pass through the shaping of the cable modem and CBQ in a linux > box. As I said I've not used CBQ - The LARTC section does say it is hard to configure and may not always work. > > Anyway, this is an example of my CBQ configuration.... Is it ok? This > configuration can carry on maximun of 1500 users. After 1500 users, the > process ksoftirqd spend 99% of CPU fulltime and the connections get very > crap. I'm telling this just because it could help to see if my > configurations are o kor wrong. Here it goes: I also haven't tried shaping at ISP sized configurations. One thing I notice is you attach TBF to classes, I don't see that in the examples. If the ip addresses are in groups it may be better to read the hashing filters section of the howto and see if the filtering can be optimised. Maybe someone with more experience of big setups can help you more than me. As a customer I quite like being policed rather than shaped, so that could be something to look into. Andy. From lists at andyfurniss.entadsl.com Wed Apr 11 21:28:29 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Apr 11 21:29:17 2007 Subject: [LARTC] Policing based on port numbers In-Reply-To: References: Message-ID: <461D36DD.2010301@andyfurniss.entadsl.com> Shuveb Hussain wrote: > Hi, > > I'm trying to police ingress traffic based on port numbers and IP > addresses. The u32 match based on IP addresses seems to work without > issues and I'm am able to police incoming packets. However, the same > isn't working with u32 matches based on TCP port numbers. For port > numbers, I added exactly one 'u32 match' rule: > > common for both: > # tc qdisc add dev eth0 handle ffff: ingress > > And then: > > # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match ip > src \ > 0.0.0.0/0 police rate 128kbit burst 10k drop flowid :1 > > The rule above works, but the same with a port match does not: > > # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match > tcp dport 0xXYZ 0xFFFF police rate 128kbit burst 10k drop flowid :1 > > Is there anything I am missing? I've never managed to find a way to use the word tcp in a filter without getting an illegal match - I know it's in the help. If you want to match tcp use the ip protocol match tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match ip dport 0xXYZ 0xFFFF match ip protocol 0x06 0xff police ..... Andy. From christian.benvenuti at libero.it Wed Apr 11 22:09:19 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Wed Apr 11 21:59:27 2007 Subject: [LARTC] Re: Policing based on port numbers Message-ID: <1176322159.21137.19.camel@benve-laptop> Hi, >> Hi, >> >> I'm trying to police ingress traffic based on port numbers and IP >> addresses. The u32 match based on IP addresses seems to work without >> issues and I'm am able to police incoming packets. However, the same >> isn't working with u32 matches based on TCP port numbers. For port >> numbers, I added exactly one 'u32 match' rule: >> >> common for both: >> # tc qdisc add dev eth0 handle ffff: ingress >> >> And then: >> >> # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match ip >> src \ >> 0.0.0.0/0 police rate 128kbit burst 10k drop flowid :1 >> >> The rule above works, but the same with a port match does not: >> >> # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match >> tcp dport 0xXYZ 0xFFFF police rate 128kbit burst 10k drop flowid :1 >> >> Is there anything I am missing? >I've never managed to find a way to use the word tcp in a filter without >getting an illegal match - I know it's in the help. The reason it does not work is that the keywords that refer to the L4 protocols (such as tcp) are supposed to be used when: 1) the u32 filter is inserted into a u32 hash table AND 2) you jump to the above hash table from a u32 filter configured with the "offset" option. (unfortunately the U32 classifier is not well documented) >If you want to match tcp use the ip protocol match > >tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match >ip dport 0xXYZ 0xFFFF match ip protocol 0x06 0xff police ..... You should invert the order of the two "match" conditions: first you make sure it is TCP, and then you test the destination port number. This filter works in most cases, but not always: it does not take into account the IP options. IP packets with options in the IP header will not match. The reason is that "ip dport" is equivalent to "offset 22" from the beginning of the IP header. Regards /Christian [http://benve.info] From christian.benvenuti at libero.it Wed Apr 11 23:54:07 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Wed Apr 11 23:44:18 2007 Subject: [LARTC] Re: tc (CBQ) and UDP packets Message-ID: <1176328447.21137.27.camel@benve-laptop> Hi, >> I have seen in my site something strange. I use tc-CBQ for bandwidth >> shaping, and it works properly well. Sometimes i have seen UDP connections >> spend more bandwidth than i set up to the user. I have 4 sites, and it >> always happens. Is there a problem with bandwidth shaping and UDP packets? >I have never used CBQ. It could be that you are measuring the bandwidth >before the queue - UDP won't usually back off in response to delay/drop >like TCP does. > > Andy Andy is right. Did you verify it? How/where are you measuring the rate? Your config is supposed to work. On which direction do you see the problem? Upload (i.e., eth1 to eth0), or Download (i.e., eth0 to eth1)? Regards /Christian [http://benve.info] From christian.benvenuti at libero.it Thu Apr 12 00:27:36 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Apr 12 00:17:29 2007 Subject: [LARTC] Re: tc questions Message-ID: <1176330456.21137.39.camel@benve-laptop> Hi, >Hello. > >I may be misunderstanding what you are trying to do, but I think > >tc -s class ls dev eth1 > >shows the stats you want. > >note on the "class" word The above command is good for getting the statistics, but it does not return the current status of the class's queue (i.e., the number of packets in it). However, in most cases the statistics is what you want, therefore Marco is right. Regards /Christian [http://benve.info] From lists at andyfurniss.entadsl.com Thu Apr 12 02:57:26 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu Apr 12 02:57:15 2007 Subject: [LARTC] Re: Policing based on port numbers In-Reply-To: <1176322159.21137.19.camel@benve-laptop> References: <1176322159.21137.19.camel@benve-laptop> Message-ID: <461D83F6.20608@andyfurniss.entadsl.com> Christian Benvenuti wrote: >> I've never managed to find a way to use the word tcp in a filter without >> getting an illegal match - I know it's in the help. > > The reason it does not work is that the keywords that refer to the L4 > protocols (such as tcp) are supposed to be used when: > 1) the u32 filter is inserted into a u32 hash table > AND > 2) you jump to the above hash table from a u32 filter configured with > the "offset" option. > > (unfortunately the U32 classifier is not well documented) Ahh thanks I'll have to try some hashing tests. Last time I tried I found that going above 512 buckets didn't work so I gave up and still haven't looked into turning things up (I assume that's possible) > >> If you want to match tcp use the ip protocol match >> >> tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match >> ip dport 0xXYZ 0xFFFF match ip protocol 0x06 0xff police ..... > > You should invert the order of the two "match" conditions: first you make > sure it is TCP, and then you test the destination port number. > Hmm - does that really matter - I did it that way because, judging by filter counters the first match that fails stops further matching. It depends on what you match and your traffic patterns, but it seemed like it would be more efficient. > This filter works in most cases, but not always: it does not take into > account the IP options. IP packets with options in the IP header will > not match. > The reason is that "ip dport" is equivalent to "offset 22" from the > beginning of the IP header. I wonder how many packets there are like that in the wild - I don't have much traffic to test. It's possible to make a fall through counter by making a match and not specifying a class/flowid, though ISTR seeing a patch recently that made me think that won't work anymore. Andy. From randywallacejr at gmail.com Thu Apr 12 05:34:13 2007 From: randywallacejr at gmail.com (Randy Wallace) Date: Thu Apr 12 05:34:24 2007 Subject: [LARTC] two routes, non-permanent higher proiority Message-ID: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> I would write a script that would check for connectivity to the internet over the ethernet port for internet. If your slackware router is always connected, and communicating, with the satellite modem, though not always passing real internet traffic, you could leave the interface 'up'. From there, you could write a simple ping-check type script that would ping the satellite gateway, which would only be visible when the dish is communicating to the NOC. For example: if ping -I eth0 xx.xx.xx.xx is successful then ip route del default via xx.xx.xx.xx dev ppp0 ip route add default via xx.xx.xx.xx dev eth0 else ip route del default via xx.xx.xx.xx dev eth0 ip route add default via xx.xx.xx.xx dev ppp0 otherwise, if eth0 is not always talking to the modem (ethernet), then you could use ethtool to check for ethernet connectivity before trying to ping, i.e. if (ethtool eth0 | grep 'Link detected: ') == 'yes' then ip addr add xx.xx.xx.xx/xx brd + dev eth0 and whatever routes you need to get the satellite gateway. then run the ping script above. hope i helped a little, -Randy From bob.beers at gmail.com Thu Apr 12 14:45:37 2007 From: bob.beers at gmail.com (Bob Beers) Date: Thu Apr 12 14:46:07 2007 Subject: [LARTC] two routes, non-permanent higher proiority In-Reply-To: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> References: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> Message-ID: <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> On 4/11/07, Randy Wallace wrote: Hi Randy, thanks for the reply, I'll leave it intact, but I have a few comments/questions at the end. > I would write a script that would check for connectivity to the > internet over the ethernet port for internet. > > If your slackware router is always connected, and communicating, with > the satellite modem, though not always passing real internet traffic, > you could leave the interface 'up'. From there, you could write a > simple ping-check type script that would ping the satellite gateway, > which would only be visible when the dish is communicating to the NOC. > > For example: > > if > > ping -I eth0 xx.xx.xx.xx > > is successful then > > ip route del default via xx.xx.xx.xx dev ppp0 > ip route add default via xx.xx.xx.xx dev eth0 > > else > > ip route del default via xx.xx.xx.xx dev eth0 > ip route add default via xx.xx.xx.xx dev ppp0 > > otherwise, if eth0 is not always talking to the modem (ethernet), then > you could use ethtool to check for ethernet connectivity before trying > to ping, i.e. > > if > > (ethtool eth0 | grep 'Link detected: ') == 'yes' > > then > > ip addr add xx.xx.xx.xx/xx brd + dev eth0 > > and whatever routes you need to get the satellite gateway. > then run the ping script above. > > hope i helped a little, > > -Randy > This is a good plan, and I'm doing something like this already, but I was really hoping that there was some static way to set up two default routes, maybe two route tables even, and that the kernel would be smart enough to know if the satellite route was reachable, and prefer it based on some priority or metric setting. I think the satellite modem may be powered even if the dish is stowed, so the ethtool check of "Link detected" may be always true, so not much help. I suppose maybe this is what the dynamic routing protocols (OSPF?, or BGP?) was designed to handle? But I suppose a script from a cronjob would adjust the routing with only a -- worst case -- 60 second delay. Thanks, -Bob From tami at disconnected.de Thu Apr 12 19:26:21 2007 From: tami at disconnected.de (Paul Zirnik) Date: Thu Apr 12 19:26:34 2007 Subject: [LARTC] two routes, non-permanent higher proiority In-Reply-To: <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> References: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> Message-ID: <200704121926.21775.tami@disconnected.de> On Thursday 12 April 2007 14:45, Bob Beers wrote: > On 4/11/07, Randy Wallace wrote: > > Hi Randy, thanks for the reply, > I'll leave it intact, but I have a few comments/questions at the end. > > > I would write a script that would check for connectivity to the > > internet over the ethernet port for internet. > > > > If your slackware router is always connected, and communicating, with > > the satellite modem, though not always passing real internet traffic, > > you could leave the interface 'up'. From there, you could write a > > simple ping-check type script that would ping the satellite gateway, > > which would only be visible when the dish is communicating to the NOC. > > > > For example: > > > > if > > > > ping -I eth0 xx.xx.xx.xx > > > > is successful then > > > > ip route del default via xx.xx.xx.xx dev ppp0 > > ip route add default via xx.xx.xx.xx dev eth0 > > > > else > > > > ip route del default via xx.xx.xx.xx dev eth0 > > ip route add default via xx.xx.xx.xx dev ppp0 > > > > otherwise, if eth0 is not always talking to the modem (ethernet), then > > you could use ethtool to check for ethernet connectivity before trying > > to ping, i.e. > > > > if > > > > (ethtool eth0 | grep 'Link detected: ') == 'yes' > > > > then > > > > ip addr add xx.xx.xx.xx/xx brd + dev eth0 > > > > and whatever routes you need to get the satellite gateway. > > then run the ping script above. > > > > hope i helped a little, > > > > -Randy > > This is a good plan, and I'm doing something like this already, > but I was really hoping that there was some static way to > set up two default routes, maybe two route tables even, > and that the kernel would be smart enough to know if the > satellite route was reachable, and prefer it based on some > priority or metric setting. I think the satellite modem may be > powered even if the dish is stowed, so the ethtool check of > "Link detected" may be always true, so not much help. > I suppose maybe this is what the dynamic routing protocols > (OSPF?, or BGP?) was designed to handle? But I suppose > a script from a cronjob would adjust the routing with only > a -- worst case -- 60 second delay. You can setup two or more default routes, the kernel automaticaly switches to the next defaultroute after a timeout. The timeout can be set via /proc/sys/net/ipv4/inet_peer_gc_maxtime To switch back to the first route IMHO simply flush the routing cache (not tested) regards, Paul From tami at disconnected.de Thu Apr 12 19:37:10 2007 From: tami at disconnected.de (Paul Zirnik) Date: Thu Apr 12 19:37:18 2007 Subject: [LARTC] two routes, non-permanent higher proiority In-Reply-To: <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> References: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> Message-ID: <200704121937.10789.tami@disconnected.de> Ups, sorry wrong /proc value for controling the timeout :( Right one is /proc/sys/net/ipv4/route/gc_timeout regards, Paul From wong_powah at yahoo.ca Thu Apr 12 19:39:46 2007 From: wong_powah at yahoo.ca (PoWah Wong) Date: Thu Apr 12 19:39:53 2007 Subject: [LARTC] two NICs on the same subnet Message-ID: <758445.80815.qm@web56311.mail.re3.yahoo.com> What are the reasons that two NICs on the same computer are set to the same subnet? i.e. eth0 IP addresses is x.y.z.m and eth1 is x.y.z.n. Any websites describing these in details? http://lartc.org/lartc.html#LARTC.RPDB.MULTIPLE-LINKS "4.2. Routing for multiple uplinks/providers" have two cases (Split access and Load balancing) for two or more internet connections on the same computer but do not state explicitly that they can be set to the same subnet. The two cases requires creating two routing tables, one for each interface, and route the packets accordingly (using iproute2), don't they? These kernel options for either 2.4.x or 2.6.y should be enabled: CONFIG_IP_ADVANCED_ROUTER (Networking/IP: Advanced Router) and CONFIG_IP_MULTIPLE_TABLES (Networking/IP: policy routing) CONFIG_IP_ROUTE_MULTIPATH (NetworkingIP: equal cost multipath) Is setting different service (eg. ssh, ftp) for different interface which are on the same subnet a valid reason? From bob.beers at gmail.com Thu Apr 12 20:45:03 2007 From: bob.beers at gmail.com (Bob Beers) Date: Thu Apr 12 20:45:08 2007 Subject: [LARTC] two routes, non-permanent higher proiority In-Reply-To: <200704121937.10789.tami@disconnected.de> References: <861508be0704112034j53d18877w93afbae4594093b0@mail.gmail.com> <4f6ba3b0704120545h28346036md19c3e2a361c1943@mail.gmail.com> <200704121937.10789.tami@disconnected.de> Message-ID: <4f6ba3b0704121145u4033da0dqcca59ba02ae4ecfc@mail.gmail.com> > Right one is /proc/sys/net/ipv4/route/gc_timeout Thanks Paul, I'll check this out. From christian.benvenuti at libero.it Thu Apr 12 22:30:02 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Apr 12 22:18:31 2007 Subject: [LARTC] Re: Policing based on port numbers Message-ID: <1176409802.17972.61.camel@benve-laptop> Hi, >>> If you want to match tcp use the ip protocol match >>> >>> tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match >>> ip dport 0xXYZ 0xFFFF match ip protocol 0x06 0xff police ..... >> >> You should invert the order of the two "match" conditions: first you make >> sure it is TCP, and then you test the destination port number. > >Hmm - does that really matter - I did it that way because, judging by >filter counters the first match that fails stops further matching. It >depends on what you match and your traffic patterns, but it seemed like >it would be more efficient. When performances are critical it can make sense to test the destination port number first, you are right. However, since at offset 22 there is the TCP destination port number only when the transport protocol is TCP (otherwise there is something else), it would not be a bad idea to first test the transport layer protocol. This is just my personal preference. I am sure many people prefer your solution. One more reason why I _personally_ would prefer to test the protocol first is that when you test dport first you also change the way the filter's statistics are updated. Here is an example. # TCP DST PORT = 5000 --> Class 1:12 tc filter add dev eth1 parent 1:0 prio 3 protocol ip \ u32 \ match ip protocol 6 0xFF \ match ip dport 5000 0xFFFF \ flowid 1:12 The u32 classifier can return the number of successes for each match condition: # tc -s -d filter list dev eth1 filter parent 1: protocol ip pref 3 u32 fh 800::801 order 2049 key ht 800 bkt 0 flowid 1:12 (rule hit 0 success 0) match 00060000/00ff0000 at 8 (success 0 ) <-------Partial counters match 00001388/0000ffff at 20 (success 0 ) <-------" " >From the output above I can tell: 1) how many TCP packets have been tested by the filter (1st match) 2) how many TCP packets matched dport (2nd match) 3) how many packets matched both 1) and 2). (In the simple example above 2) and 3) are the same) If you invert the order of the match conditions, you can only tell: 1) how many packets matched the filter (i.e., both the protocol and dport) You can not trust all the partial counters. >> This filter works in most cases, but not always: it does not take into >> account the IP options. IP packets with options in the IP header will >> not match. >> The reason is that "ip dport" is equivalent to "offset 22" from the >> beginning of the IP header. > >I wonder how many packets there are like that in the wild - I don't have > much traffic to test. It's possible to make a fall through counter by >making a match and not specifying a class/flowid, though ISTR seeing a >patch recently that made me think that won't work anymore. ISTR? An easy workaround (for detecting the packets with options) consists of adding another match condition that first tests the IHL field of the IP header and makes sure there are no option (i.e., IHL=5). Another workaround consists of tagging the packets with iptables. The percentage of IP packets with options is small. However, the choice here is between a configuration that is always correct (and therefore easier to troubleshoot) and one that is not always correct. For example, an ISP should not take the 2nd option into consideration. Regards /Christian [http://benve.info] From rcook at wyrms.net Fri Apr 13 02:41:45 2007 From: rcook at wyrms.net (Robin Cook) Date: Fri Apr 13 02:42:11 2007 Subject: [LARTC] gre tunnel question In-Reply-To: References: Message-ID: <1176424905.31855.27.camel@localhost> Hello, I am trying to implement a Broadcast GRE Tunnel that is described at this link http://linux-ip.net/gl/ip-tunnels/node9.html but it doesnt seem to be working. I am seeing the GRE packets for both networks on both sniffers but the tcpdump i tun0 doesnt show any of the packets from the remote end getting there. Can you tell me if I have something misconfigured or if what I am trying to do is impossible. Ive listed all the configuration information under the ascii drawing Basically trying to get routing information from network A to network B without knowing what the ip address is for remote eth0 is, hence the multicast address, and keeping the overhead as low as possible. Thanks. Router 1 2 Router +---------+ +--------+ +---+ +---+ +--------+ +---------+ |network A|--| Net-99 |-----|INE|--|INE|-----| Net-77 |--|network B| +---------+ +--------+ | +---+ +---+ | +--------+ +---------+ eth1 eth0 | PT CT CT PT | eth0 eth1 | | +-------+ +-------+ |sniffer| |sniffer| +-------+ +-------+ Net99 eth0: 172.16.1.240/24 Net99 eth1: 99.99.99.99/8 Net77 eth0: 172.16.2.240/24 Net77 eth1: 77.77.77.77/8 INE - Inline Network Encryptor INE1 PT: 172.16.1.1/24 INE1 CT: 10.0.0.1/24 INE2 PT: 172.16.2.1/24 INE2 CT: 10.0.0.2/24 Router Net99 ============ Create Tunnel ------------- modprobe ip-gre ip tunnel add tun0 mode gre remote 239.0.0.1 local 172.16.1.240 dev eth0 ip addr add 10.20.1.1/24 dev tun0 ip link set tun0 up ip link seg gre0 up ======================================================================== zebra.conf ---------- hostname net99 password itac enable password itac interface eth0 description To KG-175B 789 ip address 172.16.1.240/24 multicast interface eth1 description To airborne network ip address 99.99.99.99/8 multicast interface tun0 description tunnel for ospf ip address 10.20.1.1/24 multicast ip route 0.0.0.0/0 172.16.1.1 ======================================================================== ospfd.conf ---------- router ospf ospf router-id 172.16.1.240 ospf abr-type cisco redistribute connected network 10.20.1.0/24 area 0 network 99.0.0.0/8 area 1 area 1 stub ======================================================================== Router Net77 ============ Create Tunnel ------------- modprobe ip-gre ip tunnel add tun0 mode gre remote 239.0.0.1 local 172.16.2.240 dev eth0 ip addr add 10.20.1.2/24 dev tun0 ip link set tun0 up ip link set gre0 up ======================================================================== zebra.conf ---------- hostname net77 password itac enable password itac interface eth0 description To KG-175E 23969 ip address 172.16.2.240/24 multicast interface eth1 description To airborne network ip address 77.77.77.77/8 multicast interface tun0 description tunnel for ospf ip address 10.20.1.2/24 multicast ip route 0.0.0.0/0 172.16.1.1 ======================================================================== ospf.conf --------- router ospf ospf router-id 172.16.2.240 ospf abr-type cisco redistribute connected network 10.20.1.0/24 area 0 network 77.0.0.0/8 area 1 area 1 stub ======================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: This is a digitally signed message part Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070412/cf5ba5b8/attachment-0001.pgp From alex at uh.cu Fri Apr 13 02:50:36 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Fri Apr 13 02:49:07 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <461C0B7A.8050601@andyfurniss.entadsl.com> References: <1175788853.3947.6.camel@benve-laptop> <200704092349.09428.alex@uh.cu> <461C0B7A.8050601@andyfurniss.entadsl.com> Message-ID: <200704130050.36834.alex@uh.cu> On Tuesday 10 April 2007 22:11, Andy Furniss wrote: > Alejandro Ramos Encinosa wrote: > > tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 > > > > tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit > > This is a misconfiguration, it doesn't make sense to add sfq and another > htb class to 1:20. ...why? The case I am trying to deal with is an scenario where some traffic goes into 1:20 (something like the traffic from/to the subnet 10.6.70.0/24) and then, I want to shape specifically some other traffic type (for example, the ssh connections from/to subnet 10.6.70.0/24). Is there another way to do it? Please, take a in mind that (in my example) I want to enclose the whole traffic from/to the subnet 10.6.70.0/24 and from that traffic I want to give an special treatment to ssh traffic. > > Andy. Regards, Ale. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From orrie at seznam.cz Fri Apr 13 08:12:30 2007 From: orrie at seznam.cz (Ales Klok) Date: Sun Apr 22 01:07:44 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <200704130050.36834.alex@uh.cu> References: <1175788853.3947.6.camel@benve-laptop> <200704092349.09428.alex@uh.cu> <461C0B7A.8050601@andyfurniss.entadsl.com> <200704130050.36834.alex@uh.cu> Message-ID: <461F1F4E.4010606@seznam.cz> Alejandro Ramos Encinosa wrote: > On Tuesday 10 April 2007 22:11, Andy Furniss wrote: > >> Alejandro Ramos Encinosa wrote: >> >>> tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 >>> >>> tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit >>> >> This is a misconfiguration, it doesn't make sense to add sfq and another >> htb class to 1:20. >> > ...why? The case I am trying to deal with is an scenario where some traffic > goes into 1:20 (something like the traffic from/to the subnet 10.6.70.0/24) > and then, I want to shape specifically some other traffic type (for example, > the ssh connections from/to subnet 10.6.70.0/24). Is there another way to do > it? Please, take a in mind that (in my example) I want to enclose the whole > traffic from/to the subnet 10.6.70.0/24 and from that traffic I want to give > an special treatment to ssh traffic. > >> Andy. >> > Regards, Ale. > You can't attach qdisc to HTB inner class, because only leaf classes can hold packet queue. You have to create inner class with bandwith allocation for 10.6.70.0/24 and attach child classes to it (for SSH, RDP, ... whatever). Please check HTB manual and theory here http://luxik.cdi.cz/~devik/qos/htb/ (especially section 3. Sharing hierarchy) /ak From lists at andyfurniss.entadsl.com Fri Apr 13 21:16:26 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sun Apr 22 01:12:10 2007 Subject: [LARTC] Re: tc questions In-Reply-To: <200704130050.36834.alex@uh.cu> References: <1175788853.3947.6.camel@benve-laptop> <200704092349.09428.alex@uh.cu> <461C0B7A.8050601@andyfurniss.entadsl.com> <200704130050.36834.alex@uh.cu> Message-ID: <461FD70A.9090902@andyfurniss.entadsl.com> Alejandro Ramos Encinosa wrote: > On Tuesday 10 April 2007 22:11, Andy Furniss wrote: >> Alejandro Ramos Encinosa wrote: >>> tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 >>> >>> tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit >> This is a misconfiguration, it doesn't make sense to add sfq and another >> htb class to 1:20. > ...why? The case I am trying to deal with is an scenario where some traffic > goes into 1:20 (something like the traffic from/to the subnet 10.6.70.0/24) > and then, I want to shape specifically some other traffic type (for example, > the ssh connections from/to subnet 10.6.70.0/24). Is there another way to do > it? Please, take a in mind that (in my example) I want to enclose the whole > traffic from/to the subnet 10.6.70.0/24 and from that traffic I want to give > an special treatment to ssh traffic. >> Andy. > Regards, Ale. > You could add two htb classes under 1:20 and give one higher prio, or you could use the prio qdisc. If you really care about latency and have many bulk classes on a slow link then hfsc is better than htb. Linux hfsc could still be improved. sfq and b/pfifo should be added on leafs, so you could still use them if you created two classes under 1:20. If you don't specify a qdisc on htb leafs you get pfifo - but the queue length will be chosen from the interface that htb is added to - 1000 for eth (possibly too long) or 3 on ppp/vlan (too short), so it's worth thinking about queue lengths, adding a qdisc and using the limit parameter common to b/pfifo and sfq. (default sfq is 128). Andy. From Jon.J.Flechsenhaar at boeing.com Wed Apr 18 00:20:35 2007 From: Jon.J.Flechsenhaar at boeing.com (Flechsenhaar, Jon J) Date: Sun Apr 22 01:26:36 2007 Subject: [LARTC] RSVP questions? Message-ID: <0E24ED2A7F9AA349A8633E6A56A64BE002239830@XCH-SW-2V1.sw.nos.boeing.com> Have a few generic questions about Kom-RSVP 1.) 11:00:05.175 WARNING: timer system overloaded, deviation is 30.080 sec - what timer is this referring to? System time? 2.) What is the relationship between the CBQ class created for RSVP traffic and the CBQ parameter that is in RSVP.conf? - ex. ### cbq qdisc ### tc qdisc add dev eth0 root handle 1: cbq bandwidth 10mbit avpkt 1000 mpu 64 ### cbq root rate ### tc class add dev eth0 parent 1:0 classid :1 est 1sec 8sec cbq bandwidth 10mbit rate 1mbit allot 1514 maxburst 50 avpkt 1000 ### extra class ### tc class add dev eth0 parent 1:1 classid :2 est 1sec 8sec cbq bandwidth 10mbit rate 100kbit allot 1514 weight 500Kbit prio 6 maxburst 50 avpkt 1000 ### rsvp class - classid acts as filter ??? ### tc class add dev eth0 parent 1:1 classid 1:7FFE cbq rate 800kbit bandwidth 10mbit allot 1514b avpkt 1000 maxburst 20 isolated RSVP.conf interface eth0 refresh 10000 tc cbq 1000000 2500 3.) Does the above statement in RSVP.conf reserve 1mbit/s for RSVP traffic? From andras at andras.net Wed Apr 18 07:53:25 2007 From: andras at andras.net (Andras Sarkozy) Date: Sun Apr 22 01:28:45 2007 Subject: [LARTC] ipp2p problem with kernel 2.6.20! In-Reply-To: <461C0E8D.60008@andyfurniss.entadsl.com> References: <1175596875.18427.7.camel@dunder> <1175784729.22316.3.camel@dunder> <461C0E8D.60008@andyfurniss.entadsl.com> Message-ID: <4625B255.1010800@andras.net> Hi, I'm using ipp2p v0.8.2 under kernel 2.6.20 - especially to shape my son's bittorrent usage :) I had the same problem first, then it took me several steps/attemps to make it work but it is now. Those are the steps I followed (probably an overkill but I wanted all features available for Shorewall and multiple ISP routing) 1, get the clean kernel source and symlink it to /usr/src/linux (it saves you some mistakes to properly identify the location of your kernel source) 2, get the latest iptables source, symlink to /usr/src/iptables 3, get the ipp2p source 4, get the latest patch-o-matic-ng 5, get Julian Anastasov's latest giant patch for 2.6.20 6, run the patch-o-matic/runme and apply all features all features 7, run the patch-o-matic/runme ipp2p 8, run make menuconfig 9, apply Julian's patch on the kernel 10, run make menuconfig & make 11, compile iptables against this kernel source (don't forget to set the proper PREFIX to your distro) 12, compile the ipp2p code against this kernel source 13, install the new kernel: make modules modules_install install 14, optionally you have to copy the ipt_ipp2p.ko to the proper modules folder and copy libipt_ipp2p.so to iptables's lib folder There could be steps not necessary in this process but it is working now for me now. Hope this helps, Andras Andy Furniss wrote: > Niclas Bentley wrote: >> Hi again, >> I wrote this a few days ago....Isn't there anybody using ipp2p and linux >> kernel 2.6.20? >> I think ipp2p would be very useful in order to identify bittorrent >> traffic... > > I don't - maybe you should cc the maintainer aswell. > > As for bittorrent - there may be easier (and fairer) ways, like per user > shaping or if you are doing QOS for yourself use a client that can bind > to a different IP. > > Andy. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > From s.cramatte at wanadoo.fr Wed Apr 18 10:45:54 2007 From: s.cramatte at wanadoo.fr (=?ISO-8859-1?Q?S=E9bastien_CRAMATTE?=) Date: Sun Apr 22 01:29:36 2007 Subject: [LARTC] Can't change ipt_conntrack hashsize under debian sarge ??? Message-ID: <4625DAC2.3020209@wanadoo.fr> Hello, I've tried to change ipt_conntrack hashsize and con under my debian charge but doesn't work ! Ive got 2876Mb available for conntrack so I've done (according to some previous mail and this http://www.wallfire.org/misc/netfilter_conntrack_perf.txt) CONNTRACK_MAX = 2876 * 64 = 184064 HASHSIZE = 2876 * 8 = 23002 But the near power of 2 is 2^16 = 131072 ... I'm not sure that if it better to put 184064 or 131072 ? Seems that netfilter algorythm is more eficient with power of 2 value ? I can set the CONNTRACK_MAX value but not the HASHSIZE ... I've tried add hashsize= paremeter in /etc/modules or in /etc/modprobe.d/arch/i386 and I've done an "update-modules" ... When reboot the server the value still 8192 ???? Any Ideas ? Moreover I've read somewhere that is better to augment HASHSIZE value to 1:2 ratio ... in my case 65440 But how can I determine the best value ? My computer is P4 Hyper Threading 3.6 Ghz ... Might be I should put 131072 as CONNTRACK_MAX ? This server is a bridge that only do L7 QoS (filter + o - 70 Mbits for > 600 customers ). # cat /etc/sysctl.conf net.ipv4.netfilter.ip_conntrack_max = 131072 #cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max 131072 # cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets 8192 #cat /etc/modprobe.d/arch/i386 alias eth0 tg3 alias eth1 tg3 alias eth2 e1000 options ipt_conntrack hashsize=65440 Many thanks for you help Regards From e.janz at barceloviajes.com Wed Apr 18 13:06:22 2007 From: e.janz at barceloviajes.com (e.janz@barceloviajes.com) Date: Sun Apr 22 01:29:41 2007 Subject: [LARTC] The "ip route get" returns wrong interface and gateway in an multipath routing environment Message-ID: Hi, I think I found a problem in iproute or ubuntu kernel. I think that the "ip route get" returns wrong interface and gateway in an multipath routing environment on Ubuntu 6.06 LTS. I reported it also to launchpad as a bug: https://bugs.launchpad.net/ubuntu/+source/iproute/+bug/105521 The easiest way to reproduce it is to start an Ubuntu 6.06 LTS Live on a system with three IF's and setup the environment as follows: root@ubuntu:~# ifconfig eth0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255 up root@ubuntu:~# ifconfig eth1 192.168.1.1 netmask 255.255.255.0 broadcast 192.168.1.255 up root@ubuntu:~# ifconfig eth2 192.168.2.1 netmask 255.255.255.0 broadcast 192.168.2.255 up root@ubuntu:~# ip route add default nexthop via 192.168.1.254 dev eth1 weight 1 nexthop via 192.168.2.254 dev eth2 weight 1 root@ubuntu:~# ip rule add prio 111 from 192.168.1.1 table 111 root@ubuntu:~# ip rule add prio 222 from 192.168.2.1 table 222 root@ubuntu:~# ip route add default table 111 via 192.168.1.254 root@ubuntu:~# ip route add default table 222 via 192.168.2.254 root@ubuntu:~# ip route ls 192.168.2.0/24 dev eth2 proto kernel scope link src 192.168.2.1 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.1 default nexthop via 192.168.1.254 dev eth1 weight 1 nexthop via 192.168.2.254 dev eth2 weight 1 root@ubuntu:~# ip rule ls 0: from all lookup local 111: from 192.168.1.1 lookup 111 222: from 192.168.2.1 lookup 222 32766: from all lookup main 32767: from all lookup default root@ubuntu:~# ip route ls table 111 default via 192.168.1.254 dev eth1 root@ubuntu:~# ip route ls table 222 default via 192.168.2.254 dev eth2 root@ubuntu:~# uname -a Linux ubuntu 2.6.15-23-386 #1 PREEMPT Tue May 23 13:49:40 UTC 2006 i686 GNU/Linux root@ubuntu:~# ip -V ip utility, iproute2-ss041019 root@ubuntu:~# ip route get 1.2.3.1 1.2.3.1 via 192.168.2.254 dev eth2 src 192.168.1.1 cache mtu 1500 advmss 1460 hoplimit 64 root@ubuntu:~# ip route get 1.2.3.2 1.2.3.2 via 192.168.2.254 dev eth2 src 192.168.2.1 cache mtu 1500 advmss 1460 hoplimit 64 As you can see, the "ip route get" always returns ".. via 192.168.2.254 dev eth2 ..." and only switches the source ip but not the corresponding interface and gateway. I saw this behaviour a long time ago on Debian, but by now, on Debian Woody this is working fine as least on kernel 2.6.14 and 2.6.16 with the same iproute package and the "ip route get" output also gives the right interface and gateway ( I did not test it on more environments ). On an installed Ubuntu 6.06 LTS the behaviour is also wrong ( I found this problem on a 2.6.15-28-server kernel with the same iproute package ). ? Any idea why this is happening ? ? Is this a regression ? ? Any suggestions ? Kind Regards, Eric Janz -- ADVERTENCIA LEGAL El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy LEGAL ADVISORY This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070418/3e77e302/attachment.html From christian.benvenuti at libero.it Thu Apr 19 22:50:24 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Sun Apr 22 01:35:17 2007 Subject: [LARTC] Re: tc questions Message-ID: <1177015824.2662.112.camel@benve-laptop> Hi Alejandro, >> Alejandro Ramos Encinosa wrote: >> > tc qdisc add dev eth1 parent 1:20 handle 120: sfq perturb 10 >> > >> > tc class add dev eth1 parent 1:20 classid 1:21 htb rate 49mbit >> >> This is a misconfiguration, it doesn't make sense to add sfq and another >> htb class to 1:20. >...why? The case I am trying to deal with is an scenario where some traffic >goes into 1:20 (something like the traffic from/to the subnet 10.6.70.0/24) >and then, I want to shape specifically some other traffic type (for example, >the ssh connections from/to subnet 10.6.70.0/24). Is there another way to do >it? Please, take a in mind that (in my example) I want to enclose the whole >traffic from/to the subnet 10.6.70.0/24 and from that traffic I want to give >an special treatment to ssh traffic. >> > > Andy. I hope you already managed to find a solution to the above problem. I think a question posted on this list does not get any answer in four main cases: 1- It is not formulated well. 2- No one knows the answer. 3- Everyone knows the answer and thinks someone else will reply sooner or later. 4- The same question has been posted already many times and therefore a simple search in the list archive would be sufficient to find the answer/solution. If you did find a solution to your problem and you think it can come useful to others too, I would kindly suggest you to share it with the list members (especially in the case 2 above). Anyway, let me try to answer your questions. Andy is right: qdisc attached to non-leaf classes are not used by HTB (even though you can configure them). Packets must be queued into the leaf classes' queues. Non-leaf classes are used only for link-sharing (in the case of HTB). Here are two examples of solutions to your problem: 1) You define two filters: - one for the SSH(to/from 10.6.70.0/24) traffic - one for the Not-SSH to/from 10.6.70.0/24 traffic. Both filters would map traffic to 1:20. The first filter must be tested first (therefore you should assign it an higher priority). The rate/ceil parameters configured on 1:20 would apply to all the traffic that goes to 1:20 (SSH and Not-SSH). By assigning a policer to the first filter you would be able to shape the SSH traffic explicitly. 2) Instead of using the same class 1:20 for both SSH and Not-SSH traffic, you can create two classes under 1:20, say 1:21 for SSH and 1:22 for Not-SSH. In this case you would not need to attach any policer to the filters because you can configure two independent rate/ceil parameters for the two classes. Regards /Christian [http://benve.info] From alex at uh.cu Sun Apr 15 06:25:36 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Sun Apr 22 01:43:26 2007 Subject: [LARTC] iptables marks Message-ID: <200704150425.37045.alex@uh.cu> Hi all!! I was trying to figure out how iptables marks work. I thought that a packet could just be marked once into a chain (if the packet matchs the criteria, then it the action is applied, and that's all for the packet into this chain), but I was wrong: I did iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 7 iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 8 and then I did `iptables -t mangle -L -x -v' and I got Chain INPUT (policy ACCEPT 9565560 packets, 4954706655 bytes) pkts bytes target prot opt in out source destination 45 31630 MARK 0 -- eth0 any anywhere anywhere MARK set 0x7 45 31630 MARK 0 -- eth0 any anywhere anywhere MARK set 0x8 Can someone tell me how can I be sure one packet will just be marked once into the chain? From s.cramatte at wanadoo.fr Tue Apr 17 20:24:37 2007 From: s.cramatte at wanadoo.fr (=?ISO-8859-1?Q?S=E9bastien_CRAMATTE?=) Date: Sun Apr 22 01:43:48 2007 Subject: [LARTC] Can't change ipt_conntrack hashsize under debian sarge ??? Message-ID: <462510E5.90802@wanadoo.fr> Hello, I've tried to change ipt_conntrack hashsize and con under my debian charge but doesn't work ! Ive got 2876Mb available for conntrack so I've done (according to some previous mail and this http://www.wallfire.org/misc/netfilter_conntrack_perf.txt) CONNTRACK_MAX = 2876 * 64 = 184064 HASHSIZE = 2876 * 8 = 23002 But the near power of 2 is 2^16 = 131072 ... I'm not sure that if it better to put 184064 or 131072 ? Seems that netfilter algorythm is more eficient with power of 2 value ? I can set the CONNTRACK_MAX value but not the HASHSIZE ... I've tried add hashsize= paremeter in /etc/modules or in /etc/modprobe.d/arch/i386 and I've done an "update-modules" ... When reboot the server the value still 8192 ???? Any Ideas ? Moreover I've read somewhere that is better to augment HASHSIZE value to 1:2 ratio ... in my case 65440 But how can I determine the best value ? My computer is P4 Hyper Threading 3.6 Ghz ... Might be I should put 131072 as CONNTRACK_MAX ? This server is a bridge that only do L7 QoS (filter + o - 70 Mbits for > 600 customers ). # cat /etc/sysctl.conf net.ipv4.netfilter.ip_conntrack_max = 131072 #cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max 131072 # cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets 8192 #cat /etc/modprobe.d/arch/i386 alias eth0 tg3 alias eth1 tg3 alias eth2 e1000 options ipt_conntrack hashsize=65440 Many thanks for you help Regards From nelsoneci at gmail.com Sun Apr 22 02:04:28 2007 From: nelsoneci at gmail.com (Nelson Castillo) Date: Sun Apr 22 02:04:35 2007 Subject: [LARTC] iptables marks In-Reply-To: <200704150425.37045.alex@uh.cu> References: <200704150425.37045.alex@uh.cu> Message-ID: <2accc2ff0704211704v2475f49cx95a25cad8c842ed8@mail.gmail.com> > iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 7 > iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 8 > > and then I did `iptables -t mangle -L -x -v' and I got > > Chain INPUT (policy ACCEPT 9565560 packets, 4954706655 bytes) > pkts bytes target prot opt in out source destination > 45 31630 MARK 0 -- eth0 any anywhere anywhere MARK set 0x7 > 45 31630 MARK 0 -- eth0 any anywhere anywhere MARK set 0x8 > > Can someone tell me how can I be sure one packet will just be marked once into > the chain? I would try the following (untested) rules: iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 7 iptables -t mangle -A INPUT -i eth0 -j RETURN iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 8 I guess you will never get the second mark. Regards, Nelson.- -- http://arhuaco.org http://emQbit.com From s.cramatte at wanadoo.fr Mon Apr 23 11:53:35 2007 From: s.cramatte at wanadoo.fr (=?ISO-8859-1?Q?S=E9bastien_CRAMATTE?=) Date: Mon Apr 23 11:54:24 2007 Subject: [LARTC] Debian sarge 2.6.18 Traffic Manager freeze under load ... Message-ID: <462C821F.7050805@wanadoo.fr> Hello I've got Debian sarge 2.6.18 Traffic Manager setup as a bridge. This server is p4 hyperthreading with 3Gb of memory. Yesterday on 10:00pm start to see in my syslog that ip_conntrack was full and on 12:00pm the server was frozen ... I precise that I've already change CONNTRACK_MAX=131072 and HASHSIZE=65536 values I'm not sure that is a direct conntrack problem ... might be l7-filter,ipp2p or ethernet bridge ? Any tips or ideas of what I should check ? Regards From padam.singh at inventum.cc Mon Apr 23 13:35:56 2007 From: padam.singh at inventum.cc (Padam J Singh) Date: Mon Apr 23 13:36:48 2007 Subject: [LARTC] iptables marks In-Reply-To: <200704150425.37045.alex@uh.cu> References: <200704150425.37045.alex@uh.cu> Message-ID: <462C9A1C.1000509@inventum.cc> An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070423/483d76ce/attachment.html From salatiel.filho at gmail.com Mon Apr 23 15:48:12 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Mon Apr 23 15:48:16 2007 Subject: [LARTC] Shape own router In-Reply-To: <46051DF7.5030605@andyfurniss.entadsl.com> References: <460309CA.8050801@andyfurniss.entadsl.com> <4603336E.5060501@andyfurniss.entadsl.com> <46051DF7.5030605@andyfurniss.entadsl.com> Message-ID: On 3/24/07, Andy Furniss wrote: > > Salatiel Filho wrote: > > > > > Hi Andy , thanks again , but i am not understanding very well how to > > do it [still newbie in this]. Let`s try to change to some real code > > here. This is part of my setup to shape download: > > > > eth0 = EXTIF > > eth1 = LOCALIF > > > > # SHAPE DOWNLOAD to LOCALNET NOT COMING FROM THE ROUTER ITSELF [samba > > for example] > > iptables -t mangle -s ! 192.168.254.254 -A POSTROUTING -o eth1 -j IMQ > > --todev 1 > > If you shape your wan - eth0 using ifb on ingress or imq from prerouting > then you do not need any rules on eth1, the wan traffic will already be > shaped. > > If you do not plan on seperating users or interactive traffic from bulk > traffic, it would actually be much nicer to use a policer for ingress > wan traffic. Policing doesn't buffer traffic just drops it when a > virtual buffer is full, so you won't be delaying interactive traffic by > queuing with bulk. > > When you shape ingress wan, however you do it, you will need to > sacrifice about 20% of your bandwidth, possibly more depending on > needs/traffic/wan speed. Shaping from the wrong end of the bottleneck is > better than doing nothing, but you can't do it perfectly. > > > > > tc qdisc add dev imq1 root handle 1: htb default 3 r2q 1 // > > DOWNLOAD SHAPER ROOT > > tc class add dev imq1 parent 1: classid 1:1 htb rate 2048kbit quantum > > 1500 //KNOWN TRAFFIC GOES HERE > > tc class add dev imq1 parent 1: classid 1:3 htb rate 8kbit quantum > > 1500 // DEFAULT CLASS VERYYYYY SLOWWWWWWW > > If this were eth rather than imq you would be sending arp to a slow > class - not nice. > > Andy. > > I was finally able to shape the router itself :) , i changed IMQ default behaviour to AFTER NAT in Prerouting and BEFORE NAT in Postrouting. I do not know if my setup is common , but i have some like this: DOWNLOAD LINK [1024K] -> HTB PEOPLE [500k-1024ceil] guy1 [100k-1000ceil] guy2 [100k-1000ceil] guy3 [100k-1000ceil] guy4 [100k-1000ceil] guy5 [100k-1000kceil] ROUTER[512k-1000ceil] -> router and P2P BOX 24/7 [Before be able to shape the router , i need to hard limit the download speed to not eat all bandwidth, now i can let it borrow if there is available bandwidth in the parent (DOWNLOAD LINK)] -> I really need this 512k rate guaranteed in the router. DEFAULT[8k-8k] -> In theory should not be used by anyone :) But now i have a doubt , when a packet gets in too htb it will be queued , right ? If it is, is there a way to drop it if it is over the ceil limit ? I really do not want packets being queued `cause probably wiill delay the interactive traffic. Despite that , my setup is working great , that is just a doubt :) -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070423/4172234b/attachment.htm From gaio at sv.lnf.it Mon Apr 23 17:54:54 2007 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Mon Apr 23 17:56:12 2007 Subject: [LARTC] Ubuntu feisty: trouble with vmware-player host-guest access... Message-ID: <20070423155452.GC6353@sv.lnf.it> I post here this 'user' triuble because i suppose it came from some strange kernel (mis)configuration. I've just upgrade from edgy to feisty, and the communication between guest and host in bridge mode stop working. And in a rather, at least for me, strange way. I can ping from host to guest: root@host:~# ping guest PING guest (10.5.2.120) 56(84) bytes of data. 64 bytes from guest (10.5.2.120): icmp_seq=1 ttl=128 time=3.00 ms 64 bytes from guest (10.5.2.120): icmp_seq=2 ttl=128 time=0.242 ms 64 bytes from guest (10.5.2.120): icmp_seq=3 ttl=128 time=0.224 ms --- guest ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1998ms rtt min/avg/max/mdev = 0.224/1.158/3.008/1.308 ms and also from guest to host (guest are Windows XP, trust me please ;). But i cannot access the host samba share from guest, nor access the ssh server of host from guest, nor access the TightVNC server of the guest from the host. Eg, for putty: root@host:~# tcpdump -i eth0 host guest and port ssh tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 17:44:19.144412 IP guest.1153 > host.ssh: F 2625691448:2625691448(0) ack 2930621145 win 64512 17:44:19.144932 IP host.ssh > guest.1153: F 39:39(0) ack 1 win 5840 17:44:19.146412 IP guest.1153 > host.ssh: . ack 1 win 64512 17:44:29.171327 IP guest.1154 > host.ssh: S 1530035078:1530035078(0) win 64512 17:44:29.171366 IP host.ssh > guest.1154: S 3017660595:3017660595(0) ack 1530035079 win 5840 17:44:29.171620 IP guest.1154 > host.ssh: . ack 1 win 64512 17:44:29.194460 IP host.ssh > guest.1154: P 1:39(38) ack 1 win 5840 17:44:32.196358 IP host.ssh > guest.1154: P 1:39(38) ack 1 win 5840 17:44:38.196078 IP host.ssh > guest.1154: P 1:39(38) ack 1 win 5840 17:44:50.195331 IP host.ssh > guest.1154: P 1:39(38) ack 1 win 5840 someone can point me at least to the right plase to debug this? ;) See also the ubuntu bug: https://bugs.launchpad.net/ubuntu/+source/vmware-player/+bug/96445 Many tanks. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.sv.lnf.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)sv.lnf.it tel +39-0434-842711 fax +39-0434-842797 From gsomlo at gmail.com Mon Apr 23 21:57:41 2007 From: gsomlo at gmail.com (Gabriel Somlo) Date: Mon Apr 23 21:57:48 2007 Subject: [LARTC] Multiple bands with equal priority ? Message-ID: <2387247e0704231257h4196a31cg91207be9512bfced@mail.gmail.com> I'm trying to build a wan latency test environment, where packets from different "remote" locations get delayed by different amounts of time, depending on which remote location we're pretending they are from. Currently, I'm doing this using the 'prio' qdisc to obtain multiple bands, and hanging a different netem qdisc off each of the branches to delay packets, like this (assuming two remote locations): # create three bands, and place all default traffic in the first one: tc qdisc add dev eth0 root handle 1: \ prio bands 3 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # the second and third bands of the prio above will delay packets # by 50ms for one simulated remote location, and by 70ms for the other: tc qdisc add dev eth0 parent 1:2 handle 20: netem delay 50ms tc qdisc add dev eth0 parent 1:3 handle 30: netem delay 70ms # use filters to place 'remote' traffic into the appropriate band: tc filter add dev eth0 protocol ip parent 1:0 \ prio 2 <...match for first remote location...> flowid 10:2 tc filter add dev eth0 protocol ip parent 1:0 \ prio 3 <...match for second remote location...> flowid 10:3 This works great, but I can only really test one remote location at a time, because otherwise traffic sent to the second band will always starve out traffic sent to the third band. I actually don't mind default traffic having priority over that from my two 'remote' locations... :) Anyway, I'm looking for a way to allow packets from the two remote locations to compete for bandwidth on equal footing (after the appropriate delay has been applied, of course). So, instead of a prio multiband qdisc, I'd be interested in a round-robin one. Can I accomplish this with CBQ ? What would the tc commands have to look like -- I'm getting slightly confused by the split/defmap syntax, and by trying to figure out when it's a clas vs a qdisc I'm supposed to be dealing with... :( I guess I should be looking at using the WRR qdisc, but I'd like to try everything else I can before falling through to adding an out-of-tree kernel module and patching tc... Thanks, Gabriel From gaio at sv.lnf.it Tue Apr 24 09:35:03 2007 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Tue Apr 24 09:36:53 2007 Subject: [LARTC] Ubuntu feisty: trouble with vmware-player host-guest access... In-Reply-To: <20070423154302.7ded3846@localhost.localdomain> References: <20070423155452.GC6353@sv.lnf.it> <20070423154302.7ded3846@localhost.localdomain> Message-ID: <20070424073501.GA15957@sv.lnf.it> Mandi! Stephen Hemminger In chel di` si favelave... > Sorry, vmware is proprietary. Please reproduce problem with a stock kernel.org > kernel. This is not a vendor support list. I only supposed that was not some vmware trouble, or at least not all, because as you have seen some traffic passes (ping, at least). Indeed vmware-player are propietary, i'm only seeking some info to debug better this. Only for list reference, i've solved the puzzle. Simply when i've installed the box the first time i've put on it another PCI ethernet card because the integrated (RTL Giga) was not supported, and i've forgot completely that afterwards. Now i've removed the PCI addon card, enabled the integrated one, restate a bit the network configuration and now all the stuff works. Boh. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.sv.lnf.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)sv.lnf.it tel +39-0434-842711 fax +39-0434-842797 From Vincent_Gay at inmarsat.com Tue Apr 24 10:53:48 2007 From: Vincent_Gay at inmarsat.com (Vincent Gay) Date: Tue Apr 24 10:53:58 2007 Subject: [LARTC] OSPF with Netem Message-ID: Hi all, I am currently trying to emulate a satellite link, via Netem, on a testbed which is OSPF-enabled. I'd like to set up a Netem box between two routers. Since all routing between routers is dynamic, I'm wondering how to set up OSPF on my Netem box? Could someone indicate me if it is feasible and give me some guidelines to possibly do so? Thanks in advance, Vincent. This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. In accordance with Inmarsat Information Security Policy and Guidelines on Computer use, emails sent or received may be monitored. Inmarsat plc, Registered No 4886072 and Inmarsat Global Limited, Registered No. 3675885. Both Registered in England and Wales with Registered Office at 99 City Road, London EC1Y 1AX. _____________________________________________________________________ This e-mail has been scanned for viruses by Verizon Business Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.verizonbusiness.com/uk -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070424/5595400f/attachment-0001.html From lartc at mm.quex.org Tue Apr 24 11:05:32 2007 From: lartc at mm.quex.org (Michael Alger) Date: Tue Apr 24 11:05:40 2007 Subject: [LARTC] Prioritizing based on HTTP Content-Type header Message-ID: <20070424090532.GA29737@morose.quex.org> I'm setting up a reverse-proxy on a limited-bandwidth pipe. The system is Debian "etch" on Linux 2.6, using squid as the proxy. As we've only got 5mbit to play with, what I'd really like to do is set up priority levels based on the Content-Type of the (outgoing) response: 1. text/* gets highest priority (along with application/x-javascript). 2. image/* gets middle priority. 3. */* gets lowest priority. Today I tried just using tc, with netfilter's "string" match module to select matching packets, with limited success: while it does match the packet containing the response header, additional packets in the same stream don't retain the fwmark (unsurprisingly). Does anyone have any ideas of -- or even better, experience with -- a stack which can achieve this? squid's built-in rate limiting doesn't have the concept of borrowing bandwidth, so that's out. I'm open to pretty much anything: userspace proxies (either in front of or replacing squid) are fine. Another option is simply to "punish" bandwidth hogs: the primary goal is to ensure downloads of large files don't slow down users that are browing webpages. Possibly just using SFQ will work for this, but I'm not sure. Any suggestions would be appreciated. I'm even open to changing platform (e.g. FreeBSD), but I'd prefer to stick with Debian as it's what I'm most comfortable with. From alexandre at ondainternet.com.br Tue Apr 24 11:13:37 2007 From: alexandre at ondainternet.com.br (Alexandre J. Correa - Onda Internet) Date: Tue Apr 24 11:13:32 2007 Subject: [LARTC] Prioritizing based on HTTP Content-Type header In-Reply-To: <20070424090532.GA29737@morose.quex.org> References: <20070424090532.GA29737@morose.quex.org> Message-ID: <462DCA41.6030307@ondainternet.com.br> You can use STRING + CONSAVE modules !! mark packets... because string match only "starter packet" ... the others packets from the same connection isn?t marked.. consave can track this.. -j CONNMARK --restore-mark -m string --string 'string' --algo bm -j MARK --set-mark 1 -m string --string 'string2' --algo bm -j MARK --set-mark 2 -m mark --mark 1 -j CONNMARK --save-mark -m mark --mark 2 -j CONNMARK --save-mark Michael Alger wrote: > I'm setting up a reverse-proxy on a limited-bandwidth pipe. The > system is Debian "etch" on Linux 2.6, using squid as the proxy. > > As we've only got 5mbit to play with, what I'd really like to do is > set up priority levels based on the Content-Type of the (outgoing) > response: > > 1. text/* gets highest priority (along with > application/x-javascript). > 2. image/* gets middle priority. > 3. */* gets lowest priority. > > Today I tried just using tc, with netfilter's "string" match module > to select matching packets, with limited success: while it does > match the packet containing the response header, additional packets > in the same stream don't retain the fwmark (unsurprisingly). > > Does anyone have any ideas of -- or even better, experience with -- > a stack which can achieve this? squid's built-in rate limiting > doesn't have the concept of borrowing bandwidth, so that's out. > > I'm open to pretty much anything: userspace proxies (either in front > of or replacing squid) are fine. > > Another option is simply to "punish" bandwidth hogs: the primary > goal is to ensure downloads of large files don't slow down users > that are browing webpages. Possibly just using SFQ will work for > this, but I'm not sure. > > Any suggestions would be appreciated. I'm even open to changing > platform (e.g. FreeBSD), but I'd prefer to stick with Debian as it's > what I'm most comfortable with. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > -- Sds. Alexandre J. Correa Onda Internet www.ondainternet.com.br Linux User ID #142329 From s.cramatte at wanadoo.fr Tue Apr 24 15:53:16 2007 From: s.cramatte at wanadoo.fr (=?ISO-8859-1?Q?S=E9bastien_CRAMATTE?=) Date: Tue Apr 24 15:53:56 2007 Subject: [LARTC] IPMark won't compile on a vanilla 2.6.20 kernel Message-ID: <462E0BCC.8020907@wanadoo.fr> Hello, IPMark won't compile on a vanilla 2.6.20 kernel I obtain this error during the compilation under debian sarge 3.1 CC [M] net/ipv4/netfilter/ipt_TTL.o CC [M] net/ipv4/netfilter/ipt_IPMARK.o net/ipv4/netfilter/ipt_IPMARK.c: In function `target': net/ipv4/netfilter/ipt_IPMARK.c:37: error: structure has no member named `nfmark' net/ipv4/netfilter/ipt_IPMARK.c:38: error: structure has no member named `nfmark' net/ipv4/netfilter/ipt_IPMARK.c: At top level: net/ipv4/netfilter/ipt_IPMARK.c:77: warning: initialization from incompatible pointer type net/ipv4/netfilter/ipt_IPMARK.c:81: warning: initialization from incompatible pointer type make[4]: *** [net/ipv4/netfilter/ipt_IPMARK.o] Error 1 make[3]: *** [net/ipv4/netfilter] Error 2 make[2]: *** [net/ipv4] Error 2 make[1]: *** [net] Error 2 make[1]: Leaving directory `/usr/src/sarge-router-0.3.2/linux/linux-2.6.20.7' Any ideas ? From s.cramatte at wanadoo.fr Tue Apr 24 16:00:23 2007 From: s.cramatte at wanadoo.fr (=?ISO-8859-1?Q?S=E9bastien_CRAMATTE?=) Date: Tue Apr 24 16:00:54 2007 Subject: [LARTC] Does someone have an nf-hipac patch for 2.6.20 kernel ? Message-ID: <462E0D77.9060202@wanadoo.fr> Hello, Does someone have an nf-hipac patch for 2.6.20 kernel ? Regards From marek at piasta.pl Tue Apr 24 16:18:52 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Tue Apr 24 16:18:58 2007 Subject: [LARTC] IPMark won't compile on a vanilla 2.6.20 kernel In-Reply-To: <462E0BCC.8020907@wanadoo.fr> References: <462E0BCC.8020907@wanadoo.fr> Message-ID: <20070424161852.4a27ad42@catlap> >Hello, Hi, >IPMark won't compile on a vanilla 2.6.20 kernel >I obtain this error during the compilation under debian sarge 3.1 >.. > CC [M] net/ipv4/netfilter/ipt_IPMARK.o >net/ipv4/netfilter/ipt_IPMARK.c: In function `target': >net/ipv4/netfilter/ipt_IPMARK.c:37: error: structure has no member I think that IPMARK isn't maintained for some time now. If you need IPMARK for shaping upload (most probable scenario), then use IFB + hashing u32 filters instead. Works for me. pozdrawiam, Marek Kierdelewicz KoBa ISP From radu at securesystems.ro Wed Apr 25 01:47:13 2007 From: radu at securesystems.ro (Radu Oprisan) Date: Wed Apr 25 01:47:26 2007 Subject: [LARTC] Does someone have an nf-hipac patch for 2.6.20 kernel ? In-Reply-To: <462E0D77.9060202@wanadoo.fr> References: <462E0D77.9060202@wanadoo.fr> Message-ID: <462E9701.1080609@securesystems.ro> S?bastien CRAMATTE wrote: > Hello, > > Does someone have an nf-hipac patch for 2.6.20 kernel ? > > Regards > > The last kernel i patched it against and it worked was 2.6.15.4. I tried on 2.6.16.X and it compiled ok but worked not so ok. From what i see, nf-hipac seems to be sort of dead in the water, at least on developer website. From simo at mix4web.de Wed Apr 25 03:12:52 2007 From: simo at mix4web.de (Simo) Date: Wed Apr 25 03:13:09 2007 Subject: [LARTC] problem with prio qdisc and tcng Message-ID: <000001c786d6$d97c6500$8c752f00$@de> Hello Mailing list, I have a problem with the prio qdisc and I don?t know what is wrong in my configuration This ist a sample configuration and looks like this: #include "fields.tc" #include "ports.tc" #define X16(i) i i i i i i i i i i i i i i i i dev ppp0 { dsmark { prio(bands 6, priomap X16($be)) { class if ip_proto == IPPROTO_UDP; class if tcp_dport == PORT_TELNET; class if tcp_dport == PORT_HTTP; class if tcp_dport == PORT_SMTP; class if ip_dst == 10.0.10.10; $be = class(6); } } } The tc-Code looks like this: # ================================ Device ppp0 ================================ tc qdisc add dev ppp0 handle 1:0 root dsmark indices 64 set_tc_index tc qdisc add dev ppp0 handle 2:0 parent 1:0 prio bands 6 priomap 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 tc filter add dev ppp0 parent 2:0 protocol all prio 1 u32 match u8 0x11 0xff at 9 classid 2:1 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 1:0:0 u32 divisor 1 tc filter add dev ppp0 parent 2:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 1:0:0 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 1:0:1 u32 ht 1:0:0 match u16 0x17 0xffff at 2 classid 2:2 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 2:0:0 u32 divisor 1 tc filter add dev ppp0 parent 2:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 2:0:0 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 2:0:1 u32 ht 2:0:0 match u16 0x50 0xffff at 2 classid 2:3 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 3:0:0 u32 divisor 1 tc filter add dev ppp0 parent 2:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 3:0:0 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 3:0:1 u32 ht 3:0:0 match u16 0x19 0xffff at 2 classid 2:4 tc filter add dev ppp0 parent 2:0 protocol all prio 1 u32 match u32 0xa000a0a 0xffffffff at 16 classid 2:5 tc filter add dev ppp0 parent 1:0 protocol all prio 1 tcindex mask 0xfc shift 2 but when i execute the script and invoke: ?tc ?s class show dev ppp0? I don?t see any Pakets, that have been enqueued or sent. pc1:/etc/tcng# tc -s class show dev ppp0 class prio 2:1 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) class prio 2:2 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) class prio 2:3 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) class prio 2:4 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) class prio 2:5 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) class prio 2:6 parent 2: Sent 0 bytes 0 pkts (dropped 0, overlimits 0) ---------------------------------------------------------------------------- ----------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070425/a27e2b91/attachment.htm From simo at mix4web.de Wed Apr 25 13:41:39 2007 From: simo at mix4web.de (Simo) Date: Wed Apr 25 13:41:59 2007 Subject: [LARTC] HFSC with tcng Message-ID: <000b01c7872e$b0213480$10639d80$@de> Hello mailing list, I don?t know how to use HFSC queuing discipline with tcng configuration language. I become always this error: syntax error near "hfsc" Is it possible, that tcng provides no support for this classful hfcs queuing discipline? Please help! thanks ---------------------------------------------------------------------------- ----------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070425/df4047d6/attachment.html From diego.giardinetto at amugsiena.it Wed Apr 25 15:39:38 2007 From: diego.giardinetto at amugsiena.it (Diego Giardinetto [@AMUGSiena]) Date: Wed Apr 25 15:39:51 2007 Subject: [LARTC] PPPoE and shaping Message-ID: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> Hi all, I have a little problem with my home-made slackware linux server. Here is the scenario: 1. I have a local wifi network 2. my server do masquerading and exit in internet via a PPPoE connection Goals: 1. not use SQUID 2. shaping the traffic with classes 3. emule connection must have minimum priority and a band-limit of 10KBytes/s in uplink (server--->internet) Any idea? Thx, Diego -- Diego Giardinetto Skype Name: cpuzorro MSN: cpuoverload@hotmail.it From simo at mix4web.de Wed Apr 25 16:09:32 2007 From: simo at mix4web.de (Simo) Date: Wed Apr 25 16:09:44 2007 Subject: AW: [LARTC] PPPoE and shaping In-Reply-To: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> References: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> Message-ID: <001a01c78743$58d71e00$0a855a00$@de> Hi Diego, for shaping, you can use the HTB queuing discipline in the linux traffic control. For the configuration you can use tc or tcng. More Informations you can find hier: http://www.linux-ip.net/ And hier you can find a lot of configuration samples: http://www.linux-ip.net/code/tcng/ And also in the project folder of tcng you can finde a lot of examples. For example: i?ve used the standard ports of emule #include "fields.tc" #include "ports.tc" #define UPLOAD 600kbps dev ppp0 { /* 1Mbit */ egress { class ( <$emule> ) if tcp_dport == 4661 || udp_dport == 4665 || udp_dport == 4672; class(<$other>) if 1; htb () { class ( rate UPLOAD, ceil UPLOAD) { $ssh = class ( rate 10kBps, ceil UPLOAD ); $bulk = class ( rate 520kbps, ceil UPLOAD) {sfq(perturb 10s);} } } } } ---------------------------------------------------------------------------- ----------- The tc code looks like this: ---------------------------------------------------------------------------- ----------- tc qdisc add dev ppp0 handle 1:0 root dsmark indices 4 default_index 0 tc qdisc add dev ppp0 handle 2:0 parent 1:0 htb tc class add dev ppp0 parent 2:0 classid 2:1 htb rate 75000bps ceil 75000bps tc class add dev ppp0 parent 2:1 classid 2:2 htb rate 10000bps ceil 75000bps tc class add dev ppp0 parent 2:1 classid 2:3 htb rate 65000bps ceil 75000bps tc qdisc add dev ppp0 handle 3:0 parent 2:3 sfq perturb 10 tc filter add dev ppp0 parent 2:0 protocol all prio 1 tcindex mask 0x3 shift 0 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 2 tcindex classid 2:3 tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 1 tcindex classid 2:2 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 1:0:0 u32 divisor 1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 1:0:0 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 1:0:1 u32 ht 1:0:0 match u16 0x1235 0xffff at 2 classid 1:1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 2:0:0 u32 divisor 1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x11 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 2:0:0 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 2:0:1 u32 ht 2:0:0 match u16 0x1239 0xffff at 2 classid 1:1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 3:0:0 u32 divisor 1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x11 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 3:0:0 tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 3:0:1 u32 ht 3:0:0 match u16 0x1240 0xffff at 2 classid 1:1 tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u32 0x0 0x0 at 0 classid 1:2 ---------------------------------------------------------------------------- ----------------------------------------------------------------- In a world without walls who needs gates and windows? -----Urspr?ngliche Nachricht----- Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Diego Giardinetto [@AMUGSiena] Gesendet: Mittwoch, 25. April 2007 15:40 An: lartc@mailman.ds9a.nl Betreff: [LARTC] PPPoE and shaping Hi all, I have a little problem with my home-made slackware linux server. Here is the scenario: 1. I have a local wifi network 2. my server do masquerading and exit in internet via a PPPoE connection Goals: 1. not use SQUID 2. shaping the traffic with classes 3. emule connection must have minimum priority and a band-limit of 10KBytes/s in uplink (server--->internet) Any idea? Thx, Diego -- Diego Giardinetto Skype Name: cpuzorro MSN: cpuoverload@hotmail.it _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From simo at mix4web.de Wed Apr 25 16:58:38 2007 From: simo at mix4web.de (Simo) Date: Wed Apr 25 16:58:50 2007 Subject: AW: [LARTC] PPPoE and shaping In-Reply-To: <21dfacd30704250725n3f8716ccjd2d99bb0edf0ff4f@mail.gmail.com> References: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> <001a01c78743$58d71e00$0a855a00$@de> <21dfacd30704250725n3f8716ccjd2d99bb0edf0ff4f@mail.gmail.com> Message-ID: <001b01c7874a$3519c740$9f4d55c0$@de> Hi diego, The traffic control should be done on your router. Only the router know the both networks, internet and LAN. That s why, I don?t think, you will have problems with the masquereading. But how do you use emule?? Do you use nating? Have you any nating Rules for emule in your iptables? (otherwise, emule will have a low ID) >And, can I create shaping classes based on source IP of the local >network? I tried, but also here masquerading operation loses >information about original local IP... Yes, you can do this. You can create for example a class like this: class if ip_dst == 192.168.0.3; Simo From arik.funke at gmx.de Wed Apr 25 17:09:01 2007 From: arik.funke at gmx.de (Arik Raffael Funke) Date: Wed Apr 25 17:08:55 2007 Subject: [LARTC] Re: PPPoE and shaping In-Reply-To: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> References: <21dfacd30704250639t1f5623fes7c786ed8d5fceeba@mail.gmail.com> Message-ID: Diego Giardinetto [@AMUGSiena] wrote: > 2. my server do masquerading and exit in internet via a PPPoE connection > > Goals: > 1. not use SQUID > 2. shaping the traffic with classes > 3. emule connection must have minimum priority and a band-limit of > 10KBytes/s in uplink (server--->internet) You say PPoE: I assume you are talking about an ADSL not SDSL connection. If so, the "ADSL-optimizer" package by Jesper is the technologically most advanced solution: http://www.adsl-optimizer.dk/ (at least it was when I last checked) I have been running it for months now without the slightest hitch. (With the patches from Russell Stuart.) It solved all my connection problems. Best regards, Arik From madhava.rayudu at gmail.com Thu Apr 26 04:14:49 2007 From: madhava.rayudu at gmail.com (Madhava Rayudu) Date: Thu Apr 26 04:14:55 2007 Subject: [LARTC] Squid (delay pools) for HTTP .. rest is HTB.. Message-ID: Sir, I am very new Tc ...Kindly help.. I have to distribute 2 MB to 300 users... Browsing should be very good... Squid & Caching Name Server increases the browsing experience visibly. I need a script 1. All HTTP traffic is redirected to Squid. ( I will use Iptables to redirect and delay pools to limit per user bandwidth/total bandwidth for Squid ) 2. Rest of the bandwidth to shaped per user/ip up/down.. Kindly Help me out... Regards, Rayudu. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070426/58b710a0/attachment.htm From lartc at mm.quex.org Thu Apr 26 14:35:09 2007 From: lartc at mm.quex.org (Michael Alger) Date: Thu Apr 26 14:35:24 2007 Subject: [LARTC] Prioritizing based on HTTP Content-Type header In-Reply-To: <462DCA41.6030307@ondainternet.com.br> References: <20070424090532.GA29737@morose.quex.org> <462DCA41.6030307@ondainternet.com.br> Message-ID: <20070426123509.GA5277@morose.quex.org> On Tue, Apr 24, 2007 at 06:13:37AM -0300, Alexandre J. Correa - Onda Internet wrote: > You can use STRING + CONSAVE modules !! > > mark packets... > > because string match only "starter packet" ... the others packets from > the same connection isn?t marked.. consave can track this.. > > -j CONNMARK --restore-mark > -m string --string 'string' --algo bm -j MARK --set-mark 1 > -m string --string 'string2' --algo bm -j MARK --set-mark 2 > -m mark --mark 1 -j CONNMARK --save-mark > -m mark --mark 2 -j CONNMARK --save-mark I haven't fully tested the shaping setup, but it appears to be classifying packets correctly. One limitation is that it can't cope with SSL; fortunately that's not a current requirement for us, but I probably will need to find a solution for that at some point. Anyway, thanks again. From thuleau at gmail.com Thu Apr 26 16:26:46 2007 From: thuleau at gmail.com (Edouard Thuleau) Date: Thu Apr 26 16:26:53 2007 Subject: [LARTC] Library TC Message-ID: <81c11a560704260726j4ab4234xb6fa5b467c685d52@mail.gmail.com> Hi all, I try to made a C program (CAC (Call Admission Control) module) and I don't want to use a exec command with TC in my program when I want to add,modify or delete a QoS rule. I search a TC library and I found two projects : LQL (Linux QoS Library) : http://www.coverfire.com/lql/ (c) LTCM : http://hng.av.it.pt/~ltcmmm/ (c++) Do you known these API or others ? If you know them, can you tell me which one is better ? thanks, Edouard. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070426/8901428c/attachment.html From drumlesson at gmail.com Thu Apr 26 21:34:03 2007 From: drumlesson at gmail.com (terraja-based) Date: Thu Apr 26 21:34:09 2007 Subject: [LARTC] HTB+SFQ Message-ID: <823158cf0704261234p52f72f1brabc948ad4537e1ab@mail.gmail.com> Hi folks, I`ve a problem to use HTB and SFQ. The first script, below, to show a simple configuration, does work fine...!!! But, in the second example, does not work, becouse i put more code to clasify the traffic by protocol, http and ftp in this case. Somebody can tell me the errors? Thx, in advance.- NOTICE: IMQ device is to asociate with ETH1 my external iface. SCRIPT que funciona: ############################################ #!/bin/sh ifconfig imq0 up tc qdisc add dev imq0 handle 1: root htb default 1 tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit tc qdisc add dev imq0 parent 1:1 handle 2 sfq iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:1 ############################################ SCRIPT que NO funciona: ############################################ #!/bin/sh ifconfig imq0 up tc qdisc add dev imq0 handle 1: root htb default 1 tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit ceil 2000kbit tc qdisc add dev imq0 parent 1:10 handle 2 sfq tc qdisc add dev imq0 parent 1:20 handle 3 sfq iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 tc filter add dev imq0 parent 1:1 prio 0 protocol ip handle 2 fw flowid 1:10 tc filter add dev imq0 parent 1:1 prio 1 protocol ip handle 3 fw flowid 1:20 ############################################ Ya luego, con el segundo script deberia agregar al final las MARKs de IPTABLES, pero no lo hice porque ni siquera cuando hago un SHOW de las qdisc (tc qdisc show) me muestra el trafico clasificado, es decir...luego yo iba a mandar el trafico de la class 1:10 para el protocolo HTTP y la 1:20 para FTP, y eso se hace justamente con IPTABLES, pero repito no lo hice porque no veo el trafico desglozado previamente cuando trafico, usando los 2 potocolos, en la qdisc. Esa es la cuestion, no logro clasificar el trafico para luego marcarlo. Ahi esta el "K?" del asunto como decian las viejas... Any ideas? De mas esta decir que IPTABLES, IPROUTE y el KERNEL estan correctamente parcheados y actualizados, ya que sino ni siquiera levanta los modulos o daria error.- -- terraja-based -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070426/d90c88ae/attachment.htm From alex at uh.cu Fri Apr 27 20:42:01 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Fri Apr 27 20:42:26 2007 Subject: [LARTC] HTB+SFQ In-Reply-To: <823158cf0704261234p52f72f1brabc948ad4537e1ab@mail.gmail.com> References: <823158cf0704261234p52f72f1brabc948ad4537e1ab@mail.gmail.com> Message-ID: <200704271842.01935.alex@uh.cu> On Thursday 26 April 2007 19:34, terraja-based wrote: > Hi folks, Hi! Hola! > I`ve a problem to use HTB and SFQ. > The first script, below, to show a simple configuration, does work > fine...!!! > But, in the second example, does not work, becouse i put more code to > clasify the traffic by protocol, http and ftp in this case. > Somebody can tell me the errors? > Thx, in advance.- > > NOTICE: IMQ device is to asociate with ETH1 my external iface. > > SCRIPT que funciona: > > ############################################ > #!/bin/sh > > ifconfig imq0 up > tc qdisc add dev imq0 handle 1: root htb default 1 > tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit > tc qdisc add dev imq0 parent 1:1 handle 2 sfq > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:1 > ############################################ ...could you tell me why do you filter by mark 2? Are you trying to match the unmatched packets for iptables? ...?me prodr?a decir por qu? est? tratando de filtrar por la marca 2? ?Acaso est? tratando de redirigir los paquetes que iptables no haya sido capaz de clasificar? > SCRIPT que NO funciona: > > ############################################ > #!/bin/sh > > ifconfig imq0 up > tc qdisc add dev imq0 handle 1: root htb default 1 > tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit > > tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil > 2000kbit tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit > ceil 2000kbit > > tc qdisc add dev imq0 parent 1:10 handle 2 sfq > tc qdisc add dev imq0 parent 1:20 handle 3 sfq > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > tc filter add dev imq0 parent 1:1 prio 0 protocol ip handle 2 fw flowid > 1:10 tc filter add dev imq0 parent 1:1 prio 1 protocol ip handle 3 fw > flowid 1:20 ############################################ Hmm, do you really want these filters as children of 1:1 (root child) instead of 1: (root)? If you put these filters as children of 1:1 the traffic will not go through the tc tree: you need to redirect the packets falling into the root to any child. Hmm, ?de veras que quiere que estos filtros sean hijos de 1:1 (hijo de la ra?z) en vez de hijos de 1: (la ra?z)? Si estos filtros se quedan como hijos de 1:1 el tr?fico de paquetes no fluir? por el ?rbol de tc: necesita redirigir los paquetes que caen en la ra?z para alguno de los nodos hijos. > Ya luego, con el segundo script deberia agregar al final las MARKs de > IPTABLES, pero no lo hice porque ni siquera cuando hago un SHOW de las > qdisc (tc qdisc show) me muestra el trafico clasificado, es decir...luego > yo iba a mandar el trafico de la class 1:10 para el protocolo HTTP y la > 1:20 para FTP, y eso se hace justamente con IPTABLES, pero repito no lo > hice porque no veo el trafico desglozado previamente cuando trafico, usando > los 2 potocolos, en la qdisc. > Esa es la cuestion, no logro clasificar el trafico para luego marcarlo. Ahi > esta el "K?" del asunto como decian las viejas... > Any ideas? For any reason, when you redirect packets by 'default' to any child, those redirected packets seem to go directly to the attached qdisc, so, filters with the default class as parent will not work. I recomend you something like the rules bellow: Por alguna raz?n, cuando los paquetes son redirigidos 'por defecto', aparentemente pasan directamente al qdisc asociado a la clase en cuesti?n, por tanto, los filtros asociados que tienen a dicha clase como padre no funcionar?n. Yo recomendar?a una configuraci?n como la que sigue: -----------------------------8<-----------------------8<---------------------------------- tc qdisc add dev imq0 handle 1: root htb default 30 tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:30 htb rate 300kbit ceil 2000kbit tc qdisc add dev imq0 parent 1:10 handle 2 sfq tc qdisc add dev imq0 parent 1:20 handle 3 sfq iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:10 tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid 1:20 ----------------------------->8----------------------->8---------------------------------- > De mas esta decir que IPTABLES, IPROUTE y el KERNEL estan correctamente > parcheados y actualizados, ya que sino ni siquiera levanta los modulos o > daria error.- PS: by the way, I guess you need to change your 1:1 htb class parameters to match your real bandwith limitations of the device (eth1 in this case). Nota: me parece que ser?a m?s adecuado asignarle a 1:1 las verdaderas restricciones de ancho de banda del dispositivo (eth1 en este caso). -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From drumlesson at gmail.com Sat Apr 28 21:33:16 2007 From: drumlesson at gmail.com (terraja-based) Date: Sat Apr 28 21:33:36 2007 Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 In-Reply-To: <20070428100006.2C6AE410C@outpost.ds9a.nl> References: <20070428100006.2C6AE410C@outpost.ds9a.nl> Message-ID: <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> Alejandro, So, i did try the script that you give to me, and the problems its continues.- Maybe the problem was in the IPTABLES rules, i attach the complete script below: ##################### ifconfig imq0 up tc qdisc add dev imq0 handle 1: root htb default 30 tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit ceil 2000kbit tc class add dev imq0 parent 1:1 classid 1:30 htb rate 100kbit ceil 2000kbit tc qdisc add dev imq0 parent 1:10 handle 2 sfq tc qdisc add dev imq0 parent 1:20 handle 3 sfq iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:10 tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid 1:20 iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK --set-mark 2 iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK --set-mark 3 iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK --set-mark 3 ##################### The traffic it continues goes out by the "default" qdisc (1:30), and it was not clasified by the correct qdisc. I did try a ftp transfererence using the 20 and 21 TCP ports, this should to use the 1:20 qdisc asociated with the "handle 3"...BUT DID NOT WORK...!!! PLease, help me...!!! -- terraja-based -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070428/2952a6ff/attachment.html From alex at uh.cu Sun Apr 29 00:12:45 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Sun Apr 29 00:11:23 2007 Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 In-Reply-To: <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> References: <20070428100006.2C6AE410C@outpost.ds9a.nl> <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> Message-ID: <200704282212.46731.alex@uh.cu> On Saturday 28 April 2007 19:33, terraja-based wrote: > [...] > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:10 > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid 1:20 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > --set-mark 2 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > --set-mark 3 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > --set-mark 3 > [...] > The traffic it continues goes out by the "default" qdisc (1:30), and it was > not clasified by the correct qdisc. Hmm, you are trying to "redirect" all packets from eth1 to imq0, and then you are trying to mark packets for http and ftp connections. Well, I think you need to change again your configuration: if you put '-j IMQ --todev 0' as first rule, then all packets will match and will not pass through the chain, so any rule after that one, will never match against a packet. You need to mark packets before, and send to imq device later. Maybe something like this: --------------------------------8<-------------------------8<----------------------------------- [...] iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK --set-mark 2 iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK --set-mark 3 iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK --set-mark 3 iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid 1:10 tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid 1:20 [...] --------------------------------8<-------------------------8<----------------------------------- PS: as long as I know, marks 0, 1, and 2 are iptables marks (reserved marks), so if I were you, I start marking with number 3 or greater. -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From andreas at stapelspeicher.org Sun Apr 29 10:48:25 2007 From: andreas at stapelspeicher.org (Andreas Mueller) Date: Sun Apr 29 10:48:33 2007 Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 In-Reply-To: <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> References: <20070428100006.2C6AE410C@outpost.ds9a.nl> <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> Message-ID: <20070429084825.GA3557@lintera.stapelspeicher.org> Hallo terraja-based, terraja-based wrote: [snip] > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > --set-mark 2 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > --set-mark 3 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > --set-mark 3 [snip] > The traffic it continues goes out by the "default" qdisc (1:30), and it was > not clasified by the correct qdisc. [snip] the marks you set here will be gone as soon as the packet leaves, connmark could do the trick here. Still, matching --sport on the imq device should do the job as well, at least for http at port 80. For ftp, passive mode (data) connections will go to the default-class as the server's port is chosen at runtime, to catch them better use a level-7 filter (e.g. http://sourceforge.net/projects/l7-filter/). Bye, Andreas. From andreas at stapelspeicher.org Sun Apr 29 11:00:30 2007 From: andreas at stapelspeicher.org (Andreas Mueller) Date: Sun Apr 29 11:00:37 2007 Subject: [LARTC] HFSC with tcng In-Reply-To: <000b01c7872e$b0213480$10639d80$@de> References: <000b01c7872e$b0213480$10639d80$@de> Message-ID: <20070429090030.GB3557@lintera.stapelspeicher.org> Hi Simo, Simo wrote: > [...] > I don?t know how to use HFSC queuing discipline with tcng configuration > language. I become always this error: syntax error near "hfsc" > [...] > Is it possible, that tcng provides no support for this classful hfcs queuing > discipline? > [...] no, there is no such support and might never be, because this project is no longer under active development. Andreas From lsharpe at pacificwireless.com.au Mon Apr 30 05:18:23 2007 From: lsharpe at pacificwireless.com.au (Leigh Sharpe) Date: Mon Apr 30 05:18:51 2007 Subject: [LARTC] Maximum number of tc handles? Message-ID: Hi all, Can anybody tell me what the maximum number of handles are that I can use when setting up qdiscs and classes in tc? Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070430/05bfcf1c/attachment.htm From alex at uh.cu Mon Apr 30 05:28:34 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Mon Apr 30 05:27:33 2007 Subject: [LARTC] Maximum number of tc handles? In-Reply-To: References: Message-ID: <200704300328.34846.alex@uh.cu> On Monday 30 April 2007 03:18, Leigh Sharpe wrote: > Hi all, Hi, Leigh! > Can anybody tell me what the maximum number of handles are that I can > use when setting up qdiscs and classes in tc? Well, for EACH device, you have 2^16 (0xffff) possible values. > Regards, > Leigh Regards, Ale. > Leigh Sharpe > Network Systems Engineer > Pacific Wireless > Ph +61 3 9584 8966 > Mob 0408 009 502 > Helpdesk 1300 300 616 > email lsharpe@pacificwireless.com.au > web www.pacificwireless.com.au -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From cbergstrom at netsyncro.com Tue May 1 11:39:59 2007 From: cbergstrom at netsyncro.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=) Date: Tue May 1 11:40:18 2007 Subject: [LARTC] Forwarding between untagged vlans Message-ID: <46370AEF.7030305@netsyncro.com> I'm trying to implement simple untagged vlans on our switch and have misconfigured something.. ISP gw is on the default vlan1 (untagged) Router eth1 is setup on the switch with default vlan1 and member of vlan4. eth0 is default vlan4 which connects to the clients that are all default members of vlan4 eth0 is x.x.x.86/28 This is what clients are connecting to as their gw.. (no nat) eth1 is x.x.x.82/26 default route is .65/26 dev eth1 If client is default vlan4, but a member of vlan1 then it all works.. As soon as I remove client from being a member of vlan1.. The router stops forwarding. Is this to be expected and how can I correct this? I've tried adding a rule like this for the test client which is on .87 # Trying to fix vlan iptables -A FORWARD -i ${WAN} -d x.x.x.87 -o ${LAN} -j ACCEPT iptables -A FORWARD -i ${LAN} -s x.x.x.87 -o ${WAN} -j ACCEPT I see the packets from the lan trying to get out, but on ingress I don't see them.. Thanks in advance. Christopher From drumlesson at gmail.com Tue May 1 18:08:27 2007 From: drumlesson at gmail.com (terraja-based) Date: Tue May 1 18:08:41 2007 Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 25 In-Reply-To: <20070429100006.A921D40AB@outpost.ds9a.nl> References: <20070429100006.A921D40AB@outpost.ds9a.nl> Message-ID: <823158cf0705010908u2493b0c1p180be1044a2de554@mail.gmail.com> Hey Andreas, how i catch this traffic using L7 filter?, i?ve installed l7 filter now, but i don?t kwnow to use the kind of filter...!!! Can you help me? Thx.- Terraja-based 2007/4/29, lartc-request@mailman.ds9a.nl : > > Send LARTC mailing list submissions to > lartc@mailman.ds9a.nl > > To subscribe or unsubscribe via the World Wide Web, visit > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > or, via email, send a message with subject or body 'help' to > lartc-request@mailman.ds9a.nl > > You can reach the person managing the list at > lartc-owner@mailman.ds9a.nl > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of LARTC digest..." > > > Today's Topics: > > 1. Re: LARTC Digest, Vol 26, Issue 24 (terraja-based) > 2. Re: Re: LARTC Digest, Vol 26, Issue 24 (Alejandro Ramos Encinosa) > 3. Re: Re: LARTC Digest, Vol 26, Issue 24 (Andreas Mueller) > 4. Re: HFSC with tcng (Andreas Mueller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 28 Apr 2007 16:33:16 -0300 > From: terraja-based > Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > To: lartc@mailman.ds9a.nl > Message-ID: > <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Alejandro, > > > > > So, i did try the script that you give to me, and the problems its > continues.- > Maybe the problem was in the IPTABLES rules, i attach the complete script > below: > > ##################### > ifconfig imq0 up > > tc qdisc add dev imq0 handle 1: root htb default 30 > tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil 2000kbit > > tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil > 2000kbit > tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit ceil > 2000kbit > tc class add dev imq0 parent 1:1 classid 1:30 htb rate 100kbit ceil > 2000kbit > > > tc qdisc add dev imq0 parent 1:10 handle 2 sfq > tc qdisc add dev imq0 parent 1:20 handle 3 sfq > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > 1:10 > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > 1:20 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > --set-mark 2 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > --set-mark 3 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > --set-mark 3 > ##################### > > > The traffic it continues goes out by the "default" qdisc (1:30), and it > was > not clasified by the correct qdisc. > I did try a ftp transfererence using the 20 and 21 TCP ports, this should > to > use the 1:20 qdisc asociated with the "handle 3"...BUT DID NOT WORK...!!! > PLease, help me...!!! > > > -- > terraja-based > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.ds9a.nl/pipermail/lartc/attachments/20070428/2952a6ff/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Sat, 28 Apr 2007 22:12:45 +0000 > From: Alejandro Ramos Encinosa > Subject: Re: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > To: lartc@mailman.ds9a.nl > Message-ID: <200704282212.46731.alex@uh.cu> > Content-Type: text/plain; charset="iso-8859-15" > > On Saturday 28 April 2007 19:33, terraja-based wrote: > > [...] > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > 1:10 > > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > 1:20 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > --set-mark 2 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > --set-mark 3 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > --set-mark 3 > > [...] > > The traffic it continues goes out by the "default" qdisc (1:30), and it > was > > not clasified by the correct qdisc. > Hmm, you are trying to "redirect" all packets from eth1 to imq0, and then > you > are trying to mark packets for http and ftp connections. Well, I think you > need to change again your configuration: if you put '-j IMQ --todev 0' as > first rule, then all packets will match and will not pass through the > chain, > so any rule after that one, will never match against a packet. You need to > mark packets before, and send to imq device later. Maybe something like > this: > > > --------------------------------8<-------------------------8<----------------------------------- > [...] > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > --set-mark > 2 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > --set-mark > 3 > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > --set-mark > 3 > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > 1:10 > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > 1:20 > [...] > > --------------------------------8<-------------------------8<----------------------------------- > > PS: as long as I know, marks 0, 1, and 2 are iptables marks (reserved > marks), > so if I were you, I start marking with number 3 or greater. > > -- > Alejandro Ramos Encinosa > Fac. Matem?tica Computaci?n > Universidad de La Habana > > > ------------------------------ > > Message: 3 > Date: Sun, 29 Apr 2007 10:48:25 +0200 > From: Andreas Mueller > Subject: Re: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > To: lartc@mailman.ds9a.nl > Message-ID: <20070429084825.GA3557@lintera.stapelspeicher.org> > Content-Type: text/plain; charset=us-ascii > > Hallo terraja-based, > > > > terraja-based wrote: > [snip] > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > --set-mark 2 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > --set-mark 3 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > --set-mark 3 > [snip] > > The traffic it continues goes out by the "default" qdisc (1:30), and it > was > > not clasified by the correct qdisc. > [snip] > > the marks you set here will be gone as soon as the packet leaves, > connmark could do the trick here. > Still, matching --sport on the imq device should do the job as well, > at least for http at port 80. > For ftp, passive mode (data) connections will go to the default-class as > the server's port is chosen at runtime, to catch them better use a > level-7 filter (e.g. http://sourceforge.net/projects/l7-filter/). > > Bye, Andreas. > > > ------------------------------ > > Message: 4 > Date: Sun, 29 Apr 2007 11:00:30 +0200 > From: Andreas Mueller > Subject: Re: [LARTC] HFSC with tcng > To: lartc@mailman.ds9a.nl > Message-ID: <20070429090030.GB3557@lintera.stapelspeicher.org> > Content-Type: text/plain; charset=us-ascii > > Hi Simo, > > > > Simo wrote: > > [...] > > I don?t know how to use HFSC queuing discipline with tcng configuration > > language. I become always this error: syntax error near "hfsc" > > [...] > > Is it possible, that tcng provides no support for this classful hfcs > queuing > > discipline? > > [...] > > no, there is no such support and might never be, because this project is > no longer under active development. > > Andreas > > > ------------------------------ > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > End of LARTC Digest, Vol 26, Issue 25 > ************************************* > -- terraja-based -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070501/6458af1d/attachment.html From madhava.rayudu at gmail.com Tue May 1 18:58:09 2007 From: madhava.rayudu at gmail.com (Madhava Rayudu) Date: Tue May 1 18:58:19 2007 Subject: [LARTC] TC & Squid Message-ID: Sir, I am very new Tc ...Kindly help.. I have to distribute 2 MB to 300 users... Browsing should be very good... Squid & Caching Name Server increases the browsing experience visibly. I need a script 1. All HTTP traffic is redirected to Squid. ( I will use Iptables to redirect and delay pools to limit per user bandwidth/total bandwidth for Squid ) 2. Rest of the bandwidth to shaped per user/ip up/down.. Kindly Help me out... Regards, Rayudu. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070501/d7815c4f/attachment.htm From lsharpe at pacificwireless.com.au Wed May 2 08:01:22 2007 From: lsharpe at pacificwireless.com.au (Leigh Sharpe) Date: Wed May 2 08:01:48 2007 Subject: [LARTC] Cbq and max latency Message-ID: Hi All, Is there any way to set the maximum latency on a cbq when it is overloaded? Or, for that matter, to query it? For example, I want to know how long (in seconds) a packet will stay in the queue before it is dropped, and I want to be able to adjust this figure. Regards, Leigh Leigh Sharpe Network Systems Engineer Pacific Wireless Ph +61 3 9584 8966 Mob 0408 009 502 Helpdesk 1300 300 616 email lsharpe@pacificwireless.com.au web www.pacificwireless.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070502/474b93a8/attachment.html From salatiel.filho at gmail.com Wed May 2 12:36:27 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Wed May 2 12:36:43 2007 Subject: [LARTC] tc u32 match !port Message-ID: How can i redirect all traffic that not come from port 80 to a flow ? i was thing about some like tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip sport !80 ...... But this not work. Another doubt, if i have two rules that intersects , for example , one filter with u32 match ip src 10.10.10.10 flowid 1:10 and other with u32 match sport 80 0xffff flowid 1:11 , which one will work in case of a packet to 10.10.10.10 with sport 80 ??? []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From alex at uh.cu Wed May 2 15:51:21 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Wed May 2 15:51:29 2007 Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 25 In-Reply-To: <823158cf0705010908u2493b0c1p180be1044a2de554@mail.gmail.com> References: <20070429100006.A921D40AB@outpost.ds9a.nl> <823158cf0705010908u2493b0c1p180be1044a2de554@mail.gmail.com> Message-ID: <200705021351.22056.alex@uh.cu> On Tuesday 01 May 2007 16:08, terraja-based wrote: > Hey Andreas, how i catch this traffic using L7 filter?, i?ve installed l7 > filter now, but i don?t kwnow to use the kind of filter...!!! > Can you help me? Maybe you will like to visit http://l7-filter.sourceforge.net/ If you want to use layer7 module in kernel mode, then you should go to http://l7-filter.sourceforge.net/HOWTO#Doing but if you want to use it in user mode, then go to http://l7-filter.sourceforge.net/HOWTO-userspace#Doing > Thx.- > > > > Terraja-based > > 2007/4/29, lartc-request@mailman.ds9a.nl : > > Send LARTC mailing list submissions to > > lartc@mailman.ds9a.nl > > > > To subscribe or unsubscribe via the World Wide Web, visit > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > or, via email, send a message with subject or body 'help' to > > lartc-request@mailman.ds9a.nl > > > > You can reach the person managing the list at > > lartc-owner@mailman.ds9a.nl > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of LARTC digest..." > > > > > > Today's Topics: > > > > 1. Re: LARTC Digest, Vol 26, Issue 24 (terraja-based) > > 2. Re: Re: LARTC Digest, Vol 26, Issue 24 (Alejandro Ramos Encinosa) > > 3. Re: Re: LARTC Digest, Vol 26, Issue 24 (Andreas Mueller) > > 4. Re: HFSC with tcng (Andreas Mueller) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Sat, 28 Apr 2007 16:33:16 -0300 > > From: terraja-based > > Subject: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > > To: lartc@mailman.ds9a.nl > > Message-ID: > > <823158cf0704281233v1f4bd80dg719a78eb779021e1@mail.gmail.com> > > Content-Type: text/plain; charset="iso-8859-1" > > > > Alejandro, > > > > > > > > > > So, i did try the script that you give to me, and the problems its > > continues.- > > Maybe the problem was in the IPTABLES rules, i attach the complete script > > below: > > > > ##################### > > ifconfig imq0 up > > > > tc qdisc add dev imq0 handle 1: root htb default 30 > > tc class add dev imq0 parent 1: classid 1:1 htb rate 500kbit ceil > > 2000kbit > > > > tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil > > 2000kbit > > tc class add dev imq0 parent 1:1 classid 1:20 htb rate 100kbit ceil > > 2000kbit > > tc class add dev imq0 parent 1:1 classid 1:30 htb rate 100kbit ceil > > 2000kbit > > > > > > tc qdisc add dev imq0 parent 1:10 handle 2 sfq > > tc qdisc add dev imq0 parent 1:20 handle 3 sfq > > > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > > 1:10 > > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > > 1:20 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > --set-mark 2 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > --set-mark 3 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > --set-mark 3 > > ##################### > > > > > > The traffic it continues goes out by the "default" qdisc (1:30), and it > > was > > not clasified by the correct qdisc. > > I did try a ftp transfererence using the 20 and 21 TCP ports, this should > > to > > use the 1:20 qdisc asociated with the "handle 3"...BUT DID NOT WORK...!!! > > PLease, help me...!!! > > > > > > -- > > terraja-based > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: > > http://mailman.ds9a.nl/pipermail/lartc/attachments/20070428/2952a6ff/atta > >chment-0001.html > > > > ------------------------------ > > > > Message: 2 > > Date: Sat, 28 Apr 2007 22:12:45 +0000 > > From: Alejandro Ramos Encinosa > > Subject: Re: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > > To: lartc@mailman.ds9a.nl > > Message-ID: <200704282212.46731.alex@uh.cu> > > Content-Type: text/plain; charset="iso-8859-15" > > > > On Saturday 28 April 2007 19:33, terraja-based wrote: > > > [...] > > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > > > > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > > > > 1:10 > > > > > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > > > > 1:20 > > > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > > --set-mark 2 > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > > --set-mark 3 > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > > --set-mark 3 > > > [...] > > > The traffic it continues goes out by the "default" qdisc (1:30), and it > > > > was > > > > > not clasified by the correct qdisc. > > > > Hmm, you are trying to "redirect" all packets from eth1 to imq0, and then > > you > > are trying to mark packets for http and ftp connections. Well, I think > > you need to change again your configuration: if you put '-j IMQ --todev > > 0' as first rule, then all packets will match and will not pass through > > the chain, > > so any rule after that one, will never match against a packet. You need > > to mark packets before, and send to imq device later. Maybe something > > like this: > > > > > > --------------------------------8<-------------------------8<------------ > >----------------------- [...] > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > --set-mark > > 2 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > --set-mark > > 3 > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > --set-mark > > 3 > > iptables -t mangle -A PREROUTING -i eth1 -j IMQ --todev 0 > > > > tc filter add dev imq0 parent 1: prio 0 protocol ip handle 2 fw flowid > > 1:10 > > tc filter add dev imq0 parent 1: prio 1 protocol ip handle 3 fw flowid > > 1:20 > > [...] > > > > --------------------------------8<-------------------------8<------------ > >----------------------- > > > > PS: as long as I know, marks 0, 1, and 2 are iptables marks (reserved > > marks), > > so if I were you, I start marking with number 3 or greater. > > > > -- > > Alejandro Ramos Encinosa > > Fac. Matem?tica Computaci?n > > Universidad de La Habana > > > > > > ------------------------------ > > > > Message: 3 > > Date: Sun, 29 Apr 2007 10:48:25 +0200 > > From: Andreas Mueller > > Subject: Re: [LARTC] Re: LARTC Digest, Vol 26, Issue 24 > > To: lartc@mailman.ds9a.nl > > Message-ID: <20070429084825.GA3557@lintera.stapelspeicher.org> > > Content-Type: text/plain; charset=us-ascii > > > > Hallo terraja-based, > > > > > > > > terraja-based wrote: > > [snip] > > > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 80 -j MARK > > > --set-mark 2 > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 20 -j MARK > > > --set-mark 3 > > > iptables -t mangle -A PREROUTING -i eth1 -p tcp --dport 21 -j MARK > > > --set-mark 3 > > > > [snip] > > > > > The traffic it continues goes out by the "default" qdisc (1:30), and it > > > > was > > > > > not clasified by the correct qdisc. > > > > [snip] > > > > the marks you set here will be gone as soon as the packet leaves, > > connmark could do the trick here. > > Still, matching --sport on the imq device should do the job as well, > > at least for http at port 80. > > For ftp, passive mode (data) connections will go to the default-class as > > the server's port is chosen at runtime, to catch them better use a > > level-7 filter (e.g. http://sourceforge.net/projects/l7-filter/). > > > > Bye, Andreas. > > > > > > ------------------------------ > > > > Message: 4 > > Date: Sun, 29 Apr 2007 11:00:30 +0200 > > From: Andreas Mueller > > Subject: Re: [LARTC] HFSC with tcng > > To: lartc@mailman.ds9a.nl > > Message-ID: <20070429090030.GB3557@lintera.stapelspeicher.org> > > Content-Type: text/plain; charset=us-ascii > > > > Hi Simo, > > > > Simo wrote: > > > [...] > > > I don?t know how to use HFSC queuing discipline with tcng configuration > > > language. I become always this error: syntax error near "hfsc" > > > [...] > > > Is it possible, that tcng provides no support for this classful hfcs > > > > queuing > > > > > discipline? > > > [...] > > > > no, there is no such support and might never be, because this project is > > no longer under active development. > > > > Andreas > > > > > > ------------------------------ > > > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > End of LARTC Digest, Vol 26, Issue 25 > > ************************************* -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From madhava.rayudu at gmail.com Wed May 2 17:28:13 2007 From: madhava.rayudu at gmail.com (Madhava Rayudu) Date: Wed May 2 17:28:20 2007 Subject: [LARTC] TC & Squid In-Reply-To: References: Message-ID: Sir, I am very new Tc ...Kindly help.. I have to distribute 2 MB to 300 users... Browsing should be very good... Squid & Caching Name Server increases the browsing experience visibly. I need a script 1. All HTTP traffic is redirected to Squid. ( I will use Iptables to redirect and delay pools to limit per user bandwidth/total bandwidth for Squid ) 2. Rest of the bandwidth to shaped per user/ip up/down.. Kindly Help me out... Regards, Rayudu. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070502/8b7ca592/attachment.htm From alex at uh.cu Wed May 2 17:20:58 2007 From: alex at uh.cu (Alejandro Ramos Encinosa) Date: Wed May 2 17:43:28 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: References: Message-ID: <200705021520.58773.alex@uh.cu> On Wednesday 02 May 2007 10:36, Salatiel Filho wrote: > How can i redirect all traffic that not come from port 80 to a flow ? > > i was thing about some like > > tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip sport > !80 ...... Maybe you should try with iptables/tc solution: iptables -t -A -p tcp --sport ! 80 0xffff -j MARK --set-mark 5 tc filter add dev imq1 parent 1: handle 5 fw flowid ... > > But this not work. > > Another doubt, if i have two rules that intersects , for example , > one filter with u32 match ip src 10.10.10.10 flowid 1:10 > and other with u32 match sport 80 0xffff flowid 1:11 , which one will > work in case of a packet to 10.10.10.10 with sport 80 ??? From all filters in the current tc node, those with current priority, will match in the same order you declare them. Maybe you want to do something like: |-------------| | 10.10.10.10 | |-------------| / \ / \ |---------| |----------| | default | | sport 80 | |---------| |----------| then you will have the traffic from 10.10.10.10 going to the subtree root, and the traffic that also has port 80 as source, will go to the right child of the tree. Maybe the rules will like as the following: iptables -t mangle -A PREROUTING -s 10.10.10.10 -j MARK --set-mark 4 ... // parent (node 10.10.10.10 on *figure*) tc class add dev imq1 parent 1:1 classid 1:10 htb rate ... // "default" node tc class add dev imq1 parent 1:10 classid 1:11 htb rate ... // "sport 80" node tc class add dev imq1 parent 1:10 classid 1:12 htb rate ... ... // filter to match the traffic that will go to "sport 80" node tc filter add dev imq1 protocol ip parent 1: prio 1 u32 match ip src 10.10.10.10 match ip sport 80 0xffff flowid 1:20 // filter to match the rest of the traffic from 10.10.10.10 (going to "default") tc filter add dev imq1 protocol ip parent 1: prio 1 u32 match ip src 10.10.10.10 flowid 1:20 -- Alejandro Ramos Encinosa Fac. Matem?tica Computaci?n Universidad de La Habana From michael at hotplate.co.nz Thu May 3 03:22:52 2007 From: michael at hotplate.co.nz (Michael Fincham) Date: Thu May 3 03:23:06 2007 Subject: [LARTC] HTB and burst... Message-ID: <1178155372.3116.12.camel@michael-desktop> Hey everyone, For some reason my htb configuration isn't allowing any class to burst up to its ceiling ever, even when the link is only being utilised by one class that class only ever gets its assigned rate and exactly that assigned rate... The hierarchy I have is 1: at the root with no default, then 1:2 and 1:3 under that, both with assigned rates, then 2: and 3: under those respectively with defaults configured. Iptables marks the packets based on incoming interface which then get filtered to either 1:2 or 1:3 and filtered again, shaped accordingly etc... All classes and qdiscs are HTB Any ideas anyone? -- Michael Fincham From simo at mix4web.de Thu May 3 09:29:30 2007 From: simo at mix4web.de (Simo) Date: Thu May 3 09:29:46 2007 Subject: [LARTC] RSVP and TC Message-ID: <000301c78d54$cace8590$606b90b0$@de> Hello mailing list, i?m writing a documentation about what the linux traffic control system can do, and I don?t know if TC provides a support of the RSVP protocol. I know, there is a filter with the same name, but is this the RSVP protocol, which defined in the RFC2205 and RFC2209?? Can you help me please Thanks Simo ---------------------------------------------------------------------------- ----------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070503/11bf86bb/attachment.html From simo at mix4web.de Thu May 3 09:41:36 2007 From: simo at mix4web.de (Simo) Date: Thu May 3 09:41:55 2007 Subject: [LARTC] HFSC with tcng Message-ID: <001101c78d56$7c3042f0$7490c8d0$@de> Hi, thanks for your answer! You?ve said, that the tcng-tool is not longer under active development. What do you this about this tool? Is this a good way to configure the linux traffic control system? You know, the tcng syntax is very intuitive and more comfortable than the tc configuration language. Have you already made any experience with the tcng-Tool? Simo -----Urspr?ngliche Nachricht----- Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Andreas Mueller Gesendet: Sonntag, 29. April 2007 11:01 An: lartc@mailman.ds9a.nl Betreff: Re: [LARTC] HFSC with tcng Hi Simo, Simo wrote: > [...] > I don?t know how to use HFSC queuing discipline with tcng configuration > language. I become always this error: syntax error near "hfsc" > [...] > Is it possible, that tcng provides no support for this classful hfcs queuing > discipline? > [...] no, there is no such support and might never be, because this project is no longer under active development. Andreas _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From lists at andyfurniss.entadsl.com Thu May 3 14:41:27 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu May 3 14:41:31 2007 Subject: [LARTC] Multiple bands with equal priority ? In-Reply-To: <2387247e0704231257h4196a31cg91207be9512bfced@mail.gmail.com> References: <2387247e0704231257h4196a31cg91207be9512bfced@mail.gmail.com> Message-ID: <4639D877.6090803@andyfurniss.entadsl.com> Gabriel Somlo wrote: > I'm trying to build a wan latency test environment, where packets > from different "remote" locations get delayed by different amounts > of time, depending on which remote location we're pretending they > are from. > > Currently, I'm doing this using the 'prio' qdisc to obtain multiple > bands, and hanging a different netem qdisc off each of the branches > to delay packets, like this (assuming two remote locations): > > # create three bands, and place all default traffic in the first one: > tc qdisc add dev eth0 root handle 1: \ > prio bands 3 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > # the second and third bands of the prio above will delay packets > # by 50ms for one simulated remote location, and by 70ms for the other: > tc qdisc add dev eth0 parent 1:2 handle 20: netem delay 50ms > tc qdisc add dev eth0 parent 1:3 handle 30: netem delay 70ms > > # use filters to place 'remote' traffic into the appropriate band: > tc filter add dev eth0 protocol ip parent 1:0 \ > prio 2 <...match for first remote location...> flowid 10:2 > tc filter add dev eth0 protocol ip parent 1:0 \ > prio 3 <...match for second remote location...> flowid 10:3 > > This works great, but I can only really test one remote location at a > time, because otherwise traffic sent to the second band will always > starve out traffic sent to the third band. I actually don't mind > default traffic having priority over that from my two 'remote' > locations... :) > > Anyway, I'm looking for a way to allow packets from the two remote > locations to compete for bandwidth on equal footing (after the > appropriate delay has been applied, of course). So, instead of > a prio multiband qdisc, I'd be interested in a round-robin one. > > Can I accomplish this with CBQ ? What would the tc commands have > to look like -- I'm getting slightly confused by the split/defmap > syntax, and by trying to figure out when it's a clas vs a qdisc I'm > supposed to be dealing with... :( > > I guess I should be looking at using the WRR qdisc, but I'd like to > try everything else I can before falling through to adding an > out-of-tree kernel module and patching tc... Hmm I didn't think that prio would behave this way unless the eth was backlogged - but then I haven't tried this setup. I would use ifb - if the remote wans are real and all you need to do is add latency then just netem on roots, if you need to simulate the low bandwidth as well, then there is an example of using tbf + netem on the netem site - still use an ifb per class. Andy. From lists at andyfurniss.entadsl.com Thu May 3 14:57:20 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu May 3 14:57:20 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: References: Message-ID: <4639DC30.4000901@andyfurniss.entadsl.com> Salatiel Filho wrote: > How can i redirect all traffic that not come from port 80 to a flow ? > > i was thing about some like > > tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip sport > !80 ...... > > But this not work. > > Another doubt, if i have two rules that intersects , for example , > one filter with u32 match ip src 10.10.10.10 flowid 1:10 > and other with u32 match sport 80 0xffff flowid 1:11 , which one will > work in case of a packet to 10.10.10.10 with sport 80 ??? You need to use prio to order the rules - anything after a rule that matches port 80 will be ! 80 - you cannot make a rule that negates matches directly. If the structure of your htb etc is deep you can make filters attach to parents other than root, but you need to filter the traffic to those flowids first. You can match more than one thing with one filter rule so you can match prio X src ip and 80 then follow with prio (X+1) src ip. Andy. From lists at andyfurniss.entadsl.com Thu May 3 14:59:51 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu May 3 14:59:52 2007 Subject: [LARTC] HTB and burst... In-Reply-To: <1178155372.3116.12.camel@michael-desktop> References: <1178155372.3116.12.camel@michael-desktop> Message-ID: <4639DCC7.2050407@andyfurniss.entadsl.com> Michael Fincham wrote: > Hey everyone, > > For some reason my htb configuration isn't allowing any class to burst > up to its ceiling ever, even when the link is only being utilised by one > class that class only ever gets its assigned rate and exactly that > assigned rate... > > The hierarchy I have is 1: at the root with no default, then 1:2 and 1:3 > under that, both with assigned rates, then 2: and 3: under those > respectively with defaults configured. Iptables marks the packets based > on incoming interface which then get filtered to either 1:2 or 1:3 and > filtered again, shaped accordingly etc... All classes and qdiscs are HTB > > Any ideas anyone? > It could be to do with clock Hz - your burst needs to be large enough to reach rate/ceil per tick. Andy. From michael at hotplate.co.nz Thu May 3 22:44:17 2007 From: michael at hotplate.co.nz (Michael Fincham) Date: Thu May 3 22:44:26 2007 Subject: [LARTC] HTB and burst... In-Reply-To: <0E24ED2A7F9AA349A8633E6A56A64BE0027A82A1@XCH-SW-2V1.sw.nos.boeing.com> References: <1178155372.3116.12.camel@michael-desktop> <0E24ED2A7F9AA349A8633E6A56A64BE0027A82A1@XCH-SW-2V1.sw.nos.boeing.com> Message-ID: <1178225057.13679.0.camel@localhost> It looks as though I may have had the hierarchy wrong... I had a class with a qdisc as a child then all my classes as children of the qdisc... now borrowing allowed as they're all root qdiscs. -Michael On Thu, 2007-05-03 at 08:53 -0700, Flechsenhaar, Jon J wrote: > I would need to see your actual script to say for sure. > > > Jon Flechsenhaar > Boeing WNW Team > Network Services > (714)-762-1231 > 202-E7 > > -----Original Message----- > From: Michael Fincham [mailto:michael@hotplate.co.nz] > Sent: Wednesday, May 02, 2007 6:23 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] HTB and burst... > > Hey everyone, > > For some reason my htb configuration isn't allowing any class to burst > up to its ceiling ever, even when the link is only being utilised by one > class that class only ever gets its assigned rate and exactly that > assigned rate... > > The hierarchy I have is 1: at the root with no default, then 1:2 and 1:3 > under that, both with assigned rates, then 2: and 3: under those > respectively with defaults configured. Iptables marks the packets based > on incoming interface which then get filtered to either 1:2 or 1:3 and > filtered again, shaped accordingly etc... All classes and qdiscs are HTB > > Any ideas anyone? > > -- > Michael Fincham > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From forum at crafta.com Thu May 3 23:20:56 2007 From: forum at crafta.com (User of web Forum. Crafta.com) Date: Thu May 3 23:21:12 2007 Subject: [LARTC] what gnu-linux distro do you guys recommend? Message-ID: <463A5238.1090001@crafta.com> Hi, what gnu-linux distro do you guys recommend in order to get rock solid runing the L7-filter? or what are probed to runt it? Thanks in advance Aldo -- Live as a tortoise. and rate my mullet: http://www.ratemymullet.com/ From matt at acm.cs.uic.edu Fri May 4 00:36:10 2007 From: matt at acm.cs.uic.edu (matt@acm.cs.uic.edu) Date: Fri May 4 00:35:32 2007 Subject: [LARTC] how to prioritize by client ip instead of protocol Message-ID: <2041.24.192.89.166.1178231770.squirrel@acm.cs.uic.edu> i have been googling and reading the docs on how to set up a fair que that gives each user or client ip equal bandwidth. most, if not all, i find are examples how to prioritize by the type of traffic such as ssh, http, ftp, smtp, and various p2p. is it possible to set up cues for each ip on the subnet where each user gets a fair turn. i would probably have to set the max bandwidth and total que size to avoid latency. stochastic fair queuing seems like it might work, but all i find on that is to mark by protocol. also, the number of ip's on the subnet is dynamic. i would rather not create a static number of que's for each ip. could anyone give me some ideas, so some better search ideas for google, or a link to a manual that might explain what i was looking for. again all i find is how to que or prioritize by protocol, which i do not want to do. thanks matt From alejandro_aero at yahoo.es Fri May 4 01:27:58 2007 From: alejandro_aero at yahoo.es (Alejandro Lorenzo Gallego) Date: Fri May 4 01:28:32 2007 Subject: [LARTC] how to prioritize by client ip instead of protocol In-Reply-To: <2041.24.192.89.166.1178231770.squirrel@acm.cs.uic.edu> References: <2041.24.192.89.166.1178231770.squirrel@acm.cs.uic.edu> Message-ID: <200705040128.11338.alejandro_aero@yahoo.es> El Friday 04 May 2007 00:36:10 matt@acm.cs.uic.edu escribi?: > i have been googling and reading the docs on how to set up a fair que that > gives each user or client ip equal bandwidth. most, if not all, i find > are examples how to prioritize by the type of traffic such as ssh, http, > ftp, smtp, and various p2p. is it possible to set up cues for each ip on > the subnet where each user gets a fair turn. i would probably have to > set the max bandwidth and total que size to avoid latency. stochastic > fair queuing seems like it might work, but all i find on that is to mark > by protocol. > > also, the number of ip's on the subnet is dynamic. i would rather not > create a static number of que's for each ip. > > could anyone give me some ideas, so some better search ideas for google, > or a link to a manual that might explain what i was looking for. again > all i find is how to que or prioritize by protocol, which i do not want to > do. > I think you should give ESFQ [http://fatooh.org/esfq-2.6/] a ride looks exactly what you need. I am using it in a setup (~150 clients) and it works well IF you want to give all clients the same amount of bandwidth > thanks > > matt > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070504/59249b86/attachment.pgp From lartc at mm.quex.org Fri May 4 15:45:46 2007 From: lartc at mm.quex.org (Michael Alger) Date: Fri May 4 15:46:01 2007 Subject: [LARTC] Forwarding between untagged vlans In-Reply-To: <46370AEF.7030305@netsyncro.com> References: <46370AEF.7030305@netsyncro.com> Message-ID: <20070504134546.GA22302@morose.quex.org> On Tue, May 01, 2007 at 12:39:59PM +0300, "C. Bergstr?m" wrote: > I'm trying to implement simple untagged vlans on our switch and > have misconfigured something.. > > ISP gw is on the default vlan1 (untagged) > > Router > eth1 is setup on the switch with default vlan1 and member of vlan4. > eth0 is default vlan4 which connects to the clients that are all > default members of vlan4 Just to clarify, are the VLANs configured on your switch or are you doing some funny thing on the router? This reply is assuming it's the switch which handles VLANs. > eth0 is x.x.x.86/28 This is what clients are connecting to as their gw.. > (no nat) > eth1 is x.x.x.82/26 > > default route is .65/26 dev eth1 > > If client is default vlan4, but a member of vlan1 then it all works.. > As soon as I remove client from being a member of vlan1.. The router > stops forwarding. Is this to be expected and how can I correct this? > > I've tried adding a rule like this for the test client which is on .87 > # Trying to fix vlan > iptables -A FORWARD -i ${WAN} -d x.x.x.87 -o ${LAN} -j ACCEPT > iptables -A FORWARD -i ${LAN} -s x.x.x.87 -o ${WAN} -j ACCEPT > > I see the packets from the lan trying to get out, but on ingress I don't > see them.. Your WAN interface shouldn't need to be able to see both VLANs; the point of the router is to move packets between two different networks. Are the hosts on the WAN side using your router's eth1 (.82/26) as their gateway to your LAN network (/28)? It sounds like they're directly sending replies to the clients, rather than via the router. Just to clarify, this is what I think you're doing: 1. You have an internal network connected to a switch, along with a router which is their default gateway, also connected to the same switch. 2. This router has a second interface, connected to a different switch, which has some stuff connected to it; in particular, your ISP's default gateway is connected to this switch. (Possibly you have other servers in a DMZ type setup or something?) 3. Since you're using VLANs, they're actually the same physical switch; but the ports used by the internal network belong to one VLAN, and the ports used by eth1 and the upstream gateway are on a different VLAN. Same thing, different technology. (VLAN-hopping exploits notwithstanding.) So, check the following to verify your configuration is as above: 1. Clients can ping router eth0 IP. 2. Router has forwarding enabled (/proc/sys/net/ipv4/ip_forward). 3. Router can ping upstream gateway via eth1. 4. Something upstream can ping your router's eth1 IP. 5. Change a client's IP address to put it on the same subnet as your upstream gateway, and verify that it's not able to ping it (or even get an ARP response from it). If it's able to communicate with it, then your VLANs aren't segregating the traffic properly. With all that, you should be set. One question: is the LAN segment known by your upstream, i.e. are they routing traffic to your /28 via .82/26? If not, you'll need to use NAT on your router so upstream only sees its IP address. Also, what kind of switch is it? Someone might be able to provide a simple configuration. Sorry if I've missed something. Your setup sounds pretty straight forward so there's probably something simple that was overlooked. Or, there's more to the situation than I've understood. From cbergstrom at netsyncro.com Fri May 4 17:35:13 2007 From: cbergstrom at netsyncro.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=) Date: Fri May 4 17:35:36 2007 Subject: [LARTC] Forwarding between untagged vlans In-Reply-To: <20070504134546.GA22302@morose.quex.org> References: <46370AEF.7030305@netsyncro.com> <20070504134546.GA22302@morose.quex.org> Message-ID: <463B52B1.6030109@netsyncro.com> Michael Alger wrote: > On Tue, May 01, 2007 at 12:39:59PM +0300, "C. Bergstr?m" wrote: > >> I'm trying to implement simple untagged vlans on our switch and >> have misconfigured something.. >> >> ISP gw is on the default vlan1 (untagged) >> >> Router >> eth1 is setup on the switch with default vlan1 and member of vlan4. >> eth0 is default vlan4 which connects to the clients that are all >> default members of vlan4 >> > > Just to clarify, are the VLANs configured on your switch or are you > doing some funny thing on the router? > Nope all vlans were configured on the switch. > This reply is assuming it's the switch which handles VLANs. > > >> eth0 is x.x.x.86/28 This is what clients are connecting to as their gw.. >> (no nat) >> eth1 is x.x.x.82/26 >> >> default route is .65/26 dev eth1 >> >> If client is default vlan4, but a member of vlan1 then it all works.. >> As soon as I remove client from being a member of vlan1.. The router >> stops forwarding. Is this to be expected and how can I correct this? >> >> I've tried adding a rule like this for the test client which is on .87 >> # Trying to fix vlan >> iptables -A FORWARD -i ${WAN} -d x.x.x.87 -o ${LAN} -j ACCEPT >> iptables -A FORWARD -i ${LAN} -s x.x.x.87 -o ${WAN} -j ACCEPT >> >> I see the packets from the lan trying to get out, but on ingress I don't >> see them.. >> > > Your WAN interface shouldn't need to be able to see both VLANs; the > point of the router is to move packets between two different > networks. > > Are the hosts on the WAN side using your router's eth1 (.82/26) as > their gateway to your LAN network (/28)? It sounds like they're > directly sending replies to the clients, rather than via the router. > Yeah.. they were sending packets directly, but it what was throwing me off was the Cisco gw wasn't in ip show neighbors.. So I assumed it was working and going through my middle-man router. > Just to clarify, this is what I think you're doing: > > 1. You have an internal network connected to a switch, along with a > router which is their default gateway, also connected to the same > switch. > Correct > 2. This router has a second interface, connected to a different > switch, which has some stuff connected to it; in particular, your > ISP's default gateway is connected to this switch. (Possibly you > have other servers in a DMZ type setup or something?) > 2nd interface is connected to the same switch. 2nd interface = (WAN) Rest is correct. There will be a slight change next week though in that everything is moving off the default vlan and going behind this router once configured correctly. > 3. Since you're using VLANs, they're actually the same physical > switch; but the ports used by the internal network belong to one > VLAN, and the ports used by eth1 and the upstream gateway are on a > different VLAN. Same thing, different technology. (VLAN-hopping > exploits notwithstanding.) > > So, check the following to verify your configuration is as above: > > 1. Clients can ping router eth0 IP. > yes > 2. Router has forwarding enabled (/proc/sys/net/ipv4/ip_forward). > yes > 3. Router can ping upstream gateway via eth1. > yes > 4. Something upstream can ping your router's eth1 IP > yes > 5. Change a client's IP address to put it on the same subnet as your > upstream gateway, and verify that it's not able to ping it (or > even get an ARP response from it). If it's able to communicate > with it, then your VLANs aren't segregating the traffic properly. > yes.. I wasn't.. and when I started to. that's when it broke > With all that, you should be set. > > One question: is the LAN segment known by your upstream, i.e. are > they routing traffic to your /28 via .82/26? If not, you'll need > to use NAT on your router so upstream only sees its IP address. > I'm getting .65/26 and then trying to break it down into smaller networks (ie .80/28) I remember trying with a 192.168 (rfc1918) ip + with NAT/masquerading and it all worked. (There's a ton of online examples for that online...) , but these servers need world routable IPs and when I was masqurading the packets. Things like SSH stopped working for obvious reasons. I'm doing this all remotely and the pos switch's web interface crashed on me.. So my 'keys' are currently locked in the car. I needed a couple days break from it and we just bought an HP ProCurve 2650 that should be in the colo next week. I'm pretty sure I can setup the untagged vlans on the switch correctly, but maybe I was missing something simple with the iptables rules.. Am I mistaken or nat doesn't play with non rfc1918 ips? Thanks a lot for your help Christopher From Jon.J.Flechsenhaar at boeing.com Sat May 5 00:37:57 2007 From: Jon.J.Flechsenhaar at boeing.com (Flechsenhaar, Jon J) Date: Sat May 5 00:38:05 2007 Subject: [LARTC] RSVP RESV not seen Message-ID: <0E24ED2A7F9AA349A8633E6A56A64BE00223983B@XCH-SW-2V1.sw.nos.boeing.com> all: I"m just trying to create a simple rsvp session to familiarize myself with the protocol. I don't ever get a rsvp RESV message only PATH AND PATH TEAR messages. There is a timeout but i'm not sure whats causing it exactly. Can anyone shed some light on why? All the configs and output are below. Please let me know if you need more information. Thanks. Jon Test Setup(all Linux machines): host1-192.85.3.2 ---> 192.85.3.254-Router-192.85.4.254 --->host2-192.85.4.1 I am using KOM RSVP I am using TG - (Traffic Generator), sender at host1 and recever at host2. HOST1-SENDER CONFIG: alias sender 192.85.3.2 alias receiver 192.85.4.1 estimator sender 1 flow udp sender 2000 receiver 2001 send cbr 64 5 60.0 rsvp 20.0 sync 0 HOST2-RECEIVER CONFIG: #alias sender 192.85.3.2 #alias receiver 192.85.4.1 flow udp 192.85.3.2 2001 192.85.4.1 2000 recv rate sinkrate rsvp 100 RSVP.CONF: - all nodes have the corresponding settings api 4000 interface eth1 refresh 10000 tc cbq 1000000 2500 CBQ SCRIPT: $TC qdisc add dev $DEVICE root handle 1: cbq \ bandwidth $RATE avpkt 1000 mpu 64 $TC class add dev $DEVICE parent 1:0 classid :1 est 1sec 8sec cbq \ bandwidth $RATE rate $RATE allot 1514 maxburst 50 avpkt 1000 $TC class add dev $DEVICE parent 1:1 classid :2 est 1sec 8sec cbq \ bandwidth $RATE rate $BE_RATE allot 1514 weight 500Kbit \ prio 6 maxburst 50 avpkt 1000 split 1:0 defmap ffff borrow $TC class add dev $DEVICE parent 1:1 classid :3 est 2sec 16sec cbq \ bandwidth $RATE rate `echo $2/2|bc`Mbit allot 1514 weight 100Kbit \ prio 2 maxburst 100 avpkt 1000 split 1:0 defmap c0 $TC class add dev $DEVICE parent 1:1 classid :4 est 1sec 8sec cbq \ bandwidth $RATE rate 100Kbit allot 1514 weight 10Mbit \ prio 7 maxburst 10 avpkt 1000 split 1:0 defmap 2 $TC class add dev $DEVICE parent 1:1 classid 1:7FFE cbq \ rate `echo 0.2*$2|bc`Mbit bandwidth $RATE allot 1514b avpkt 1000 \ maxburst 20 isolated $TC class add dev $DEVICE parent 1:7FFE classid 1:7FFF est 4sec 32sec cbq \ rate `echo 0.15*$2|bc`Mbit bandwidth $RATE allot 1514b avpkt 1000 weight 10Kbit \prio 6 maxburst 10 split 1:7FFE defmap ffff CBQ SCRIPT OUTPUT - (ON ALL NOCES): scipt ran with ./cbqinit eth1 10 class cbq 1: root rate 10Mbit (bounded,isolated) prio no-transmit class cbq 1:1 parent 1: rate 10Mbit prio no-transmit class cbq 1:2 parent 1:1 rate 250Kbit prio 6 class cbq 1:3 parent 1:1 rate 5Mbit prio 2 class cbq 1:4 parent 1:1 rate 100Kbit prio 7 class cbq 1:7fff parent 1:7ffe rate 1500Kbit prio 6 class cbq 1:7ffe parent 1:1 rate 2Mbit (isolated) prio no-transmit [root@netrat9 RSVP]# tc filter ls dev eth1 filter parent 1: protocol ip pref 1 rsvp TG SENDER OUPUT: [root@localhost RSVP]# ./tg tgsend.conf 22:33:03.268 measured select duration: 0.000993 sec 22:33:03.366 Timer: 1178317983.366 sec 183366 600.000 sec 0.001 sec 600000 1178317800.000 sec 22:33:03.369 detected 3 interfaces 22:33:03.369 detected 3 real interfaces initializing... starting traffic generator... starting all actions... 22:33:03.373 starting CBR (60.000 sec, 5pkts/sec a 64bytes) Sender 192.85.3.2/2000 <- 17 -> 192.85.4.1/2001 22:33:03.373 signalling RSVP for CBR (60.000 sec, 5pkts/sec a 64bytes) Sender 192.85.3.2/2000 <- 17 -> 192.85.4.1/2001 data start synchronized 22:33:23.372 RSVP timeout: CBR (60.000 sec, 5pkts/sec a 64bytes) Sender 192.85.3.2/2000 <- 17 -> 192.85.4.1/2001 22:33:23.372 stopped CBR (60.000 sec, 5pkts/sec a 64bytes) Sender 192.85.3.2/2000 <- 17 -> 192.85.4.1/2001 All actions finished TG RECEIVER OUTPUT: [root@H13 RSVP]# ./tg tgrec.conf 15:27:16.670 measured select duration: 0.003950 sec 15:27:16.693 Timer: 1178317636.693 sec 109173 600.000 sec 0.004 sec 150000 1178317200.000 sec 15:27:16.694 detected 3 interfaces 15:27:16.694 detected 3 real interfaces initializing... ignoring unknown local address: 192.85.3.2 starting traffic generator... starting all actions... RSVP DAEMON OUPUT: [root@localhost RSVP]# RSVPD 22:32:50.966 detected 3 interfaces 22:32:50.966 found interface lo 22:32:50.966 found interface eth0 22:32:50.967 found interface eth1 22:32:50.967 interface eth0 has system index 2 22:32:50.967 interface eth1 has system index 3 22:32:50.967 sending RSRR query 22:32:50.967 Routing: no mrouted found 22:32:50.992 measured select duration: 0.000866 sec 22:32:51.080 Timer: 1178317971.080 sec 171080 600.000 sec 0.001 sec 600000 1178317800.000 sec 22:32:51.083 interface: rsvp-api(0) 0.0.0.0 (0.000 sec) MTU: 8191 [UDP:4000 <-> 0.0.0.0:0] loss:0.000% 22:32:51.084 interface: eth0(1) 10.10.2.10 (30.000 sec) MTU: 1500 bw: 0.000 lat: 0 22:32:51.084 CBQ: enabling on interface eth1 22:32:51.085 Opened RTNetlink socket 22:32:51.085 No. of ticks in a usec = 1.024 22:32:51.085 CBQ: enabling on interface eth1 22:32:51.085 The index of eth1 is: 3 22:32:51.085 qdisc: cbq 0x1 dev eth1 22:32:51.085 root_qdisc_handle_ = 0x1 22:32:51.085 About to dump TC classes 22:32:51.085 in dump_classinfo 22:32:51.085 ** ROOT class ** rate = 1250000bps 22:32:51.085 class cbq 0x1 dev eth1 root 22:32:51.085 in dump_classinfo 22:32:51.085 class cbq 0x1:0x1 dev eth1 parent 0x1[root@H13 RSVP]# ./tg tgrec.conf 15:27:16.670 measured select duration: 0.003950 sec 15:27:16.693 Timer: 1178317636.693 sec 109173 600.000 sec 0.004 sec 150000 1178317200.000 sec 15:27:16.694 detected 3 interfaces 15:27:16.694 detected 3 real interfaces initializing... ignoring unknown local address: 192.85.3.2 starting traffic generator... starting all actions... 22:32:51.085 in dump_classinfo 22:32:51.085 class cbq 0x1:0x2 dev eth1 parent 0x1:0x1 22:32:51.085 in dump_classinfo 22:32:51.086 class cbq 0x1:0x3 dev eth1 parent 0x1:0x1 22:32:51.086 in dump_classinfo 22:32:51.086 class cbq 0x1:0x4 dev eth1 parent 0x1:0x1 22:32:51.086 in dump_classinfo 22:32:51.086 class cbq 0x1:0x7fff dev eth1 parent 0x1:0x7ffe 22:32:51.086 in dump_classinfo 22:32:51.086 ** RSVP (reserved) class ** rate = 250000bps 22:32:51.086 class cbq 0x1:0x7ffe dev eth1 parent 0x1:0x1 22:32:51.086 Enabling CBQ scheduler for Linux, bandwidth 2000000.000 bps 22:32:51.086 SchedulerCBQ: localAdspec: length:40 hops: 1 bw: 2000000.000 lat: 2500 MTU: 1500 22:32:51.086 interface: eth1(2) 192.85.3.2 (10.000 sec) MTU: 1500 bw: 1000000.000 lat: 2500 22:32:51.087 RSVPD running - Release 3.1pre - build date: Fri Apr 6 17:08:42 PDT 2007 22:33:03.373 **************** new message received **************** 22:33:03.373 rsvp-api received MSG from localhost : InitAPI 1 1 ttl:127 length:40 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:0.0.0.0[32824] TIME_VALUES:120000 22:33:03.374 timer 0x914a188 22:42:03.374 scheduled 22:33:03.374 timer 0x914a1a4 22:35:24.134 scheduled 22:33:03.374 registered 192.85.4.1/2001(17)1 for API at localhost / 32824 22:33:03.374 **************** new message received **************** 22:33:03.374 rsvp-api received MSG from 192.85.3.2 : PATH 1 1 ttl:63 length:88 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:0.0.0.0[32824] TIME_VALUES:0 SENDER_TEMPLATE:192.85.3.2/2000 SENDER_TSPEC:r:530.000 b:106.000 p:530.000 m:106 M:106 22:33:03.374 creating Hop:192.85.3.2 via rsvp-api 22:33:03.374 received PATH for 192.85.4.1/2001,17 22:33:03.374 new Session: 192.85.4.1/2001(17)1 22:33:03.375 requesting unicast route for 192.85.4.1 22:33:03.375 unicast route lookup result: index 3 via 192.85.3.254 reported dest is 192.85.4.1 22:33:03.375 requesting unicast route for 192.85.4.1 22:33:03.375 unicast route lookup result: index 3 via 192.85.3.254 reported dest is 192.85.4.1 22:33:03.375 routing result after adjustment: eth1 22:33:03.375 creating PSB:192.85.3.2/2000 PHOP not yet set 22:33:03.375 creating PHopSB:192.85.3.2[32824] via rsvp-api 22:33:03.375 setting new PHOP PHopSB:192.85.3.2[32824] via rsvp-api 22:33:03.376 TSpec changed: r:530.000 b:106.000 p:530.000 m:106 M:106 22:33:03.376 creating: OIatPSB:192.85.3.2/2000 -> eth1 22:33:03.376 timer 0x9aa3ef8 22:33:10.974 scheduled 22:33:03.376 eth1 sends MSG to 192.85.4.1 : PATH 1 1 ttl:63 length:132 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:192.85.3.2[2] TIME_VALUES:10000 SENDER_TEMPLATE:192.85.3.2/2000 SENDER_TSPEC:r:530.000 b:106.000 p:530.000 m:106 M:106 ADSPEC:length:40 hops: 1 bw: 2000000.000 lat: 2500 MTU: 1500 22:33:03.376 PSB::updateRoutingInfo done, gateway is 192.85.3.254 , new lif count: 1 22:33:10.974 timer 0x9aa3ef8 22:33:10.974 fired at time 22:33:10.974 22:33:10.974 timer 0x9aa3ef8 22:33:24.964 scheduled 22:33:10.974 eth1 sends MSG to 192.85.4.1 : PATH 1 1 ttl:63 length:132 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:192.85.3.2[2] TIME_VALUES:10000 SENDER_TEMPLATE:192.85.3.2/2000 SENDER_TSPEC:r:530.000 b:106.000 p:530.000 m:106 M:106 ADSPEC:length:40 hops: 1 bw: 2000000.000 lat: 2500 MTU: 1500 22:33:23.373 **************** new message received **************** 22:33:23.373 rsvp-api received MSG from 192.85.3.2 : PTEAR 1 1 ttl:63 length:44 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:0.0.0.0[32824] SENDER_TEMPLATE:192.85.3.2/2000 22:33:23.373 received PTEAR for 192.85.4.1/2001,17 22:33:23.373 found Session: 192.85.4.1/2001(17)1 22:33:23.373 eth1 sends MSG to 192.85.4.1 : PTEAR 1 1 ttl:63 length:44 SESSION:192.85.4.1/2001(17)1 RSVP_HOP:192.85.3.2[2] SENDER_TEMPLATE:192.85.3.2/2000 22:33:23.373 deleting PSB:192.85.3.2/2000 from PHopSB:192.85.3.2[32824] via rsvp-api 22:33:23.374 deleting OIatPSB: OIatPSB:192.85.3.2/2000 -> eth1 22:33:23.374 timer 0x9aa3f14 00:00:00.000 deleted 22:33:23.374 timer 0x9aa3ef8 22:33:24.964 deleted 22:33:23.374 PSB::updateRoutingInfo done, gateway is 192.85.3.254 , new lif count: 0 22:33:23.374 timer 0x9aa3da8 00:00:00.000 deleted 22:33:23.374 deleting PHopSB:192.85.3.2[32824] via rsvp-api 22:33:23.374 timer 0x9aa3e3c 00:00:00.000 deleted 22:33:23.374 delete Session: 192.85.4.1/2001(17)1 22:33:23.374 **************** new message received **************** 22:33:23.374 rsvp-api received MSG from localhost : RemoveAPI 1 1 ttl:127 length:32 SESSION:192.85.4.1/2001(17)0 RSVP_HOP:0.0.0.0[32824] 22:33:23.374 removing API: 192.85.4.1/2001(17)1 for API at localhost / 32824 22:33:23.374 timer 0x914a1a4 22:35:24.134 deleted 22:33:23.374 timer 0x914a188 22:42:03.374 deleted From fermin.galan at cttc.es Fri May 4 21:34:42 2007 From: fermin.galan at cttc.es (=?iso-8859-1?Q?Ferm=EDn_Gal=E1n_M=E1rquez?=) Date: Sat May 5 06:27:54 2007 Subject: [LARTC] Multiple SA in the same IPSec tunnel Message-ID: <00ff01c78e83$4669d770$303d5854@cttc.es> Hi, When a IPSec tunnel is established between two peers, I understand that the "normal" situation is to have in a given moment two SAs, one for each direction of the tunnel. However, in one of my tunnels (peer P1 running GNU/Linux with setkey and racoon; peer P2 is a Cisco router) there is a large number (around 19) of SAs established (this has been observed in P1 with 'setkey -D'). I've glooged around and the "multiplicy of SAs" seems to be a pathological situation (as a matter of fact, connectivity trough that tunnel use to fail). Although I'm not an expert in the internals of IKE protocol, I've read that using 'initial_contact on' in the tunnel could help. However, using that parameter in racoon.conf and restarting hasn't solved the problem :( I would like to remark that P1 is running 6 tunnels and this only happens in one of them (the other 5 seems to work fine with just a pair of SAs). Maybe some Cisco-Linux interoperability issue? Any idea or suggestion about what can be happening? Please, tell me about if you need to know any extra information (logs, etc.) Any help is very welcome. Thanks in advance! Best regards, -------------------- Ferm?n Gal?n M?rquez CTTC - Centre Tecnol?gic de Telecomunicacions de Catalunya Parc Mediterrani de la Tecnologia, Av. del Canal Ol?mpic s/n, 08860 Castelldefels, Spain Room 1.02 Tel : +34 93 645 29 12 Fax : +34 93 645 29 01 Email address: fermin dot galan at cttc dot es From ericr at ipro.net Sat May 5 07:30:36 2007 From: ericr at ipro.net (ericr) Date: Sat May 5 07:30:42 2007 Subject: [LARTC] Massive filtering Message-ID: <200705050130.AA2025718096@ipro.net> I am trying to build a trafic control rule set for a huge NATed network, and I have it working for single known addresses but I need to scale it to 16M potential client addresses. I'm using iptables for NAT. Incoming traffic is simple because I can match destination address, outgoing traffic I use iptables IPMARK then tc match mark and it works perfectly if I build rules for each client individually. I am worried about performance as the client list increases. I need to place client IPs into classes like routers, freeloaders, lite-access, premium-access, etc. I have no problem with rewriting rules on the fly. It is easy to pop in a rule change any time a user authenticates or is disconnected for inactivity. My first thought for scaling up was to use the hash tables, and I am feeling that the last line in lartc's document page "12.4. Hashing filters for very fast massive filtering" which says "Note that this example could be improved to the ideal case where each chain contains 1 filter!" is a little misleading since no divisor above 256 works. On first reading, I 'm thinking, yeh, I'll just put a divisor of 16777216 and my problems are solved... nope.. wrong answer. I haven't even gotten to the point where I issue 32 million filter rules to tc and see if it chokes. I hate to have to ask, I am gratefull for all the work you have done just to get me here, I'm probably missing something important, but I'm trying to scale to 16 million potential clients and the only practical documentation I can find says thinking large is 200 clients. thoughts, comments, ideas? solutions are best. Thanks in advance, Eric. ________________________________________________________________ Sent via the WebMail system at ipro.net From vvitkov at gmail.com Sat May 5 08:15:30 2007 From: vvitkov at gmail.com (Vladimir Vitkov) Date: Sat May 5 08:15:50 2007 Subject: [LARTC] Massive filtering In-Reply-To: <200705050130.AA2025718096@ipro.net> References: <200705050130.AA2025718096@ipro.net> Message-ID: Hi, personally i've never tried so huge thing but ... a guy whom i know has writen 2 papers on this. 1) optimizing iptables and tc rules: http://www.linux-bg.org/cgi-bin/y/index.pl?page=article&id=advices&key=380752598 2) usage of ipset, iptables, ipmark: http://www.linux-bg.org/cgi-bin/y/index.pl?page=article&id=advices&key=386924398 they are in bulgarian but hopefully the code will help. IF you have a problem with understanding them, write me in private so i can make a translation. On 05/05/07, ericr wrote: > I am trying to build a trafic control rule set for a huge NATed network, and I have it working for single known addresses but I need to scale it to 16M potential client addresses. I'm using iptables for NAT. Incoming traffic is simple because I can match destination address, outgoing traffic I use iptables IPMARK then tc match mark and it works perfectly if I build rules for each client individually. I am worried about performance as the client list increases. > > I need to place client IPs into classes like routers, freeloaders, lite-access, premium-access, etc. I have no problem with rewriting rules on the fly. It is easy to pop in a rule change any time a user authenticates or is disconnected for inactivity. > > My first thought for scaling up was to use the hash tables, and I am feeling that the last line in lartc's document page "12.4. Hashing filters for very fast massive filtering" which says "Note that this example could be improved to the ideal case where each chain contains 1 filter!" is a little misleading since no divisor above 256 works. On first reading, I 'm thinking, yeh, I'll just put a divisor of 16777216 and my problems are solved... nope.. wrong answer. I haven't even gotten to the point where I issue 32 million filter rules to tc and see if it chokes. > > I hate to have to ask, I am gratefull for all the work you have done just to get me here, I'm probably missing something important, but I'm trying to scale to 16 million potential clients and the only practical documentation I can find says thinking large is 200 clients. > > > thoughts, comments, ideas? solutions are best. > Thanks in advance, > Eric. > > > > > > > > ________________________________________________________________ > Sent via the WebMail system at ipro.net > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- ? ????????, ???????? ?????? http://www.netsecad.com http://www.supportbg.com From salatiel.filho at gmail.com Sat May 5 19:28:14 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Sat May 5 19:28:34 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: <4639DC30.4000901@andyfurniss.entadsl.com> References: <4639DC30.4000901@andyfurniss.entadsl.com> Message-ID: >On 5/3/07, Andy Furniss wrote: > Salatiel Filho wrote: > > How can i redirect all traffic that not come from port 80 to a flow ? > > > > i was thing about some like > > > > tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip sport > > !80 ...... > > > > But this not work. > > > > Another doubt, if i have two rules that intersects , for example , > > one filter with u32 match ip src 10.10.10.10 flowid 1:10 > > and other with u32 match sport 80 0xffff flowid 1:11 , which one will > > work in case of a packet to 10.10.10.10 with sport 80 ??? > > You need to use prio to order the rules - anything after a rule that > matches port 80 will be ! 80 - you cannot make a rule that negates > matches directly. If the structure of your htb etc is deep you can make > filters attach to parents other than root, but you need to filter the > traffic to those flowids first. You can match more than one thing with > one filter rule so you can match prio X src ip and 80 then follow with > prio (X+1) src ip. > > Andy. > > Well , i am having a few troubles making this work. I have some like this in pseudo tc rulez :) Root class Class 1 parent ROOT prio 0 filter u32 match sport 80 dst 10.0.0.254 Class 2 paret ROOT prio 0 filter u32 match dport 22 Class 3 parent ROOT prio 7 filter u32 match dst 10.0.0.254 default Shouldn't traffic from source port 80 and destination 10.0.0.254 go through class 1 ? I can not make a way to this work, traffic to 10.0.0.254 is always falling in to class 3 :/ Am i missing something ? -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From gsmdib at gmail.com Sat May 5 20:25:06 2007 From: gsmdib at gmail.com (Alex Girchenko) Date: Sat May 5 20:25:12 2007 Subject: [LARTC] julian's patches and custom routing Message-ID: <567e52cd0705051125s65eff659lbb9d0577a8dc08d3@mail.gmail.com> I'm using a 2.6.20-15-ubuntu (shipped with feisty) kernel with Julian's patches applied and it's my 3rd day with tc, ip, ifconfig and the rest ;). Got 2 ADSL uplinks. What I need is an ability to manually configure uplink usage, so nothing like bonding by default. Failover is meant to be provided via a shell script at the next step. Here is my config: == # no need for default route for now ip rule add prio 50 table main ip route del default table main # table and default route for gt ip rule add prio 201 from 101.64.106.28/30 table gt ip route add default via 101.64.105.29 dev eth2 src 101.64.105.30 proto static table gt ip route append prohibit default table gt metric 1 proto static # table and default route for ut ip rule add prio 202 from 192.168.1.0/30 table ut ip route add default via 192.168.1.1 dev eth3 src 192.168.1.2 proto static table ut ip route append prohibit default table ut metric 1 proto static # no interface specified ip rule add prio 222 table 222 ip route add default table 222 proto static nexthop via 192.168.1.1 dev eth3 nexthop via 101.64.105.29 dev eth2 == The prob is that in case I set iptables -t nat -A POSTROUTING -o eth3 -j SNAT --to 192.168.1.2, client machines can access inet w/o probs, while iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 101.64.105.30 would lead to a non-functional connection. Could anyone please give a hint on what am I doing so wrong? TIA. From lists at andyfurniss.entadsl.com Sat May 5 20:56:16 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 5 20:56:08 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: References: <4639DC30.4000901@andyfurniss.entadsl.com> Message-ID: <463CD350.3070606@andyfurniss.entadsl.com> Salatiel Filho wrote: > Well , i am having a few troubles making this work. > I have some like this in pseudo tc rulez :) > Root class > Class 1 parent ROOT prio 0 filter u32 match sport 80 dst 10.0.0.254 > Class 2 paret ROOT prio 0 filter u32 match dport 22 > Class 3 parent ROOT prio 7 filter u32 match dst 10.0.0.254 > default > > Shouldn't traffic from source port 80 and destination 10.0.0.254 go > through class 1 ? > I can not make a way to this work, traffic to 10.0.0.254 is always > falling in to class 3 :/ > Am i missing something ? prio 1 is the top prio for filters 0 ends up much lower. I think two prio 1s should work in order of entry, but I would use 1 and 2 to be sure. I have seen reverse order of entry if you don't use prio at all ... tc -s filter ls dev $DEV parent X:Y should help you see what's going on. Andy. From salatiel.filho at gmail.com Sat May 5 21:21:51 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Sat May 5 21:21:57 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: <463CD350.3070606@andyfurniss.entadsl.com> References: <4639DC30.4000901@andyfurniss.entadsl.com> <463CD350.3070606@andyfurniss.entadsl.com> Message-ID: On 5/5/07, Andy Furniss wrote: > Salatiel Filho wrote: > > > Well , i am having a few troubles making this work. > > I have some like this in pseudo tc rulez :) > > Root class > > Class 1 parent ROOT prio 0 filter u32 match sport 80 dst 10.0.0.254 > > Class 2 paret ROOT prio 0 filter u32 match dport 22 > > Class 3 parent ROOT prio 7 filter u32 match dst 10.0.0.254 > > default > > > > Shouldn't traffic from source port 80 and destination 10.0.0.254 go > > through class 1 ? > > I can not make a way to this work, traffic to 10.0.0.254 is always > > falling in to class 3 :/ > > Am i missing something ? > > prio 1 is the top prio for filters 0 ends up much lower. > > I think two prio 1s should work in order of entry, but I would use 1 and > 2 to be sure. I have seen reverse order of entry if you don't use prio > at all ... > > tc -s filter ls dev $DEV parent X:Y > > should help you see what's going on. > > Andy. > > > > Changed to this: tc qdisc add dev imq1 root handle 1: htb default 5 r2q 1 tc class add dev imq1 parent 1: classid 1:5 htb rate 8kbit ceil 8kbit prio 7 quantum 1500 # DEFAULT tc class add dev imq1 parent 1: classid 1:2 htb rate 1024kbit ceil 1024kbit prio 0 quantum 1500 tc filter add dev imq1 parent 1: protocol ip prio 1 u32 match ip dst 192.168.10.1 match ip sport 80 0xffff flowid 1:2 # FROM HTTP DEST TO 192.168.10.1 tc class add dev imq1 parent 1: classid 1:3 htb rate 1024kbit ceil 1024kbit prio 0 quantum 1500 tc class add dev imq1 parent 1:3 classid 1:900 htb rate 1024kbit ceil 1024kbit prio 7 quantum 1500 tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip dst 192.168.10.1 flowid 1:900 # ANY OTHER TRAFFIC TO 192.168.10.1 But all traffic is still flowing to 1:900 :/ -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From lists at andyfurniss.entadsl.com Sat May 5 23:38:49 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 5 23:38:36 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: References: <4639DC30.4000901@andyfurniss.entadsl.com> <463CD350.3070606@andyfurniss.entadsl.com> Message-ID: <463CF969.6080706@andyfurniss.entadsl.com> Salatiel Filho wrote: > On 5/5/07, Andy Furniss wrote: >> Salatiel Filho wrote: >> >> > Well , i am having a few troubles making this work. >> > I have some like this in pseudo tc rulez :) >> > Root class >> > Class 1 parent ROOT prio 0 filter u32 match sport 80 dst >> 10.0.0.254 >> > Class 2 paret ROOT prio 0 filter u32 match dport 22 >> > Class 3 parent ROOT prio 7 filter u32 match dst 10.0.0.254 >> > default >> > >> > Shouldn't traffic from source port 80 and destination 10.0.0.254 go >> > through class 1 ? >> > I can not make a way to this work, traffic to 10.0.0.254 is always >> > falling in to class 3 :/ >> > Am i missing something ? >> >> prio 1 is the top prio for filters 0 ends up much lower. >> >> I think two prio 1s should work in order of entry, but I would use 1 and >> 2 to be sure. I have seen reverse order of entry if you don't use prio >> at all ... >> >> tc -s filter ls dev $DEV parent X:Y >> >> should help you see what's going on. >> >> Andy. >> >> >> >> > > Changed to this: > > tc qdisc add dev imq1 root handle 1: htb default 5 r2q 1 > tc class add dev imq1 parent 1: classid 1:5 htb rate 8kbit ceil 8kbit > prio 7 quantum 1500 # DEFAULT > > tc class add dev imq1 parent 1: classid 1:2 htb rate 1024kbit ceil > 1024kbit prio 0 quantum 1500 > tc filter add dev imq1 parent 1: protocol ip prio 1 u32 match ip dst > 192.168.10.1 match ip sport 80 0xffff flowid 1:2 # FROM HTTP DEST TO > 192.168.10.1 > > tc class add dev imq1 parent 1: classid 1:3 htb rate 1024kbit ceil > 1024kbit prio 0 quantum 1500 > tc class add dev imq1 parent 1:3 classid 1:900 htb rate 1024kbit ceil > 1024kbit prio 7 quantum 1500 > tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip dst > 192.168.10.1 flowid 1:900 # ANY OTHER TRAFFIC TO 192.168.10.1 > > But all traffic is still flowing to 1:900 :/ Hmm that should work - as long as imq1 hooks in prerouting and after nat if it goes to 1:900 and not 1:5 I suppose it is seeing the address OK. This is ingress traffic and you are downloading from an http server? The way you have set up htb the classes won't share bandwidth. What does tc -s filter ls dev imq1 show? Andy. From netsecuredata at gmail.com Sun May 6 04:21:16 2007 From: netsecuredata at gmail.com (Jorge Evangelista) Date: Sun May 6 04:21:33 2007 Subject: [LARTC] Multiple SA in the same IPSec tunnel In-Reply-To: <00ff01c78e83$4669d770$303d5854@cttc.es> References: <00ff01c78e83$4669d770$303d5854@cttc.es> Message-ID: Hi Two days ago I have configured a VPN beetween Cisco & Linux, it works fine, I heard that sometimes happen problems incompatibilities with some distros linux with respect Diffie-Hellman algorithm. I have implemented it beetween PC running Centos 4.2 and Cisco 831. Here a miniguide. IPSEC VPN entre Cisco y Linux LINUX [root@mail ~]# cat /etc/racoon/psk.txt 200.18.25.58 cizc0linux [root@mail ~]# cat /etc/ipsec.conf flush; spdflush; spdadd 10.0.0.0/24 192.168.111.0/27 any -P out ipsec esp/tunnel/200.58.25.58-200.18.25.58/require; spdadd 192.168.111.0/27 10.0.0.0/24 any -P in ipsec esp/tunnel/200.18.25.58-200.58.25.58/require; [root@mail racoon]# cat racoon.conf path include "/etc/racoon"; path pre_shared_key "/etc/racoon/psk.txt"; listen { isakmp 200.58.25.58 [500]; strict_address; } remote 200.18.25.58 { exchange_mode main; proposal { encryption_algorithm 3des; hash_algorithm sha1; authentication_method pre_shared_key; dh_group 2; } } sainfo address 10.0.0.0/24 any address 192.168.111.0/27 any { pfs_group 2; lifetime time 80000 sec; encryption_algorithm 3des; authentication_algorithm hmac_sha1; compression_algorithm deflate; } iptables -A POSTROUTING -s 10.0.0.0/255.255.255.0 -o eth0 ! 192.168.111.0/27 -j SNAT --to-source 200.58.25.58 setkey -f /etc/ipsec.conf racoon -f /etc/racoon/racoon.conf -F -ddd CISCO crypto isakmp policy 10 encr 3des authentication pre-share group 2 lifetime 80000 crypto isakmp key cizc0linux address 200.58.25.58 ! ! crypto ipsec transform-set policy01 esp-3des esp-sha-hmac ! crypto map vpn-tunnel 10 ipsec-isakmp set peer 200.58.25.58 set security-association lifetime seconds 80000 set transform-set policy01 set pfs group2 match address 100 ! interface Ethernet1 description INTERFACE WAN ip address 200.18.25.58 255.255.255.252 no ip redirects no ip unreachables no ip proxy-arp ip nat outside load-interval 30 duplex full no cdp enable crypto map vpn-tunnel end ! interface Ethernet0 description INTERFACE LAN ip address 192.168.111.1 255.255.255.224 ip nat inside no cdp enable end ! access-list 100 permit ip 192.168.111.0 0.0.0.31 10.0.0.0 0.0.0.255 ! ip nat inside source list 101 interface Ethernet1 overload ! access-list 101 deny ip 192.168.111.0 0.0.0.31 10.0.0.0 0.0.0.255 access-list 101 permit ip 192.168.111.0 0.0.0.31 any On 5/4/07, Ferm?n Gal?n M?rquez wrote: > Hi, > > When a IPSec tunnel is established between two peers, I understand that the > "normal" situation is to have in a given moment two SAs, one for each > direction of the tunnel. > > However, in one of my tunnels (peer P1 running GNU/Linux with setkey and > racoon; peer P2 is a Cisco router) there is a large number (around 19) of > SAs established (this has been observed in P1 with 'setkey -D'). > > I've glooged around and the "multiplicy of SAs" seems to be a pathological > situation (as a matter of fact, connectivity trough that tunnel use to > fail). Although I'm not an expert in the internals of IKE protocol, I've > read that using 'initial_contact on' in the tunnel could help. However, > using that parameter in racoon.conf and restarting hasn't solved the problem > :( > > I would like to remark that P1 is running 6 tunnels and this only happens in > one of them (the other 5 seems to work fine with just a pair of SAs). Maybe > some Cisco-Linux interoperability issue? > > Any idea or suggestion about what can be happening? Please, tell me about if > you need to know any extra information (logs, etc.) > > Any help is very welcome. Thanks in advance! > > Best regards, > > -------------------- > Ferm?n Gal?n M?rquez > CTTC - Centre Tecnol?gic de Telecomunicacions de Catalunya > Parc Mediterrani de la Tecnologia, Av. del Canal Ol?mpic s/n, 08860 > Castelldefels, Spain > Room 1.02 > Tel : +34 93 645 29 12 > Fax : +34 93 645 29 01 > Email address: fermin dot galan at cttc dot es > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- "The network is the computer" From salatiel.filho at gmail.com Sun May 6 05:29:51 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Sun May 6 05:30:07 2007 Subject: [LARTC] tc u32 match !port In-Reply-To: <463CF969.6080706@andyfurniss.entadsl.com> References: <4639DC30.4000901@andyfurniss.entadsl.com> <463CD350.3070606@andyfurniss.entadsl.com> <463CF969.6080706@andyfurniss.entadsl.com> Message-ID: On 5/5/07, Andy Furniss wrote: > Salatiel Filho wrote: > > On 5/5/07, Andy Furniss wrote: > >> Salatiel Filho wrote: > >> > >> > Well , i am having a few troubles making this work. > >> > I have some like this in pseudo tc rulez :) > >> > Root class > >> > Class 1 parent ROOT prio 0 filter u32 match sport 80 dst > >> 10.0.0.254 > >> > Class 2 paret ROOT prio 0 filter u32 match dport 22 > >> > Class 3 parent ROOT prio 7 filter u32 match dst 10.0.0.254 > >> > default > >> > > >> > Shouldn't traffic from source port 80 and destination 10.0.0.254 go > >> > through class 1 ? > >> > I can not make a way to this work, traffic to 10.0.0.254 is always > >> > falling in to class 3 :/ > >> > Am i missing something ? > >> > >> prio 1 is the top prio for filters 0 ends up much lower. > >> > >> I think two prio 1s should work in order of entry, but I would use 1 and > >> 2 to be sure. I have seen reverse order of entry if you don't use prio > >> at all ... > >> > >> tc -s filter ls dev $DEV parent X:Y > >> > >> should help you see what's going on. > >> > >> Andy. > >> > >> > >> > >> > > > > Changed to this: > > > > tc qdisc add dev imq1 root handle 1: htb default 5 r2q 1 > > tc class add dev imq1 parent 1: classid 1:5 htb rate 8kbit ceil 8kbit > > prio 7 quantum 1500 # DEFAULT > > > > tc class add dev imq1 parent 1: classid 1:2 htb rate 1024kbit ceil > > 1024kbit prio 0 quantum 1500 > > tc filter add dev imq1 parent 1: protocol ip prio 1 u32 match ip dst > > 192.168.10.1 match ip sport 80 0xffff flowid 1:2 # FROM HTTP DEST TO > > 192.168.10.1 > > > > tc class add dev imq1 parent 1: classid 1:3 htb rate 1024kbit ceil > > 1024kbit prio 0 quantum 1500 > > tc class add dev imq1 parent 1:3 classid 1:900 htb rate 1024kbit ceil > > 1024kbit prio 7 quantum 1500 > > tc filter add dev imq1 parent 1: protocol ip prio 7 u32 match ip dst > > 192.168.10.1 flowid 1:900 # ANY OTHER TRAFFIC TO 192.168.10.1 > > > > But all traffic is still flowing to 1:900 :/ > > Hmm that should work - as long as imq1 hooks in prerouting and after nat > if it goes to 1:900 and not 1:5 I suppose it is seeing the address OK. Yes , IMQ hooks in prerouting after nat , i have a very odd setup. > > This is ingress traffic and you are downloading from an http server? Yeah :) > > The way you have set up htb the classes won't share bandwidth. I know , i need this in this class, like a said a odd setup :) > > What does tc -s filter ls dev imq1 show? Right now i can not copy the output here. But when i took a look i had ZERO packets going through that class :/ > > Andy. > > > > -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From ali.sattari at gmail.com Sun May 6 09:31:23 2007 From: ali.sattari at gmail.com (Ali Sattari) Date: Sun May 6 09:31:35 2007 Subject: [LARTC] Using multiple network interfaces (internet connections) separately Message-ID: <8798976b0705060031g7a428c4bjdeec0113f0eb3804@mail.gmail.com> Hi, I need a solution for this case: I have a PC(as server) with 3 (or more) Ethernet ports and 3 (or more) Internet access through each Ethernet interface. (from different ISP's and with different IP's of course) I need to download files (using wget or whatever else) through each interface (internet line) separately. For example i need to download "file1" through eth1 (isp1), "file2" through eth2 (isp2) and so on. How can i make this working? any iptables/iproute rules? any Idea? Thanks in Advance, -- Ali Sattari (AKA Ali ix) http://corelist.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070506/d685dd4f/attachment.html From randywallacejr at gmail.com Sun May 6 14:06:43 2007 From: randywallacejr at gmail.com (Randy Wallace) Date: Sun May 6 14:06:50 2007 Subject: [LARTC] Re: Using multiple network interfaces (internet connections) separately Message-ID: <861508be0705060506u5ad0e61cx31d40092d422a8f8@mail.gmail.com> > Hi, > > I need a solution for this case: > I have a PC(as server) with 3 (or more) Ethernet ports and 3 (or more) > Internet access through each Ethernet interface. (from different ISP's and > with different IP's of course) > > I need to download files (using wget or whatever else) through each > interface (internet line) separately. > For example i need to download "file1" through eth1 (isp1), "file2" through > eth2 (isp2) and so on. > > How can i make this working? any iptables/iproute rules? any Idea? > > Thanks in Advance, > -- > Ali Sattari (AKA Ali ix) Ali ix, This is an application for rules, in the iproute package. how you select packets for which internet connection can, best, be done by iptables using firewall marks. The trick is, you can have only one default gateway, unless you use the multiple gateway patch, which may not be necessary for what you're talking about. The real question is: how do you plan on classifying traffic? * different hosts (IP's) per gateway? * random selection of gateway, per TCP connection? * different types of traffic (Ports) per gateway? * certain domains (only) available on each gateway? -Randy From rangi at ngen.net.nz Sun May 6 22:14:32 2007 From: rangi at ngen.net.nz (Rangi Biddle) Date: Sun May 6 22:14:58 2007 Subject: [LARTC] Traffic Shaping Message-ID: <00b801c7901b$2b988100$82c98300$@net.nz> Dear List, I am wanting to perform some traffic shaping as the subject of this email suggests. What I am wanting to do is this; I would like to have traffic shaping performed on the following protocols: HTTP, RDP, GRE, PPTP, SIP and IAX. Obviously I would like to have highest priority set for voice packets so much so that the general http traffic does not impede on the voice packets. I would like to have ample bandwidth available for RDP so that I am able to connect to a remote site and not have too much lag but ample enough that most tasks can be done. HTTP traffic would possibly have the lowest priority of all the protocols that I have listed. So to clarify priority would be something such as this: 1. IAX 2. SIP 3. GRE 4. PPTP 5. RDP 6. HTTP I have a linux gateway that I will use for performing the traffic shaping and is setup in the following way: ------------- ------------ --------- | ADSL | <----------> | LINUX | <----------> | LAN | ------------- ------------ --------- I plan to have the ADSL router forward all traffic to the linux gateway using something similar to a BIMAP rule where all incoming and outgoing traffic is made to appear to come from the public IP address. I welcome any and all suggestions but would possibly prefer the most elegant of solutions J Many thanks in advance Rangi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070507/054cddc5/attachment.htm From simo at mix4web.de Mon May 7 10:25:23 2007 From: simo at mix4web.de (Simo) Date: Mon May 7 10:25:42 2007 Subject: AW: [LARTC] Traffic Shaping In-Reply-To: <00b801c7901b$2b988100$82c98300$@net.nz> References: <00b801c7901b$2b988100$82c98300$@net.nz> Message-ID: <001a01c79081$426366a0$c72a33e0$@de> Hi Rangi, Bandwidth ist important, but VoIP needs more than this. Voice traffic needs low latency of packets. That's why traffic shaping maybe not lose your problem. in this a HFCS queuing descipline is used instead of HTB, because this can separate between bandwidth and delay. For more Information about this can you find here: http://linux-ip.net/articles/hfsc.en/ bye Simo Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Rangi Biddle Gesendet: Sonntag, 6. Mai 2007 22:15 An: lartc@mailman.ds9a.nl Betreff: [LARTC] Traffic Shaping Dear List, I am wanting to perform some traffic shaping as the subject of this email suggests. What I am wanting to do is this; I would like to have traffic shaping performed on the following protocols: HTTP, RDP, GRE, PPTP, SIP and IAX. Obviously I would like to have highest priority set for voice packets so much so that the general http traffic does not impede on the voice packets. I would like to have ample bandwidth available for RDP so that I am able to connect to a remote site and not have too much lag but ample enough that most tasks can be done. HTTP traffic would possibly have the lowest priority of all the protocols that I have listed. So to clarify priority would be something such as this: 1. IAX 2. SIP 3. GRE 4. PPTP 5. RDP 6. HTTP I have a linux gateway that I will use for performing the traffic shaping and is setup in the following way: ------------- ------------ --------- | ADSL | <----------> | LINUX | <----------> | LAN | ------------- ------------ --------- I plan to have the ADSL router forward all traffic to the linux gateway using something similar to a BIMAP rule where all incoming and outgoing traffic is made to appear to come from the public IP address. I welcome any and all suggestions but would possibly prefer the most elegant of solutions J Many thanks in advance Rangi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070507/4a44de29/attachment.html From b42-ml at srck.net Mon May 7 14:07:34 2007 From: b42-ml at srck.net (Martin Milata) Date: Mon May 7 14:07:27 2007 Subject: [LARTC] Strange problem with HTB Message-ID: <20070507120734.GA11512@nyx> Hi list, I've got quite a strange problem with htb. I have following configuration: dual core athlon, two intel e1000 nics - eth1 is connected to lan and has private ip, eth0 is connected to our isp and has public ip (so there's nat on eth0). There is practically same htb configuration on both interfaces, only the filters are different. On eth1, packets are classified by their destination address, howewer this does not work on eth0, because packets are already natted when they reach scheduling subsystem -- so their source ip is copied into fwmark by "IPMARK" iptables rule and classified according to this mark. Every ip has it's own htb class and for each ip something like this is run: tc class add dev eth0 parent 1:0011 classid 1:00ab htb rate 96kbit ceil 1000kbit prio 1 quantum 1500 tc qdisc add dev eth0 parent 1:00ab handle 00ab: esfq perturb 5 hash src tc class add dev eth1 parent 1:0011 classid 1:00ab htb rate 96kbit ceil 2000kbit prio 1 quantum 1500 tc qdisc add dev eth1 parent 1:00ab handle 00ab: esfq perturb 5 hash dst tc filter add dev eth0 protocol ip prio 5 parent 1:0 u32 ht 800:0: match mark 0x0a9ad002 0xffffffff flowid 1:00ab tc filter add dev eth1 protocol ip prio 5 parent 1:0 u32 ht 2:02: match ip dst 10.154.208.2 flowid 1:00ab Few days ago, I noticed that shaping on eth0 does not work. It probably happened at the same time I recompiled the kernel to support SMP, changed HZ from 250 to 1000 and changed packet scheduling subsystem clock source from CPU to gettimeofday() (because of smp). However, it doesn't neccessarily have to be the cause of the problem. Shaping on eth1 works ok, though the only difference is in the rate/ceil values and in the filters, and the filters work right -- packets reach correct class (byte/packet counters are incremented). Here is snippet of output from "tc -s class ls dev eth0". It seems strange to me that the "rate" value is actually higher than the ceil of the class -- can't it be kernel/tc bug or do I just misinterpreted the meaning of rate/ceil? class htb 1:ab parent 1:11 leaf ab: prio 1 rate 97000bit ceil 1000Kbit burst 1611b cburst 1725b Sent 184781922 bytes 122597 pkt (dropped 0, overlimits 0 requeues 0) rate 2465Kbit 203pps backlog 0b 7p requeues 0 lended: 96087 borrowed: 26507 giants: 0 tokens: -238223 ctokens: -19556 Does anyone have clue what can be wrong? Thanks in advance and sorry for my english, -MM From salatiel.filho at gmail.com Mon May 7 15:32:07 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Mon May 7 15:32:16 2007 Subject: [LARTC] IMQ KERNEL PANIC 2.6.17.14 AND 2.6.21.1 No chain/target/match by that name Message-ID: After starting to shape local traffic now i am getting a lot of kernel panics in tcp_retransmit, so i decided to update my kernel from 2.6.17.14 to 2.6.21.1 , the problem is that after that i get: # iptables -t mangle -A POSTROUTING -o eth0 -j IMQ --todev 0 iptables: No chain/target/match by that name so i can not redirect ttraffic to IMQ device. and modules are loaded. - # lsmod Module Size Used by ipt_ipp2p 6656 2 ipt_MASQUERADE 2688 1 sch_sfq 4864 31 cls_u32 6660 8 sch_htb 14208 2 ipt_IMQ 1792 0 imq 3592 0 xt_mac 1792 19 ipt_LOG 5504 2 xt_limit 2304 2 xt_multiport 3200 4 xt_state 2176 3 iptable_mangle 2304 1 iptable_nat 6020 1 nf_nat 13996 2 ipt_MASQUERADE,iptable_nat nf_conntrack_ipv4 12940 5 iptable_nat nf_conntrack 46584 5 ipt_MASQUERADE,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4 nfnetlink 4888 3 nf_nat,nf_conntrack_ipv4,nf_conntrack iptable_filter 2436 1 ip_tables 9560 3 iptable_mangle,iptable_nat,iptable_filter usbhid 19424 0 uhci_hcd 18836 0 via_rhine 18456 0 3c59x 35820 0 Any help ?? []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From kuolung at ms.kuolung.net Mon May 7 15:55:15 2007 From: kuolung at ms.kuolung.net (Kuolung) Date: Mon May 7 15:55:50 2007 Subject: [LARTC] Re: Using multiple network interfaces (internetconnections) separately References: <861508be0705060506u5ad0e61cx31d40092d422a8f8@mail.gmail.com> Message-ID: <00b301c790af$586ddce0$140aa8c0@AMD3000> Hi , I want to use the " * random selection of gateway, per TCP connection?" ,i can do it right now but my if same remote site( ip ) always goto the same gateway,i think that is ip_route_cache problem or something like this how can I do ?? Kuolung ----- Original Message ----- From: "Randy Wallace" To: Sent: Sunday, May 06, 2007 8:06 PM Subject: [LARTC] Re: Using multiple network interfaces (internetconnections) separately >> Hi, >> >> I need a solution for this case: >> I have a PC(as server) with 3 (or more) Ethernet ports and 3 (or more) >> Internet access through each Ethernet interface. (from different ISP's >> and >> with different IP's of course) >> >> I need to download files (using wget or whatever else) through each >> interface (internet line) separately. >> For example i need to download "file1" through eth1 (isp1), "file2" >> through >> eth2 (isp2) and so on. >> >> How can i make this working? any iptables/iproute rules? any Idea? >> >> Thanks in Advance, >> -- >> Ali Sattari (AKA Ali ix) > > Ali ix, > > This is an application for rules, in the iproute package. how you > select packets > for which internet connection can, best, be done by iptables using > firewall marks. > > The trick is, you can have only one default gateway, unless you use the > multiple > gateway patch, which may not be necessary for what you're talking about. > > The real question is: how do you plan on classifying traffic? > * different hosts (IP's) per gateway? > * random selection of gateway, per TCP connection? > * different types of traffic (Ports) per gateway? > * certain domains (only) available on each gateway? > > -Randy > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From rangi at ngen.net.nz Mon May 7 20:06:30 2007 From: rangi at ngen.net.nz (Rangi Biddle) Date: Mon May 7 20:06:58 2007 Subject: [LARTC] Traffic Shaping In-Reply-To: <001a01c79081$426366a0$c72a33e0$@de> References: <00b801c7901b$2b988100$82c98300$@net.nz> <001a01c79081$426366a0$c72a33e0$@de> Message-ID: <000501c790d2$72c2cf10$58486d30$@net.nz> HI Simo, Thanks for the info. Very interesting read. I forgot to mention in the post that I am still relatively new to traffic shaping with Linux but was still able to more than comprehend the info in that document. Many thanks again. One thing that I am slightly uncertain of though is that I would prefer not to divide the bandwidth between x amount of people but rather designate a priority that packets take over each other which that info doesn't cover. Is it still possible using HFSC to accomplish this? Kind regards, Rangi From: Simo [mailto:simo@mix4web.de] Sent: Monday, May 07, 2007 8:25 PM To: 'Rangi Biddle'; lartc@mailman.ds9a.nl Subject: AW: [LARTC] Traffic Shaping Hi Rangi, Bandwidth ist important, but VoIP needs more than this. Voice traffic needs low latency of packets. That's why traffic shaping maybe not lose your problem. in this a HFCS queuing descipline is used instead of HTB, because this can separate between bandwidth and delay. For more Information about this can you find here: http://linux-ip.net/articles/hfsc.en/ bye Simo Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Rangi Biddle Gesendet: Sonntag, 6. Mai 2007 22:15 An: lartc@mailman.ds9a.nl Betreff: [LARTC] Traffic Shaping Dear List, I am wanting to perform some traffic shaping as the subject of this email suggests. What I am wanting to do is this; I would like to have traffic shaping performed on the following protocols: HTTP, RDP, GRE, PPTP, SIP and IAX. Obviously I would like to have highest priority set for voice packets so much so that the general http traffic does not impede on the voice packets. I would like to have ample bandwidth available for RDP so that I am able to connect to a remote site and not have too much lag but ample enough that most tasks can be done. HTTP traffic would possibly have the lowest priority of all the protocols that I have listed. So to clarify priority would be something such as this: 1. IAX 2. SIP 3. GRE 4. PPTP 5. RDP 6. HTTP I have a linux gateway that I will use for performing the traffic shaping and is setup in the following way: ------------- ------------ --------- | ADSL | <----------> | LINUX | <----------> | LAN | ------------- ------------ --------- I plan to have the ADSL router forward all traffic to the linux gateway using something similar to a BIMAP rule where all incoming and outgoing traffic is made to appear to come from the public IP address. I welcome any and all suggestions but would possibly prefer the most elegant of solutions J Many thanks in advance Rangi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070508/1c329695/attachment.htm From nic-lartc at studentergaarden.dk Tue May 8 03:51:12 2007 From: nic-lartc at studentergaarden.dk (nic-lartc@studentergaarden.dk) Date: Tue May 8 03:51:17 2007 Subject: [LARTC] limit bandwidth per host question Message-ID: <463FD790.8010709@studentergaarden.dk> EHLO tc gurus. New to traffic control. Unfortunately, the politicians here in Denmark have decided that a PC is the same as a television set - so anyone owning a PC and internet connection of over 255 kbit/s must pay DKR 2200/year = EUR 300 = USD 400 in television licence fees :-( This is a lot of money for poor students, so we want to offer the students the *option* of limiting their download speed to 255 kbit/s. Limit must be per internal IP number (or MAC address, even better). Situation: dorm rooms, 130 residents, Internet connection is 100 Mbit full duplex fiber Ethernet, never over 10% used. Router/firewall is a Debian/Etch box 650 Mhz, 160 Mb RAM, with kernel 2.6, iptables, netfilter iproute2 & everything necessary. eth0 = internet, eth1 = DMZ, eth2 = internal NATted network, 172.16.0.0/16 As far as I can see, this should do the trick?: # delete all existing queue disciplines tc qdisc del dev eth2 root # attach queue discipline HTB to interface eth2 and give it handle 1:0 tc qdisc add dev eth2 root handle 1:0 htb # host 1 tc class add dev eth2 parent 1:0 classid 1:1 htb rate 255kbit burst 255kbit tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ match ip dst 172.16.255.132 flowid 1:1 # host 2 tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst 255kbit tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ match ip dst 172.16.255.145 flowid 1:2 # etc etc etc Questions: 1) Is this a good way of doing it? 2) TBF or HTB? I just chose HTB because it seems more flexible and has sane defaults, so I don't have to think so much. Are there any disadvantages? 3) Any clever suggestions on how to best implement the stupid law with the least harm to our users (for example, maybe we could have a relatively high burst bandwidth, with the real limiting to 255 Kbit/s only kicking in after several seconds? This might make normal web surfing seem almost unaffected? Thanks, Nicolas From rangi at ngen.net.nz Tue May 8 05:16:08 2007 From: rangi at ngen.net.nz (Rangi Biddle) Date: Tue May 8 05:16:36 2007 Subject: [LARTC] Traffic Shaping In-Reply-To: <000f01c790e9$19e20110$4da60330$@de> References: <00b801c7901b$2b988100$82c98300$@net.nz> <001a01c79081$426366a0$c72a33e0$@de> <000501c790d2$72c2cf10$58486d30$@net.nz> <000f01c790e9$19e20110$4da60330$@de> Message-ID: <02d701c7911f$3add16d0$b0974470$@net.nz> Hi Simo, I?ve just started to take a look into tcng. Looks promising, but I?m not sure that I have the time to spend fully investigating the tool. Plus I haven?t had much luck getting tcsim to compile as I am running a 2.6.9 kernel and tcsim is currently targeted at a 2.5.4 kernel. What would be very helpful is something complete that I can fiddle with and customize to my needs. I don?t believe I mentioned this already but it is for a client that has only recently been having issues since they have begun using RDP clients. They are looking at VOIP at a later stage but I would like to have something at least in place to prioritize packets. Kind regards, Rangi PS. I am still rather new to tc in linux. From: Simo [mailto:simo@mix4web.de] Sent: Tuesday, May 08, 2007 8:49 AM To: 'Rangi Biddle' Subject: AW: [LARTC] Traffic Shaping Hi Rangi, if i have understoud, what do you mean. I?ll say, you need to use the PRIO queuing descipline. With this qdisc you can define an amount of Bands (priority FIFOs) to serve the network packets and you don?t need to devide the bandwidth. Here a link to an illustration: http://www.linux-ip.net/articles/Traffic-Control-HOWTO/images/pfifo_fast-qdi sc.png The Problem by this qdisc is, if too many high priority Packets in the qdisc were enqued, the rest of the traffic in the other low priority bands or FIFOs will be ignored und will have a high latency That?s why you can use the prio qdisc combined with tbf qdisc. I think that will solve your problem How do you use the linux traffic control system? Do you use the tcng tool? If so, i can send you a script for your problem, and we can simulate this with the tcsim component of tcng tool befor use Sorry for my english, i?m from morocco and i?m studying in germany ;) Kind regards Simo Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Rangi Biddle Gesendet: Montag, 7. Mai 2007 20:07 An: lartc@mailman.ds9a.nl Betreff: RE: [LARTC] Traffic Shaping HI Simo, Thanks for the info. Very interesting read. I forgot to mention in the post that I am still relatively new to traffic shaping with Linux but was still able to more than comprehend the info in that document. Many thanks again. One thing that I am slightly uncertain of though is that I would prefer not to divide the bandwidth between x amount of people but rather designate a priority that packets take over each other which that info doesn?t cover. Is it still possible using HFSC to accomplish this? Kind regards, Rangi From: Simo [mailto:simo@mix4web.de] Sent: Monday, May 07, 2007 8:25 PM To: 'Rangi Biddle'; lartc@mailman.ds9a.nl Subject: AW: [LARTC] Traffic Shaping Hi Rangi, Bandwidth ist important, but VoIP needs more than this. Voice traffic needs low latency of packets. That?s why traffic shaping maybe not lose your problem. in this a HFCS queuing descipline is used instead of HTB, because this can separate between bandwidth and delay. For more Information about this can you find here: http://linux-ip.net/articles/hfsc.en/ bye Simo Von: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] Im Auftrag von Rangi Biddle Gesendet: Sonntag, 6. Mai 2007 22:15 An: lartc@mailman.ds9a.nl Betreff: [LARTC] Traffic Shaping Dear List, I am wanting to perform some traffic shaping as the subject of this email suggests. What I am wanting to do is this; I would like to have traffic shaping performed on the following protocols: HTTP, RDP, GRE, PPTP, SIP and IAX. Obviously I would like to have highest priority set for voice packets so much so that the general http traffic does not impede on the voice packets. I would like to have ample bandwidth available for RDP so that I am able to connect to a remote site and not have too much lag but ample enough that most tasks can be done. HTTP traffic would possibly have the lowest priority of all the protocols that I have listed. So to clarify priority would be something such as this: 1. IAX 2. SIP 3. GRE 4. PPTP 5. RDP 6. HTTP I have a linux gateway that I will use for performing the traffic shaping and is setup in the following way: ------------- ------------ --------- | ADSL | <----------> | LINUX | <----------> | LAN | ------------- ------------ --------- I plan to have the ADSL router forward all traffic to the linux gateway using something similar to a BIMAP rule where all incoming and outgoing traffic is made to appear to come from the public IP address. I welcome any and all suggestions but would possibly prefer the most elegant of solutions J Many thanks in advance Rangi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070508/c7718b3b/attachment.html From michael at hotplate.co.nz Tue May 8 05:32:49 2007 From: michael at hotplate.co.nz (Michael Fincham) Date: Tue May 8 05:32:45 2007 Subject: [LARTC] vconfig + Q in Q or 'vlan stacking' Message-ID: <1178595169.5360.1.camel@michael-desktop> Hello again everyone, Does anyone know the status of Q in Q or vlan stacking in the linux kernel? I've tried just adding a vlan on a vlan interface, but the recieving end seems to just see the inner vlan tag and not the outer one first. -Michael From pwl at 4me.pl Tue May 8 13:44:29 2007 From: pwl at 4me.pl (=?iso-8859-2?Q?Piotr_W=F3jcicki?=) Date: Tue May 8 13:44:46 2007 Subject: [LARTC] Token Bucket Filter and Dropping Message-ID: <002a01c79166$3e53a7b0$bafaf710$@pl> I am trying to create my own Token Bucket Filter. However, I have a problem with packet dropping. Scenario : I got two streams 20KB/s each. I got one bucket with rate 20KB/s I put both streams into this bucket. When buffer is full packets need to be dropped. The problem is that only every other packet needs to be dropped in this scenario. Streams are the same so queue looks like that : S1 | S2 | S1 | S2 Packets form both streams are one by one. The result is that all packets from stream S1 are being dropped and all packets from Stream S2 are being sent. Ideally half of dropped packets would be from S1 and half from S1. What are possible solutions to this problem ? Piotr Wojcicki From marco.casaroli at gmail.com Tue May 8 16:03:43 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Tue May 8 16:03:51 2007 Subject: [LARTC] Token Bucket Filter and Dropping In-Reply-To: <002a01c79166$3e53a7b0$bafaf710$@pl> References: <002a01c79166$3e53a7b0$bafaf710$@pl> Message-ID: <92ed523b0705080703t2540a164t3857af45d60c6e16@mail.gmail.com> you need hierarchical token bucket for that have you tried HTB? On 5/8/07, Piotr W?jcicki wrote: > I am trying to create my own Token Bucket Filter. However, I have a problem > with packet dropping. > > Scenario : > I got two streams 20KB/s each. > I got one bucket with rate 20KB/s > > I put both streams into this bucket. > > When buffer is full packets need to be dropped. The problem is that only > every other packet needs to be dropped in this scenario. > Streams are the same so queue looks like that : > > S1 | S2 | S1 | S2 > > Packets form both streams are one by one. > The result is that all packets from stream S1 are being dropped and all > packets from Stream S2 are being sent. > Ideally half of dropped packets would be from S1 and half from S1. > > What are possible solutions to this problem ? > > > Piotr Wojcicki > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Marco From nic-lartc at studentergaarden.dk Tue May 8 16:44:41 2007 From: nic-lartc at studentergaarden.dk (nic-lartc@studentergaarden.dk) Date: Tue May 8 16:44:31 2007 Subject: [LARTC] limit bandwidth per host question In-Reply-To: <00b201c79148$e465d420$0402a8c0@southern> References: <463FD790.8010709@studentergaarden.dk> <00b201c79148$e465d420$0402a8c0@southern> Message-ID: <46408CD9.2090101@studentergaarden.dk> If you mean BrazilFW? http://www.brazilfw.com.br this is not an option - we have a well functioning firewall with 4 interfaces, VPN, logging, an advanced quota system etc. We do not want a micro-floppy distro - just need to add traffic control to the existing Debian box. Nicolas hareram wrote: > Hi > > look at the BFW does the job of all you need > > hare > ----- Original Message ----- From: > To: > Sent: Tuesday, May 08, 2007 7:21 AM > Subject: [LARTC] limit bandwidth per host question > > >> EHLO tc gurus. >> >> New to traffic control. Unfortunately, the politicians here in >> Denmark have decided that a PC is the same as a television set - so >> anyone owning a PC and internet connection of over 255 kbit/s must >> pay DKR 2200/year = EUR 300 = USD 400 in television licence fees :-( >> This is a lot of money for poor students, so we want to offer the >> students the *option* of limiting their download speed to 255 kbit/s. >> Limit must be per internal IP number (or MAC address, even better). >> >> Situation: dorm rooms, 130 residents, Internet connection is 100 Mbit >> full duplex fiber Ethernet, never over 10% used. Router/firewall is a >> Debian/Etch box 650 Mhz, 160 Mb RAM, with kernel 2.6, iptables, >> netfilter iproute2 & everything necessary. >> >> eth0 = internet, eth1 = DMZ, eth2 = internal NATted network, >> 172.16.0.0/16 >> >> As far as I can see, this should do the trick?: >> >> # delete all existing queue disciplines >> tc qdisc del dev eth2 root >> >> # attach queue discipline HTB to interface eth2 and give it handle 1:0 >> tc qdisc add dev eth2 root handle 1:0 htb >> >> # host 1 >> tc class add dev eth2 parent 1:0 classid 1:1 htb rate 255kbit burst >> 255kbit >> tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ >> match ip dst 172.16.255.132 flowid 1:1 >> >> # host 2 >> tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst >> 255kbit >> tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ >> match ip dst 172.16.255.145 flowid 1:2 >> >> # etc etc etc >> >> Questions: >> >> 1) Is this a good way of doing it? >> >> 2) TBF or HTB? I just chose HTB because it seems more flexible and >> has sane defaults, so I don't have to think so much. Are there any >> disadvantages? >> >> 3) Any clever suggestions on how to best implement the stupid law >> with the least harm to our users (for example, maybe we could have a >> relatively high burst bandwidth, with the real limiting to 255 Kbit/s >> only kicking in after several seconds? This might make normal web >> surfing seem almost unaffected? >> >> Thanks, >> Nicolas >> >> _______________________________________________ >> LARTC mailing list >> LARTC@mailman.ds9a.nl >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >> >> > > From pwl at 4me.pl Tue May 8 17:08:27 2007 From: pwl at 4me.pl (=?iso-8859-2?Q?Piotr_W=F3jcicki?=) Date: Tue May 8 17:08:37 2007 Subject: [LARTC] Token Bucket Filter and Dropping In-Reply-To: <92ed523b0705080703t2540a164t3857af45d60c6e16@mail.gmail.com> References: <002a01c79166$3e53a7b0$bafaf710$@pl> <92ed523b0705080703t2540a164t3857af45d60c6e16@mail.gmail.com> Message-ID: <003201c79182$bba02880$32e07980$@pl> I am more like creating my own filter. So separate streams shouldn't be put in one bucket ? But I believe it is possible in htb... -----Original Message----- From: Marco Aurelio [mailto:marco.casaroli@gmail.com] Sent: Tuesday, May 08, 2007 4:04 PM To: Piotr W?jcicki; lartc@mailman.ds9a.nl Subject: Re: [LARTC] Token Bucket Filter and Dropping you need hierarchical token bucket for that have you tried HTB? On 5/8/07, Piotr W?jcicki wrote: > I am trying to create my own Token Bucket Filter. However, I have a problem > with packet dropping. From salatiel.filho at gmail.com Tue May 8 17:25:25 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Tue May 8 17:25:30 2007 Subject: [LARTC] Re: IMQ KERNEL PANIC 2.6.17.14 AND 2.6.21.1 No chain/target/match by that name In-Reply-To: References: Message-ID: On 5/7/07, Salatiel Filho wrote: > After starting to shape local traffic now i am getting a lot of kernel > panics in tcp_retransmit, so i decided to update my kernel from > 2.6.17.14 to 2.6.21.1 , the problem is that after that i get: > > # iptables -t mangle -A POSTROUTING -o eth0 -j IMQ --todev 0 > iptables: No chain/target/match by that name > > so i can not redirect ttraffic to IMQ device. > > > and modules are loaded. > - > # lsmod > Module Size Used by > ipt_ipp2p 6656 2 > ipt_MASQUERADE 2688 1 > sch_sfq 4864 31 > cls_u32 6660 8 > sch_htb 14208 2 > ipt_IMQ 1792 0 > imq 3592 0 > xt_mac 1792 19 > ipt_LOG 5504 2 > xt_limit 2304 2 > xt_multiport 3200 4 > xt_state 2176 3 > iptable_mangle 2304 1 > iptable_nat 6020 1 > nf_nat 13996 2 ipt_MASQUERADE,iptable_nat > nf_conntrack_ipv4 12940 5 iptable_nat > nf_conntrack 46584 5 > ipt_MASQUERADE,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4 > nfnetlink 4888 3 nf_nat,nf_conntrack_ipv4,nf_conntrack > iptable_filter 2436 1 > ip_tables 9560 3 iptable_mangle,iptable_nat,iptable_filter > usbhid 19424 0 > uhci_hcd 18836 0 > via_rhine 18456 0 > 3c59x 35820 0 > > Any help ?? > > > []'s > Salatiel > > "O maior prazer do inteligente ? bancar o idiota > diante de um idiota que banca o inteligente". > 2.6.20.11 iptables command works , but i still get kernel panic :/ What is the problem in redirect a local traffic to IMQ ? I redirect squid traffic to the IMQ device. [I need this behaviour] -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". From stas at crypt.org.ru Tue May 8 20:10:40 2007 From: stas at crypt.org.ru (Stanislav Kruchinin) Date: Tue May 8 20:10:10 2007 Subject: [LARTC] Massive filtering In-Reply-To: <200705050130.AA2025718096@ipro.net> References: <200705050130.AA2025718096@ipro.net> Message-ID: <4640BD20.5030306@crypt.org.ru> ericr wrote: > My first thought for scaling up was to use the hash tables, and I am > feeling that the last line in lartc's document page "12.4. Hashing filters > for very fast massive filtering" which says "Note that this example could > be improved to the ideal case where each chain contains 1 filter!" is a > little misleading since no divisor above 256 works. On first reading, I 'm > thinking, yeh, I'll just put a divisor of 16777216 and my problems are > solved... nope.. wrong answer. I haven't even gotten to the point where I > issue 32 million filter rules to tc and see if it chokes. The only solution in the case of thousands of rules is the u32 classifier with hashing filters. Unfortunately, divisor's upper limit is 256, and it's not appropriate for the practical tasks. From the other side, hashes with very large number of buckets (like 16777216, you said) can't be implemented, because they will require much more RAM than you can address. I have similar task and some days ago started to work on patches for tc and u32 classifier that will allow to use large hashes (see my recent messages at linux-net@ mailing list archive). I'm a newbie in a Linux kernel and I can't complete this task fast. I think we should ask for help from experienced developers. From andy at andybev.com Tue May 8 20:19:38 2007 From: andy at andybev.com (Andrew Beverley) Date: Tue May 8 20:22:25 2007 Subject: [LARTC] Re: IMQ KERNEL PANIC 2.6.17.14 AND 2.6.21.1 No chain/target/match by that name In-Reply-To: References: Message-ID: <1178648378.10842.1.camel@andybev.localdomain> > 2.6.20.11 iptables command works , but i still get kernel panic :/ > What is the problem in redirect a local traffic to IMQ ? I redirect > squid traffic to the IMQ device. [I need this behaviour] You could try IFB, which is already in the vanilla kernel. However, it is slightly more limited as to where you can hook it. From fssilva at gmail.com Tue May 8 21:12:59 2007 From: fssilva at gmail.com (Fabio Silva) Date: Tue May 8 21:13:07 2007 Subject: [LARTC] Squid + iproute2 Message-ID: Hi all, i have a problem i have this topology 192.168.1.7 GW 192.168.2.252 link 1 ------------------------------------ link 2 | | eth1 PROXY eth0 192.168.1.245 192.168.2.245 The default gw of the PROXY is 192.168.1.7 and the link2 is a Secondary link that i need to go out to internet!!!! My internal IP of the network is 192.168.2.0/24 Im using this #!/bin/bash # # Legenda: # eth0 Link2 # eth1 link1 # # # Resetando o Firewall: echo -n "Resetando regras existentes" iptables -F iptables -Z iptables -X iptables -t nat -F iptables -P INPUT ACCEPT iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT echo "[OK]" # OBS: essa regra eh mutuamente excludente com a proxima, a do NAT # ou seja, escolha uma das duas echo -n "Habilitando o mascaramento..." #iptables -t nat -A POSTROUTING -j MASQUERADE iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE echo "[OK]" # Marcando pacotes echo -n "Marcando pacotes..." iptables -A PREROUTING -t mangle -s 192.168.2.0/24 -d 0/0 -j MARK --set-mark 3 echo "[OK]" # Desabilitando o filtro de pacotes do martian source echo -n "Desligando rp_filter..." for eee in /proc/sys/net/ipv4/conf/*/rp_filter; do echo 0 > $eee done echo "[OK]" # Definindo regras de balanceamento de Link: echo -n "Balanceando links..." # link #1 ip route add 192.168.1.0/24 dev eth1 src 192.168.1.245 table link1 #ip route add 192.168.0.0/24 via 192.168.0.1 table link1 ip route add default via 192.168.1.7 table link1 # link Default ip route add 192.168.2.0/24 dev eth0 src 192.168.2.245 table link #ip route add 192.168.0.0/24 via 192.168.0.1 table link ip route add default via 192.168.2.252 table link # tabela principal de roteamento ip route add 192.168.1.0/24 dev eth1 src 192.168.1.245 ip route add 192.168.2.0/24 dev eth0 src 192.168.2.245 # setando a rota preferencial ip route add default via 192.168.1.7 # regras das tabelas ip rule add from 192.168.1.245 table link1 ip rule add from 192.168.2.245 table link # balanceamento de link ip rule add fwmark 3 lookup link prio 3 ip route add default table link nexthop via 192.168.1.7 dev eth1 weight 1 nexthop via 192.168.2.252 dev eth0 weight 1 # flush no roteamento ip route flush cache echo "[OK]" sleep 2 But... if i shutdown the link to ip 192.168.1.7 it didnt re-route to another gateway 192.168.2.252. Any clue? Regards, -- Fabio S. Silva From etienne.carriere at philips.com Wed May 9 11:53:51 2007 From: etienne.carriere at philips.com (Etienne Carriere) Date: Wed May 9 11:52:29 2007 Subject: [LARTC] limit bandwidth per host question Message-ID: Hi, May be you should check HTB and its 'ceil' param which can limit bandwith to an upper bound. Refers to HTB user guide section "4. Ceiling" http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#ceiling etienne nic-lartc@studentergaarden.dk wrote: (Sent by: lartc-bounces@mailman.ds9a.nl 08/05/2007 03:51) > EHLO tc gurus. > > New to traffic control. Unfortunately, the politicians here in Denmark > have decided that a PC is the same as a television set - so anyone > owning a PC and internet connection of over 255 kbit/s must pay DKR > 2200/year = EUR 300 = USD 400 in television licence fees :-( This is a > lot of money for poor students, so we want to offer the students the > *option* of limiting their download speed to 255 kbit/s. Limit must be > per internal IP number (or MAC address, even better). > > Situation: dorm rooms, 130 residents, Internet connection is 100 Mbit > full duplex fiber Ethernet, never over 10% used. Router/firewall is a > Debian/Etch box 650 Mhz, 160 Mb RAM, with kernel 2.6, iptables, > netfilter iproute2 & everything necessary. > > eth0 = internet, eth1 = DMZ, eth2 = internal NATted network, 172.16.0.0/16 > > As far as I can see, this should do the trick?: > > # delete all existing queue disciplines > tc qdisc del dev eth2 root > > # attach queue discipline HTB to interface eth2 and give it handle 1:0 > tc qdisc add dev eth2 root handle 1:0 htb > > # host 1 > tc class add dev eth2 parent 1:0 classid 1:1 htb rate 255kbit burst 255kbit > tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ > match ip dst 172.16.255.132 flowid 1:1 > > # host 2 > tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst 255kbit > tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ > match ip dst 172.16.255.145 flowid 1:2 > > # etc etc etc > > Questions: > > 1) Is this a good way of doing it? > > 2) TBF or HTB? I just chose HTB because it seems more flexible and has > sane defaults, so I don't have to think so much. Are there any > disadvantages? > > 3) Any clever suggestions on how to best implement the stupid law with > the least harm to our users (for example, maybe we could have a > relatively high burst bandwidth, with the real limiting to 255 Kbit/s > only kicking in after several seconds? This might make normal web > surfing seem almost unaffected? > > Thanks, > Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070509/b4fff9ee/attachment.htm From johnnyb at marlboro.edu Wed May 9 14:59:15 2007 From: johnnyb at marlboro.edu (John Baker) Date: Wed May 9 14:59:28 2007 Subject: [LARTC] snmp, cacti and shaping Message-ID: <4641C5A3.601@marlboro.edu> Hi I'm trying to move the tracking of the shaping from MRTG to Cacti. My predecessor, who built all this stuff and was far more advanced than I, had a shell script that collected data by running tc -s qdisc show dev on both eth 0 and 1 and then pushing it out via snmp to another server running MRTG. I'm building a new server with cacti and having trouble with the custom templates. Does anyone have any templates/scripts that would help? Thanks -- John Baker Network Systems Administrator Marlboro College Phone: 451-7551 off campus; 551 on campus From francis at aspl.es Wed May 9 17:00:06 2007 From: francis at aspl.es (Francis Brosnan Blazquez) Date: Wed May 9 17:00:00 2007 Subject: [LARTC] Load balancing using connmark Message-ID: <1178722806.7492.55.camel@vulcan.aspl> Hi, I've been implementing a load balancing solution using CONNMARK, based on solution described by Luciano Ruete at [1]. Gracias por el post y por apuntar en la direcci?n correcta Luciano! Once implemented, I've found that due to some reason packets aren't properly marked (or improperly remarked) and sent out using the wrong interface. My topo setup is: [82.123.136.74]: eth1 : mark:0x1 --\ +--[FW BOX] -- eth0: 192.168.0.53 [217.146.74.82]: eth2 : mark:0x2 --/ Using conntrack tool, shows that after a while, it starts to appear packets marked with 0x2 or 0x1 not comming from the proper source IP. >> conntrack -L | grep mark=2 | grep '82.123.136.74'; conntrack -L | grep mark=1 | grep '217.146.74.82' tcp 6 425543 ESTABLISHED src=192.168.0.178 dst=82.216.53.249 sport=1552 dport=443 packets=818 bytes=93471 src=82.216.53.249 dst=82.123.136.74 sport=443 dport=1552 packets=875 bytes=83909 [ASSURED] mark=2 use=1 tcp 6 428681 ESTABLISHED src=192.168.0.177 dst=89.139.122.12 sport=2361 dport=443 packets=122 bytes=29381 src=89.139.122.12 dst=82.123.136.74 sport=443 dport=2361 packets=139 bytes=14120 [ASSURED] mark=2 use=1 This is quite odd since solution proposed at [1] looks good. I'll cite it here for clarity (suppose I already have all ip rule stuff installed): iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark After giving a try during several days, I've found that another firewall solution, shorewall [2], implements built-in load balacing for free by using the following set of instructions: iptables -t mangle -A PREROUTING -m connmark ! --mark 0/0xFF -j CONNMARK --restore-mark --mask 0xFF iptables -t mangle -A OUTPUT -m connmark ! --mark 0/0xFF -j CONNMARK --restore-mark --mask 0xFF iptables -t mangle -N routemark iptables -t mangle -A PREROUTING -i eth1 -m mark --mark 0/0xFF -j routemark iptables -t mangle -A routemark -i eth1 -j MARK --set-mark 1 iptables -t mangle -A PREROUTING -i eth2 -m mark --mark 0/0xFF -j routemark iptables -t mangle -A routemark -i eth2 -j MARK --set-mark 2 iptables -t mangle -A routemark -m mark ! --mark 0/0xFF -j CONNMARK --save-mark --mask 0xFF After a bit of testing with the second solution, it seems to behave better, doing all marking job at the PREROUTING and OUTPUT. Did anybody find that some packages doesn't get properly routed according to the mark with the first solution? What you do think about the second solution? Cheers! [1] http://mailman.ds9a.nl/pipermail/lartc/2006q2/018964.html [2] http://www.shorewall.net -- Francis Brosnan Blazquez Advanced Software Production Line, S.L. From rabbit at rabbit.us Wed May 9 18:33:15 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed May 9 18:33:24 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <1178722806.7492.55.camel@vulcan.aspl> References: <1178722806.7492.55.camel@vulcan.aspl> Message-ID: <4641F7CB.3000209@rabbit.us> Francis Brosnan Blazquez wrote: > Hi, > > I've been implementing a load balancing solution using CONNMARK, based > on solution described by Luciano Ruete at [1]. Gracias por el post y por > apuntar en la direcci?n correcta Luciano! > > Once implemented, I've found that due to some reason packets aren't > properly marked (or improperly remarked) and sent out using the wrong > interface. > > > > iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT > iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 > iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 > iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark This is wrong. POSTROUTING is exactly what is is _POST_ routing. By the time you do your marks and stuff the kernel has _already_ assigned a packet to an interface, and you can not alter this anymore. > After a bit of testing with the second solution, it seems to behave > better, doing all marking job at the PREROUTING and OUTPUT. This is flawed too. OUTPUT suffers from the very same problem as POSTROUTING - by the time the packets hit the NF stack the process has already bound itself to an interface, which you can not change anymore. Peter From default at advaita.sytes.net Wed May 9 22:24:22 2007 From: default at advaita.sytes.net (John Default) Date: Wed May 9 22:24:28 2007 Subject: [LARTC] Token Bucket Filter and Dropping In-Reply-To: <002a01c79166$3e53a7b0$bafaf710$@pl> References: <002a01c79166$3e53a7b0$bafaf710$@pl> Message-ID: <46422DF6.7000405@advaita.sytes.net> Hi No need for htb, simple tbf will do. But if you are creating your own: As i understand token bucket, you should take each packet from the end of queue in order they are there, i.e S1, S2, S1, S2, not just each other. When the bucket is full, you will drop every packet on input. It is unlikely that you will free space in buffer everytime right when S2 packet comes... You would have buffer full of s2 then, not s1|s2|s1... as you say. Are you sure you are dequeuing from front and doing tail-drop ? As i understand: dequeue at constant rate<--que_front_s1,s2,s1,s2,s1,s2_que_tail<--enqueue input or drop Maybe change in size of token could help mix this... How big is one token now? (i do not know how your tbf is implemented...)Can you give more details ? By the way: how are you creating those streams, i think it is unusual to see such properly ordered packets of streams in real life ... If i am completely out, sorry then, just beginner. Best regards (default) Piotr W?jcicki wrote: > I am trying to create my own Token Bucket Filter. However, I have a problem > with packet dropping. > > Scenario : > I got two streams 20KB/s each. > I got one bucket with rate 20KB/s > > I put both streams into this bucket. > > When buffer is full packets need to be dropped. The problem is that only > every other packet needs to be dropped in this scenario. > Streams are the same so queue looks like that : > > S1 | S2 | S1 | S2 > > Packets form both streams are one by one. > The result is that all packets from stream S1 are being dropped and all > packets from Stream S2 are being sent. > Ideally half of dropped packets would be from S1 and half from S1. > > What are possible solutions to this problem ? > > > Piotr Wojcicki > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > From lists at andyfurniss.entadsl.com Wed May 9 23:39:24 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed May 9 23:39:23 2007 Subject: [LARTC] limit bandwidth per host question In-Reply-To: <463FD790.8010709@studentergaarden.dk> References: <463FD790.8010709@studentergaarden.dk> Message-ID: <46423F8C.1090905@andyfurniss.entadsl.com> nic-lartc@studentergaarden.dk wrote: > EHLO tc gurus. > > New to traffic control. Unfortunately, the politicians here in Denmark > have decided that a PC is the same as a television set - so anyone > owning a PC and internet connection of over 255 kbit/s must pay DKR > 2200/year = EUR 300 = USD 400 in television licence fees :-( This is a > lot of money for poor students, so we want to offer the students the > *option* of limiting their download speed to 255 kbit/s. Limit must be > per internal IP number (or MAC address, even better). Eww - nasty. Is the law watertight - eg pay one licence fee and run a proxy to workaround, maybe they thought of that :-) > > Situation: dorm rooms, 130 residents, Internet connection is 100 Mbit > full duplex fiber Ethernet, never over 10% used. Router/firewall is a > Debian/Etch box 650 Mhz, 160 Mb RAM, with kernel 2.6, iptables, > netfilter iproute2 & everything necessary. > > eth0 = internet, eth1 = DMZ, eth2 = internal NATted network, 172.16.0.0/16 > Another thought - do they have link layer access to each other .... > As far as I can see, this should do the trick?: > > # delete all existing queue disciplines > tc qdisc del dev eth2 root > > # attach queue discipline HTB to interface eth2 and give it handle 1:0 > tc qdisc add dev eth2 root handle 1:0 htb > > # host 1 > tc class add dev eth2 parent 1:0 classid 1:1 htb rate 255kbit burst 255kbit > tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ > match ip dst 172.16.255.132 flowid 1:1 > > # host 2 > tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst 255kbit > tc filter add dev eth2 protocol ip parent 1:0 prio 1 u32 \ > match ip dst 172.16.255.145 flowid 1:2 > > # etc etc etc > > Questions: > > 1) Is this a good way of doing it? > > 2) TBF or HTB? I just chose HTB because it seems more flexible and has > sane defaults, so I don't have to think so much. Are there any > disadvantages? > > 3) Any clever suggestions on how to best implement the stupid law with > the least harm to our users (for example, maybe we could have a > relatively high burst bandwidth, with the real limiting to 255 Kbit/s > only kicking in after several seconds? This might make normal web > surfing seem almost unaffected? Burst is a good idea I thing htb with prio on leafs could be slightly nicer - eg. with so little bandwidth users may want to game while downloading, so giving udp and common tcp game ports prio over other tcp will mean they can do both. Another way could be to use policers - they are a bit crude but do keep latency low and also have burst/buffer parameter. Maybe set up some tests to see which is nicer at 255kbit. If you only have to shape downloads to comply, users may find most of their bandwidth gets used by the acks when they upload. Andy. From kristiadi_himawan at dtp.net.id Thu May 10 04:27:10 2007 From: kristiadi_himawan at dtp.net.id (Kristiadi Himawan) Date: Thu May 10 04:29:18 2007 Subject: [LARTC] snmp, cacti and shaping In-Reply-To: <4641C5A3.601@marlboro.edu> Message-ID: Hi John, If you still have the script for MRTG you could use it also with few modification for Cacti, and here's the step by step tutorial creating graph from script. http://docs.cacti.net/node/300 Regards, Kris On 5/9/2007, "John Baker" wrote: >Hi > >I'm trying to move the tracking of the shaping from MRTG to Cacti. My >predecessor, who built all this stuff and was far more advanced than I, >had a shell script that collected data by running tc -s qdisc show dev >on both eth 0 and 1 and then pushing it out via snmp to another server >running MRTG. I'm building a new server with cacti and having trouble >with the custom templates. Does anyone have any templates/scripts that >would help? > >Thanks > >-- >John Baker >Network Systems Administrator >Marlboro College >Phone: 451-7551 off campus; 551 on campus > >_______________________________________________ >LARTC mailing list >LARTC@mailman.ds9a.nl >http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From adm.acacio at digi.com.br Thu May 10 06:11:17 2007 From: adm.acacio at digi.com.br (=?ISO-8859-1?Q?Ac=E1cio_Alves_dos_Santos?=) Date: Thu May 10 06:11:26 2007 Subject: [LARTC] Problem with ipp2p 0.8.2 Message-ID: <56ADBD00-2885-4ED4-8BB8-2B2D7892EF02@digi.com.br> Hello Guys ( and girls, if is there any here :) ), I have a box with IPP2P Installed on it (Debian Etch, ipp2p 0.8.2 tarball, iptables 1.3.6 and kernel 2.6.21), and I've identified a problem: I use iptables to apply a mark on the traffic that ipp2p classifies as p2p. In my tc rules, I have granted bandwidth to many traffic classes (http, ssh, streaming, games, p2p, etc) and one last class of traffic, where all the unclassified traffic goes. The problem is: The accuracy of ipp2p isn't good on my box, and the traffic of the unclassified traffic becomes big, and some things that I have chosen to go to this class (like ftp traffic in passive mode, or some games that use a "p2p like" protocol) suffer with the excessive traffic. Eventually, when I reboot my box and try again to install, the ipp2p works, and that class with the unclassified traffic becomes again "normal". Does any of you have a suggestion to do? I'm getting crazy with this problem... -- Ac?cio Alves dos Santos Administra??o de redes Diginet Brasil adm.acacio@digi.com.br (+55) 84 4008-9000 Esta mensagem, incluindo seus anexos, pode conter informa??o confidencial e/ou privilegiada. Se voc? n?o for o destinat?rio ou a pessoa autorizada a receber esta mensagem, n?o pode usar, copiar ou divulgar as informa??es nela contidas ou tomar qualquer a??o baseada nessas informa??es. Se voc? recebeu esta mensagem por engano, por favor avise imediatamente o remetente, respondendo o e-mail e em seguida apague-o. Agradecemos sua coopera??o. This message, including its attatchments, may contain confidential and/or privileged information. If you are not the recipient or authorized person to receive this message, you must not use, copy, disclose or take any action based on this message or any information herein. If you received this message by mistake, please advise the sender immediately by replying the e- mail and deleting this message. Thank you for your cooperation. From gsmdib at gmail.com Thu May 10 07:53:15 2007 From: gsmdib at gmail.com (Alex Girchenko) Date: Thu May 10 07:53:20 2007 Subject: [LARTC] gw, lsrc in julian's patches Message-ID: <567e52cd0705092253l74e72633m18b042a77aae4a8b@mail.gmail.com> In http://www.ssi.bg/~ja/dgd.txt I read: -- - key "gw" for ip_route_output used to select the right route for the gateway - key "lsrc" for ip_route_input used to find the best unicast route between this IP and the destination address (similar to output routing call but still makes the checks needed for input packet). -- Could someone please provide a couple of examples on this? I was unable to find any info in the LARTC list. The second question is: do the patches add a way to monitor link state that could be used e.g. from a shell script (I need to adjust netfilter policy according to the link state)? Could you please advice a way to do this? TIA. From salim.si at cipherium.com.tw Thu May 10 08:15:23 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu May 10 08:15:58 2007 Subject: [LARTC] Load balancing using connmark Message-ID: <000401c792ca$9ba1cdb0$5964a8c0@SalimSi> Francis Brosnan Blazquez wrote: > Hi, > > I've been implementing a load balancing solution using CONNMARK, based > on solution described by Luciano Ruete at [1]. Gracias por el post y por > apuntar en la direcci?n correcta Luciano! > > Once implemented, I've found that due to some reason packets aren't > properly marked (or improperly remarked) and sent out using the wrong > interface. > > > > iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT > iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 > iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 > iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark This is wrong. POSTROUTING is exactly what is is _POST_ routing. By the time you do your marks and stuff the kernel has _already_ assigned a packet to an interface, and you can not alter this anymore. > After a bit of testing with the second solution, it seems to behave > better, doing all marking job at the PREROUTING and OUTPUT. This is flawed too. OUTPUT suffers from the very same problem as POSTROUTING - by the time the packets hit the NF stack the process has already bound itself to an interface, which you can not change anymore. Peter Disagree with Peter. The marking in postrouting table is CONNMARK. This is for marking the connection, which has already had a route decided for it, so that all packets of the connection passes through this interface. This marking is done for packets with NEW state, see the check for mark==0 in the prev. line. The restore mark in PREROUTING will restore the connmark and route the subsequent packets. This approach will work, but you need some sort of stateful-ness in netfilter. The second point in Brosnan Blazquez?s mail about shorewall: They seem to be doing Policy Routing, not real load balancing. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070510/3454fffb/attachment.html From salim.si at cipherium.com.tw Thu May 10 10:01:00 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu May 10 10:01:29 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <000401c792ca$9ba1cdb0$5964a8c0@SalimSi> Message-ID: <000e01c792d9$5c7071a0$5964a8c0@SalimSi> On closer look, I am wrong about shorewall. It seems to be a different approach to load balancing. They connmark the incoming packets from WAN, rather than outgoing packets. I think it should work well, but I wonder why this approach is not popular. There must be some drawback to it. I can?t think of one,though. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Salim S I Sent: Thursday, May 10, 2007 2:15 PM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Load balancing using connmark Francis Brosnan Blazquez wrote: > Hi, > > I've been implementing a load balancing solution using CONNMARK, based > on solution described by Luciano Ruete at [1]. Gracias por el post y por > apuntar en la direcci?n correcta Luciano! > > Once implemented, I've found that due to some reason packets aren't > properly marked (or improperly remarked) and sent out using the wrong > interface. > > > > iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT > iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 > iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 > iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark This is wrong. POSTROUTING is exactly what is is _POST_ routing. By the time you do your marks and stuff the kernel has _already_ assigned a packet to an interface, and you can not alter this anymore. > After a bit of testing with the second solution, it seems to behave > better, doing all marking job at the PREROUTING and OUTPUT. This is flawed too. OUTPUT suffers from the very same problem as POSTROUTING - by the time the packets hit the NF stack the process has already bound itself to an interface, which you can not change anymore. Peter Disagree with Peter. The marking in postrouting table is CONNMARK. This is for marking the connection, which has already had a route decided for it, so that all packets of the connection passes through this interface. This marking is done for packets with NEW state, see the check for mark==0 in the prev. line. The restore mark in PREROUTING will restore the connmark and route the subsequent packets. This approach will work, but you need some sort of stateful-ness in netfilter. The second point in Brosnan Blazquez?s mail about shorewall: They seem to be doing Policy Routing, not real load balancing. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070510/9842db92/attachment.htm From diego.giardinetto at amugsiena.it Thu May 10 10:48:02 2007 From: diego.giardinetto at amugsiena.it (Diego Giardinetto [@AMUGSiena]) Date: Thu May 10 10:48:06 2007 Subject: [LARTC] connmark and masquerading Message-ID: <21dfacd30705100148i5f3abe6cj2b353d659e559328@mail.gmail.com> Hi all, and thx for all previous repplies to my questions!!! Here is another one bit trouble: is it possible maintain commark information after that a packet crossed the forwarding chain with masquerading? Best wishes, Diego -- Diego Giardinetto Skype Name: cpuzorro MSN: cpuoverload@hotmail.it From francis at aspl.es Thu May 10 11:06:48 2007 From: francis at aspl.es (Francis Brosnan Blazquez) Date: Thu May 10 11:06:40 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <000e01c792d9$5c7071a0$5964a8c0@SalimSi> References: <000e01c792d9$5c7071a0$5964a8c0@SalimSi> Message-ID: <1178788008.4376.13.camel@vulcan.aspl> El jue, 10-05-2007 a las 16:01 +0800, Salim S I escribi?: Hi Salim, Thanks for your reply, > On closer look, I am wrong about shorewall. It seems to be a different > approach to load balancing. They connmark the incoming packets from > WAN, rather than outgoing packets. I think it should work well, but I > wonder why this approach is not popular. There must be some drawback > to it. I can?t think of one,though. I think the main advantage of shorewall solution is that it applies connmark to incoming packets from the wan as you point, leaving load balancing to outgoing connections to the main table. In any case, with this second solution I don't see wrong routed packages on wan interfaces using tcpdump, whereas with the first solution I do. More testing is required. Regarding to your previous reply, can you elaborate more on "...This approach will work, but you need some sort of stateful-ness in netfilter..." Cheers! -- Francis Brosnan Blazquez Advanced Software Production Line, S.L. From salim.si at cipherium.com.tw Thu May 10 11:22:59 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu May 10 11:23:18 2007 Subject: FW: [LARTC] Load balancing using connmark Message-ID: <001901c792e4$cdc161b0$5964a8c0@SalimSi> -----Original Message----- From: Salim S I [mailto:salim.si@cipherium.com.tw] Sent: Thursday, May 10, 2007 5:22 PM To: 'Francis Brosnan Blazquez' Subject: RE: [LARTC] Load balancing using connmark "I think the main advantage of shorewall solution is that it applies connmark to incoming packets from the wan as you point, leaving load balancing to outgoing connections to the main table" Actually, the main table/multipath route only routes the first packet of a connection. The subsequent routing for that connection is done based on connmark, for outgoing packets too. Otherwise replies to packets coming from WAN1 may go through WAN2. The difference in the two solutions is only in where packets are marked and which packets are marked. Routing is the same. For a detailed discussion on the first approach, you can refer to this thread. http://mailman.ds9a.nl/pipermail/lartc/2006q2/018964.html -----Original Message----- From: Francis Brosnan Blazquez [mailto:francis@aspl.es] Sent: Thursday, May 10, 2007 5:07 PM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: RE: [LARTC] Load balancing using connmark El jue, 10-05-2007 a las 16:01 +0800, Salim S I escribi?: Hi Salim, Thanks for your reply, > On closer look, I am wrong about shorewall. It seems to be a different > approach to load balancing. They connmark the incoming packets from > WAN, rather than outgoing packets. I think it should work well, but I > wonder why this approach is not popular. There must be some drawback > to it. I can?t think of one,though. I think the main advantage of shorewall solution is that it applies connmark to incoming packets from the wan as you point, leaving load balancing to outgoing connections to the main table. In any case, with this second solution I don't see wrong routed packages on wan interfaces using tcpdump, whereas with the first solution I do. More testing is required. Regarding to your previous reply, can you elaborate more on "...This approach will work, but you need some sort of stateful-ness in netfilter..." Cheers! -- Francis Brosnan Blazquez Advanced Software Production Line, S.L. From peter at endian.it Thu May 10 12:25:37 2007 From: peter at endian.it (Peter Warasin) Date: Thu May 10 12:25:51 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <1178722806.7492.55.camel@vulcan.aspl> References: <1178722806.7492.55.camel@vulcan.aspl> Message-ID: <4642F321.4050300@endian.it> hi people Francis Brosnan Blazquez wrote: > I've been implementing a load balancing solution using CONNMARK, based > After giving a try during several days, I've found that another firewall > solution, shorewall [2], implements built-in load balacing for free by > using the following set of instructions: did somebody try the shorewall solution with centos 4? with centos 4 and the first solution i always had the problem, that it routes correctly only for passing through connections (forwarded). connections starting from the machine or hoing to the machine (input/output chain) had exactly the same behaviour as you stated before. i noticed with centos 4 that packets do not pass the prerouting magle chain if going to the local host (passing the input filter chain thereafter). therefore certainly the mark will not be restored and there will be no influence on the routing decision. someone noticed similar behaviour? peter -- :: e n d i a n :: open source - open minds :: peter warasin :: http://www.endian.com :: peter@endian.com From rabbit at rabbit.us Thu May 10 12:51:26 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu May 10 12:51:34 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <000401c792ca$9ba1cdb0$5964a8c0@SalimSi> References: <000401c792ca$9ba1cdb0$5964a8c0@SalimSi> Message-ID: <4642F92E.50001@rabbit.us> Salim S I wrote: > Francis Brosnan Blazquez wrote: > >> Hi, > >> > >> I've been implementing a load balancing solution using CONNMARK, based > >> on solution described by Luciano Ruete at [1]. Gracias por el post y por > >> apuntar en la direcci?n correcta Luciano! > >> > >> Once implemented, I've found that due to some reason packets aren't > >> properly marked (or improperly remarked) and sent out using the wrong > >> interface. > >> > >> > >> > >> iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT > >> iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 > >> iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 > >> iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark > > > > This is wrong. POSTROUTING is exactly what is is _POST_ routing. By the > > time you do your marks and stuff the kernel has _already_ assigned a > > packet to an interface, and you can not alter this anymore. > > > >> After a bit of testing with the second solution, it seems to behave > >> better, doing all marking job at the PREROUTING and OUTPUT. > > > > This is flawed too. OUTPUT suffers from the very same problem as > > POSTROUTING - by the time the packets hit the NF stack the process has > > already bound itself to an interface, which you can not change anymore. > > > > Peter > > > > Disagree with Peter. The marking in postrouting table is CONNMARK. This > is for marking the connection, which has already had a route decided for > it, so that all packets of the connection passes through this interface. > This marking is done for packets with NEW state, see the check for > mark==0 in the prev. line. The restore mark in PREROUTING will restore > the connmark and route the subsequent packets. > > This approach will work, but you need some sort of stateful-ness in > netfilter. > Connmark is exactly the statefullness you are talking about. The problem is that the marks by themselves do not mean anything. You mark packets and expect iproute to classify the packet in the correct routing table etc. CONNMARK is invisible to iproute - this is why you have only --save-mark and --restore-mark, and the rest of the rules deal with real MARKs. Further you (and the OP) seem to be confused by a mix of routing tasks. In the case of _forwarded_ traffic, you need to make sure that all packets within a connection leave to WAN over the same interface, and are SNATed to the same ip, so that they will come bak the same interface. The SNATting is trivial (as it can be done in POSTROUTING only), but you need to set all marks before the routing takes place (which is anywhere _but_ POSTROUTING). You might mark the connection with the proper CONNMARK. and subsequent packets might get routed correctly, but the _first_ packet (the one that you use to set the mark) is already assigned to an interface, and there is nothing you can do about it. In the case of _local_ traffic - it becomes even trickier. The problem is that when sockets are created they already have a source IP (the kernel determines that by looking at the default routing table, your marks do not exist yet). So since you can not alter the socket binding, the only way to make it leave on a different interface is by treating it as a forwarded connection and performing NAT on it. It is arguable if NATting locally originating connections is a good idea, but it can be done in OUTPUT just like it is done for forwarder connections in PREROUTING. I hope this clarifies things a bit, feel free to point out any inconsistencies you may find. Peter From rabbit at rabbit.us Thu May 10 12:59:58 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu May 10 13:00:03 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <4642F92E.50001@rabbit.us> References: <000401c792ca$9ba1cdb0$5964a8c0@SalimSi> <4642F92E.50001@rabbit.us> Message-ID: <4642FB2E.2080801@rabbit.us> Peter Rabbitson wrote: > ... > In the case of _local_ traffic - it becomes even trickier. The problem > is that when sockets are created they already have a source IP (the > kernel determines that by looking at the default routing table, your > marks do not exist yet). This is misleading - it will happen only when the application does not request a specific ip/interface to bind to. Only then the kernel default table is consulted, and the best interface is determined based on the destination that is supplied on socket creation. From salim.si at cipherium.com.tw Thu May 10 13:25:14 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu May 10 13:25:32 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <4642F92E.50001@rabbit.us> Message-ID: <001a01c792f5$e4bec9a0$5964a8c0@SalimSi> Let me explain why the marking is done in POSTROUTING. The first packet of any connection get routed by the multipath routing entry. This happens AFTER PREROUTING, as you know. And this is what we want, letting the kernel decide based on the weights. (some people do think that we shouldn't let multipath decide routing, but thatz a different story). So where can this packet be marked? Obviously in POSTROUTING (so that local pkts also can be caught). We mark it and save it.(connmark).The mark is decoded by the chosen interface. (eg:-o WAN1 --set mark 1,-o WAN2 --set-mark 2) In PREROUTING, there is a restore-mark. You see iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark. If this packet belong to a connection that has already sent a packet,this will restore the mark set in POSTROUTING. Then it will be routed by the corresponding routing table.(wan1 table lookup mark1 and wan2 table lookup mark2) If it is a new pkt, it will be routed by multipath routing statement,since no mark exists. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Thursday, May 10, 2007 6:51 PM Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Load balancing using connmark Salim S I wrote: > Francis Brosnan Blazquez wrote: > >> Hi, > >> > >> I've been implementing a load balancing solution using CONNMARK, based > >> on solution described by Luciano Ruete at [1]. Gracias por el post y por > >> apuntar en la direcci?n correcta Luciano! > >> > >> Once implemented, I've found that due to some reason packets aren't > >> properly marked (or improperly remarked) and sent out using the wrong > >> interface. > >> > >> > >> > >> iptables -t mangle -A POSTROUTING -m mark --mark ! 0 -j ACCEPT > >> iptables -t mangle -A POSTROUTING -o eth1 -j MARK --set-mark 0x1 > >> iptables -t mangle -A POSTROUTING -o eth2 -j MARK --set-mark 0x2 > >> iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark > > > > This is wrong. POSTROUTING is exactly what is is _POST_ routing. By the > > time you do your marks and stuff the kernel has _already_ assigned a > > packet to an interface, and you can not alter this anymore. > > > >> After a bit of testing with the second solution, it seems to behave > >> better, doing all marking job at the PREROUTING and OUTPUT. > > > > This is flawed too. OUTPUT suffers from the very same problem as > > POSTROUTING - by the time the packets hit the NF stack the process has > > already bound itself to an interface, which you can not change anymore. > > > > Peter > > > > Disagree with Peter. The marking in postrouting table is CONNMARK. This > is for marking the connection, which has already had a route decided for > it, so that all packets of the connection passes through this interface. > This marking is done for packets with NEW state, see the check for > mark==0 in the prev. line. The restore mark in PREROUTING will restore > the connmark and route the subsequent packets. > > This approach will work, but you need some sort of stateful-ness in > netfilter. > Connmark is exactly the statefullness you are talking about. The problem is that the marks by themselves do not mean anything. You mark packets and expect iproute to classify the packet in the correct routing table etc. CONNMARK is invisible to iproute - this is why you have only --save-mark and --restore-mark, and the rest of the rules deal with real MARKs. Further you (and the OP) seem to be confused by a mix of routing tasks. In the case of _forwarded_ traffic, you need to make sure that all packets within a connection leave to WAN over the same interface, and are SNATed to the same ip, so that they will come bak the same interface. The SNATting is trivial (as it can be done in POSTROUTING only), but you need to set all marks before the routing takes place (which is anywhere _but_ POSTROUTING). You might mark the connection with the proper CONNMARK. and subsequent packets might get routed correctly, but the _first_ packet (the one that you use to set the mark) is already assigned to an interface, and there is nothing you can do about it. In the case of _local_ traffic - it becomes even trickier. The problem is that when sockets are created they already have a source IP (the kernel determines that by looking at the default routing table, your marks do not exist yet). So since you can not alter the socket binding, the only way to make it leave on a different interface is by treating it as a forwarded connection and performing NAT on it. It is arguable if NATting locally originating connections is a good idea, but it can be done in OUTPUT just like it is done for forwarder connections in PREROUTING. I hope this clarifies things a bit, feel free to point out any inconsistencies you may find. Peter _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From david+lartc at blue-labs.org Thu May 10 14:04:10 2007 From: david+lartc at blue-labs.org (David Ford) Date: Thu May 10 14:04:44 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <001a01c792f5$e4bec9a0$5964a8c0@SalimSi> References: <001a01c792f5$e4bec9a0$5964a8c0@SalimSi> Message-ID: <46430A3A.7040203@blue-labs.org> Is there a good [single?] document explaining all of this and more? What the kernel does in POST vs PRE with respect to iproute2 and netfilter with CONNMARK and etc? Thank you, David Salim S I wrote: > Let me explain why the marking is done in POSTROUTING. > [...] From rabbit at rabbit.us Thu May 10 14:06:54 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu May 10 14:06:59 2007 Subject: [LARTC] Load balancing using connmark In-Reply-To: <001a01c792f5$e4bec9a0$5964a8c0@SalimSi> References: <001a01c792f5$e4bec9a0$5964a8c0@SalimSi> Message-ID: <46430ADE.6090804@rabbit.us> Salim S I wrote: > Let me explain why the marking is done in POSTROUTING. > > want, letting the kernel decide based on the weights. (some people do > think that we shouldn't let multipath decide routing, but thatz a > different story). I apologize, as I am one of these people, and subsequently assumed the OP wanted this. In this light I agree with Salim. On an unrelated note the OP should be aware that letting multipath do the balancing is impractical (i.e. does not work) in real life scenarios, but this indeed is a topic for a separate thread. From simo at mix4web.de Thu May 10 17:12:58 2007 From: simo at mix4web.de (Simo) Date: Thu May 10 17:13:17 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? Message-ID: <001401c79315$b1732c60$14598520$@de> Hello mailing list, i stand bevor a mystery and cannot explain it J. I want to do shaping and prioritization and I have done these following configurations and simulations. I can?t explain, that the combination of PRIO and TBF is much better than the HTB (with the prio parameter) alone or in combination with the SFQ. Here are my example configurations: 2 Traffic Classes http (80 = 0x50) and ssh (22 = 0x16), and in my example, I want to prioritize the http-Traffic: HTB: the results of the simulation ist here: HTB cumulative: http://simo.mix4web.de/up/htb_cumul.jpg HTB delay: http://simo.mix4web.de/up/htb_delay.jpg HTB with prio parameter cumulative: http://simo.mix4web.de/up/htb_cumul_prio_paramter.jpg HTB with prio parameter delay: http://simo.mix4web.de/up/htb_delay_prio_parameter.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; htb () { class ( rate UPLOAD, ceil UPLOAD) { /* with the prio parameter : $high = class ( rate 700kbps, ceil UPLOAD, prio 0); */ $high = class ( rate 700kbps, ceil UPLOAD); /* with the prio parameter : $low = class ( rate 300kbps, ceil UPLOAD, prio 0); */ $low = class ( rate 300kbps, ceil UPLOAD, prio 1); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s PRIO and TBF: PRIO and TBF cumulative: http://simo.mix4web.de/up/prio_tbf_cumul.jpg PRIO and TBF delay: http://simo.mix4web.de/up/prio_tbf_delay.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; prio{ $high = class{ tbf (rate 700kbps, burst 1510B, mtu 1510B, limit 3000B); } $low = class{ tbf (rate 300kbps, burst 1510B, mtu 1510B, limit 3000B); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s the delay by the combination of PRIO and TBF is much better than by the HTB. (is it possible that pakets maybe dropped by the combination of PRIO and TBF, that?s why the latency is so good???) Have you an idea??? thanks simo ---------------------------------------------------------------------------- ----------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070510/411abf08/attachment-0001.html From salim.si at cipherium.com.tw Fri May 11 04:56:51 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri May 11 04:57:08 2007 Subject: [LARTC] DGD patch not detecting dead gateway Message-ID: <000701c79378$09be3f10$5964a8c0@SalimSi> I have a doubt. If you use such a script monitoring the link status with ping and then reconfiguring, why do you need the DGD patch? You need to do some reconfiguration (change multipath to a single default route) anyway if you use the script, right? Also, the DGD patch uses src to lookup the routing table entry, but if you have a dynamic IP for the WAN interface (PPPoE, DHCP etc), this approach is bound to fail, right? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/ca6e1ea7/attachment.htm From salim.si at cipherium.com.tw Fri May 11 08:25:52 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri May 11 08:26:15 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <001401c79315$b1732c60$14598520$@de> Message-ID: <001601c79395$3a0477d0$5964a8c0@SalimSi> HTB?s priority and PRIO qdisc are very different. PRIO qdisc will definitely give better latency for your high priority traffic, since the qdisc is designed for the purpose of ?priority?. In theory it will even starve the low priority traffic, if high prio traffic is waiting to go out. HTB?s priority is different, it only gives relative priority. High prio class in a level is de-queued first during the roundrobin/wrr cycle, but lower priority classes will also be fairly serviced, unlike PRIO qdisc. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Simo Sent: Thursday, May 10, 2007 11:13 PM To: lartc@mailman.ds9a.nl Subject: [LARTC] PRIO and TBF is much better than HTB?? Hello mailing list, i stand bevor a mystery and cannot explain it :-). I want to do shaping and prioritization and I have done these following configurations and simulations. I can?t explain, that the combination of PRIO and TBF is much better than the HTB (with the prio parameter) alone or in combination with the SFQ. Here are my example configurations: 2 Traffic Classes http (80 = 0x50) and ssh (22 = 0x16), and in my example, I want to prioritize the http-Traffic: HTB: the results of the simulation ist here: HTB cumulative: http://simo.mix4web.de/up/htb_cumul.jpg HTB delay: http://simo.mix4web.de/up/htb_delay.jpg HTB with prio parameter cumulative: http://simo.mix4web.de/up/htb_cumul_prio_paramter.jpg HTB with prio parameter delay: http://simo.mix4web.de/up/htb_delay_prio_parameter.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; htb () { class ( rate UPLOAD, ceil UPLOAD) { /* with the prio parameter : $high = class ( rate 700kbps, ceil UPLOAD, prio 0); */ $high = class ( rate 700kbps, ceil UPLOAD); /* with the prio parameter : $low = class ( rate 300kbps, ceil UPLOAD, prio 0); */ $low = class ( rate 300kbps, ceil UPLOAD, prio 1); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s PRIO and TBF: PRIO and TBF cumulative: http://simo.mix4web.de/up/prio_tbf_cumul.jpg PRIO and TBF delay: http://simo.mix4web.de/up/prio_tbf_delay.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; prio{ $high = class{ tbf (rate 700kbps, burst 1510B, mtu 1510B, limit 3000B); } $low = class{ tbf (rate 300kbps, burst 1510B, mtu 1510B, limit 3000B); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s the delay by the combination of PRIO and TBF is much better than by the HTB. (is it possible that pakets maybe dropped by the combination of PRIO and TBF, that?s why the latency is so good???) Have you an idea??? thanks simo ------------------------------------------------------------------------ --------------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/50453469/attachment-0001.html From simo at mix4web.de Fri May 11 10:36:31 2007 From: simo at mix4web.de (Simo) Date: Fri May 11 10:37:00 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <001601c79395$3a0477d0$5964a8c0@SalimSi> References: <001401c79315$b1732c60$14598520$@de> <001601c79395$3a0477d0$5964a8c0@SalimSi> Message-ID: <001501c793a7$7c3d4480$74b7cd80$@de> Hi, Thanks for your answer. You are right concerning the PRIO QDisc, but which I did not understand is that the combination (PRIO+TBF) made a Shaping nearly exactly the same as with HTB only with better latency. One sees this with the comparison of the two following illustrations of my simulation: HTB with prio parameter cumulative: http://simo.mix4web.de/up/htb_cumul_prio_paramter.jpg PRIO and TBF cumulative: http://simo.mix4web.de/up/prio_tbf_cumul.jpg > > theory it will even starve the low priority traffic, if high prio traffic is waiting to go out. > In the first illustration you can see that the low priority traffic also has been served (nearly exactly the same as with HTB). Because of the use of PRIO in combination with TBF. But the latency is much better, if you compares the following illustrations: HTB with prio parameter delay: http://simo.mix4web.de/up/htb_delay_prio_parameter.jpg PRIO and TBF delay: http://simo.mix4web.de/up/prio_tbf_delay.jpg I think that the overhead with the HTB algorithm is larger and the scheduler keeps the packets a little longer in the queues. Simo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/e8358004/attachment.htm From salim.si at cipherium.com.tw Fri May 11 11:18:25 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri May 11 11:18:41 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <001501c793a7$7c3d4480$74b7cd80$@de> Message-ID: <002f01c793ad$574121a0$5964a8c0@SalimSi> That is why I said 'in theory'. Using PRIO qdisc, I have never been able to achieve starvation of low priority traffic. I have tested with same rates for both high and low prio traffic, and did not see high priority traffic really dominating. Maybe a high rate of high prio traffic combined with a low rate of low prio traffic will achieve this, I don't know. The cumulative effect you see is more likely due to the errant behavior, not the intended behavior of PRIO qdisc. I may be wrong here; I am speaking only from my experience. You make a decision whether to depend on this unintentional, but very common, behavior or not. Another thing is, PRIO qdisc lists a known bug: High rate of low priority traffic will starve High priority traffic. So if all goes according to the known documentation, 'some' of your traffic will starve under 'some' condition. :-) But yes, TBF+PRIO is the preferred solution for latency sensitive applications, like Voice/Video. In such cases, one doesn't care if the non-realtime traffic is starved or not. The PRIO algorithm is designed to 'empty' high priority queue first. HTB only ensures that high priority queue is 'serviced' first. HTB has a fair queuing algorithm. It is not really suited for prioritizing traffic, i.e to give absolute priority. Still, you may take a look at the wondershaper script, which prioritizes some traffic using HTB. -----Original Message----- From: Simo [mailto:simo@mix4web.de] Sent: Friday, May 11, 2007 4:37 PM To: 'Salim S I'; lartc@mailman.ds9a.nl Subject: RE: [LARTC] PRIO and TBF is much better than HTB?? Hi, Thanks for your answer. You are right concerning the PRIO QDisc, but which I did not understand is that the combination (PRIO+TBF) made a Shaping nearly exactly the same as with HTB only with better latency. One sees this with the comparison of the two following illustrations of my simulation: HTB with prio parameter cumulative: http://simo.mix4web.de/up/htb_cumul_prio_paramter.jpg PRIO and TBF cumulative: http://simo.mix4web.de/up/prio_tbf_cumul.jpg > > theory it will even starve the low priority traffic, if high prio traffic is waiting to go out. > In the first illustration you can see that the low priority traffic also has been served (nearly exactly the same as with HTB). Because of the use of PRIO in combination with TBF. But the latency is much better, if you compares the following illustrations: HTB with prio parameter delay: http://simo.mix4web.de/up/htb_delay_prio_parameter.jpg PRIO and TBF delay: http://simo.mix4web.de/up/prio_tbf_delay.jpg I think that the overhead with the HTB algorithm is larger and the scheduler keeps the packets a little longer in the queues. Simo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/19b8c043/attachment.html From jeremy.salmon at openlab.ma Fri May 11 11:58:48 2007 From: jeremy.salmon at openlab.ma (Jeremy SALMON) Date: Fri May 11 12:14:01 2007 Subject: [LARTC] Debian - Vlan - Route problem? Message-ID: <1EBE4399-1FFE-476C-BEA5-D6E9B8326DB2@openlab.ma> Hi, I'm completly lost with vlan and route configuration on my debian. This is my architecture : eth1.401 eth1.2338 eth2 Voice Vlan Public IP Local Network 10.150.11.90 84.16.x.x 192.168.1.1 255.255.255.240 255.255.255.128 255.255.255.0 | | | BOX In this box I use : - NAT to allow the eth2 client connect to Internet from 84.16.x.x - Asterisk. Phones are in the eth2 network, SIP provider are in eth1.401 No default gateway in network card. A simple script to create route and allow NAT and other things... ============= SCRIPT ================== # Activate IP Forward echo 1 > /proc/sys/net/ipv4/ip_forward # Init Iptables iptables -F iptables -t nat -F # NAT iptables -t nat -A POSTROUTING -o eth0.2338 -s 192.168.1.0/24 -d! 10.0.0.0/8 -j SNAT --to 84.16.x.x # Add route for Internet Traffic route add default gw 84.16.x.x # Add route for my SIP provider. Route all traffic to 10.0.0.0 route add -net 10.0.0.0 netmask 255.0.0.0 gw 10.150.11.1 ============= END OF SCRIPT ============ I have a sip phone 192.168.1.200 gateway 192.168.1.1 I have my notebook 192.168.1.100 gateway 192.168.1.1 When I only ping external IP (for example 212.217.0.1) from my laptop, everything is ok. eth1.2338 is in use When I only make a call through SIP provider 10.x.x.x everything is ok. eth1.401 is in use So it seem route are working.... But for example when I make a call and during this call I ping 212.217.0.1 ping lose 95% of packet. And immediately after hangup the phone, ping start to work ok.... In IPTRAF I see all the ICMP packet sent throught eth1.2338, and all the udp phone traffic sent through eth1.401. But it seem ping don't receive the response, or response arrive to the eth1.401.... When I ping 212.217.0.1, and during the ping make a call, all the incoming udp traffic is lost... Someone can help me with this configuration ? I'm completely lost..... Thanks in advance, Jeremy From simo at mix4web.de Fri May 11 13:53:55 2007 From: simo at mix4web.de (Simo) Date: Fri May 11 13:54:08 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <002f01c793ad$574121a0$5964a8c0@SalimSi> References: <001501c793a7$7c3d4480$74b7cd80$@de> <002f01c793ad$574121a0$5964a8c0@SalimSi> Message-ID: <002701c793c3$0dbeeab0$293cc010$@de> Hi, Thanks a lot for your explanations. J I ?ve looked for an advantage of HTB opposite the combination PRIO+TBF , because this combination seemed better to me. But I?ve forgotten ;) that with HTB the unused Tokens can be distributed fairly on the other classes, so that the unused Bandwidth can fairly distributed on the other classes and that is not the case with the combination PRIO+TBF. That?s why I would prefer to use the HTB. Simo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/64bf9b46/attachment.htm From Jon.J.Flechsenhaar at boeing.com Fri May 11 17:50:10 2007 From: Jon.J.Flechsenhaar at boeing.com (Flechsenhaar, Jon J) Date: Fri May 11 17:50:35 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <001601c79395$3a0477d0$5964a8c0@SalimSi> References: <001401c79315$b1732c60$14598520$@de> <001601c79395$3a0477d0$5964a8c0@SalimSi> Message-ID: <0E24ED2A7F9AA349A8633E6A56A64BE0027A82B4@XCH-SW-2V1.sw.nos.boeing.com> Just to comment. Yes you will get better latency with prio and tbf. However there creation purposes were for different end goals. HTB has the ability to create a class structure that can break your link bandwidth up into different classes. The prio setting in HTB is to determine which class will get served if there is additional bandwidth. However all classes will get there guaranteed rates. This fits well into DiffServ. Prio is just priority. A higher prio class will starve out a lower prio class. There is no guaranteed rates or class structure, only qdiscs. TBF is purely a rate limitor. Use it to slow down an interface. Again no class structure. http://opalsoft.net/qos/DS.htm The above link is a must if your working with QoS on Linux. Jon Flechsenhaar Boeing WNW Team Network Services (714)-762-1231 202-E7 ________________________________ From: Salim S I [mailto:salim.si@cipherium.com.tw] Sent: Thursday, May 10, 2007 11:26 PM To: lartc@mailman.ds9a.nl Subject: RE: [LARTC] PRIO and TBF is much better than HTB?? HTB's priority and PRIO qdisc are very different. PRIO qdisc will definitely give better latency for your high priority traffic, since the qdisc is designed for the purpose of 'priority'. In theory it will even starve the low priority traffic, if high prio traffic is waiting to go out. HTB's priority is different, it only gives relative priority. High prio class in a level is de-queued first during the roundrobin/wrr cycle, but lower priority classes will also be fairly serviced, unlike PRIO qdisc. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Simo Sent: Thursday, May 10, 2007 11:13 PM To: lartc@mailman.ds9a.nl Subject: [LARTC] PRIO and TBF is much better than HTB?? Hello mailing list, i stand bevor a mystery and cannot explain it :-). I want to do shaping and prioritization and I have done these following configurations and simulations. I can?t explain, that the combination of PRIO and TBF is much better than the HTB (with the prio parameter) alone or in combination with the SFQ. Here are my example configurations: 2 Traffic Classes http (80 = 0x50) and ssh (22 = 0x16), and in my example, I want to prioritize the http-Traffic: HTB: the results of the simulation ist here: HTB cumulative: http://simo.mix4web.de/up/htb_cumul.jpg HTB delay: http://simo.mix4web.de/up/htb_delay.jpg HTB with prio parameter cumulative: http://simo.mix4web.de/up/htb_cumul_prio_paramter.jpg HTB with prio parameter delay: http://simo.mix4web.de/up/htb_delay_prio_parameter.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; htb () { class ( rate UPLOAD, ceil UPLOAD) { /* with the prio parameter : $high = class ( rate 700kbps, ceil UPLOAD, prio 0); */ $high = class ( rate 700kbps, ceil UPLOAD); /* with the prio parameter : $low = class ( rate 300kbps, ceil UPLOAD, prio 0); */ $low = class ( rate 300kbps, ceil UPLOAD, prio 1); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s PRIO and TBF: PRIO and TBF cumulative: http://simo.mix4web.de/up/prio_tbf_cumul.jpg PRIO and TBF delay: http://simo.mix4web.de/up/prio_tbf_delay.jpg #define UPLOAD 1000kbps dev eth0 1000 { egress { class ( <$high> ) if tcp_dport == 80; class(<$low>) if tcp_dport == 22; prio{ $high = class{ tbf (rate 700kbps, burst 1510B, mtu 1510B, limit 3000B); } $low = class{ tbf (rate 300kbps, burst 1510B, mtu 1510B, limit 3000B); } } } } /* 1Mbit 0.0008 = 100*8/10^6 */ every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 /* 800kbit/s */ every 0.001s send TCP_PCK($tcp_dport=80) 0 x 60 time 2s the delay by the combination of PRIO and TBF is much better than by the HTB. (is it possible that pakets maybe dropped by the combination of PRIO and TBF, that?s why the latency is so good???) Have you an idea??? thanks simo --------------------------------------------------------------------------------------------------------------------------------------------- In a world without walls who needs gates and windows? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/592dfd36/attachment-0001.html From fernandes_pablo at yahoo.com.br Fri May 11 17:02:05 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Fri May 11 21:03:53 2007 Subject: [LARTC] HTB and bursts Message-ID: <20070511190348.41B6D450A@outpost.ds9a.nl> Hey, i saw a related question made by another user in this list, but i still do not understanding how to do it or each values put. I have HTB "rules" in a ISP and i control for each customer this way: Flush and 1:0 class tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1:0 htb tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1:0 htb tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit Upload and Download: user1 tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil 150kbit tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j CLASSIFY --set-class 1:5 iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY --set-class 1:5 Upload and Download: user2 tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil 150kbit tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j CLASSIFY --set-class 1:8 iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY --set-class 1:8 (.) I would like to have the customer using 150kbit stable in a download. But at the begining of the conection, i would like to have a 200kbit burst. This will help the navigation between web sites in internet, downloading the gifs and texts during the burst and then have just 150kbit (thinking in a big download, for example). Is it possible? We have more products (100kbit, 150kbit, 200kbit, 300kbit, 450kbit, 600kbit, 1mbit, 1,5mbit 2mbit. Thank's any help in advance. Pablo Fernandes -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070511/1b18f90d/attachment.htm From lists at andyfurniss.entadsl.com Fri May 11 11:40:28 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 12 01:03:05 2007 Subject: [LARTC] PRIO and TBF is much better than HTB?? In-Reply-To: <001401c79315$b1732c60$14598520$@de> References: <001401c79315$b1732c60$14598520$@de> Message-ID: <46443A0C.8050201@andyfurniss.entadsl.com> Simo wrote: > #define UPLOAD 1000kbps I've never used tcns/sim if that's what this is kbps means k bytes to "normal" tc. > $low = class{ tbf (rate 300kbps, burst 1510B, mtu 1510B, limit > 3000B); } limit 3000B - not even enough for two packets (1500 mtu = 1514 to tc on eth), would hurt performance on a real wan. > every 0.0008s send TCP_PCK($tcp_dport=22) 0 x 60 > > /* 800kbit/s */ testing with a stream is not very representative of real tcp. > the delay by the combination of PRIO and TBF is much better than by the HTB. > (is it possible that pakets maybe dropped by the combination of PRIO and > TBF, that?s why the latency is so good???) Yes unless you add leafs with limit htb uses qlen of nic, default 1000p Andy. From lists at andyfurniss.entadsl.com Fri May 11 11:52:48 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 12 01:07:32 2007 Subject: [LARTC] limit bandwidth per host question In-Reply-To: <46423F8C.1090905@andyfurniss.entadsl.com> References: <463FD790.8010709@studentergaarden.dk> <46423F8C.1090905@andyfurniss.entadsl.com> Message-ID: <46443CF0.9040105@andyfurniss.entadsl.com> Andy Furniss wrote: >> tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst >> 255kbit > Burst is a good idea Actually you need to specify burst and cburst for it to work and I suppose the law doesn't stop you being more generous than 255kbit - I just tried 100k (= 100k byte) and browsing isn't too bad. Browsing and downloading together with just fifo is horrible though. I tried htb with the prio qdisc and it was dissapointing WRT latency. HTB class prio was far better. In both cases I also had sfq on the leaf of the tcp class, which makes browsing while downloading nicer and for tcp games didn't hurt latency too much. I was only testing with one user though I scripted two, I'll get round to playing with curl loader one day. There's bound to be a mistake somewhere, but I paste below what I did. class/flowids are hex and you have 0-ffff after : (minor) to play with - you'll need a more sensible numbering system that I chose. Policing was also not too bad. Andy. cat htb-255-eth0-prio-htb set -x IP=/sbin/ip TC=/sbin/tc $TC qdisc del dev eth0 root &>/dev/null if [ "$1" = "stop" ] then echo "stopping" exit fi $TC qdisc add dev eth0 root handle 1: htb $TC class add dev eth0 parent 1: classid 1:1 htb rate 255kbit burst 100k cburst 100k $TC class add dev eth0 parent 1:1 classid 1:11 htb prio 0 rate 200kbit ceil 255kbit burst 10k cburst 10k $TC qdisc add dev eth0 parent 1:11 bfifo limit 50k $TC class add dev eth0 parent 1:1 classid 1:12 htb prio 1 rate 55kbit ceil 255kbit burst 90k cburst 90k $TC qdisc add dev eth0 parent 1:12 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.3 flowid 1:1 $TC filter add dev eth0 parent 1:1 protocol ip prio 1 u32 match ip protocol 6 0xff flowid 1:12 $TC filter add dev eth0 parent 1:1 protocol ip prio 2 u32 match u32 0 0 flowid 1:11 $TC class add dev eth0 parent 1: classid 1:2 htb rate 255kbit burst 100k cburst 100k $TC class add dev eth0 parent 1:2 classid 1:21 htb prio 0 rate 200kbit ceil 255kbit burst 10k cburst 10k $TC qdisc add dev eth0 parent 1:21 bfifo limit 50k $TC class add dev eth0 parent 1:2 classid 1:22 htb prio 1 rate 55kbit ceil 255kbit burst 90k cburst 90k $TC qdisc add dev eth0 parent 1:22 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.99 flowid 1:2 $TC filter add dev eth0 parent 1:2 protocol ip prio 1 u32 match ip protocol 6 0xff flowid 1:22 $TC filter add dev eth0 parent 1:2 protocol ip prio 2 u32 match u32 0 0 flowid 1:21 cat htb-255-eth0-prio set -x IP=/sbin/ip TC=/sbin/tc $TC qdisc del dev eth0 root &>/dev/null if [ "$1" = "stop" ] then echo "stopping" exit fi $TC qdisc add dev eth0 root handle 1: htb $TC class add dev eth0 parent 1: classid 1:1 htb rate 255kbit burst 100k cburst 100k $TC qdisc add dev eth0 parent 1:1 handle 2: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 $TC qdisc add dev eth0 parent 2:1 bfifo limit 50k $TC qdisc add dev eth0 parent 2:2 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.3 flowid 1:1 $TC filter add dev eth0 parent 2: protocol ip prio 1 u32 match ip protocol 6 0xff flowid 2:2 $TC filter add dev eth0 parent 2: protocol ip prio 2 u32 match u32 0 0 flowid 2:1 $TC class add dev eth0 parent 1: classid 1:2 htb rate 255kbit burst 100k cburst 100k $TC qdisc add dev eth0 parent 1:2 handle 3: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 $TC qdisc add dev eth0 parent 3:1 bfifo limit 50k $TC qdisc add dev eth0 parent 3:2 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.99 flowid 1:2 $TC filter add dev eth0 parent 3: protocol ip prio 1 u32 match ip protocol 6 0xff flowid 3:2 $TC filter add dev eth0 parent 3: protocol ip prio 2 u32 match u32 0 0 flowid 3:1 From lists at andyfurniss.entadsl.com Fri May 11 02:36:48 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 12 01:08:26 2007 Subject: [LARTC] limit bandwidth per host question In-Reply-To: <46423F8C.1090905@andyfurniss.entadsl.com> References: <463FD790.8010709@studentergaarden.dk> <46423F8C.1090905@andyfurniss.entadsl.com> Message-ID: <4643BAA0.4020800@andyfurniss.entadsl.com> Andy Furniss wrote: >> tc class add dev eth2 parent 1:0 classid 1:2 htb rate 255kbit burst >> 255kbit > Burst is a good idea Actually you need to specify burst and cburst for it to work and I suppose the law doesn't stop you being more generous than 255kbit - I just tried 100k (= 100k byte) and browsing isn't too bad. Browsing and downloading together with just fifo is horrible though. I tried htb with the prio qdisc and it was dissapointing WRT latency. HTB class prio was far better. In both cases I also had sfq on the leaf of the tcp class, which makes browsing while downloading nicer and for tcp games didn't hurt latency too much. I was only testing with one user though I scripted two, I'll get round to playing with curl loader one day. There's bound to be a mistake somewhere, but I paste below what I did. class/flowids are hex and you have 0-ffff after : (minor) to play with - you'll need a more sensible numbering system that I chose. Policing was also not too bad. Andy. cat htb-255-eth0-prio-htb set -x IP=/sbin/ip TC=/sbin/tc $TC qdisc del dev eth0 root &>/dev/null if [ "$1" = "stop" ] then echo "stopping" exit fi $TC qdisc add dev eth0 root handle 1: htb $TC class add dev eth0 parent 1: classid 1:1 htb rate 255kbit burst 100k cburst 100k $TC class add dev eth0 parent 1:1 classid 1:11 htb prio 0 rate 200kbit ceil 255kbit burst 10k cburst 10k $TC qdisc add dev eth0 parent 1:11 bfifo limit 50k $TC class add dev eth0 parent 1:1 classid 1:12 htb prio 1 rate 55kbit ceil 255kbit burst 90k cburst 90k $TC qdisc add dev eth0 parent 1:12 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.3 flowid 1:1 $TC filter add dev eth0 parent 1:1 protocol ip prio 1 u32 match ip protocol 6 0xff flowid 1:12 $TC filter add dev eth0 parent 1:1 protocol ip prio 2 u32 match u32 0 0 flowid 1:11 $TC class add dev eth0 parent 1: classid 1:2 htb rate 255kbit burst 100k cburst 100k $TC class add dev eth0 parent 1:2 classid 1:21 htb prio 0 rate 200kbit ceil 255kbit burst 10k cburst 10k $TC qdisc add dev eth0 parent 1:21 bfifo limit 50k $TC class add dev eth0 parent 1:2 classid 1:22 htb prio 1 rate 55kbit ceil 255kbit burst 90k cburst 90k $TC qdisc add dev eth0 parent 1:22 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.99 flowid 1:2 $TC filter add dev eth0 parent 1:2 protocol ip prio 1 u32 match ip protocol 6 0xff flowid 1:22 $TC filter add dev eth0 parent 1:2 protocol ip prio 2 u32 match u32 0 0 flowid 1:21 cat htb-255-eth0-prio set -x IP=/sbin/ip TC=/sbin/tc $TC qdisc del dev eth0 root &>/dev/null if [ "$1" = "stop" ] then echo "stopping" exit fi $TC qdisc add dev eth0 root handle 1: htb $TC class add dev eth0 parent 1: classid 1:1 htb rate 255kbit burst 100k cburst 100k $TC qdisc add dev eth0 parent 1:1 handle 2: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 $TC qdisc add dev eth0 parent 2:1 bfifo limit 50k $TC qdisc add dev eth0 parent 2:2 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.3 flowid 1:1 $TC filter add dev eth0 parent 2: protocol ip prio 1 u32 match ip protocol 6 0xff flowid 2:2 $TC filter add dev eth0 parent 2: protocol ip prio 2 u32 match u32 0 0 flowid 2:1 $TC class add dev eth0 parent 1: classid 1:2 htb rate 255kbit burst 100k cburst 100k $TC qdisc add dev eth0 parent 1:2 handle 3: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 $TC qdisc add dev eth0 parent 3:1 bfifo limit 50k $TC qdisc add dev eth0 parent 3:2 sfq limit 30 $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 192.168.0.99 flowid 1:2 $TC filter add dev eth0 parent 3: protocol ip prio 1 u32 match ip protocol 6 0xff flowid 3:2 $TC filter add dev eth0 parent 3: protocol ip prio 2 u32 match u32 0 0 flowid 3:1 From lists at andyfurniss.entadsl.com Sat May 12 02:41:55 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sat May 12 02:41:45 2007 Subject: [LARTC] HTB and bursts In-Reply-To: <20070511190348.41B6D450A@outpost.ds9a.nl> References: <20070511190348.41B6D450A@outpost.ds9a.nl> Message-ID: <46450D53.5080702@andyfurniss.entadsl.com> Pablo Fernandes Yahoo wrote: > I would like to have the customer using 150kbit stable in a download. But at > the begining of the conection, i would like to have a 200kbit burst. Depends what you mean - burst is an amount of data not a bitrate. If you want them (using your setup) to have 25k of data unlimited rate then burst 25k cburst 25k should do it. I think that if your class has a different ceil to rate then giving a burst but not cburst will give them burst bytes capped at ceil rate. I haven't tested the exact behavior or read all recent posts yet. Andy. From bugfood-ml at fatooh.org Sat May 12 23:51:33 2007 From: bugfood-ml at fatooh.org (Corey Hickey) Date: Sat May 12 23:51:51 2007 Subject: [LARTC] Massive filtering In-Reply-To: <200705050130.AA2025718096@ipro.net> References: <200705050130.AA2025718096@ipro.net> Message-ID: <464636E5.5080709@fatooh.org> ericr wrote: > I am trying to build a trafic control rule set for a huge NATed > network, and I have it working for single known addresses but I need > to scale it to 16M potential client addresses. I'm using iptables > for NAT. Incoming traffic is simple because I can match destination > address, outgoing traffic I use iptables IPMARK then tc match mark > and it works perfectly if I build rules for each client individually. > I am worried about performance as the client list increases. > > I need to place client IPs into classes like routers, freeloaders, > lite-access, premium-access, etc. I have no problem with rewriting > rules on the fly. It is easy to pop in a rule change any time a user > authenticates or is disconnected for inactivity. I don't know what exactly it is you're doing, but here's a thought. Do you control the allocation of addresses via DHCP? If so, it might be faster/easier to simply set up IP ranges for your separate classes of user. 10.1.0.0/16 routers 10.2.0.0/16 freeloaders 10.3.0.0/16 ...etc... Then you can use single matches in iptables/tc to identify packets to/from each class. -Corey From sachinutd at gmail.com Sun May 13 14:08:42 2007 From: sachinutd at gmail.com (Sachin K) Date: Sun May 13 14:08:58 2007 Subject: [LARTC] IP address change management Message-ID: Hi How do inetd services handle IP address change? What happens exactly after I change the IP address of an interface? Thanks, Sachin From mkathuria at tuxtechnologies.co.in Mon May 14 07:35:52 2007 From: mkathuria at tuxtechnologies.co.in (Manish Kathuria) Date: Mon May 14 07:36:00 2007 Subject: [LARTC] DGD patch not detecting dead gateway In-Reply-To: <000701c79378$09be3f10$5964a8c0@SalimSi> References: <000701c79378$09be3f10$5964a8c0@SalimSi> Message-ID: <1df4abe60705132235n358df5abt40908d972656e63e@mail.gmail.com> On 5/11/07, Salim S I wrote: > > I have a doubt. If you use such a script monitoring the link status with > ping and then reconfiguring, why do you need the DGD patch? You need to do > some reconfiguration (change multipath to a single default route) anyway if > you use the script, right? The patches take care of many other issues also. Please refer to the archives here: http://mailman.ds9a.nl/pipermail/lartc/2007q1/020403.html -- Manish Kathuria Tux Technologies http://www.tuxtechnologies.co.in/ From rabbit at rabbit.us Mon May 14 07:57:04 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Mon May 14 07:57:13 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter Message-ID: <4647FA30.5040401@rabbit.us> Hi, I have searched the archives on the topic, and it seems that the list gurus favor load balancing to be done in the kernel as opposed to other means. I have been using a home-grown approach, which splits traffic based on `-m statistic --mode random --probability X`, then CONNMARKs the individual connections and the kernel happily routes them. I understand that for > 2 links it will become impractical to calculate a correct X. But if we only have 2 gateways to the internet - are there any advantages in letting the kernel multipath scheduler do the balancing (with all the downsides of route caching), as opposed to the pure random approach described above? Thanks Peter From salim.si at cipherium.com.tw Mon May 14 07:59:23 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Mon May 14 07:59:39 2007 Subject: [LARTC] DGD patch not detecting dead gateway In-Reply-To: <1df4abe60705132235n358df5abt40908d972656e63e@mail.gmail.com> Message-ID: <000d01c795ed$08c3dcb0$c46bfea9@SalimSi> I had followed that discussion, but my doubt remains. For MASQUERADE too, it depends on the 'src' parameter, which was configured statically. But after the interface comes up with a new address, the initial configuration will be invalid because 'src' is not correct anymore, it seems...Or have I have misunderstood the concept? -----Original Message----- From: Manish Kathuria [mailto:mkathuria@tuxtechnologies.co.in] Sent: Monday, May 14, 2007 1:36 PM To: Salim S I Cc: lartc@mailman.ds9a.nl; tomlobato@gmail.com Subject: Re: [LARTC] DGD patch not detecting dead gateway On 5/11/07, Salim S I wrote: > > I have a doubt. If you use such a script monitoring the link status with > ping and then reconfiguring, why do you need the DGD patch? You need to do > some reconfiguration (change multipath to a single default route) anyway if > you use the script, right? The patches take care of many other issues also. Please refer to the archives here: http://mailman.ds9a.nl/pipermail/lartc/2007q1/020403.html -- Manish Kathuria Tux Technologies http://www.tuxtechnologies.co.in/ From salim.si at cipherium.com.tw Mon May 14 08:07:07 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Mon May 14 08:07:30 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <4647FA30.5040401@rabbit.us> Message-ID: <000e01c795ee$1d6cc090$c46bfea9@SalimSi> I have thought about this approach, but, I think, this approach does not handle failover/dead-gateway-detection well. Because you need to alter all your netfilter routing rules if you find a link down. And then reconfigure again when the link comes up. I am interested to know how you handle that. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Monday, May 14, 2007 1:57 PM To: lartc@mailman.ds9a.nl Subject: [LARTC] Multihome load balancing - kernel vs netfilter Hi, I have searched the archives on the topic, and it seems that the list gurus favor load balancing to be done in the kernel as opposed to other means. I have been using a home-grown approach, which splits traffic based on `-m statistic --mode random --probability X`, then CONNMARKs the individual connections and the kernel happily routes them. I understand that for > 2 links it will become impractical to calculate a correct X. But if we only have 2 gateways to the internet - are there any advantages in letting the kernel multipath scheduler do the balancing (with all the downsides of route caching), as opposed to the pure random approach described above? Thanks Peter _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From rabbit at rabbit.us Mon May 14 09:15:56 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Mon May 14 09:16:06 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000e01c795ee$1d6cc090$c46bfea9@SalimSi> References: <000e01c795ee$1d6cc090$c46bfea9@SalimSi> Message-ID: <46480CAC.6050002@rabbit.us> Salim S I wrote: >> -----Original Message----- >> From: lartc-bounces@mailman.ds9a.nl >> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >> Sent: Monday, May 14, 2007 1:57 PM >> To: lartc@mailman.ds9a.nl >> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >> >> Hi, >> I have searched the archives on the topic, and it seems that the list >> gurus favor load balancing to be done in the kernel as opposed to other >> means. I have been using a home-grown approach, which splits traffic >> based on `-m statistic --mode random --probability X`, then CONNMARKs >> the individual connections and the kernel happily routes them. I >> understand that for > 2 links it will become impractical to calculate a >> correct X. But if we only have 2 gateways to the internet - are there >> any advantages in letting the kernel multipath scheduler do the >> balancing (with all the downsides of route caching), as opposed to the >> pure random approach described above? > > I have thought about this approach, but, I think, this approach does not > handle failover/dead-gateway-detection well. Because you need to alter > all your netfilter routing rules if you find a link down. And then > reconfigure again when the link comes up. I am interested to know how > you handle that. > Certainly. What I am doing is NATing a large company network, which gets load balanced and receives fail over protection. I also have a number of services running on the router which must not be balanced nor failed over, as they are expected to respond on a specific IP only. All remaining traffic on the server itself is not balanced but fails over when the designated primary link goes down. I start with a simple pinger app, that pings several well known remote sites once a minute using a large icmp packet (1k of payload). The rtt times are averaged out and are used to calculate the current "quality" of the link (the large packet makes congestion a visible factor). If one of the interface responses is 0 (meaning not a single one of the pinged hosts has responded) - the link is dead. In iproute I have two separate tables, each using one of the links as default gw, matching a certain mark. The default route is set to a single gateway (not a multipath), either by hardcoding, or by using the first input of the pinger (it can run without a default gw set, explanation follows) In iptables I have two user defined chains: iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 iptables -t mangle -A ISP1 -j MARK --set-mark 11 iptables -t mangle -A ISP1 -j ACCEPT iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 iptables -t mangle -A ISP2 -j MARK --set-mark 12 iptables -t mangle -A ISP2 -j ACCEPT The rules that reference those chains are: For all locally originating traffic: iptables -t mangle -A OUTPUT -o $I1 -j ISP1 iptables -t mangle -A OUTPUT -o $I2 -j ISP2 For all incoming traffic from the internet: iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 For all other traffic (nat) iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode random --probability $X -j ISP1 iptables -t mangle -A PREROUTING -j ISP2 At the end of the PREROUTING cain I have iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark The NATing is trivially solved by: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP What does this achieve: * Local applications that have explicitly requested a specific IP to bind to, will be routed over the corresponding interface and will stay that way. Only applications binding to 0.0.0.0 will be routed by consulting the default route. * Responses to connections from the internet are guaranteed to leave from the same interface they came in. * All new connection not coming from the external interfaces are load balanced by the weight of $X, and are again guaranteed to stay there for the life of the connection, but another connection to the same host is not guaranteed to go over the same link. This is important in a company environment, since most employees use the same online resources. On every run of the pinger I do the following: * If both gateways are alive I replace the -m statistic rule, adjusting the value of $X * If one is detected dead, I adjust the probability accordingly (or alternatively remove the statistic match altogether), and change the default gateway if it is the one that failed. So really the whole exercise revolves around changing a single rule (or two rules, if you want to control the probability in a more fine-grained way). Last but not least this setup allowed me to program exception tables for certain IP blocks. For instance Yahoo has a braindead two tier authentication system for commercial solutions. It remembers the IP which you used to login with first, and it must match the IP used to login to a more secure area (using another password). Or users from within the lan might want to use one of the ISPs SMTP servers, which keeps a close eye on who is talking to it. So I have a $PREFERRED which is adjusted to either ISP1 or ISP2, depending on the current state of affairs, and rules like: iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state NEW -j $PREFERRED iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state NEW -j $PREFERRED This pretty much sums it up. The only downside I can think of is that loss of service can be observed between two runs of the pinger. Let me know if I missed something be it critical or minor. Thanks Peter From salim.si at cipherium.com.tw Mon May 14 10:23:17 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Mon May 14 10:23:44 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <46480CAC.6050002@rabbit.us> Message-ID: <000f01c79601$25193090$c46bfea9@SalimSi> iptables -t mangle -A PREROUTING -j ISP2 Doesn't it need to check for state NEW? Or packets will not reach the restore-mark rule. You may have to manually populate the routing tables when an interface comes up, after being down for some time. (Kernel would have removed the routing entries for this interface after it found the interface down. This happens only if its nexthop is down) I tend to favor this approach, because it is more flexible in selecting the interface. You can use different weights/probability depending on different factors. I have seen a variation of this method, used with 'recent' (-m recent) match, instead of CONNMARK. The only downside in using this method, as far as I can see, is the need to reconfigure rules and routing tables, in case of a failure/coming-up. But lately, I have found that even with multipath method, there IS a need for reconfiguration. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Monday, May 14, 2007 3:16 PM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter Salim S I wrote: >> -----Original Message----- >> From: lartc-bounces@mailman.ds9a.nl >> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >> Sent: Monday, May 14, 2007 1:57 PM >> To: lartc@mailman.ds9a.nl >> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >> >> Hi, >> I have searched the archives on the topic, and it seems that the list >> gurus favor load balancing to be done in the kernel as opposed to other >> means. I have been using a home-grown approach, which splits traffic >> based on `-m statistic --mode random --probability X`, then CONNMARKs >> the individual connections and the kernel happily routes them. I >> understand that for > 2 links it will become impractical to calculate a >> correct X. But if we only have 2 gateways to the internet - are there >> any advantages in letting the kernel multipath scheduler do the >> balancing (with all the downsides of route caching), as opposed to the >> pure random approach described above? > > I have thought about this approach, but, I think, this approach does not > handle failover/dead-gateway-detection well. Because you need to alter > all your netfilter routing rules if you find a link down. And then > reconfigure again when the link comes up. I am interested to know how > you handle that. > Certainly. What I am doing is NATing a large company network, which gets load balanced and receives fail over protection. I also have a number of services running on the router which must not be balanced nor failed over, as they are expected to respond on a specific IP only. All remaining traffic on the server itself is not balanced but fails over when the designated primary link goes down. I start with a simple pinger app, that pings several well known remote sites once a minute using a large icmp packet (1k of payload). The rtt times are averaged out and are used to calculate the current "quality" of the link (the large packet makes congestion a visible factor). If one of the interface responses is 0 (meaning not a single one of the pinged hosts has responded) - the link is dead. In iproute I have two separate tables, each using one of the links as default gw, matching a certain mark. The default route is set to a single gateway (not a multipath), either by hardcoding, or by using the first input of the pinger (it can run without a default gw set, explanation follows) In iptables I have two user defined chains: iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 iptables -t mangle -A ISP1 -j MARK --set-mark 11 iptables -t mangle -A ISP1 -j ACCEPT iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 iptables -t mangle -A ISP2 -j MARK --set-mark 12 iptables -t mangle -A ISP2 -j ACCEPT The rules that reference those chains are: For all locally originating traffic: iptables -t mangle -A OUTPUT -o $I1 -j ISP1 iptables -t mangle -A OUTPUT -o $I2 -j ISP2 For all incoming traffic from the internet: iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 For all other traffic (nat) iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode random --probability $X -j ISP1 iptables -t mangle -A PREROUTING -j ISP2 At the end of the PREROUTING cain I have iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark The NATing is trivially solved by: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP What does this achieve: * Local applications that have explicitly requested a specific IP to bind to, will be routed over the corresponding interface and will stay that way. Only applications binding to 0.0.0.0 will be routed by consulting the default route. * Responses to connections from the internet are guaranteed to leave from the same interface they came in. * All new connection not coming from the external interfaces are load balanced by the weight of $X, and are again guaranteed to stay there for the life of the connection, but another connection to the same host is not guaranteed to go over the same link. This is important in a company environment, since most employees use the same online resources. On every run of the pinger I do the following: * If both gateways are alive I replace the -m statistic rule, adjusting the value of $X * If one is detected dead, I adjust the probability accordingly (or alternatively remove the statistic match altogether), and change the default gateway if it is the one that failed. So really the whole exercise revolves around changing a single rule (or two rules, if you want to control the probability in a more fine-grained way). Last but not least this setup allowed me to program exception tables for certain IP blocks. For instance Yahoo has a braindead two tier authentication system for commercial solutions. It remembers the IP which you used to login with first, and it must match the IP used to login to a more secure area (using another password). Or users from within the lan might want to use one of the ISPs SMTP servers, which keeps a close eye on who is talking to it. So I have a $PREFERRED which is adjusted to either ISP1 or ISP2, depending on the current state of affairs, and rules like: iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state NEW -j $PREFERRED iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state NEW -j $PREFERRED This pretty much sums it up. The only downside I can think of is that loss of service can be observed between two runs of the pinger. Let me know if I missed something be it critical or minor. Thanks Peter _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From rabbit at rabbit.us Mon May 14 13:24:31 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Mon May 14 13:24:37 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000f01c79601$25193090$c46bfea9@SalimSi> References: <000f01c79601$25193090$c46bfea9@SalimSi> Message-ID: <464846EF.3080109@rabbit.us> Answer inlined: Salim S I wrote: > iptables -t mangle -A PREROUTING -j ISP2 > > Doesn't it need to check for state NEW? Or packets will not reach the > restore-mark rule. Of course, and the real script does check. I typed this line manually because the copy cut it, and missed the obvious check. > You may have to manually populate the routing tables when an interface > comes up, after being down for some time. (Kernel would have removed the > routing entries for this interface after it found the interface down. > This happens only if its nexthop is down) This is what I can't really understand (and it applies to DGD as well) - how often in real life does someone yank a cable out, so an interface will go down? In over 7 years of dealing with various ISPs I have never seen the link go so dead, that the kernel will down the interface and remove all associated routing information. What I have seen on the other hand is the link dying at the 2nd or 3rd hop, which (if I understand correctly) DGD simply can not detect. Correct me if my assumption is wrong. > I tend to favor this approach, because it is more flexible in selecting > the interface. You can use different weights/probability depending on > different factors. I have seen a variation of this method, used with > 'recent' (-m recent) match, instead of CONNMARK. I see. But recent would have a "caching effect", and from what I understand is heavier on the kernel, unlike the CONNMARK which hooks into the conntrack which in turn has to track connections either way. > The only downside in using this method, as far as I can see, is the need > to reconfigure rules and routing tables, in case of a failure/coming-up. > But lately, I have found that even with multipath method, there IS a > need for reconfiguration. Got you. This pretty much answers my original question. Thank you for your time. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson > Sent: Monday, May 14, 2007 3:16 PM > To: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > Salim S I wrote: >>> -----Original Message----- >>> From: lartc-bounces@mailman.ds9a.nl >>> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >>> Sent: Monday, May 14, 2007 1:57 PM >>> To: lartc@mailman.ds9a.nl >>> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >>> >>> Hi, >>> I have searched the archives on the topic, and it seems that the list >>> gurus favor load balancing to be done in the kernel as opposed to > other >>> means. I have been using a home-grown approach, which splits traffic >>> based on `-m statistic --mode random --probability X`, then CONNMARKs >>> the individual connections and the kernel happily routes them. I >>> understand that for > 2 links it will become impractical to calculate > a >>> correct X. But if we only have 2 gateways to the internet - are there >>> any advantages in letting the kernel multipath scheduler do the >>> balancing (with all the downsides of route caching), as opposed to > the >>> pure random approach described above? >> I have thought about this approach, but, I think, this approach does > not >> handle failover/dead-gateway-detection well. Because you need to alter >> all your netfilter routing rules if you find a link down. And then >> reconfigure again when the link comes up. I am interested to know how >> you handle that. >> > > Certainly. What I am doing is NATing a large company network, which gets > load balanced and receives fail over protection. I also have a number of > services running on the router which must not be balanced nor failed > over, as they are expected to respond on a specific IP only. All > remaining traffic on the server itself is not balanced but fails over > when the designated primary link goes down. > > I start with a simple pinger app, that pings several well known remote > sites once a minute using a large icmp packet (1k of payload). The rtt > times are averaged out and are used to calculate the current "quality" > of the link (the large packet makes congestion a visible factor). If one > of the interface responses is 0 (meaning not a single one of the pinged > hosts has responded) - the link is dead. > > In iproute I have two separate tables, each using one of the links as > default gw, matching a certain mark. The default route is set to a > single gateway (not a multipath), either by hardcoding, or by using the > first input of the pinger (it can run without a default gw set, > explanation follows) > > In iptables I have two user defined chains: > iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 > iptables -t mangle -A ISP1 -j MARK --set-mark 11 > iptables -t mangle -A ISP1 -j ACCEPT > > iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 > iptables -t mangle -A ISP2 -j MARK --set-mark 12 > iptables -t mangle -A ISP2 -j ACCEPT > > The rules that reference those chains are: > > For all locally originating traffic: > iptables -t mangle -A OUTPUT -o $I1 -j ISP1 > iptables -t mangle -A OUTPUT -o $I2 -j ISP2 > > For all incoming traffic from the internet: > iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 > iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 > > For all other traffic (nat) > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode random --probability $X -j ISP1 > iptables -t mangle -A PREROUTING -j ISP2 > > At the end of the PREROUTING cain I have > iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark > > The NATing is trivially solved by: > iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT > iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT > iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT > > iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP > iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP > > > What does this achieve: > * Local applications that have explicitly requested a specific IP to > bind to, will be routed over the corresponding interface and will stay > that way. Only applications binding to 0.0.0.0 will be routed by > consulting the default route. > * Responses to connections from the internet are guaranteed to leave > from the same interface they came in. > * All new connection not coming from the external interfaces are load > balanced by the weight of $X, and are again guaranteed to stay there for > the life of the connection, but another connection to the same host is > not guaranteed to go over the same link. This is important in a company > environment, since most employees use the same online resources. > > On every run of the pinger I do the following: > * If both gateways are alive I replace the -m statistic rule, adjusting > the value of $X > * If one is detected dead, I adjust the probability accordingly (or > alternatively remove the statistic match altogether), and change the > default gateway if it is the one that failed. > > So really the whole exercise revolves around changing a single rule (or > two rules, if you want to control the probability in a more fine-grained > way). > > Last but not least this setup allowed me to program exception tables for > certain IP blocks. For instance Yahoo has a braindead two tier > authentication system for commercial solutions. It remembers the IP > which you used to login with first, and it must match the IP used to > login to a more secure area (using another password). Or users from > within the lan might want to use one of the ISPs SMTP servers, which > keeps a close eye on who is talking to it. So I have a $PREFERRED which > is adjusted to either ISP1 or ISP2, depending on the current state of > affairs, and rules like: > iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state > NEW -j $PREFERRED > iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state > NEW -j $PREFERRED > > This pretty much sums it up. The only downside I can think of is that > loss of service can be observed between two runs of the pinger. Let me > know if I missed something be it critical or minor. > > Thanks > > Peter > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From kuolung at ms.kuolung.net Tue May 15 14:09:40 2007 From: kuolung at ms.kuolung.net (Kuolung) Date: Tue May 15 14:10:21 2007 Subject: [LARTC] Re: Using multiple network interfaces (internetconnections) separately Message-ID: <00de01c796e9$ede01320$0201a8c0@AMD3000> Hi , I want to use the " * random selection of gateway, per TCP connection?" ,i can do it right now but my if same remote site( ip ) always goto the same gateway,i think that is ip_route_cache problem or something like this how can I do ?? Kuolung ----- Original Message ----- From: "Randy Wallace" To: Sent: Sunday, May 06, 2007 8:06 PM Subject: [LARTC] Re: Using multiple network interfaces (internetconnections) separately >> Hi, >> >> I need a solution for this case: >> I have a PC(as server) with 3 (or more) Ethernet ports and 3 (or more) >> Internet access through each Ethernet interface. (from different ISP's >> and >> with different IP's of course) >> >> I need to download files (using wget or whatever else) through each >> interface (internet line) separately. >> For example i need to download "file1" through eth1 (isp1), "file2" >> through >> eth2 (isp2) and so on. >> >> How can i make this working? any iptables/iproute rules? any Idea? >> >> Thanks in Advance, >> -- >> Ali Sattari (AKA Ali ix) > > Ali ix, > > This is an application for rules, in the iproute package. how you > select packets > for which internet connection can, best, be done by iptables using > firewall marks. > > The trick is, you can have only one default gateway, unless you use the > multiple > gateway patch, which may not be necessary for what you're talking about. > > The real question is: how do you plan on classifying traffic? > * different hosts (IP's) per gateway? > * random selection of gateway, per TCP connection? > * different types of traffic (Ports) per gateway? > * certain domains (only) available on each gateway? > > -Randy > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070515/7d4b8ef6/attachment.html From WBohannan at spidersat.com.gh Tue May 15 18:21:31 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Tue May 15 18:22:15 2007 Subject: [LARTC] Brouting on two NICS + 1 virtual NIC Message-ID: <4D411FB02758FE45915E9724339093F62E4618@intranet.scpl.local> Currently have a bridge working, would now like to add a third virtual nic so the machine can do nat as well to local users, however after a crazy amount of ready cant seem to get my head around it. Please help. Have a working bridge below (etc/network/interfaces and eth0 is the internet side interface so a virtual interface like eth1:0 would be nice :) auto lo iface lo inet loopback auto br0 iface br0 inet static address 193.220.59.77 netmask 255.255.255.128 network 193.220.59.0 broadcast 193.220.59.127 gateway 193.220.59.126 pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 Kind Regards William From janka.lartc at mailnull.com Wed May 16 16:04:37 2007 From: janka.lartc at mailnull.com (Sam LARTC) Date: Wed May 16 16:04:55 2007 Subject: [LARTC] tcng + esfq Message-ID: <4ba64e700705160704t237a0899rb6c0ed6e60038280@mail.gmail.com> FYI, i've just created a quick patch adding esfq (Enhanced Stochastic Fairness queueing discipline) for tcng (Traffic Control Next Generation). Patch is located at http://devel.dob.sk/tcng+esfq. Enjoy. Sam From avalon at friendofpooh.com Wed May 16 21:56:19 2007 From: avalon at friendofpooh.com (Aleksandar Kostadinov) Date: Wed May 16 21:56:26 2007 Subject: [LARTC] Re: drop silently locally generated packets In-Reply-To: References: Message-ID: Hi. I want to drop silently locally generated packets on a specific interface. I tried 2 approaches: tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1: proto ip u32 match ip dst 10.10.10.1 flowid 1:1 police conform-exceed drop/drop tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tc qdisc add dev eth0 parent 1:2 handle 3:0 netem drop 100 tc filter add dev eth0 parent 1: proto ip u32 match ip dst 10.10.10.1 flowid 1:2 Both work (drop the packets to 10.10.10.1 and pass any others) but when I run "ping 10.10.10.1" I get after some time continuously "ping: sendmsg: No buffer space available". Any idea why is this happening? As well how could I drop packets without application being able to detect it? Thanks much, Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070516/fdb01f3a/attachment.htm From avalon at friendofpooh.com Thu May 17 00:14:09 2007 From: avalon at friendofpooh.com (Aleksandar Kostadinov) Date: Thu May 17 00:14:18 2007 Subject: [LARTC] Re: drop silently locally generated packets In-Reply-To: <464B6A84.7040701@equinox-eng.com> References: <464B6A84.7040701@equinox-eng.com> Message-ID: On 5/16/07, Gustin Johnson wrote: > > Is there a reason you are not using iptables to drop these packets? yes. First it is not invisible for the application (try yourself with ping). If I use QUEUE though it's really transparent. Ask netfilter guys why. But I need these packets to be received locally and that's why iptables can't help. I mean I give an example using ping but I am actually going to handle multicast packets that have to be received by other local processes. I just don't want these to go out of the machine. Applications are not in my control to change ttl or whatever. The solutions I propose seem to work fine, but I'm not sure if there aren't any side effects that could appear depending on how the application has been written. The only thing returning errors I've found is ping but could I know if any application I'm running will work fine? The other tool I could try is mrouted but I think there should be an easier way. This drops packets originating on the Linux box > iptables -A OUTPUT -d 10.10.10.1 -j DROP > > The following drops packets that originate elsewhere (such as a NAT'd LAN) > iptables -A FORWARD -d 10.10.10.1 -j DROP > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070517/1b96d7af/attachment.html From fernandes_pablo at yahoo.com.br Wed May 16 21:30:39 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Thu May 17 01:30:57 2007 Subject: [LARTC] statistics and calc bandwidth traffic using tc -s qdisc show Message-ID: <20070516233053.CFCE93F85@outpost.ds9a.nl> Hello, Is there someone here who knows what does it means? The Sent part. [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth0" qdisc sfq 140: dev eth0 parent 1:140 limit 128p quantum 1514b perturb 10sec Sent 3155024 bytes 23249 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth1" qdisc sfq 140: dev eth1 parent 1:140 limit 128p quantum 1514b perturb 10sec Sent 41141183 bytes 32560 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 I also would like to know if there is a way to calc the bandwidth traffic (in kbit for example) of this customer using this informations. Thank you for any help in advance. Pablo Fernandes -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070516/202c2557/attachment.htm From e.janz at barceloviajes.com Thu May 17 09:42:26 2007 From: e.janz at barceloviajes.com (e.janz@barceloviajes.com) Date: Thu May 17 09:42:29 2007 Subject: [LARTC] statistics and calc bandwidth traffic using tc -s qdisc show In-Reply-To: <20070516233053.CFCE93F85@outpost.ds9a.nl> Message-ID: lartc-bounces@mailman.ds9a.nl wrote on 16/05/2007 21:30:39: > Hello, > > Is there someone here who knows what does it means? > > The Sent part. > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth0" > qdisc sfq 140: dev eth0 parent 1:140 limit 128p quantum 1514b perturb 10sec > Sent 3155024 bytes 23249 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth1" > qdisc sfq 140: dev eth1 parent 1:140 limit 128p quantum 1514b perturb 10sec > Sent 41141183 bytes 32560 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > > I also would like to know if there is a way to calc the bandwidth > traffic (in kbit for example) of this customer using this > informations. > > Thank you for any help in advance. > > Pablo Fernandes > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc Hi, the "Sent" parameter shows you the amount of data that fall into this qdisc. You can obtain the instant bandwith usage that falls into this qdisc parsing the "rate" parameter. In your example the rate is 0bit, that means 0 bits per second bandwith usage. I must admit that the output from tc -s is a big pain !! Best regards, Eric Janz -- ADVERTENCIA LEGAL El contenido de este correo es confidencial y dirigido unicamente a su destinatario. Para acceder a su clausula de privacidad consulte http://www.barceloviajes.com/privacy LEGAL ADVISORY This message is confidential and intended only for the person or entity to which it is addressed. In order to read its privacy policy consult it at http://www.barceloviajes.com/privacy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070517/0b756cae/attachment.html From vladsun at relef.net Thu May 17 12:56:39 2007 From: vladsun at relef.net (VladSun) Date: Thu May 17 12:57:10 2007 Subject: [LARTC] IPCLASSIFY - patch based on IPMARK Message-ID: <464C34E7.7050002@relef.net> Hello everybody! Some time ago I've decided that using the MARK property of the Linux IP packet structure for the needs of traffic control is not very useful. So I wrote an iptables patch called IPCLASSIFY. It is fully based on IPMARK but it uses the PRIORITY field instead of MARK. The relation between IPCLASSIFY<->CLASSIFY is the same as IPMARK<->MARK. By using IPCLASSIFY not a single TC filter is needed any more! Additionally, the MARK field can be used for something else, more useful. You can find it here : http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz . Fell free to report any bugs. :) Have a nice day! From salatiel.filho at gmail.com Thu May 17 14:39:57 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Thu May 17 14:40:11 2007 Subject: [LARTC] statistics and calc bandwidth traffic using tc -s qdisc show In-Reply-To: <20070516233053.CFCE93F85@outpost.ds9a.nl> References: <20070516233053.CFCE93F85@outpost.ds9a.nl> Message-ID: I use tc-viewer . It does a great job. http://snaj.ath.cx/tc-viewer/tc-viewer.html On 5/16/07, Pablo Fernandes Yahoo wrote: > > Hello, > > > > Is there someone here who knows what does it means? > > > > The Sent part. > > > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth0" > > qdisc sfq 140: dev eth0 parent 1:140 limit 128p quantum 1514b perturb > 10sec > > Sent 3155024 bytes 23249 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth1" > > qdisc sfq 140: dev eth1 parent 1:140 limit 128p quantum 1514b perturb > 10sec > > Sent 41141183 bytes 32560 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > > > I also would like to know if there is a way to calc the bandwidth traffic > (in kbit for example) of this customer using this informations. > > > > Thank you for any help in advance. > > > > Pablo Fernandes > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070517/44737c04/attachment.htm From mk at crc.dk Thu May 17 15:02:18 2007 From: mk at crc.dk (Mogens Kjaer) Date: Thu May 17 15:02:20 2007 Subject: [LARTC] Newbie: Route some traffic through a pptp tunnel Message-ID: <464C525A.2080307@crc.dk> I have a centos 4 i386 machine that works like a router (iptables filter, NAT) with two NIC's. One NIC is connected to my ISP (100 Mbit FTTH), I get a DHCP assigned public IP that changes "sometimes". Most incoming ports are blocked by my ISP. In order to get a fixed IP and open ports, I have to set up a PPTP tunnel to the ISP. The default gw and the NAT'ing goes to this tunnel. This is the output of ifconfig: eth0 Link encap:Ethernet HWaddr 00:80:C8:EA:88:A7 inet addr:86.48.47.147 Bcast:86.48.47.255 Mask:255.255.254.0 inet6 addr: fe80::280:c8ff:feea:88a7/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8083596 errors:0 dropped:0 overruns:0 frame:0 TX packets:3408048 errors:22 dropped:0 overruns:22 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1538901914 (1.4 GiB) TX bytes:519514046 (495.4 MiB) Interrupt:169 Base address:0x4000 eth1 Link encap:Ethernet HWaddr 00:12:79:A0:3D:7E inet addr:192.168.4.1 Bcast:192.168.4.255 Mask:255.255.255.0 inet6 addr: fe80::212:79ff:fea0:3d7e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:126264 errors:0 dropped:0 overruns:0 frame:0 TX packets:155536 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:23156937 (22.0 MiB) TX bytes:111015780 (105.8 MiB) Interrupt:177 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:912424 errors:0 dropped:0 overruns:0 frame:0 TX packets:912424 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:103397649 (98.6 MiB) TX bytes:103397649 (98.6 MiB) ppp0 Link encap:Point-to-Point Protocol inet addr:86.48.43.19 P-t-P:81.19.236.186 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1000 Metric:1 RX packets:120948 errors:0 dropped:0 overruns:0 frame:0 TX packets:109043 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:3 RX bytes:80518167 (76.7 MiB) TX bytes:37434930 (35.7 MiB) This works today, my problem is that the tunneled traffic is slower than going through eth0 directly. How can I: 1. Use the tunnel for incoming and outgoing mail and incoming http requests. 2. NAT traffic from eth1 to eth0, i.e. not through the tunnel 3. Local traffic from the router should access the internet through eth0, except for outgoing mails. Mogens -- Mogens Kjaer, Carlsberg A/S, Computer Department Gamle Carlsberg Vej 10, DK-2500 Valby, Denmark Phone: +45 33 27 53 25, Fax: +45 33 27 47 08 Email: mk@crc.dk Homepage: http://www.crc.dk From vladsun at relef.net Thu May 17 15:07:10 2007 From: vladsun at relef.net (VladSun) Date: Thu May 17 15:07:38 2007 Subject: [LARTC] Newbie: Route some traffic through a pptp tunnel In-Reply-To: <464C525A.2080307@crc.dk> References: <464C525A.2080307@crc.dk> Message-ID: <464C537E.4040800@relef.net> Mogens Kjaer ??????: > I have a centos 4 i386 machine that works like a > router (iptables filter, NAT) with two NIC's. > > One NIC is connected to my ISP (100 Mbit FTTH), > I get a DHCP assigned public IP that changes > "sometimes". Most incoming ports are blocked > by my ISP. > > In order to get a fixed IP and open ports, I > have to set up a PPTP tunnel to the ISP. > > The default gw and the NAT'ing goes to this tunnel. > > This is the output of ifconfig: > > eth0 Link encap:Ethernet HWaddr 00:80:C8:EA:88:A7 > inet addr:86.48.47.147 Bcast:86.48.47.255 Mask:255.255.254.0 > inet6 addr: fe80::280:c8ff:feea:88a7/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:8083596 errors:0 dropped:0 overruns:0 frame:0 > TX packets:3408048 errors:22 dropped:0 overruns:22 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:1538901914 (1.4 GiB) TX bytes:519514046 (495.4 MiB) > Interrupt:169 Base address:0x4000 > > eth1 Link encap:Ethernet HWaddr 00:12:79:A0:3D:7E > inet addr:192.168.4.1 Bcast:192.168.4.255 Mask:255.255.255.0 > inet6 addr: fe80::212:79ff:fea0:3d7e/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:126264 errors:0 dropped:0 overruns:0 frame:0 > TX packets:155536 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:23156937 (22.0 MiB) TX bytes:111015780 (105.8 MiB) > Interrupt:177 > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:912424 errors:0 dropped:0 overruns:0 frame:0 > TX packets:912424 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:103397649 (98.6 MiB) TX bytes:103397649 (98.6 MiB) > > ppp0 Link encap:Point-to-Point Protocol > inet addr:86.48.43.19 P-t-P:81.19.236.186 Mask:255.255.255.255 > UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1000 Metric:1 > RX packets:120948 errors:0 dropped:0 overruns:0 frame:0 > TX packets:109043 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:3 > RX bytes:80518167 (76.7 MiB) TX bytes:37434930 (35.7 MiB) > > This works today, my problem is that the tunneled traffic is slower than > going through eth0 directly. > > How can I: > > 1. Use the tunnel for incoming and outgoing mail and incoming http > requests. > 2. NAT traffic from eth1 to eth0, i.e. not through the tunnel > 3. Local traffic from the router should access the internet through > eth0, except for outgoing mails. > > Mogens > > You may find the ROUTE iptables target useful for this. From support at isn.net Fri May 18 07:08:12 2007 From: support at isn.net (ISN Support Staff) Date: Fri May 18 07:08:21 2007 Subject: [LARTC] High Latency With Tiered Queues Message-ID: <1179464892.13716.12.camel@eliza> Hello, I'm trying to setup what I thought would be a fairly basic tiered shaping system. I have a 6mbit (768kbps) link coming into my eth1 device, with my LAN IPs on the eth0 device. I want to limit outgoing traffic so that certain IPs are limited to 400kbps, with 3 classes under that 400k so certain machines get prioritized (main servers in 1:21, other servers in 1:22, workstations in 1:23) The problem is that when I turn this on, my packet latency jumps up by 50 to 100 times the normal rate. I go from 10-20 ms ping times to 500-1600ms! I've tried putting SFQ qdiscs under the classes, but that makes no difference. I'm sure there is just some tuning parameter I'm not setting correctly, but can somebody clue me in to what I'm doing wrong? Or is HTB just the wrong scheduler to be using here? I tried CBQ, but I can't get the tiers to work ( I keep getting RTNETLINK answers: Invalid argument) I'm currently using a single tiered CBQ solution, but it really doesn't fit my needs. Here's the full script: ----------------------- qdisc add dev eth1 root handle 1: htb default 10 class add dev eth1 parent 1: classid 1:1 htb rate 768kbps class add dev eth1 parent 1: classid 1:10 htb rate 250kbps ceil 768kbps burst 6k class add dev eth1 parent 1: classid 1:20 htb rate 100kbps ceil 400kbps burst 6k class add dev eth1 parent 1:20 classid 1:21 htb rate 75kbps ceil 375kbps burst 6k filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.35 flowid 1:21 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.29 flowid 1:21 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.28 flowid 1:21 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.16 flowid 1:21 class add dev eth1 parent 1:20 classid 1:22 htb rate 15kbps ceil 350kbps burst 6k filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.30 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.31 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.32 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.36 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.39 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.40 flowid 1:22 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.47 flowid 1:22 class add dev eth1 parent 1:20 classid 1:23 htb rate 10kbps ceil 200kbps burst 6k filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.33 flowid 1:23 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.34 flowid 1:23 filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip src 192.168.1.37 flowid 1:23 From bugfood-ml at fatooh.org Fri May 18 08:46:22 2007 From: bugfood-ml at fatooh.org (Corey Hickey) Date: Fri May 18 08:46:46 2007 Subject: [LARTC] tcng + esfq In-Reply-To: <4ba64e700705160704t237a0899rb6c0ed6e60038280@mail.gmail.com> References: <4ba64e700705160704t237a0899rb6c0ed6e60038280@mail.gmail.com> Message-ID: <464D4BBE.6080907@fatooh.org> Sam LARTC wrote: > FYI, > > i've just created a quick patch adding esfq (Enhanced Stochastic > Fairness queueing discipline) for tcng (Traffic Control Next > Generation). > Patch is located at http://devel.dob.sk/tcng+esfq. > Enjoy. I put a link to your patch page on the ESFQ page. Next time I make a release I'll put a note in the README as well. I don't use tcng, but I had a quick look at your patch and noticed a very minor error: -------------------------------------------------------------------- diff -urN tcng/tcc/q_esfq.c tcng-sam/tcc/q_esfq.c --- tcng/tcc/q_esfq.c 1970-01-01 01:00:00.000000000 +0100 +++ tcng-sam/tcc/q_esfq.c 2007-05-06 15:37:32.154594952 +0200 @@ -0,0 +1,78 @@ +/* + * q_esfq.c - Enhanced Statistical Fair Queuing qdisc -------------------------------------------------------------------- ESFQ stands for "Enhanced Stochastic Fairness Queueing". That's all. -Corey From hijacker at oldum.net Fri May 18 17:37:22 2007 From: hijacker at oldum.net (Nikolay Kichukov) Date: Fri May 18 17:38:11 2007 Subject: [LARTC] statistics and calc bandwidth traffic using tc -s qdisc show In-Reply-To: References: <20070516233053.CFCE93F85@outpost.ds9a.nl> Message-ID: <1179502642.16009.13.camel@hanna.taxback.ess.ie> seems so cool ... nice find ;-) thanks for sharing ;-) -nik On Thu, 2007-05-17 at 09:39 -0300, Salatiel Filho wrote: > I use tc-viewer . It does a great job. > http://snaj.ath.cx/tc-viewer/tc-viewer.html > > > On 5/16/07, Pablo Fernandes Yahoo > wrote: > Hello, > > > > Is there someone here who knows what does it means? > > > > The Sent part. > > > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev > eth0" > > qdisc sfq 140: dev eth0 parent 1:140 limit 128p quantum 1514b > perturb 10sec > > Sent 3155024 bytes 23249 pkt (dropped 0, overlimits 0 > requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > [root@fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev > eth1" > > qdisc sfq 140: dev eth1 parent 1:140 limit 128p quantum 1514b > perturb 10sec > > Sent 41141183 bytes 32560 pkt (dropped 0, overlimits 0 > requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > > > I also would like to know if there is a way to calc the > bandwidth traffic (in kbit for example) of this > customer using this informations. > > > > Thank you for any help in advance. > > > > Pablo Fernandes > > > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > -- > []'s > Salatiel > > "O maior prazer do inteligente ? bancar o idiota > diante de um idiota que banca o inteligente". > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From ethy.brito at inexo.com.br Fri May 18 17:48:04 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Fri May 18 17:48:15 2007 Subject: [LARTC] statistics and calc bandwidth traffic using tc -s qdisc show In-Reply-To: <1179502642.16009.13.camel@hanna.taxback.ess.ie> References: <20070516233053.CFCE93F85@outpost.ds9a.nl> <1179502642.16009.13.camel@hanna.taxback.ess.ie> Message-ID: <20070518124804.4cbbd9a2@pulsar.inexo.com.br> On Fri, 18 May 2007 18:37:22 +0300 Nikolay Kichukov wrote: > seems so cool ... > > nice find ;-) > thanks for sharing ;-) > > -nik > > On Thu, 2007-05-17 at 09:39 -0300, Salatiel Filho wrote: > > I use tc-viewer . It does a great job. > > http://snaj.ath.cx/tc-viewer/tc-viewer.html Hi All Is there any output that counts the number of dropped bytes (not packets) just as in "Sent" output? Any patch perhaps?? Regards -- Ethy H. Brito /"\ InterNexo Ltda. \ / CAMPANHA DA FITA ASCII - CONTRA MAIL HTML +55 (12) 3797-6860 X ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL S.J.Campos - Brasil / \ From shetravel at gmail.com Sat May 19 23:03:38 2007 From: shetravel at gmail.com (shetravel) Date: Sat May 19 23:03:55 2007 Subject: [LARTC] ipip/gre tunnel behind NAT environments. Message-ID: <63d6f13b0705191403y7f9256cbp1bbcd2d9b9575d83@mail.gmail.com> Hi, Does anyone tried to get ipip or gre tunnel behind NAT environments. ? i'm trying to make both side tunneling with ipip or gre with private address just like belows.. A -------------------FIRWWAL -------------------INET ------------------- B PRIVATE PUBLIC PUBLIC (10.100.0.1) (211.xxx.xxx.xxx) ( 211.xxx.xxx.xxx) is it possible to make both side connections with IPIP or GRE tunnels ? thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070520/ab989397/attachment.html From drumlesson at gmail.com Sun May 20 01:25:51 2007 From: drumlesson at gmail.com (terraja-based) Date: Sun May 20 01:25:56 2007 Subject: [LARTC] Re: LARTC Digest, Vol 27, Issue 26 In-Reply-To: <20070519100005.F40A34100@outpost.ds9a.nl> References: <20070519100005.F40A34100@outpost.ds9a.nl> Message-ID: <823158cf0705191625o14a7ff82n8054365e8eff226e@mail.gmail.com> Hi folks...!!! I need to generate qdisc statistics to show my 4 class (10, 20, 30, 40), i`ve all working with HTB and so on, but i need to graph this results e.gwith RRDTOOL. I found a script made in perl, that can to graph my 4 class, but i need to know which IP address on my LAN are using the bandwidth too, in other hand i need to classify the traffic by IP to show. This is an out of my "tc -s qdisc show": *********************************************************** qdisc htb 1: r2q 10 default 40 direct_packets_stat 0 Sent 935816543 bytes 791394 pkt (dropped 0, overlimits 117076 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 10: parent 1:10 limit 128p quantum 1500b perturb 10sec Sent 2385144 bytes 21890 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 20: parent 1:20 limit 128p quantum 1500b perturb 10sec Sent 614622187 bytes 536309 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 30: parent 1:30 limit 128p quantum 1500b perturb 10sec Sent 17904922 bytes 14150 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 40: parent 1:40 limit 128p quantum 1500b perturb 10sec Sent 300904290 bytes 219045 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 *********************************************************** I`ll apreciate any help, thx a lot terraja-based -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070519/c4c09806/attachment.htm From salatiel.filho at gmail.com Sun May 20 03:08:43 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Sun May 20 03:08:53 2007 Subject: [LARTC] Re: LARTC Digest, Vol 27, Issue 26 In-Reply-To: <823158cf0705191625o14a7ff82n8054365e8eff226e@mail.gmail.com> References: <20070519100005.F40A34100@outpost.ds9a.nl> <823158cf0705191625o14a7ff82n8054365e8eff226e@mail.gmail.com> Message-ID: On 5/19/07, terraja-based wrote: > > Hi folks...!!! > > > I need to generate qdisc statistics to show my 4 class (10, 20, 30, 40), > i`ve all working with HTB and so on, but i need to graph this results e.gwith RRDTOOL. > > I found a script made in perl, that can to graph my 4 class, but i need to > know which IP address on my LAN are using the bandwidth too, in other hand i > need to classify the traffic by IP to show. > > This is an out of my "tc -s qdisc show": > > *********************************************************** > qdisc htb 1: r2q 10 default 40 direct_packets_stat 0 > Sent 935816543 bytes 791394 pkt (dropped 0, overlimits 117076 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc sfq 10: parent 1:10 limit 128p quantum 1500b perturb 10sec > Sent 2385144 bytes 21890 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc sfq 20: parent 1:20 limit 128p quantum 1500b perturb 10sec > Sent 614622187 bytes 536309 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc sfq 30: parent 1:30 limit 128p quantum 1500b perturb 10sec > Sent 17904922 bytes 14150 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc sfq 40: parent 1:40 limit 128p quantum 1500b perturb 10sec > Sent 300904290 bytes 219045 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > *********************************************************** > > > I`ll apreciate any help, thx a lot > > terraja-based > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > Use polltc , i do not remember the URL but it outputs RRD files. -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070519/bdb8037c/attachment.html From ethy.brito at inexo.com.br Sun May 20 17:17:32 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Sun May 20 17:14:24 2007 Subject: [LARTC] dropped bytes in "tc -s class" output Message-ID: <20070520121732.3cf9b0e6@babalu.inexo.com.br> Hi All Is there any output that counts the number of dropped bytes (not packets) just as in "Sent" in "tc -s class" output? I have an HTB arrangement here I can see "dropped" in father/mother 1:0 qdisc but NOT in each class (they are all zeroes). These dropped are packets or bytes? Why these "drops" do not show thenselves in its own class(es) under its qdisc? Any patch perhaps?? Regards -- Ethy H. Brito /"\ InterNexo Ltda. \ / CAMPANHA DA FITA ASCII - CONTRA MAIL HTML +55 (12) 3797-6860 X ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL S.J.Campos - Brasil / \ From christian.benvenuti at libero.it Sun May 20 22:56:23 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Sun May 20 22:49:12 2007 Subject: [LARTC] Re: dropped bytes in "tc -s class" output Message-ID: <1179694583.7354.9.camel@benve-laptop> Hi, >Hi All > >Is there any output that counts the number of dropped bytes >(not packets) just as in "Sent" in "tc -s class" output? No. A simple workaround (for simple configurations) consists of redirecting all the traffic you want to drop to a dedicated class and attach a blackhole qdisc (i.e., drop everything) to it. Supposing you redirect to class 1:12 all the traffic that is to be dropped, all you need is: tc qdisc add dev eth1 parent 1:12 handle 12: blackhole The counters of class 1:12 are what you are looking for: class htb 1:12 parent 1:1 leaf 12: prio 0 quantum 1600 rate 128000bit ceil 128000bit burst 1664b/8 mpu 0b overhead 0b cburst 1664b/8 mpu 0b overhead 0b level 0 Sent 2352 bytes 24 pkt (dropped 0, overlimits 0 requeues 0) ^^^^^^^^^^^^^^^^^^^^^^ qdisc blackhole 12: dev eth1 parent 1:12 Sent 0 bytes 0 pkt (dropped 24, overlimits 0 requeues 0) ^^^^^^^^^^ >I have an HTB arrangement here I can see "dropped" in father/mother 1:0 qdisc >but NOT in each class (they are all zeroes). >These dropped are packets or bytes? Number of packets. >Why these "drops" do not show thenselves in its own class(es) under its >qdisc? > >Any patch perhaps?? >Regards Regards /Christian [http://benve.info] From christian.benvenuti at libero.it Sun May 20 23:18:47 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Sun May 20 23:11:22 2007 Subject: [LARTC] Re: statistics and calc bandwidth traffic using tc -s qdisc show Message-ID: <1179695927.7354.16.camel@benve-laptop> >> Hello, >> >> Is there someone here who knows what does it means? >> >> The Sent part. >> >> [root at fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth0" >> qdisc sfq 140: dev eth0 parent 1:140 limit 128p quantum 1514b perturb 10sec >> Sent 3155024 bytes 23249 pkt (dropped 0, overlimits 0 requeues 0) >> rate 0bit 0pps backlog 0b 0p requeues 0 >> >> [root at fw ~]# tc -s qdisc show |grep -A 2 "qdisc sfq 140: dev eth1" >> qdisc sfq 140: dev eth1 parent 1:140 limit 128p quantum 1514b perturb 10sec >> Sent 41141183 bytes 32560 pkt (dropped 0, overlimits 0 requeues 0) >> rate 0bit 0pps backlog 0b 0p requeues 0 >> >> >> I also would like to know if there is a way to calc the bandwidth >> traffic (in kbit for example) of this customer using this >> informations. >> >> Thank you for any help in advance. >> >> Pablo Fernandes > >Hi, > >the "Sent" parameter shows you the amount of data that fall into this >qdisc. You can obtain the instant bandwith usage that falls into this >qdisc parsing the "rate" parameter. In your example the rate is 0bit, that >means 0 bits per second bandwith usage. >I must admit that the output from tc -s is a big pain !! Hi Pablo, Eric, the rate parameter is (usually) not initialized unless you ask for it explicitly. HTB by default measures the rate of its classes (not the qdisc), but the other qdisc do not. If you want to measure the rates (of a non-HTB class), you need to use the "estimator" option. You also need to make sure your kernel is compiled with support for it: Networking +-> Networking options +-> QoS and/or fair queueing +-> Rate estimator The "Rate estimator" option is selected by default, for example, if you enable the "Actions" option (that is in the same menu). Here is an example of configuration of a SFQ qdisc with an estimator attached: tc qdisc add dev eth1 root estimator 1sec 8sec sfq perturb 10 ^^^^^^^^^^^^^^^^^^^ The above example (1sec 8sec) says this: read the counters every "1 second" and give me the EWMA average rate over an interval of "8 seconds". Here is an example of output: qdisc sfq 8001: limit 128p quantum 1514b flows 128/1024 perturb 10sec Sent 24010 bytes 245 pkt (dropped 0, overlimits 0 requeues 0) rate 7520bit 10pps backlog 0b 0p requeues 0 ^^^^^^^^^^^^^^^^^^ "rate" is the bit/s rate and "pps" is the pkt/s rate. Regards /Christian [http://benve.info] From ethy.brito at inexo.com.br Sun May 20 23:28:24 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Sun May 20 23:25:00 2007 Subject: [LARTC] Re: dropped bytes in "tc -s class" output In-Reply-To: <1179694583.7354.9.camel@benve-laptop> References: <1179694583.7354.9.camel@benve-laptop> Message-ID: <20070520182824.2ff1ae02@babalu.inexo.com.br> On Sun, 20 May 2007 22:56:23 +0200 Christian Benvenuti wrote: > Hi, > > >Hi All > > > >Is there any output that counts the number of dropped bytes > >(not packets) just as in "Sent" in "tc -s class" output? > > No. > A simple workaround (for simple configurations) consists of redirecting > all the traffic you want to drop to a dedicated class and attach a > blackhole qdisc (i.e., drop everything) to it. Hmmm. I am pretty sure I did not tell what I meant. I assume that at some point an HTB class will drop some packets that are beyond its speed regulation, right? I just need to measure this amount of dropped bytes (not packets). With this measure I can MRTG it and give the clients some felling that they really need more bandwidth. How can I do this? Ethy From luciano at lugmen.org.ar Mon May 21 01:07:35 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Mon May 21 01:07:34 2007 Subject: [LARTC] IPCLASSIFY - patch based on IPMARK In-Reply-To: <464C34E7.7050002@relef.net> References: <464C34E7.7050002@relef.net> Message-ID: <200705202007.35544.luciano@lugmen.org.ar> On Thursday 17 May 2007 07:56, VladSun wrote: > Hello everybody! > > Some time ago I've decided that using the MARK property of the Linux IP > packet structure for the needs of traffic control is not very useful. So > I wrote an iptables patch called IPCLASSIFY. It is fully based on IPMARK > but it uses the PRIORITY field instead of MARK. > > The relation between IPCLASSIFY<->CLASSIFY is the same as IPMARK<->MARK. > By using IPCLASSIFY not a single TC filter is needed any more! > Additionally, the MARK field can be used for something else, more useful. > > You can find it here : > http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz . > > Fell free to report any bugs. :) Ok, here is the first, 0bytes in the tar.gz of the above url :) luciano@sarasvati:~/downloads/apps$ wget http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz --20:05:43-- http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz => `IPCLASSIFY.tar.gz' Resolving openfmi.net... 62.44.101.15 Connecting to openfmi.net|62.44.101.15|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ <=> ] 0 --.--K/s 20:05:50 (0.00 B/s) - `IPCLASSIFY.tar.gz' saved [0] luciano@sarasvati:~/downloads/apps$ ls -lh IPCLASSIFY.tar.gz -rw-r--r-- 1 luciano luciano 0 2007-05-20 20:05 IPCLASSIFY.tar.gz luciano@sarasvati:~/downloads/apps$ :-) -- Luciano From vladsun at relef.net Mon May 21 01:13:25 2007 From: vladsun at relef.net (VladSun) Date: Mon May 21 01:13:55 2007 Subject: [LARTC] IPCLASSIFY - patch based on IPMARK In-Reply-To: <200705202007.35544.luciano@lugmen.org.ar> References: <464C34E7.7050002@relef.net> <200705202007.35544.luciano@lugmen.org.ar> Message-ID: <4650D615.4070808@relef.net> Sorry! The packet has been modified since I wrote this message. The new URL is: http://openfmi.net/frs/download.php/410/IPCLASSIFY.zip Luciano Ruete ??????: > On Thursday 17 May 2007 07:56, VladSun wrote: > >> Hello everybody! >> >> Some time ago I've decided that using the MARK property of the Linux IP >> packet structure for the needs of traffic control is not very useful. So >> I wrote an iptables patch called IPCLASSIFY. It is fully based on IPMARK >> but it uses the PRIORITY field instead of MARK. >> >> The relation between IPCLASSIFY<->CLASSIFY is the same as IPMARK<->MARK. >> By using IPCLASSIFY not a single TC filter is needed any more! >> Additionally, the MARK field can be used for something else, more useful. >> >> You can find it here : >> http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz . >> >> Fell free to report any bugs. :) >> > > Ok, here is the first, 0bytes in the tar.gz of the above url :) > luciano@sarasvati:~/downloads/apps$ wget > http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz > --20:05:43-- http://openfmi.net/frs/download.php/385/IPCLASSIFY.tar.gz > => `IPCLASSIFY.tar.gz' > Resolving openfmi.net... 62.44.101.15 > Connecting to openfmi.net|62.44.101.15|:80... connected. > HTTP request sent, awaiting response... 200 OK > Length: unspecified [text/html] > > [ > <=> ] > 0 --.--K/s > > 20:05:50 (0.00 B/s) - `IPCLASSIFY.tar.gz' saved [0] > > luciano@sarasvati:~/downloads/apps$ ls -lh IPCLASSIFY.tar.gz > -rw-r--r-- 1 luciano luciano 0 2007-05-20 20:05 IPCLASSIFY.tar.gz > luciano@sarasvati:~/downloads/apps$ > > :-) > From ryan.castellucci at gmail.com Mon May 21 21:50:56 2007 From: ryan.castellucci at gmail.com (Ryan Castellucci) Date: Mon May 21 21:51:16 2007 Subject: [LARTC] ipip/gre tunnel behind NAT environments. In-Reply-To: <63d6f13b0705191403y7f9256cbp1bbcd2d9b9575d83@mail.gmail.com> References: <63d6f13b0705191403y7f9256cbp1bbcd2d9b9575d83@mail.gmail.com> Message-ID: <118619310705211250p4033cc2dha28eae80b132cc9b@mail.gmail.com> On 5/19/07, shetravel wrote: > Hi, Does anyone tried to get ipip or gre tunnel behind NAT environments. ? > i'm trying to make both side tunneling with ipip or gre with private address > just like belows.. > > > A -------------------FIRWWAL -------------------INET ------------------- B > PRIVATE PUBLIC > PUBLIC > (10.100.0.1) (211.xxx.xxx.xxx) > (211.xxx.xxx.xxx) > > is it possible to make both side connections with IPIP or GRE tunnels ? > thanks in advance. If the firewall is a linux system, you should be able to easily use DNAT to forward the ipip or gre packets to host 'A'. Something like... iptables -t nat -A PREROUTING -i [Firewall's internet facing interface] -s [Host B's IP] -d [Firewall's public IP] -p ipip -j DNAT --to-destination [Host A's IP] I'm not sure if connection tracking will do any of this automatically, but if it were going to work, A would have to send packets to B over the tunnel first before B could send to A. -- Ryan Castellucci http://ryanc.org/ From luciano at lugmen.org.ar Tue May 22 05:28:08 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Tue May 22 05:28:26 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <4647FA30.5040401@rabbit.us> References: <4647FA30.5040401@rabbit.us> Message-ID: <200705220028.08859.luciano@lugmen.org.ar> On Monday 14 May 2007 02:57, Peter Rabbitson wrote: > Hi, > I have searched the archives on the topic, and it seems that the list > gurus favor load balancing to be done in the kernel as opposed to other > means. AFAIKR there aren't conflicting opinions, there are just to different aproaches and i belive that routing solution is user cause it was the first and because sounds logical to implement multipath with your routing tool. But iptables has become in a routing tool so far (and much more). Personaly im using multipath, but i do not dislike the iptables aproach. > I have been using a home-grown approach, which splits traffic > based on `-m statistic --mode random --probability X`, then CONNMARKs > the individual connections and the kernel happily routes them. I > understand that for > 2 links it will become impractical to calculate a > correct X. well, is not impractical with a litle of scripting in your firewal... #!/bin/bash # your uplinks weight as in kernel multipath # ie: link1 link2 link3 link4 link5 weight=" 1 2 1 3 5 " weight_total= for n in $weight ; do let weight_total=weight_total+n done for n in $weight ; do probability=$((n*100/weight_total)) echo iptables.. -m statistic --mode random --probability $probability done but the problem arraise when you have lets say 101 links, cause mode random takes a 2 digit number right?, but this can be changed in the code (use the source...) > But if we only have 2 gateways to the internet - are there > any advantages in letting the kernel multipath scheduler do the > balancing (with all the downsides of route caching), as opposed to the > pure random approach described above? Well, the disvantage i see is that you have to move all your routing rules to iptables space, but in the end you always need the routing table, but it is a mather of change old habits... -- Luciano From beere at vertis.nl Tue May 22 16:24:29 2007 From: beere at vertis.nl (beere@vertis.nl) Date: Tue May 22 16:24:45 2007 Subject: [LARTC] lc shaping in- and outbound traffic on same box References: <50BD9F2BCFC52840934CBADBBA4481B20896C908@aspmail01.ApplicationNet.nl> Message-ID: <50BD9F2BCFC52840934CBADBBA4481B20975866F@aspmail01.ApplicationNet.nl> I'm looking for a way to shape all traffic on a virtual vlan interface, say eth1.100 to a max of 100mbit. The box is a quagga router with eth0 on the inside and vlan interfaces on the eth1 card to our upstream partners. Each partner has his own vlan on eth1. I tried shaping but was only able to shape outbound traffic on the eth1.100 interface. Inbound shaping was also possible, but only on the eth0 interface. As I cannot see on the eth0 interface through which eth1 interface traffic came in, I can't use shaping on that interface this way. Does anyone have a solution for me? Could this be done using tc's routing filter and route realms? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070522/d5a93bc1/attachment.htm From konstantin at astafjev.com Tue May 22 16:44:27 2007 From: konstantin at astafjev.com (Konstantin Astafjev) Date: Tue May 22 16:44:30 2007 Subject: [LARTC] lc shaping in- and outbound traffic on same box In-Reply-To: <50BD9F2BCFC52840934CBADBBA4481B20975866F@aspmail01.ApplicationNet.nl> References: <50BD9F2BCFC52840934CBADBBA4481B20896C908@aspmail01.ApplicationNet.nl> <50BD9F2BCFC52840934CBADBBA4481B20975866F@aspmail01.ApplicationNet.nl> Message-ID: <1955086018.20070522174427@astafjev.com> Hello beere, Tuesday, May 22, 2007, 5:24:29 PM, you wrote: > I?m looking for a way to shape all traffic on a virtual vlan > interface, say eth1.100 to a max of 100mbit. The box is a quagga > router with eth0 on the inside and vlan interfaces on the eth1 card > to our upstream partners. Each partner has his own vlan on eth1. > > I tried shaping but was only able to shape outbound traffic on the > eth1.100 interface. Inbound shaping was also possible, but only on > the eth0 interface. As I cannot see on the eth0 interface through > which eth1 interface traffic came in, I can?t use shaping on that interface this way. > > Does anyone have a solution for me? Could this be done using tc?s routing filter and route realms? If I understand you correct http://www.linuximq.net/ is what you need to make shaping on different interfaces. -- Best regards, Konstantin From shetravel at gmail.com Tue May 22 19:52:14 2007 From: shetravel at gmail.com (shetravel) Date: Tue May 22 19:52:27 2007 Subject: [LARTC] ipip/gre tunnel behind NAT environments. In-Reply-To: <118619310705211250p4033cc2dha28eae80b132cc9b@mail.gmail.com> References: <63d6f13b0705191403y7f9256cbp1bbcd2d9b9575d83@mail.gmail.com> <118619310705211250p4033cc2dha28eae80b132cc9b@mail.gmail.com> Message-ID: <63d6f13b0705221052o31af348fr7396f610dc1af841@mail.gmail.com> Thank you for the reply, Ryan. Yes, unfortunately it does not a linux box, but D-link IP sharing box. it only shows me IPSEC/PPTP tunnel pass through options on it. so, it should be passed the ipip or gre packet through the NAT machine right ? Thanks in advance. > 2007/5/22, Ryan Castellucci : > If the firewall is a linux system, you should be able to easily use > DNAT to forward the ipip or gre packets to host 'A'. > > Something like... > > iptables -t nat -A PREROUTING -i [Firewall's internet facing > interface] -s [Host B's IP] -d [Firewall's public IP] -p ipip -j DNAT > --to-destination [Host A's IP] > > I'm not sure if connection tracking will do any of this automatically, > but if it were going to work, A would have to send packets to B over > the tunnel first before B could send to A. > > -- > Ryan Castellucci http://ryanc.org/ > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > >> On 5/19/07, shetravel wrote: > > Hi, Does anyone tried to get ipip or gre tunnel behind NAT environments. ? > > i'm trying to make both side tunneling with ipip or gre with private address > > just like belows.. > > > > > > A -------------------FIRWWAL -------------------INET ------------------- B > > PRIVATE PUBLIC > > PUBLIC > > (10.100.0.1) (211.xxx.xxx.xxx) > > (211.xxx.xxx.xxx) > > > > is it possible to make both side connections with IPIP or GRE tunnels ? > > thanks in advance. > From beere at vertis.nl Wed May 23 12:48:13 2007 From: beere at vertis.nl (beere@vertis.nl) Date: Wed May 23 12:48:34 2007 Subject: [LARTC] lc shaping in- and outbound traffic on same box References: <50BD9F2BCFC52840934CBADBBA4481B20896C908@aspmail01.ApplicationNet.nl> <50BD9F2BCFC52840934CBADBBA4481B20975866F@aspmail01.ApplicationNet.nl> <1955086018.20070522174427@astafjev.com> Message-ID: <50BD9F2BCFC52840934CBADBBA4481B2097586D0@aspmail01.ApplicationNet.nl> Thanks! This is very interesting, but I need to patch the kernel for this to work. Would it be possible to mark all traffic that comes in through the eth1.100 interface using iptables and use that mark to shape that traffic leaving the eth0 interface? How would I go about doing this? Erwin -----Original Message----- From: Konstantin Astafjev [mailto:konstantin@astafjev.com] Sent: dinsdag 22 mei 2007 16:44 To: Beer, Erwin de Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] lc shaping in- and outbound traffic on same box Hello beere, Tuesday, May 22, 2007, 5:24:29 PM, you wrote: > I'm looking for a way to shape all traffic on a virtual vlan > interface, say eth1.100 to a max of 100mbit. The box is a quagga > router with eth0 on the inside and vlan interfaces on the eth1 card > to our upstream partners. Each partner has his own vlan on eth1. > > I tried shaping but was only able to shape outbound traffic on the > eth1.100 interface. Inbound shaping was also possible, but only on > the eth0 interface. As I cannot see on the eth0 interface through > which eth1 interface traffic came in, I can't use shaping on that interface this way. > > Does anyone have a solution for me? Could this be done using tc's routing filter and route realms? If I understand you correct http://www.linuximq.net/ is what you need to make shaping on different interfaces. -- Best regards, Konstantin From zhukov at gawab.com Wed May 23 15:06:57 2007 From: zhukov at gawab.com (Georgy Zhukov) Date: Wed May 23 15:07:06 2007 Subject: [LARTC] lc shaping in- and outbound traffic on same box In-Reply-To: <50BD9F2BCFC52840934CBADBBA4481B2097586D0@aspmail01.ApplicationNet.nl> References: <50BD9F2BCFC52840934CBADBBA4481B20896C908@aspmail01.ApplicationNet.nl> <50BD9F2BCFC52840934CBADBBA4481B20975866F@aspmail01.ApplicationNet.nl> <1955086018.20070522174427@astafjev.com> <50BD9F2BCFC52840934CBADBBA4481B2097586D0@aspmail01.ApplicationNet.nl> Message-ID: <30a2c22b0705230606l6218578dk90ce00778e9e4df@mail.gmail.com> Hi there! I believe it is possible, you can use the mark option in iptables to assign a mark to each packet and then set filters on tc using these marks. Be aware that this mark only exists within the kernel, so the packet is not modificated. Although I have not read throughly the IMQ stuff I'm wondering if you cant solve your problem with DSCP. On 5/23/07, beere@vertis.nl wrote: > > Thanks! This is very interesting, but I need to patch the kernel for > this to work. Would it be possible to mark all traffic that comes in > through the eth1.100 interface using iptables and use that mark to shape > that traffic leaving the eth0 interface? > > How would I go about doing this? > > Erwin > > -----Original Message----- > From: Konstantin Astafjev [mailto:konstantin@astafjev.com] > Sent: dinsdag 22 mei 2007 16:44 > To: Beer, Erwin de > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] lc shaping in- and outbound traffic on same box > > Hello beere, > > Tuesday, May 22, 2007, 5:24:29 PM, you wrote: > > I'm looking for a way to shape all traffic on a virtual vlan > > interface, say eth1.100 to a max of 100mbit. The box is a quagga > > router with eth0 on the inside and vlan interfaces on the eth1 card > > to our upstream partners. Each partner has his own vlan on eth1. > > > > I tried shaping but was only able to shape outbound traffic on the > > eth1.100 interface. Inbound shaping was also possible, but only on > > the eth0 interface. As I cannot see on the eth0 interface through > > which eth1 interface traffic came in, I can't use shaping on that > interface this way. > > > > Does anyone have a solution for me? Could this be done using tc's > routing filter and route realms? > > If I understand you correct http://www.linuximq.net/ is what you need > to make shaping on different interfaces. > > -- > Best regards, > Konstantin > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070523/0fc7cf7c/attachment.html From arman.anwar at gmail.com Thu May 24 12:46:17 2007 From: arman.anwar at gmail.com (Arman) Date: Thu May 24 12:46:27 2007 Subject: [LARTC] tc-htb traffic shaping script Message-ID: <13c1e7670705240346y6f1728cfi37409d8f20153d2d@mail.gmail.com> Hi, Is there any tested good HTB script for traffic shaping available like as that of CBQ available at. http://freshmeat.net/projects/cbq.init I am n new bie and need to work on htb. -- Regards, M Arman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070524/d87af3ee/attachment.htm From beere at vertis.nl Thu May 24 14:24:18 2007 From: beere at vertis.nl (beere@vertis.nl) Date: Thu May 24 14:24:30 2007 Subject: [LARTC] tc-htb traffic shaping script References: <13c1e7670705240346y6f1728cfi37409d8f20153d2d@mail.gmail.com> Message-ID: <50BD9F2BCFC52840934CBADBBA4481B209758789@aspmail01.ApplicationNet.nl> I can send you mine, it's a modified version of one I found somewhere on the net to be able to limit bandwith on a linux router. I did no cleaning up or anything #!/bin/bash # tc uses the following units when passed as a parameter. # kbps: Kilobytes per second # mbps: Megabytes per second # kbit: Kilobits per second # mbit: Megabits per second # bps: Bytes per second # Amounts of data can be specified in: # kb or k: Kilobytes # mb or m: Megabytes # mbit: Megabits # kbit: Kilobits # To get the byte figure from bits, divide the number by 8 bit # # # Name of the traffic control command. TC=/sbin/tc IPTABLES=/sbin/iptables # The network interface we're planning on limiting bandwidth. IF1=eth1.106 # Interface IF2=eth0 # Interface # Download limit (in mega bits) DNLD=100mbit # DOWNLOAD Limit # Upload limit (in mega bits) UPLD=100mbit # UPLOAD Limit # IP address of the machine we are controlling #IP=81.18.0.0/24 #Host IP #IP=0.0.0.0/0 #Host IP # Filter options for limiting the intended interface. IN="$TC filter add dev $IF2 protocol ip parent 1:0 prio 1" OUT="$TC filter add dev $IF1 protocol ip parent 2:0 prio 1" start() { # All traffic originating from IF1 gets marked $IPTABLES -t mangle -D PREROUTING -i $IF1 -j MARK --set-mark 106 >/dev/null 2>&1 $IPTABLES -t mangle -A PREROUTING -i $IF1 -j MARK --set-mark 106 # INBOUND matches on fwmark 106 and gets shaped when it leaves the IF2 interface $TC qdisc add dev $IF2 root handle 1: htb default 30 $TC class add dev $IF2 parent 1: classid 1:1 htb rate $DNLD $IN handle 106 fw flowid 1:1 printf "\n" printf "Shaping traffic incoming on $IF1 ==> $IF2 to max. $DNLD" # OUTBOUND matches all traffic heading out IF1 gets shaped, no filter needed $TC qdisc add dev $IF1 root handle 2: htb default 1 $TC class add dev $IF1 parent 2: classid 2:1 htb rate $UPLD # $OUT u32 match ip src $IP flowid 2:1 printf "\n" printf "Shaping traffic incoming on $IF2 ==> $IF1 to max. $UPLD\n" # The first line creates the root qdisc, and the next line # creates a child qdiscs that respectively are used to shape download # and upload bandwidth. The third line defines a filter if required. } stop() { # Stop the bandwidth shaping. $TC qdisc del dev $IF1 root $TC qdisc del dev $IF2 root $IPTABLES -t mangle -D PREROUTING -i $IF1 -j MARK --set-mark 106 } restart() { # Self-explanatory. stop sleep 1 start } show() { # Display status of traffic control status. # $TC -s qdisc ls dev $IF1 $TC -s qdisc ls dev $IF2 } case "$1" in start) echo -n "Starting bandwidth shaping: " start echo "done" ;; stop) echo -n "Stopping bandwidth shaping: " stop echo "done" ;; restart) echo -n "Restarting bandwidth shaping: " restart echo "done" ;; show) echo "Bandwidth shaping status for $IF2:" show echo "" ;; *) pwd=$(pwd) echo "Usage: tc.bash {start|stop|restart|show}" ;; esac exit 0 From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Arman Sent: donderdag 24 mei 2007 12:46 To: lartc@mailman.ds9a.nl Subject: [LARTC] tc-htb traffic shaping script Hi, Is there any tested good HTB script for traffic shaping available like as that of CBQ available at. http://freshmeat.net/projects/cbq.init I am n new bie and need to work on htb. -- Regards, M Arman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070524/278d4037/attachment-0001.html From marco.casaroli at gmail.com Thu May 24 18:57:14 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Thu May 24 18:57:22 2007 Subject: [LARTC] tc-htb traffic shaping script In-Reply-To: <50BD9F2BCFC52840934CBADBBA4481B209758789@aspmail01.ApplicationNet.nl> References: <13c1e7670705240346y6f1728cfi37409d8f20153d2d@mail.gmail.com> <50BD9F2BCFC52840934CBADBBA4481B209758789@aspmail01.ApplicationNet.nl> Message-ID: <92ed523b0705240957j5e7ba417pf2b4bb0ed6bd8504@mail.gmail.com> http://lartc.org/wondershaper/ On 5/24/07, beere@vertis.nl wrote: > > > > > I can send you mine, it's a modified version of one I found somewhere on the > net to be able to limit bandwith on a linux router. I did no cleaning up or > anything > > > > > > #!/bin/bash > > > > # tc uses the following units when passed as a parameter. > > # kbps: Kilobytes per second > > # mbps: Megabytes per second > > # kbit: Kilobits per second > > # mbit: Megabits per second > > # bps: Bytes per second > > # Amounts of data can be specified in: > > # kb or k: Kilobytes > > # mb or m: Megabytes > > # mbit: Megabits > > # kbit: Kilobits > > # To get the byte figure from bits, divide the number by 8 bit > > # > > > > # > > # Name of the traffic control command. > > TC=/sbin/tc > > IPTABLES=/sbin/iptables > > > > # The network interface we're planning on limiting bandwidth. > > IF1=eth1.106 # Interface > > IF2=eth0 # Interface > > > > # Download limit (in mega bits) > > DNLD=100mbit # DOWNLOAD Limit > > > > # Upload limit (in mega bits) > > UPLD=100mbit # UPLOAD Limit > > > > # IP address of the machine we are controlling > > #IP=81.18.0.0/24 #Host IP > > #IP=0.0.0.0/0 #Host IP > > > > # Filter options for limiting the intended interface. > > IN="$TC filter add dev $IF2 protocol ip parent 1:0 prio 1" > > OUT="$TC filter add dev $IF1 protocol ip parent 2:0 prio 1" > > > > start() { > > > > # All traffic originating from IF1 gets marked > > $IPTABLES -t mangle -D PREROUTING -i $IF1 -j MARK --set-mark 106 > >/dev/null 2>&1 > > $IPTABLES -t mangle -A PREROUTING -i $IF1 -j MARK --set-mark 106 > > > > # INBOUND matches on fwmark 106 and gets shaped when it leaves the IF2 > interface > > > > $TC qdisc add dev $IF2 root handle 1: htb default 30 > > $TC class add dev $IF2 parent 1: classid 1:1 htb rate $DNLD > > $IN handle 106 fw flowid 1:1 > > > > printf "\n" > > printf "Shaping traffic incoming on $IF1 ==> $IF2 to max. $DNLD" > > > > # OUTBOUND matches all traffic heading out IF1 gets shaped, no filter needed > > > > $TC qdisc add dev $IF1 root handle 2: htb default 1 > > $TC class add dev $IF1 parent 2: classid 2:1 htb rate $UPLD > > # $OUT u32 match ip src $IP flowid 2:1 > > > > printf "\n" > > printf "Shaping traffic incoming on $IF2 ==> $IF1 to max. $UPLD\n" > > > > # The first line creates the root qdisc, and the next line > > # creates a child qdiscs that respectively are used to shape download > > # and upload bandwidth. The third line defines a filter if required. > > > > } > > > > stop() { > > > > # Stop the bandwidth shaping. > > $TC qdisc del dev $IF1 root > > $TC qdisc del dev $IF2 root > > $IPTABLES -t mangle -D PREROUTING -i $IF1 -j MARK --set-mark 106 > > > > } > > > > restart() { > > > > # Self-explanatory. > > stop > > sleep 1 > > start > > > > } > > > > show() { > > > > # Display status of traffic control status. > > # $TC -s qdisc ls dev $IF1 > > $TC -s qdisc ls dev $IF2 > > > > } > > > > case "$1" in > > > > start) > > > > echo -n "Starting bandwidth shaping: " > > start > > echo "done" > > ;; > > > > stop) > > > > echo -n "Stopping bandwidth shaping: " > > stop > > echo "done" > > ;; > > > > restart) > > > > echo -n "Restarting bandwidth shaping: " > > restart > > echo "done" > > ;; > > > > show) > > > > echo "Bandwidth shaping status for $IF2:" > > show > > echo "" > > ;; > > > > *) > > > > pwd=$(pwd) > > echo "Usage: tc.bash {start|stop|restart|show}" > > ;; > > > > esac > > > > exit 0 > > > > > > > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Arman > Sent: donderdag 24 mei 2007 12:46 > To: lartc@mailman.ds9a.nl > Subject: [LARTC] tc-htb traffic shaping script > > > > > Hi, > > Is there any tested good HTB script for traffic shaping available like > as that of CBQ available at. > > http://freshmeat.net/projects/cbq.init > > I am n new bie and need to work on htb. > > -- > Regards, > M Arman > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > -- Marco From ysu at inf.ethz.ch Sat May 26 10:29:42 2007 From: ysu at inf.ethz.ch (Yang Su) Date: Sat May 26 10:29:23 2007 Subject: [LARTC] Problem with the tc statistcis Message-ID: <4657EFF6.2070702@inf.ethz.ch> I had a linux wireless router. I would like to monitor the queue lengh of the wireless interface. By default, the wifi0 interface is with pfifo_fast qdisc which does not report backlog packet. I replaced pfifo_fast with pfifo: 'tc qdisc replace dev wifi0 root pfifo' Then I use iperf to send UDP pkts faster than the interface can handle but when I read the qdisc, the result is like: qdisc pfifo 8007: limit 10p Sent 46249560 bytes 30600 pkts (dropped 0, overlimits 0) It is really strange, because over 80% of packets lost is observed by application layer but by qdisc, there is no packet dropped at all and no backlog. Any hints for this result? Thank you, Yang Su From arman.anwar at gmail.com Sat May 26 11:54:28 2007 From: arman.anwar at gmail.com (Arman) Date: Sat May 26 11:54:34 2007 Subject: [LARTC] Need cbq or htb optimal solution Message-ID: <13c1e7670705260254v42c6ea7aqcac6fa411047bf78@mail.gmail.com> Hi all, Can anyone in this mailing list answer a few theoretical question which r confusing me. Here is the scenario I have a total Bandwidth of 2Mbps for a private LAN I am managing. I am using cbq standard script available online and for controlling bandwidth, squid and iptables. I have diff. packages for client. One: server own services access (unlimited bandwidth means no delays or control) + 50Kbps Internet bandwidth Two: server own services (unlimited) + 30Kbps Internet bandwidth Three: server own services (unlimited) + 15Kbps Internet bandwidth Now I have to distribute speed between these clients, which are around 120. If I calculate I should have over 4Mbps total bandwidth. Currently I am using cbq for controlling bandwidth and have 120 cbq classes, one class per user. Is this good or I should have 4 categories of classes and users sharing the speed. Currently I figure out that in diff times of day bandwidth utilization is diff. sometimes users utilizing low bandwidth and most of bandwidth is wasted and sometimes scenario is diff. What will be the case if I switch to htb. I thought I could utilize ceil parameter to utilize wasted bandwidth in current configuration like one class per user. Please suggest the optimal solution. Thnax is advance -- Regards, M Arman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070526/f4dd1140/attachment.htm From fernandes_pablo at yahoo.com.br Sat May 26 11:54:03 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Sat May 26 15:54:43 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers Message-ID: <20070526135435.C96F540DB@outpost.ds9a.nl> Hello, have HTB "rules" in 4 different ISPs and i control for each customer this way: Flush and 1:0 class tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1:0 htb tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1:0 htb tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit Upload and Download: user1 tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil 150kbit tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j CLASSIFY --set-class 1:5 iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY --set-class 1:5 Upload and Download: user2 tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil 150kbit tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j CLASSIFY --set-class 1:8 iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY --set-class 1:8 (.) This rules works fine, but just for less than 1.700 customers. More than 1.700 customers, i have my load avarage in the sky and Ksoftirqd process (top information) in 100% fulltime. I don't know why. I used to use CBQ instead HTB because i had the same problem and Ron (a guy in this list) gave this rules and told me that he uses this for more than 3.000 customers. I tested it in more than 7 different computers (but the same hadware specifications) and i had the same problem with either CBQ or HTB rules. The computers that i have are all of them DELL PowerEdge 1850. I will put some hardware iformations here: top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3 root 39 19 0 0 0 R 100 0.0 5316:20 ksoftirqd/0 [root@fw ~]# uptime 10:38:11 up 161 days, 17:21, 3 users, load average: 1.58, 1.65, 1.51 (unfortunately when i took this, the load average was "pretty good", but minutes ago, it was more than 11.0 [root@fw ~]# lspci 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 09) 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 09) 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 09) 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 09) 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 09) 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02) 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (A-Segment Bridge) (rev 06) 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (B-Segment Bridge) (rev 06) 02:0c.0 Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 04) 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 4 (rev 06) 03:0b.0 Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 04) 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09) 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) 09:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] [root@fw ~]# free -m total used free shared buffers cached Mem: 2021 1479 542 0 400 654 -/+ buffers/cache: 424 1597 Swap: 1027 0 1027 [root@fw ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.674 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5990.78 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.674 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.13 Any help/Tipp/hint will be very welcome. Thanks in Advance! Pablo Fernandes -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070526/2dae4d4d/attachment.html From marek at piasta.pl Sat May 26 16:22:10 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Sat May 26 16:23:01 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070526135435.C96F540DB@outpost.ds9a.nl> References: <20070526135435.C96F540DB@outpost.ds9a.nl> Message-ID: <20070526162210.753a8d69@catlap> >Hello, Hi there! >iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j CLASSIFY >--set-class 1:5 >iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY >--set-class 1:5 3k iptables rules strike me as something suicidaly slow. Try using tc hashing filters for traffic classification as described here: http://lartc.org/howto/lartc.adv-filter.hashing.html If you use private addresses and NAT then you'll need IFB (http://linux-net.osdl.org/index.php/IFB) to shape upload per client with u32 hashing filters. Hope that helps. pozdrawiam, Marek Kierdelewicz KoBa ISP From vladsun at relef.net Sat May 26 17:23:16 2007 From: vladsun at relef.net (VladSun) Date: Sat May 26 17:23:25 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070526135435.C96F540DB@outpost.ds9a.nl> References: <20070526135435.C96F540DB@outpost.ds9a.nl> Message-ID: <465850E4.4070609@relef.net> Pablo Fernandes Yahoo ??????: > > Hello, > > have HTB ?rules? in 4 different ISPs and i control for each customer > this way: > > Flush and 1:0 class > > tc qdisc del dev eth0 root > > tc qdisc add dev eth0 root handle 1:0 htb > > tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit > > tc qdisc del dev eth1 root > > tc qdisc add dev eth1 root handle 1:0 htb > > tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit > > Upload and Download: user1 > > tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil 150kbit > > tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil 50kbit > > tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j CLASSIFY > --set-class 1:5 > > iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY > --set-class 1:5 > > Upload and Download: user2 > > tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil 150kbit > > tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil 50kbit > > tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j CLASSIFY > --set-class 1:8 > > iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY > --set-class 1:8 > > (?) > > This rules works fine, but just for less than 1.700 customers. More > than 1.700 customers, i have my load avarage in the sky and Ksoftirqd > process (top information) in 100% fulltime. I don?t know why. I used > to use CBQ instead HTB because i had the same problem and Ron (a guy > in this list) gave this rules and told me that he uses this for more > than 3.000 customers. I tested it in more than 7 different computers > (but the same hadware specifications) and i had the same problem with > either CBQ or HTB rules. The computers that i have are all of them > DELL PowerEdge 1850. I will put some hardware iformations here: > > top > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 3 root 39 19 0 0 0 R 100 0.0 5316:20 ksoftirqd/0 > > [root@fw ~]# uptime > > 10:38:11 up 161 days, 17:21, 3 users, load average: 1.58, 1.65, 1.51 > (unfortunately when i took this, the load average was ?pretty good?, > but minutes ago, it was more than 11.0 > > [root@fw ~]# lspci > > 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub > (rev 09) > > 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express > Port A (rev 09) > > 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B > (rev 09) > > 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 09) > > 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 09) > > 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #1 (rev 02) > > 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #2 (rev 02) > > 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #3 (rev 02) > > 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 > EHCI Controller (rev 02) > > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) > > 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC > Interface Bridge (rev 02) > > 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE > Controller (rev 02) > > 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (A-Segment Bridge) (rev 06) > > 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (B-Segment Bridge) (rev 06) > > 02:0c.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller > 4 (rev 06) > > 03:0b.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge A (rev 09) > > 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge B (rev 09) > > 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 09:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 > QY [Radeon 7000/VE] > > [root@fw ~]# free -m > > total used free shared buffers cached > > Mem: 2021 1479 542 0 400 654 > > -/+ buffers/cache: 424 1597 > > Swap: 1027 0 1027 > > [root@fw ~]# cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm > constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5990.78 > > processor : 1 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm > constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5985.13 > > Any help/Tipp/hint will be very welcome. > > Thanks in Advance! > > Pablo Fernandes > > ------------------------------------------------------------------------ > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > You may find this: http://openfmi.net/frs/download.php/410/IPCLASSIFY.zip useful From adm.acacio at digi.com.br Sat May 26 18:22:39 2007 From: adm.acacio at digi.com.br (=?ISO-8859-1?Q?Ac=E1cio_Alves_dos_Santos?=) Date: Sat May 26 18:22:49 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070526135435.C96F540DB@outpost.ds9a.nl> References: <20070526135435.C96F540DB@outpost.ds9a.nl> Message-ID: <44437216-8295-4D1D-9DC0-3B83A1D7853A@digi.com.br> Pablo, Here we have HTB being used for more than 10.000 customers. The difference, is that we use tc and u32 filters to classify the packets.. I use the same Dell PE 1850, but I have two Quad-Core Xeon (1.86GHz) on it :) # uptime 13:18:08 up 16 days, 12:32, 1 user, load average: 0.02, 0.02, 0.00 mpstat says: 01:19:11 PM CPU %user %nice %sys %iowait %irq %soft % steal %idle intr/s 01:19:13 PM all 0.00 0.00 0.00 0.00 0.57 13.81 0.00 85.61 10568.88 And as you can see.. the use of cpu is not that big.. On May 26, 2007, at 6:54 AM, Pablo Fernandes Yahoo wrote: > Hello, > > > > have HTB ?rules? in 4 different ISPs and i control for each > customer this way: > > > > Flush and 1:0 class > > tc qdisc del dev eth0 root > > tc qdisc add dev eth0 root handle 1:0 htb > > tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit > > tc qdisc del dev eth1 root > > tc qdisc add dev eth1 root handle 1:0 htb > > tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit > > > > Upload and Download: user1 > > tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil > 150kbit > > tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil > 50kbit > > tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j > CLASSIFY --set-class 1:5 > > iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY -- > set-class 1:5 > > > > Upload and Download: user2 > > tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil > 150kbit > > tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil > 50kbit > > tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j > CLASSIFY --set-class 1:8 > > iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY -- > set-class 1:8 > > > > (?) > > > > This rules works fine, but just for less than 1.700 customers. More > than 1.700 customers, i have my load avarage in the sky and > Ksoftirqd process (top information) in 100% fulltime. I don?t know > why. I used to use CBQ instead HTB because i had the same problem > and Ron (a guy in this list) gave this rules and told me that he > uses this for more than 3.000 customers. I tested it in more than 7 > different computers (but the same hadware specifications) and i had > the same problem with either CBQ or HTB rules. The computers that i > have are all of them DELL PowerEdge 1850. I will put some hardware > iformations here: > > > > top > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 3 root 39 19 0 0 0 R 100 0.0 5316:20 > ksoftirqd/0 > > > > [root@fw ~]# uptime > > 10:38:11 up 161 days, 17:21, 3 users, load average: 1.58, 1.65, > 1.51 (unfortunately when i took this, the load average was > ?pretty good?, but minutes ago, it was more than 11.0 > > > > [root@fw ~]# lspci > > 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub > (rev 09) > > 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express > Port A (rev 09) > > 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port > B (rev 09) > > 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 > (rev 09) > > 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev > 09) > > 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #1 (rev 02) > > 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #2 (rev 02) > > 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #3 (rev 02) > > 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB2 EHCI Controller (rev 02) > > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) > > 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC > Interface Bridge (rev 02) > > 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) > IDE Controller (rev 02) > > 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (A-Segment Bridge) (rev 06) > > 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (B-Segment Bridge) (rev 06) > > 02:0c.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID > controller 4 (rev 06) > > 03:0b.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge A (rev 09) > > 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge B (rev 09) > > 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 09:0d.0 VGA compatible controller: ATI Technologies Inc Radeon > RV100 QY [Radeon 7000/VE] > > > > [root@fw ~]# free -m > > total used free shared buffers > cached > > Mem: 2021 1479 542 0 > 400 654 > > -/+ buffers/cache: 424 1597 > > Swap: 1027 0 1027 > > > > [root@fw ~]# cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5990.78 > > > > processor : 1 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5985.13 > > > > > > Any help/Tipp/hint will be very welcome. > > > > Thanks in Advance! > > > > Pablo Fernandes > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -- Ac?cio Alves dos Santos Administra??o de redes Diginet Brasil adm.acacio@digi.com.br (+55) 84 4008-9000 Esta mensagem, incluindo seus anexos, pode conter informa??o confidencial e/ou privilegiada. Se voc? n?o for o destinat?rio ou a pessoa autorizada a receber esta mensagem, n?o pode usar, copiar ou divulgar as informa??es nela contidas ou tomar qualquer a??o baseada nessas informa??es. Se voc? recebeu esta mensagem por engano, por favor avise imediatamente o remetente, respondendo o e-mail e em seguida apague-o. Agradecemos sua coopera??o. This message, including its attatchments, may contain confidential and/or privileged information. If you are not the recipient or authorized person to receive this message, you must not use, copy, disclose or take any action based on this message or any information herein. If you received this message by mistake, please advise the sender immediately by replying the e- mail and deleting this message. Thank you for your cooperation. From christian.benvenuti at libero.it Sat May 26 22:20:39 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Sat May 26 22:15:00 2007 Subject: [LARTC] Re: Problem with the tc statistcis Message-ID: <1180210840.9410.2.camel@benve-laptop> Hi Yang, >I had a linux wireless router. I would like to monitor the queue lengh >of the wireless interface. By default, the wifi0 interface is with >pfifo_fast qdisc which does not report backlog packet. Actually it does: # tc -s -d qdisc list dev eth1 qdisc pfifo_fast root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 256404346 bytes 246076 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 27092b 26p requeues 0 ^^^^^^^^^^^^^^^^^^ If the counters are set to 0 it is probably because there is no backlog (i.e., you transmit faster than you queue). >I replaced >pfifo_fast with pfifo: >'tc qdisc replace dev wifi0 root pfifo' >Then I use iperf to send UDP pkts faster than the interface can handle >but when I read the qdisc, the result is like: > >qdisc pfifo 8007: limit 10p >Sent 46249560 bytes 30600 pkts (dropped 0, overlimits 0) > >It is really strange, because over 80% of packets lost is observed by >application layer but by qdisc, there is no packet dropped at all >and no backlog. Any hints for this result? Are you sure it is not the application that drops the packets? Did you check the "bytes/pkt" counters on the tx and rx hosts? Regards /Christian (http://benve.info) From rotoole at gmail.com Sun May 27 01:06:02 2007 From: rotoole at gmail.com (Ryan O'Toole) Date: Sun May 27 01:06:07 2007 Subject: [LARTC] rate limiting netmask w/ dd-wrt Message-ID: <81ad6ba10705261606n339e4644id7686881c95dcf6d@mail.gmail.com> I'm trying to setup a DD-WRT router (www.dd-wrt.com; embedded micro-device linux for the uninitiated) to rate limit all the traffic it receives from its wi-fi interface. I followed the instructions from the cookbook section on rate limiting (http://lartc.org/howto/lartc.ratelimit.single.html), but it doesn't appear to be working. I tested by checking my laptop's bandwidth on http://www.speakeasy.net/speedtest/. Here's the commands I issued to tc: # tc qdisc add dev $DEV root handle 1: cbq avpkt 1000 bandwidth 10mbit # tc class add dev $DEV parent 1: classid 1:1 cbq rate 512kbit allot 1500 prio 5 bounded isolated # tc filter add dev $DEV parent 1: protocol ip prio 16 u32 match ip src 192.168.1.0/24 flowid 1:1 I replaced $DEV with each interface name in succession, as I wasn't sure which to use. Here's the output of ifconfig from the router: # ifconfig br0 Link encap:Ethernet HWaddr 00:16:01:59:EF:00 inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:114200 errors:0 dropped:0 overruns:0 frame:0 TX packets:125307 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:14568079 (13.8 MiB) TX bytes:71752923 (68.4 MiB) eth0 Link encap:Ethernet HWaddr 00:16:01:59:EF:00 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:64266 errors:0 dropped:0 overruns:0 frame:0 TX packets:54720 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:65677538 (62.6 MiB) TX bytes:12736990 (12.1 MiB) Interrupt:4 eth1 Link encap:Ethernet HWaddr 00:16:01:59:EF:02 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:117891 errors:0 dropped:0 overruns:0 frame:253671 TX packets:133961 errors:3 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:16589572 (15.8 MiB) TX bytes:75539860 (72.0 MiB) Interrupt:2 Base address:0x5000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MULTICAST MTU:16436 Metric:1 RX packets:46 errors:0 dropped:0 overruns:0 frame:0 TX packets:46 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5328 (5.2 KiB) TX bytes:5328 (5.2 KiB) vlan0 Link encap:Ethernet HWaddr 00:16:01:59:EF:00 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:5570 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:1086810 (1.0 MiB) vlan1 Link encap:Ethernet HWaddr 00:16:01:59:EF:01 inet addr:192.168.10.138 Bcast:192.168.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:64281 errors:0 dropped:0 overruns:0 frame:0 TX packets:49178 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:64522436 (61.5 MiB) TX bytes:11432967 (10.9 MiB) Can anyone tell me what I'm doing wrong here? Best, Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070526/92b76387/attachment.htm From fernandes_pablo at yahoo.com.br Sun May 27 06:21:30 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Sun May 27 10:22:13 2007 Subject: [LARTC] 2 gateways - routing based in source address Message-ID: <20070527082158.ED7E14053@outpost.ds9a.nl> Hello, I have a linux router, 2 internet access, 2 IP /24 ranges (as source computers) and 3 interfaces cards. (internet 2) 192.168.0.1 on eth0 [LINUX COMPUTER] 192.168.1.1 on eth1 (internet 1) 10.1.0.1 on eth2 (customers) The Sources (customers) are: 10.20.0.0/24 10.30.0.0/24 I don't have IP on this ranges in my linux box. There is another router under my linux box in my topology. But this customers arrives in my eth2 interface. I make SNAT fort his networks in my linux box. The Gateways for internet access are: 192.168.0.254 on eth0 192.168.1.254 on eth1 I would like to have my customers of 10.20.0.0/24 going out for the internet 2 and the customers on 10.30.0.0/24 going out by the internet 1 (Eth1). Thanks for any help in advance. Pablo Fernandes -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070527/71b51942/attachment.html From mihai at systelecom.ro Sun May 27 14:15:54 2007 From: mihai at systelecom.ro (Mihai Predescu) Date: Sun May 27 14:16:10 2007 Subject: [LARTC] Possible problem - class listing Message-ID: <1189771028.20070527151554@systelecom.ro> Hello lartc, I've made a script for shaping a network with about 3000+ users using hashing tables and everything seems to go down well, i've been testing it (for about 10 mins and it's doing the job, matching filters and so on. It counts about 5000+ filters on eth1 (limit download) and about the same on eth0 (upload). The actual implementation is tomorrow so to be really sure about it i have one question regarding how leaf classes are displayed. 1:20 is parent with many child classes, each one having it's own sfq qdisc. When i do : # tc -s class show dev eth1 .. the output is class htb 1:4061 parent 1:20 leaf df27: prio 0 rate 128000bit ceil 5000Kbit burst 1759b cburst 7849b Sent 0 bytes 0 pkts (dropped 0, overlimits 0) lended: 0 borrowed: 0 giants: 0 tokens: 90112 ctokens: 10289 class htb 1:4052 parent 1:20 leaf df1e: prio 0 rate 128000bit ceil 5000Kbit burst 1759b cburst 7849b Sent 0 bytes 0 pkts (dropped 0, overlimits 0) lended: 0 borrowed: 0 giants: 0 tokens: 90112 ctokens: 10289 On the previous setup we had it lists like this: class htb 1:1248 parent 1:10 leaf 1248: prio 0 rate 128000bit ceil 128000bit burst 1663b cburst 1663b Sent 173255 bytes 645 pkts (dropped 0, overlimits 0) lended: 645 borrowed: 0 giants: 0 tokens: 100864 ctokens: 100864 As you can see the part "class htb 1:4061 parent 1:20 leaf df27:" and "class htb 1:4052 parent 1:20 leaf df1e:" it's different from the next ... The second one is "class htb 1:1248 parent 1:10 leaf 1248:" so the classid has the same, let's say handle, with the leaf one. In my setup it doesn't. I wonder if this is a real problem and if i should take it into consideration and browse the inet for some solution. Any help will be appreciated. Thanks. -- Best regards, Mihai mailto:mihai@systelecom.ro From tim at bulsattv.com Mon May 28 00:46:41 2007 From: tim at bulsattv.com (Stoimen Gerenski) Date: Mon May 28 00:46:59 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers Message-ID: <1180306001.465a0a518d9ce@mail.bulsattv.com> Hello everybody, I have a problem which is similar: 450 customers with 450 HTB classes and their corresponding 450 filters in a subclass (1:3) of the root qdisc (PRIO) of the interface. The problem is manifesting at times, when I try to ping a host behind the router from a host before the router, the latency becomes 1-1,5 msec. On the machine is running also iptables firewall with a bunch of rules for dropping/accepting/natting specific traffic, plus routing about 30 Mbits/sec. When I remove the HTB qdisc, the latency is normal, 0,3-0,4 msec. Anyone has an idea what could cause this? Any input much appreciated! Regards, Stoimen -------------- Pablo, Here we have HTB being used for more than 10.000 customers. The difference, is that we use tc and u32 filters to classify the packets.. I use the same Dell PE 1850, but I have two Quad-Core Xeon (1.86GHz) on it :) # uptime 13:18:08 up 16 days, 12:32, 1 user, load average: 0.02, 0.02, 0.00 mpstat says: 01:19:11 PM CPU %user %nice %sys %iowait %irq %soft % steal %idle intr/s 01:19:13 PM all 0.00 0.00 0.00 0.00 0.57 13.81 0.00 85.61 10568.88 And as you can see.. the use of cpu is not that big.. On May 26, 2007, at 6:54 AM, Pablo Fernandes Yahoo wrote: > Hello, > > > > have HTB ?rules? in 4 different ISPs and i control for each > customer this way: > > > > Flush and 1:0 class > > tc qdisc del dev eth0 root > > tc qdisc add dev eth0 root handle 1:0 htb > > tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit > > tc qdisc del dev eth1 root > > tc qdisc add dev eth1 root handle 1:0 htb > > tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit > > > > Upload and Download: user1 > > tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil > 150kbit > > tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil > 50kbit > > tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j > CLASSIFY --set-class 1:5 > > iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY -- > set-class 1:5 > > > > Upload and Download: user2 > > tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil > 150kbit > > tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil > 50kbit > > tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j > CLASSIFY --set-class 1:8 > > iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY -- > set-class 1:8 > > > > (?) > > > > This rules works fine, but just for less than 1.700 customers. More > than 1.700 customers, i have my load avarage in the sky and > Ksoftirqd process (top information) in 100% fulltime. I don?t know > why. I used to use CBQ instead HTB because i had the same problem > and Ron (a guy in this list) gave this rules and told me that he > uses this for more than 3.000 customers. I tested it in more than 7 > different computers (but the same hadware specifications) and i had > the same problem with either CBQ or HTB rules. The computers that i > have are all of them DELL PowerEdge 1850. I will put some hardware > iformations here: > > > > top > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 3 root 39 19 0 0 0 R 100 0.0 5316:20 > ksoftirqd/0 > > > > [root at fw ~]# uptime > > 10:38:11 up 161 days, 17:21, 3 users, load average: 1.58, 1.65, > 1.51 (unfortunately when i took this, the load average was > ?pretty good?, but minutes ago, it was more than 11.0 > > > > [root at fw ~]# lspci > > 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub > (rev 09) > > 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express > Port A (rev 09) > > 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port > B (rev 09) > > 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 > (rev 09) > > 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev > 09) > > 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #1 (rev 02) > > 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #2 (rev 02) > > 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB UHCI Controller #3 (rev 02) > > 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) > USB2 EHCI Controller (rev 02) > > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) > > 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC > Interface Bridge (rev 02) > > 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) > IDE Controller (rev 02) > > 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (A-Segment Bridge) (rev 06) > > 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (B-Segment Bridge) (rev 06) > > 02:0c.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID > controller 4 (rev 06) > > 03:0b.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge A (rev 09) > > 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge B (rev 09) > > 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 09:0d.0 VGA compatible controller: ATI Technologies Inc Radeon > RV100 QY [Radeon 7000/VE] > > > > [root at fw ~]# free -m > > total used free shared buffers > cached > > Mem: 2021 1479 542 0 > 400 654 > > -/+ buffers/cache: 424 1597 > > Swap: 1027 0 1027 > > > > [root at fw ~]# cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5990.78 > > > > processor : 1 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5985.13 > > > > > > Any help/Tipp/hint will be very welcome. > > > > Thanks in Advance! > > > > Pablo Fernandes --------------------------------------------------- Webmail of Bulsat Ltd. at http://mail.bulsattv.com/ From fernandes_pablo at yahoo.com.br Mon May 28 02:27:29 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Mon May 28 06:28:10 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers Message-ID: <20070528042801.50FCB40F8@outpost.ds9a.nl> As i told before, i tryed to shape my traffic CBQ and i did use u32 filters. the results were exactly the same as HTB using u32 filters (i tryed before too) or not. Do you have in your HTB setup a class per customer? I've seen different setups, but all of them shaping the traffic either based in protocols or/and IP ranges (it isn't our reality whilst we have at least a class per each single IP within the network). After some readings, i'm starting to suspect about my NIC driver (e1000). Do you have in your Dell PE 1850 interfaces using the e1000 driver? Is the entire traffic passing by this server? I suppose that problem is something about the hardware or software interruptions. Are you using the default parameters for the e1000 kernel module? Regards >Pablo, > >Here we have HTB being used for more than 10.000 customers. The difference, is that we use tc and u32 filters to classify >the packets.. > >I use the same Dell PE 1850, but I have two Quad-Core Xeon (1.86GHz) on it :) > ># uptime >13:18:08 up 16 days, 12:32, 1 user, load average: 0.02, 0.02, 0.00 > >mpstat says: > >01:19:11 PM CPU %user %nice %sys %iowait %irq %soft % >steal %idle intr/s >01:19:13 PM all 0.00 0.00 0.00 0.00 0.57 13.81 >0.00 85.61 10568.88 > >And as you can see.. the use of cpu is not that big.. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070528/75616a65/attachment.htm From adm.acacio at digi.com.br Mon May 28 15:15:23 2007 From: adm.acacio at digi.com.br (=?ISO-8859-1?Q?Ac=E1cio_Alves_dos_Santos?=) Date: Mon May 28 15:12:55 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070528042801.50FCB40F8@outpost.ds9a.nl> References: <20070528042801.50FCB40F8@outpost.ds9a.nl> Message-ID: <718C034E-5101-4747-A296-92453A27078D@digi.com.br> I have classes configured by protocol.. My NIC is the e1000 too (default parameters), but I'm using off-board cards (with scalable i/o support - good for multi-core setups). The main problem I had was the number of interruptions (~ 10.000/s)... With IRQ balance activated, each NIC got used with a specific processor core, and the use of CPU on these cores was always 100%. I've solved this problem in my setup (2 Quad-Cores), deactivating the IRQ balance, that caused the interruptions to be processed by all the 8 cores. Are you doing p2p control on this server? This is usually what takes more CPU usage.. On May 27, 2007, at 9:27 PM, Pablo Fernandes Yahoo wrote: > As i told before, i tryed to shape my traffic CBQ and i did use u32 > filters. the results were exactly the same as HTB using u32 filters > (i tryed before too) or not. > > > > Do you have in your HTB setup a class per customer? I?ve seen > different setups, but all of them shaping the traffic either based > in protocols or/and IP ranges (it isn?t our reality whilst we have > at least a class per each single IP within the network). > > > > After some readings, i?m starting to suspect about my NIC driver > (e1000). Do you have in your Dell PE 1850 interfaces using the > e1000 driver? Is the entire traffic passing by this server? I > suppose that problem is something about the hardware or software > interruptions. Are you using the default parameters for the e1000 > kernel module? -- Ac?cio Alves dos Santos Administra??o de redes Diginet Brasil adm.acacio@digi.com.br (+55) 84 4008-9000 Esta mensagem, incluindo seus anexos, pode conter informa??o confidencial e/ou privilegiada. Se voc? n?o for o destinat?rio ou a pessoa autorizada a receber esta mensagem, n?o pode usar, copiar ou divulgar as informa??es nela contidas ou tomar qualquer a??o baseada nessas informa??es. Se voc? recebeu esta mensagem por engano, por favor avise imediatamente o remetente, respondendo o e-mail e em seguida apague-o. Agradecemos sua coopera??o. This message, including its attatchments, may contain confidential and/or privileged information. If you are not the recipient or authorized person to receive this message, you must not use, copy, disclose or take any action based on this message or any information herein. If you received this message by mistake, please advise the sender immediately by replying the e- mail and deleting this message. Thank you for your cooperation. From WBohannan at spidersat.com.gh Mon May 28 15:12:21 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Mon May 28 15:15:14 2007 Subject: [LARTC] 2 NICs Bridge + Router Message-ID: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> Hi wondering if anyone can help. I have two NICs on a debian sarge based system and current running as a bridge (br0) which consists of eth0 and eth1. Is it possible to add a virtual interface to the eth1 so I can also do NAT on the box as well? I have tried many times and keep coming up with errors. Kind Regards William Bohannan From alex at zoomnet.ro Mon May 28 15:29:45 2007 From: alex at zoomnet.ro (Alexandru Dragoi) Date: Mon May 28 15:29:48 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070526135435.C96F540DB@outpost.ds9a.nl> References: <20070526135435.C96F540DB@outpost.ds9a.nl> Message-ID: <465AD949.6020904@zoomnet.ro> u32 hash filters is the key, as somebody pointed. You can also tune your iptables setup, like this #192.168.1.0/24 iptables -t mangle -N 192-168-1-0-24 iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j 192-168-1-0-24 iptables -t mangle -N 192-168-1-0-25 iptables -t mangle -N 192-168-1-128-25 iptables -t mangle -A 192-168-1-0-24 -s 192.168.1.0/25 -j 192-168-1-0-25 iptables -t mangle -A 192-168-1-0-24 -s 192.168.128.0/25 -j 192-168-1-128-25 . . and so on, until (ip 192.168.1.11, which is called in chain created for 192.168.1.10/31) iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.10 -j CLASSIFY --set-class 1:10 iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.11 -j CLASSIFY --set-class 1:11 .. I guess you got the ideea, it requires some RAM, which i belive is not such a big problem. Similar rules should be made for download. Pablo Fernandes Yahoo wrote: > > Hello, > > have HTB ?rules? in 4 different ISPs and i control for each customer > this way: > > Flush and 1:0 class > > tc qdisc del dev eth0 root > > tc qdisc add dev eth0 root handle 1:0 htb > > tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit > > tc qdisc del dev eth1 root > > tc qdisc add dev eth1 root handle 1:0 htb > > tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit > > Upload and Download: user1 > > tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil 150kbit > > tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil 50kbit > > tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest x.x.x.x -o eth0 -j CLASSIFY > --set-class 1:5 > > iptables -t mangle -A FORWARD --src x.x.x.x -o eth1 -j CLASSIFY > --set-class 1:5 > > Upload and Download: user2 > > tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil 150kbit > > tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 > > tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil 50kbit > > tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 > > iptables -t mangle -A POSTROUTING --dest y.y.y.y -o eth0 -j CLASSIFY > --set-class 1:8 > > iptables -t mangle -A FORWARD --src y.y.y.y -o eth1 -j CLASSIFY > --set-class 1:8 > > (?) > > This rules works fine, but just for less than 1.700 customers. More > than 1.700 customers, i have my load avarage in the sky and Ksoftirqd > process (top information) in 100% fulltime. I don?t know why. I used > to use CBQ instead HTB because i had the same problem and Ron (a guy > in this list) gave this rules and told me that he uses this for more > than 3.000 customers. I tested it in more than 7 different computers > (but the same hadware specifications) and i had the same problem with > either CBQ or HTB rules. The computers that i have are all of them > DELL PowerEdge 1850. I will put some hardware iformations here: > > top > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 3 root 39 19 0 0 0 R 100 0.0 5316:20 ksoftirqd/0 > > [root@fw ~]# uptime > > 10:38:11 up 161 days, 17:21, 3 users, load average: 1.58, 1.65, 1.51 > (unfortunately when i took this, the load average was ?pretty good?, > but minutes ago, it was more than 11.0 > > [root@fw ~]# lspci > > 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub > (rev 09) > > 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express > Port A (rev 09) > > 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B > (rev 09) > > 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 09) > > 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 09) > > 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #1 (rev 02) > > 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #2 (rev 02) > > 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB > UHCI Controller #3 (rev 02) > > 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 > EHCI Controller (rev 02) > > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) > > 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC > Interface Bridge (rev 02) > > 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE > Controller (rev 02) > > 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (A-Segment Bridge) (rev 06) > > 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor > (B-Segment Bridge) (rev 06) > > 02:0c.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller > 4 (rev 06) > > 03:0b.0 Ethernet controller: Intel Corporation 82545GM Gigabit > Ethernet Controller (rev 04) > > 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge A (rev 09) > > 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI > Bridge B (rev 09) > > 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit > Ethernet Controller (rev 05) > > 09:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 > QY [Radeon 7000/VE] > > [root@fw ~]# free -m > > total used free shared buffers cached > > Mem: 2021 1479 542 0 400 654 > > -/+ buffers/cache: 424 1597 > > Swap: 1027 0 1027 > > [root@fw ~]# cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm > constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5990.78 > > processor : 1 > > vendor_id : GenuineIntel > > cpu family : 15 > > model : 4 > > model name : Intel(R) Xeon(TM) CPU 3.00GHz > > stepping : 3 > > cpu MHz : 2992.674 > > cache size : 2048 KB > > physical id : 0 > > siblings : 2 > > core id : 0 > > cpu cores : 1 > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 5 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm > constant_tsc pni monitor ds_cpl cid cx16 xtpr > > bogomips : 5985.13 > > Any help/Tipp/hint will be very welcome. > > Thanks in Advance! > > Pablo Fernandes > > ------------------------------------------------------------------------ > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > From vladsun at relef.net Mon May 28 15:39:11 2007 From: vladsun at relef.net (VladSun) Date: Mon May 28 15:39:23 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <465AD949.6020904@zoomnet.ro> References: <20070526135435.C96F540DB@outpost.ds9a.nl> <465AD949.6020904@zoomnet.ro> Message-ID: <465ADB7F.1020202@relef.net> Alexandru Dragoi ??????: > u32 hash filters is the key, as somebody pointed. You can also tune your > iptables setup, like this > > #192.168.1.0/24 > iptables -t mangle -N 192-168-1-0-24 > iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j 192-168-1-0-24 > iptables -t mangle -N 192-168-1-0-25 > iptables -t mangle -N 192-168-1-128-25 > iptables -t mangle -A 192-168-1-0-24 -s 192.168.1.0/25 -j 192-168-1-0-25 > iptables -t mangle -A 192-168-1-0-24 -s 192.168.128.0/25 -j 192-168-1-128-25 > . > . > and so on, until (ip 192.168.1.11, which is called in chain created for > 192.168.1.10/31) > > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.10 -j CLASSIFY > --set-class 1:10 > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.11 -j CLASSIFY > --set-class 1:11 > > .. I guess you got the ideea, it requires some RAM, which i belive is > not such a big problem. Similar rules should be made for download. > > Or you can use my patch - IPCLASSIFY. Then the rules above would be substituted by a signle rule per direction: iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j IPCLASSIFY --addr=src --and-mask=0xff --or-mask=0x11000 iptables -t mangle -A FORWARD -d 192.168.1.0/24 -j IPCLASSIFY --addr=dst --and-mask=0xff --or-mask=0x12000 This is equal to applying CLASSIFY target to each packet with --set-class (srcIP & 0xFF | 0x1100 ) and --set-class (dstIP & 0xFF | 0x1200 ). It is very similar to IPMARK, but it uses skb->priority field instead mark. So no tc filters are needed. From alex at zoomnet.ro Mon May 28 15:53:55 2007 From: alex at zoomnet.ro (Alexandru Dragoi) Date: Mon May 28 15:53:55 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <465ADB7F.1020202@relef.net> References: <20070526135435.C96F540DB@outpost.ds9a.nl> <465AD949.6020904@zoomnet.ro> <465ADB7F.1020202@relef.net> Message-ID: <465ADEF3.8040507@zoomnet.ro> VladSun wrote: > Alexandru Dragoi ??????: >> u32 hash filters is the key, as somebody pointed. You can also tune your >> iptables setup, like this >> >> #192.168.1.0/24 >> iptables -t mangle -N 192-168-1-0-24 >> iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j 192-168-1-0-24 >> iptables -t mangle -N 192-168-1-0-25 >> iptables -t mangle -N 192-168-1-128-25 >> iptables -t mangle -A 192-168-1-0-24 -s 192.168.1.0/25 -j 192-168-1-0-25 >> iptables -t mangle -A 192-168-1-0-24 -s 192.168.128.0/25 -j >> 192-168-1-128-25 >> . >> . >> and so on, until (ip 192.168.1.11, which is called in chain created for >> 192.168.1.10/31) >> >> iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.10 -j CLASSIFY >> --set-class 1:10 >> iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.11 -j CLASSIFY >> --set-class 1:11 >> >> .. I guess you got the ideea, it requires some RAM, which i belive is >> not such a big problem. Similar rules should be made for download. >> >> > Or you can use my patch - IPCLASSIFY. Then the rules above would be > substituted by a signle rule per direction: > > > iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j IPCLASSIFY > --addr=src --and-mask=0xff --or-mask=0x11000 > iptables -t mangle -A FORWARD -d 192.168.1.0/24 -j IPCLASSIFY > --addr=dst --and-mask=0xff --or-mask=0x12000 > > This is equal to applying CLASSIFY target to each packet with > --set-class (srcIP & 0xFF | 0x1100 ) and --set-class (dstIP & 0xFF | > 0x1200 ). > It is very similar to IPMARK, but it uses skb->priority field instead > mark. So no tc filters are needed. > Cool, I remember I red about this a little while ago. Now, another thing to tune would be some htb paches for massive hashing on classid lookup. I must say I haven't use it so far, I hope I will do it soon. http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg16279.html From fernandes_pablo at yahoo.com.br Mon May 28 12:01:51 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Mon May 28 16:02:27 2007 Subject: AW: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <465ADB7F.1020202@relef.net> Message-ID: <20070528140222.13DEA402D@outpost.ds9a.nl> Hey, I'm definately glad because i can see that someone else knows what is = happening here. Thank for all the help and also i'm here to help anyone = as much as i can. So, refreshing my current setup, i have this rules for each customer: tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1:0 htb tc class add dev eth0 parent 1:0 classid 1:1 htb rate 100mbit tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1:0 htb tc class add dev eth1 parent 1:0 classid 1:1 htb rate 100mbit user 1 tc class add dev eth0 parent 1:1 classid 1:5 htb rate 150kbit ceil = 150kbit tc qdisc add dev eth0 parent 1:5 handle 5: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:5 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:5 handle 5: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest 10.30.0.54 -o eth0 -j CLASSIFY = --set-class 1:5 iptables -t mangle -A FORWARD --src 10.30.0.54 -o eth1 -j CLASSIFY = --set-class 1:5 user n tc class add dev eth0 parent 1:1 classid 1:8 htb rate 150kbit ceil = 150kbit tc qdisc add dev eth0 parent 1:8 handle 8: sfq perturb 10 tc class add dev eth1 parent 1:1 classid 1:8 htb rate 50kbit ceil 50kbit tc qdisc add dev eth1 parent 1:8 handle 8: sfq perturb 10 iptables -t mangle -A POSTROUTING --dest 10.20.0.43 -o eth0 -j CLASSIFY = --set-class 1:8 iptables -t mangle -A FORWARD --src 10.20.0.43 -o eth1 -j CLASSIFY = --set-class 1:8 what u32 rules could replace these iptables rules? I would like to try = u32 filters and see if them will solve the problem, if i had no success, = i will try the IPCLASSIFY patch. Thanks again in Advance. Regards Pablo Fernandes -----Urspr=C3=BCngliche Nachricht----- Von: VladSun [mailto:vladsun@relef.net]=20 Gesendet: segunda-feira, 28 de maio de 2007 14:39 An: Alexandru Dragoi Cc: Pablo Fernandes Yahoo; lartc@mailman.ds9a.nl Betreff: Re: [LARTC] big problem with HTB/CBQ and CPU for more than = 1.700 customers Alexandru Dragoi =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0: > u32 hash filters is the key, as somebody pointed. You can also tune = your > iptables setup, like this > > #192.168.1.0/24 > iptables -t mangle -N 192-168-1-0-24 > iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j 192-168-1-0-24 > iptables -t mangle -N 192-168-1-0-25 > iptables -t mangle -N 192-168-1-128-25 > iptables -t mangle -A 192-168-1-0-24 -s 192.168.1.0/25 -j = 192-168-1-0-25 > iptables -t mangle -A 192-168-1-0-24 -s 192.168.128.0/25 -j = 192-168-1-128-25 > . > . > and so on, until (ip 192.168.1.11, which is called in chain created = for > 192.168.1.10/31) > > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.10 -j CLASSIFY > --set-class 1:10 > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.11 -j CLASSIFY > --set-class 1:11 > > .. I guess you got the ideea, it requires some RAM, which i belive is > not such a big problem. Similar rules should be made for download. > > =20 Or you can use my patch - IPCLASSIFY. Then the rules above would be=20 substituted by a signle rule per direction: iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j IPCLASSIFY = --addr=3Dsrc=20 --and-mask=3D0xff --or-mask=3D0x11000 iptables -t mangle -A FORWARD -d 192.168.1.0/24 -j IPCLASSIFY = --addr=3Ddst=20 --and-mask=3D0xff --or-mask=3D0x12000 This is equal to applying CLASSIFY target to each packet with=20 --set-class (srcIP & 0xFF | 0x1100 ) and --set-class (dstIP & 0xFF |=20 0x1200 ). It is very similar to IPMARK, but it uses skb->priority field instead=20 mark. So no tc filters are needed. _______________________________________________________ Yahoo! Mail - Sempre a melhor opção para você! Experimente já e veja as novidades. http://br.yahoo.com/mailbeta/tudonovo/ From marek at piasta.pl Mon May 28 17:11:32 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Mon May 28 17:12:21 2007 Subject: [LARTC] 2 gateways - routing based in source address In-Reply-To: <20070527082158.ED7E14053@outpost.ds9a.nl> References: <20070527082158.ED7E14053@outpost.ds9a.nl> Message-ID: <20070528171132.3f2cf013@catlap> >Hello, >I don't have IP on this ranges in my linux box. There is another router >under my linux box in my topology. But this customers arrives in my >eth2 interface. I make SNAT fort his networks in my linux box. >... What you need is simple source address based policy routing as described here: http://lartc.org/howto/lartc.rpdb.html Example on lartc howto is pretty straightforward. Many inpolite people would even say "RTFM". pozdrawiam, Marek Kierdelewicz KoBa ISP From gtaylor at riverviewtech.net Mon May 28 20:39:13 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Mon May 28 20:39:22 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> Message-ID: <465B21D1.5080300@riverviewtech.net> On 5/28/2007 8:12 AM, William Bohannan wrote: > Hi wondering if anyone can help. I have two NICs on a debian sarge based > system and current running as a bridge (br0) which consists of eth0 and > eth1. Is it possible to add a virtual interface to the eth1 so I can > also do NAT on the box as well? I have tried many times and keep coming > up with errors. Why not add virtual aliased interfaces to the br0 interface? Do your NATing there. Grant. . . . From fernandes_pablo at yahoo.com.br Tue May 29 01:25:43 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Tue May 29 05:26:27 2007 Subject: [LARTC] Re: big problem with HTB/CBQ and CPU for more than 1.700 customers Message-ID: <20070529032617.5CB6A634B3@outpost.ds9a.nl> Hey, I have some links that you could be interested on them: http://www.geocities.com/asimshankar/notes/linux-networking-code.txt http://oss.sgi.com/archives/netdev/2004-06/msg00162.html http://www.spec.org/osg/web99/results/res2003q3/web99-20030818-00245.html http://www.intel.com/support/network/sb/cs-009209.htm All of them talks about the interrupts. So, what do you think should i do with my e1000? What do you think could be the best board for sites as 8.000 customers? My problem is exact these lots of interruptions. Thank you in advance! Regards >I have classes configured by protocol.. > >My NIC is the e1000 too (default parameters), but I'm using off-board cards (with scalable i/o support - good for multi->core setups). The main problem I had was the number of interruptions (~ 10.000/s)... >With IRQ balance activated, each NIC got used with a specific processor core, and the use of CPU on these cores was always >100%. > >I've solved this problem in my setup (2 Quad-Cores), deactivating the IRQ balance, that caused the interruptions to be >processed by all the >8 cores. > >Are you doing p2p control on this server? This is usually what takes more CPU usage.. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070529/3ea52851/attachment.html From netsecuredata at gmail.com Tue May 29 06:32:25 2007 From: netsecuredata at gmail.com (Jorge Evangelista) Date: Tue May 29 06:32:35 2007 Subject: [LARTC] 2 gateways - routing based in source address In-Reply-To: <20070528171132.3f2cf013@catlap> References: <20070527082158.ED7E14053@outpost.ds9a.nl> <20070528171132.3f2cf013@catlap> Message-ID: Hi Pablo, You have to configure your box linux similar to: ip rule add from 10.20.0.0/24 to 0.0.0.0/0 table 100 ip route add default via 192.168.0.254 table 100 ip rule add from 10.30.0.0/24 to 0.0.0.0/0 table 200 ip route add default via 192.168.1.254 table 200 On 5/28/07, Marek Kierdelewicz wrote: > >Hello, > > > >I don't have IP on this ranges in my linux box. There is another router > >under my linux box in my topology. But this customers arrives in my > >eth2 interface. I make SNAT fort his networks in my linux box. > >... > > What you need is simple source address based policy routing as > described here: > http://lartc.org/howto/lartc.rpdb.html > > Example on lartc howto is pretty straightforward. Many inpolite > people would even say "RTFM". > > pozdrawiam, > Marek Kierdelewicz > KoBa ISP > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- "The network is the computer" From arman.anwar at gmail.com Tue May 29 08:05:59 2007 From: arman.anwar at gmail.com (Arman) Date: Tue May 29 08:06:22 2007 Subject: [LARTC] Re: Need cbq or htb optimal solution In-Reply-To: <13c1e7670705260254v42c6ea7aqcac6fa411047bf78@mail.gmail.com> References: <13c1e7670705260254v42c6ea7aqcac6fa411047bf78@mail.gmail.com> Message-ID: <13c1e7670705282305w532af239q24614b92646e646@mail.gmail.com> Hi, Please someone tell me what is the best in terms of performance when I use one class per client or one class per package I am new to it and dont know how the details. I am getting trouble sometimes when I connect from LAN to server through SSH(putty), the connection starts to hang and speed goes down. I dont know what configuration is wrong in cbq. Previously i was using squid delay pools. Regards, Arman On 5/26/07, Arman wrote: > > Hi all, > > > Can anyone in this mailing list answer a few theoretical question which r > confusing me. > > Here is the scenario > > I have a total Bandwidth of 2Mbps for a private LAN I am managing. I am > using cbq standard script available online and for controlling bandwidth, > squid and iptables. I have diff. packages for client. > > One: server own services access (unlimited bandwidth means no delays or > control) + A Internet bandwidth > > Two: server own services (unlimited) + B Internet bandwidth > > Three: server own services (unlimited) + C Internet bandwidth > > Now I have to distribute speed between these clients, which are around > 120. If I calculate I should have over 4Mbps total bandwidth. Currently I am > using cbq for controlling bandwidth and have 120 cbq classes, one class per > user. Is this good or I should have 4 categories of classes and users > sharing the speed. Currently I figure out that in diff times of day > bandwidth utilization is diff. sometimes users utilizing low bandwidth and > most of bandwidth is wasted and sometimes scenario is diff. > > What will be the case if I switch to htb. I thought I could utilize ceil > parameter to utilize wasted bandwidth in current configuration like one > class per user. > Please suggest the optimal solution. > > Thnax is advance > -- > Regards, > M Arman -- Regards, Arman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070529/e457b0eb/attachment.htm From salim.si at cipherium.com.tw Tue May 29 08:16:47 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Tue May 29 08:17:11 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <200705220028.08859.luciano@lugmen.org.ar> Message-ID: <001201c7a1b8$f3b24730$5964a8c0@SalimSi> None of the load balancing techniques I have come across seems to cover 'IP-Persistence'. For example, a session with several connections (for which no conntrack-helper modules exist), will have problems, as its connections will be routed through different WAN interfaces. Some servers are very particular about the source IP of the packets they receive. I suspect online gaming and instant messengers will have problems with load balancing. How is the experience of other people in here? A rewrite of 'recent' match to include both source and destination may turn out to be a solution, albeit with low performance. Any other ideas? -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Luciano Ruete Sent: Tuesday, May 22, 2007 11:28 AM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Monday 14 May 2007 02:57, Peter Rabbitson wrote: > Hi, > I have searched the archives on the topic, and it seems that the list > gurus favor load balancing to be done in the kernel as opposed to other > means. AFAIKR there aren't conflicting opinions, there are just to different aproaches and i belive that routing solution is user cause it was the first and because sounds logical to implement multipath with your routing tool. But iptables has become in a routing tool so far (and much more). Personaly im using multipath, but i do not dislike the iptables aproach. > I have been using a home-grown approach, which splits traffic > based on `-m statistic --mode random --probability X`, then CONNMARKs > the individual connections and the kernel happily routes them. I > understand that for > 2 links it will become impractical to calculate a > correct X. well, is not impractical with a litle of scripting in your firewal... #!/bin/bash # your uplinks weight as in kernel multipath # ie: link1 link2 link3 link4 link5 weight=" 1 2 1 3 5 " weight_total= for n in $weight ; do let weight_total=weight_total+n done for n in $weight ; do probability=$((n*100/weight_total)) echo iptables.. -m statistic --mode random --probability $probability done but the problem arraise when you have lets say 101 links, cause mode random takes a 2 digit number right?, but this can be changed in the code (use the source...) > But if we only have 2 gateways to the internet - are there > any advantages in letting the kernel multipath scheduler do the > balancing (with all the downsides of route caching), as opposed to the > pure random approach described above? Well, the disvantage i see is that you have to move all your routing rules to iptables space, but in the end you always need the routing table, but it is a mather of change old habits... -- Luciano _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From marek at piasta.pl Tue May 29 08:33:50 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Tue May 29 08:34:44 2007 Subject: [LARTC] Re: big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <20070529032617.5CB6A634B3@outpost.ds9a.nl> References: <20070529032617.5CB6A634B3@outpost.ds9a.nl> Message-ID: <20070529083350.2a7dcd78@catlap> >So, what do you think should i do with my e1000? What do you think >could be the best board for sites as 8.000 customers? My problem is >exact these lots of interruptions. Plug as many network interfaces (e1000) as cpu cores you have. E1000 multiport nics have separate irq assigned to each "port", so having 2 x Quad-Core Xeon and 2 x 4-port e1000 would allow you to configure static affinity of each port to one core: http://bcr2.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt Sometimes symmetric usage of network interfaces (for symmetric core usage) is the problem. I think you can achieve it by plugging all 8 ports to managed switch and configuring some form of aggregation. The best would be src/dst IP EtherChannel or something similar. For some deployments (where router sees all the clients on OSI layer 2) src/dst MAC EtherChannel would suffice. On linux side you would have to configure bonding: http://linux-net.osdl.org/index.php/Bonding pozdrawiam, Marek Kierdelewicz KoBa ISP From fernandes_pablo at yahoo.com.br Tue May 29 18:47:01 2007 From: fernandes_pablo at yahoo.com.br (Pablo Fernandes Yahoo) Date: Tue May 29 22:47:38 2007 Subject: [LARTC] 2 gateways - routing based in source address Message-ID: <20070529204734.86B5D4B87A@outpost.ds9a.nl> Hi, First of all, thank you for your help. And how can i put all the traffic comming from anywhere with destination port 80 or 443 to go out by the gateway 192.168.1.254, while all the rest going out by 192.168.0.254 ? Im trying here different ways with ToS but it isn't working. Thank you for any Tip. Regards >Hi Pablo, > >You have to configure your box linux similar to: > >ip rule add from 10.20.0.0/24 to 0.0.0.0/0 table 100 >ip route add default via 192.168.0.254 table 100 > >ip rule add from 10.30.0.0/24 to 0.0.0.0/0 table 200 >ip route add default via 192.168.1.254 table 200 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070529/5fb58144/attachment.html From netsecuredata at gmail.com Wed May 30 03:28:07 2007 From: netsecuredata at gmail.com (Jorge Evangelista) Date: Wed May 30 03:28:15 2007 Subject: [LARTC] 2 gateways - routing based in source address In-Reply-To: <20070529204734.86B5D4B87A@outpost.ds9a.nl> References: <20070529204734.86B5D4B87A@outpost.ds9a.nl> Message-ID: Hi, I have not tried it yet. I think that you have to make a rule similiar as ip rule add from 0.0.0.0/0 table 100 ip route add default via 192.168.1.254 table 100 proto static ip rule add from 0.0.0.0/0 table 200 ip route add default via 192.168.0.254 table 200 proto static #Mark incoming packets for later routing iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark iptables -A PREROUTING -t mangle -i eth2 -s 0.0.0.0/0 -p tcp --dport 80 -j MARK --set-mark 1 iptables -A PREROUTING -t mangle -i eth2 -s 0.0.0.0/0 -p tcp --dport 443 -j MARK --set-mark 1 iptables -A PREROUTING -t mangle -i eth2 -s 0.0.0.0/0 -j MARK --set-mark 2 ip rule add from all fwmark 1 table 100 ip rule add from all fwmark 2 table 200 For NAT iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to $IP_ETH0 iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to $IP_ETH1 On 5/29/07, Pablo Fernandes Yahoo wrote: > > > > Hi, > > > > First of all, thank you for your help. > > > > And how can i put all the traffic comming from anywhere with destination > port 80 or 443 to go out by the gateway 192.168.1.254, while all the rest > going out by 192.168.0.254 ? > > > > Im trying here different ways with ToS but it isn't working. Thank you for > any Tip. > > > > Regards > > > > >Hi Pablo, > > > > > >You have to configure your box linux similar to: > > > > > >ip rule add from 10.20.0.0/24 to 0.0.0.0/0 table 100 > > >ip route add default via 192.168.0.254 table 100 > > > > > >ip rule add from 10.30.0.0/24 to 0.0.0.0/0 table 200 > > >ip route add default via 192.168.1.254 table 200 > > > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > -- "The network is the computer" From salim.si at cipherium.com.tw Wed May 30 05:58:18 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Wed May 30 05:58:41 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <200705300045.37139.luciano@lugmen.org.ar> Message-ID: <000501c7a26e$c605a730$5964a8c0@SalimSi> Sorry, but it doesn't work that way. CONNMARK needs helper modules like the ones for FTP or H.323 to really know if connections belong to the same session. To cover all gaming and IM apps with own helper modules is practically impossible. I remember even MSN have had problems (timeout every 5 mins), but it seems to have been fixed at the server level. Could you please point out if I had missed any open discussion in the list which covers these things? -----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Wednesday, May 30, 2007 11:46 AM To: Salim S I Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Tuesday 29 May 2007 03:16:47 you wrote: > None of the load balancing techniques I have come across seems to cover > 'IP-Persistence'. For example, a session with several connections (for > which no conntrack-helper modules exist), will have problems, as its > connections will be routed through different WAN interfaces. Some > servers are very particular about the source IP of the packets they > receive. I suspect online gaming and instant messengers will have > problems with load balancing. How is the experience of other people in > here? > > A rewrite of 'recent' match to include both source and destination may > turn out to be a solution, albeit with low performance. Any other ideas? In this same thread a CONNMARK solution was exposed, and this same CONNMARK solution was openly discused several times in this list. All the cases that you mention (online gamming, instant messenger) and all other that you do not mention are solved having a connection-aware firewall, which is capable to route over the same link packets that belongs to the same logical connection, this is achived perfectly using netfilter CONNMARK. Regards! -- Luciano From rabbit at rabbit.us Wed May 30 06:55:13 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed May 30 06:55:20 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000501c7a26e$c605a730$5964a8c0@SalimSi> References: <000501c7a26e$c605a730$5964a8c0@SalimSi> Message-ID: <465D03B1.3050204@rabbit.us> Salim S I wrote: >> -----Original Message----- >> From: Luciano Ruete [mailto:luciano@lugmen.org.ar] >> Sent: Wednesday, May 30, 2007 11:46 AM >> To: Salim S I >> Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter >> >> On Tuesday 29 May 2007 03:16:47 you wrote: >>> None of the load balancing techniques I have come across seems to >> cover >>> 'IP-Persistence'. For example, a session with several connections (for >>> which no conntrack-helper modules exist), will have problems, as its >>> connections will be routed through different WAN interfaces. Some >>> servers are very particular about the source IP of the packets they >>> receive. I suspect online gaming and instant messengers will have >>> problems with load balancing. How is the experience of other people in >>> here? >>> >>> A rewrite of 'recent' match to include both source and destination may >>> turn out to be a solution, albeit with low performance. Any other >> ideas? >> >> In this same thread a CONNMARK solution was exposed, and this same >> CONNMARK >> solution was openly discused several times in this list. >> >> All the cases that you mention (online gamming, instant messenger) and >> all >> other that you do not mention are solved having a connection-aware >> firewall, >> which is capable to route over the same link packets that belongs to the >> same >> logical connection, this is achived perfectly using netfilter CONNMARK. >> >> Regards! > Sorry, but it doesn't work that way. > CONNMARK needs helper modules like the ones for FTP or H.323 to really > know if connections belong to the same session. To cover all gaming and > IM apps with own helper modules is practically impossible. I remember > even MSN have had problems (timeout every 5 mins), but it seems to have > been fixed at the server level. > Could you please point out if I had missed any open discussion in the > list which covers these things? Salim is correct, non-trackable protocols can be a major PITA. Actually I discussed this earlier in the thread. Yes, kernel balancing due to caching will alleviate this to a certain extent, but there will still be surprises down the road, when a cache entry finaly expires. Besides caching blows the entire balancing idea to bits if most users access primarily the same resource over and over again (think of a popular internet radio station). Furthermore neither route balancing nor the netfilter approach will be effective for resources hosted over _multiple_ distinct IPs (AIM is a very good example with separate authentication and data servers). This is where the exception lists come into play, which I also discussed. If one still wants to achieve pseudo balancing on the exempted destinations, it is still possible with the excellent SAME patch which makes a NAT decision based solely on an index derived fom the size of the source pool to be NATted divided by the number of NAT targets provided. Also note that as long as a service uses a static range of ports, you do not even have to know all the destination IP ranges in order to exempt it - simple port matching will do. HTH Peter From lists at andyfurniss.entadsl.com Wed May 30 22:34:03 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed May 30 22:34:00 2007 Subject: [LARTC] High Latency With Tiered Queues In-Reply-To: <1179464892.13716.12.camel@eliza> References: <1179464892.13716.12.camel@eliza> Message-ID: <465DDFBB.6000201@andyfurniss.entadsl.com> ISN Support Staff wrote: > Hello, > > I'm trying to setup what I thought would be a fairly basic tiered > shaping system. I have a 6mbit (768kbps) link coming into my eth1 > device, with my LAN IPs on the eth0 device. I want to limit outgoing > traffic so that certain IPs are limited to 400kbps, with 3 classes under > that 400k so certain machines get prioritized (main servers in 1:21, > other servers in 1:22, workstations in 1:23) > > The problem is that when I turn this on, my packet latency jumps up by > 50 to 100 times the normal rate. I go from 10-20 ms ping times to > 500-1600ms! I've tried putting SFQ qdiscs under the classes, but that > makes no difference. > > I'm sure there is just some tuning parameter I'm not setting > correctly, but can somebody clue me in to what I'm doing wrong? Or is > HTB just the wrong scheduler to be using here? I tried CBQ, but I can't > get the tiers to work ( I keep getting RTNETLINK answers: Invalid > argument) I'm currently using a single tiered CBQ solution, but it > really doesn't fit my needs. > > Here's the full script: > ----------------------- > qdisc add dev eth1 root handle 1: htb default 10 Using htb default will send arp to a really long (in this case) backlogged queue, which could cause problems. Generally you need to classify low latency traffic to different class than bulk if you want it to stay low latency. I am suprised sfq didn't help - maybe you also need to back off the ceils to allow for overheads on the quoted link speed. Andy. From lists at andyfurniss.entadsl.com Wed May 30 22:42:08 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed May 30 22:41:56 2007 Subject: [LARTC] Re: dropped bytes in "tc -s class" output In-Reply-To: <20070520182824.2ff1ae02@babalu.inexo.com.br> References: <1179694583.7354.9.camel@benve-laptop> <20070520182824.2ff1ae02@babalu.inexo.com.br> Message-ID: <465DE1A0.2000603@andyfurniss.entadsl.com> Ethy H. Brito wrote: > On Sun, 20 May 2007 22:56:23 +0200 > Christian Benvenuti wrote: > >> Hi, >> >>> Hi All >>> >>> Is there any output that counts the number of dropped bytes >>> (not packets) just as in "Sent" in "tc -s class" output? >> No. >> A simple workaround (for simple configurations) consists of redirecting >> all the traffic you want to drop to a dedicated class and attach a >> blackhole qdisc (i.e., drop everything) to it. > > Hmmm. I am pretty sure I did not tell what I meant. > > I assume that at some point an HTB class will drop some packets that are > beyond its speed regulation, right? Not always if it's got a really long queue tcp windows can be absorbed. You can choose queue length. > I just need to measure this amount of dropped bytes (not packets). I don't think you can. With > this measure I can MRTG it and give the clients some felling that they > really need more bandwidth. That would be a bit deceitful - TCP relies to some extent on dropped packets, how many you get depends on buffer lengths and number of connections. You could double the rate with a smaller buffer and see more dropped packets than a slower link with a bigger buffer. Andy. From lists at andyfurniss.entadsl.com Wed May 30 22:48:06 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed May 30 22:47:51 2007 Subject: [LARTC] rate limiting netmask w/ dd-wrt In-Reply-To: <81ad6ba10705261606n339e4644id7686881c95dcf6d@mail.gmail.com> References: <81ad6ba10705261606n339e4644id7686881c95dcf6d@mail.gmail.com> Message-ID: <465DE306.2040000@andyfurniss.entadsl.com> Ryan O'Toole wrote: > I'm trying to setup a DD-WRT router (www.dd-wrt.com; embedded micro-device > linux for the uninitiated) to rate limit all the traffic it receives from > its wi-fi interface. Qdiscs work on traffic leaving the interfaces. If you want to limit incoming traffic have a look at the policer example in LARTC, there are other ways using ifb/imq, but I don't know if dd-wrt will have those. Wireless is a special case - half duplex, so it's going to be even harder to do. Andy. From GregScott at InfraSupportEtc.com Thu May 31 01:46:44 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu May 31 01:46:57 2007 Subject: [LARTC] Proxy ARP with a Coyote Point equalizer Message-ID: <925A849792280C4E80C5461017A4B8A210B7F9@mail733.InfraSupportEtc.com> Here is a puzzle. I have a network with several servers. It's a mess. It's a /24 and pieces and servers are all over the place inside this /24 block, on both sides of the firewall. For example, the router at 1.2.3.1 is outside the firewall and many of the servers at 1.2.3.nnn/24 are behind the firewall. (Obviously, 1.2.3.nnn is a fudged network.) eth0 points outward to the Internet. eth1 points inward to the serers. Both eth0 and eth1 have IP Address 1.2.3.2. I setup proxy ARP like this: echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp echo 1 > /proc/sys/net/ipv4/conf/eth1/proxy_arp And I set up appropriate routes to the systems on both sides of the firewall. This all works - all the systems route the way they are supposed to route. Here is the problem. Behind the firewall is a Coyote Point Equalizer at 1.2.3.10, with a high-volume website behind it spread across several servers. Every time I put this proxy ARP firewall in place, that nasty Coyote Point box dies and this breaks the high volume website behind it and makes lots of people mad. I've never seen a Coyote Point Equalizer but I have a hunch it might not get along well with a proxy ARP device in its same network. Here are my questions: Proxy ARP really means proxy ARP - that firewall answers ARP requests for anything and everything it sees, for any network. This also has consequences for new devices that try to be polite when they set IP Addresses for themselves by ARPing to see if anyone else answers at that address. Is there a way to limit proxy ARP to a list of IP Addresses? Or - should I forget proxy ARP and look at bridging instead? Can I do bridging and still access the bridged interfaces remotely? Thanks - Greg Scott GregScott@InfraSupportEtc.com From gtaylor at riverviewtech.net Thu May 31 02:19:27 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu May 31 02:19:34 2007 Subject: [LARTC] Proxy ARP with a Coyote Point equalizer In-Reply-To: <925A849792280C4E80C5461017A4B8A210B7F9@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B7F9@mail733.InfraSupportEtc.com> Message-ID: <465E148F.3000500@riverviewtech.net> On 5/30/2007 6:46 PM, Greg Scott wrote: > Here is the problem. Behind the firewall is a Coyote Point Equalizer > at 1.2.3.10, with a high-volume website behind it spread across > several servers. Every time I put this proxy ARP firewall in place, > that nasty Coyote Point box dies and this breaks the high volume > website behind it and makes lots of people mad. I've never seen a > Coyote Point Equalizer but I have a hunch it might not get along well > with a proxy ARP device in its same network. Hrm... > Proxy ARP really means proxy ARP - that firewall answers ARP requests > for anything and everything it sees, for any network. This also has > consequences for new devices that try to be polite when they set IP > Addresses for themselves by ARPing to see if anyone else answers at > that address. Is there a way to limit proxy ARP to a list of IP > Addresses? This will be Proxy ARP implementation specific. I have no idea whether or not Linux can be configured to behave as you are asking or not. > Or - should I forget proxy ARP and look at bridging instead? Having just (briefly) brushed up on Proxy ARPing, I can see how it would be a problem for a load balancer. Most load balancers work on a couple of different levels, either IP <-> MAC spoofing, or NATing. The former method is probably what is happening and thus having a problem with your Proxy ARP router / firewall. Consider if you will a host out side of the Proxy ARP router / firewall trying to connect to an IP address that is both behind the Proxy ARP router / firewall AND the load balancer. If the load balancer changes the MAC address that the IP address belongs to, the Proxy ARP router / firewall will inevitably end up pointing to the wrong internal MAC. How will the load balancer handle the traffic when it does not start flowing to the alternative MAC like it wants? I can not say. But, I do see a very big potential for a conflict. In said conflict, I can not say any thing to how any of the equipment will fail. Thus, you could end up in the scenario you are in now. I can't say for sure as to whether or not you should forget about proxy ARP or not, but I can say for sure that bridging will do what you are wanting to do very well. Bridging will pass the ARP requests in directly to the load balancer like it is expecting so that it can control things the way that it wants to. This means that when the load balancer alters the IP <-> MAC mapping, the upstream device on the other side of the bridge will see the changed MAC address. I think I would go the bridging route. > Can I do bridging and still access the bridged interfaces remotely? Most definitely! Put your IP address on the bridge interface. I.e. eth0 and eth1 are bridged together by br0. ifconfig br0 1.2.3.2 netmask 255.255.255.0 You will be able to access 1.2.3.2 from either side of the bridge. That is presuming that you do not use EBTables / IPTables to filter the bridged traffic. In other words, so long as you are not doing any layer 2 filtering yes. From salim.si at cipherium.com.tw Thu May 31 07:02:16 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu May 31 07:02:35 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <200705310125.37531.luciano@lugmen.org.ar> Message-ID: <000a01c7a340$df560720$5964a8c0@SalimSi> Before we get into the "Top-posting" stuff, it would be nice if you follow the normal way of replying (or atleast marking a copy) to the list. I think that is the basic idea behind mailing list. If you had done that, I wouldn't have had to do the "Top-Posting". Take a look at the archives please. -----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Thursday, May 31, 2007 12:26 PM To: Salim S I Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Wednesday 30 May 2007 00:58:18 you wrote: First of all, learn about basic[1] mailing list rules, mainly your top-posting[2] is breaking all the sense of the thread >> Sorry, but it doesn't work that way >yes it does. Up to you if you refuse to accept, doesn't matter for me if you choose to live in your little world. >> CONNMARK needs helper modules like the ones for FTP or H.323 to really >> know if connections belong to the same session. To cover all gaming and >> IM apps with own helper modules is practically impossible. >this helpers are needed because some special protocols have special needs, >all >other protocols are covered in a simpler maner bye flowing the tcp flow >between two ports, you need al least a litle low level knowldge about layer >3-4 protocols to undestand this. Yessir. 3 bags full. If you had read my post c l e a r l y, before you felt obliged to show off your knowledge, you might have understood that I was talking about the so-called 'special-protocols'. Btw, thanks for that bit about "TCP flow between two ports", was quite new to me. >> I remember >> even MSN have had problems (timeout every 5 mins), but it seems to have >> been fixed at the server level. >With CONNMARK solution 99,99% of the things works, i am the sys/net-admin >from >an ISP that proves it, whit load balancing over multiple links. Sorry again! That figure of '99.99' is in YOUR case, but are you aware there are others in this world too, with different scenarios/setups? You did not think Peter and I were dreaming up a scenario,did you? Btw, your being a netadmin doesn't automatically make your statements correct. >For each protocol that are not covered by simple tcp flow a helper module >was written. It must be a well kept secret then! I am sorry to say this, if your knowledge was half the size of your ego, it would have been good for us all. >> Could you please point out if I had missed any open discussion in the >> list which covers these things? >just google(ie): "connmark site:lartc...archive" Thanks for introducing google. But my question still stands. From gypsy at iswest.com Thu May 31 08:49:59 2007 From: gypsy at iswest.com (gypsy) Date: Thu May 31 08:50:41 2007 Subject: [LARTC] Proxy ARP with a Coyote Point equalizer References: <925A849792280C4E80C5461017A4B8A210B7F9@mail733.InfraSupportEtc.com> Message-ID: <465E7017.52BF668A@iswest.com> Greg Scott wrote: > > Here is a puzzle. > > I have a network with several servers. It's a mess. It's a /24 and > pieces and servers are all over the place inside this /24 block, on both > sides of the firewall. For example, the router at 1.2.3.1 is outside > the firewall and many of the servers at 1.2.3.nnn/24 are behind the > firewall. (Obviously, 1.2.3.nnn is a fudged network.) > > eth0 points outward to the Internet. > eth1 points inward to the serers. > > Both eth0 and eth1 have IP Address 1.2.3.2. I setup proxy ARP like > this: > > echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp > echo 1 > /proc/sys/net/ipv4/conf/eth1/proxy_arp > > And I set up appropriate routes to the systems on both sides of the > firewall. > > This all works - all the systems route the way they are supposed to > route. > > Here is the problem. Behind the firewall is a Coyote Point Equalizer at > 1.2.3.10, with a high-volume website behind it spread across several > servers. Every time I put this proxy ARP firewall in place, that nasty > Coyote Point box dies and this breaks the high volume website behind it > and makes lots of people mad. I've never seen a Coyote Point Equalizer > but I have a hunch it might not get along well with a proxy ARP device > in its same network. > > Here are my questions: > > Proxy ARP really means proxy ARP - that firewall answers ARP requests > for anything and everything it sees, for any network. This also has > consequences for new devices that try to be polite when they set IP > Addresses for themselves by ARPing to see if anyone else answers at that > address. Is there a way to limit proxy ARP to a list of IP Addresses? > > Or - should I forget proxy ARP and look at bridging instead? Can I do > bridging and still access the bridged interfaces remotely? > > Thanks > > - Greg Scott > GregScott@InfraSupportEtc.com See http://yesican.chsoft.biz/lartc/proxy-arp.conf and http://yesican.chsoft.biz/lartc/proxy-arp.sh to see if that helps. The LAN interface (eth0) uses the /proc-/proxy_arp setting while the WAN (eth1) interface uses the script. FWIW, those are my working setups. One computer has a WAN connection (eth1) and all other servers inside connect to its eth0. The above script and config file are on that computer. Note that both eth1 and eth0 have the same IP (66.209.101.198) and netmask. This machine has a third interface (eth2) to the LAN, but that is not material here. If the ISP changes things, which they have done a couple of times, I have to ask them to flush their ARP cache manually because their retention is HUGE (~70 minutes), but except for that, I've never had any problems with this setup. I had no success at all trying to use /proc on eth1. -- gypsy From cla.greco at fastwebnet.it Thu May 31 12:29:15 2007 From: cla.greco at fastwebnet.it (Claudio Greco) Date: Thu May 31 12:30:30 2007 Subject: [LARTC] Watchdog timer and packet enqueuing Message-ID: <465EA37B.4030709@fastwebnet.it> Hello everybody, I'm implementing a qdisc for my MSc thesis and I've experienced some issue maybe you can help me with. The discipline is simply a slotted max-weight in which any user has a Ceil Rate they mustn't trespass on. Being such a discipline non-work-conserving, I've to set a watchdog timer in order to allow further call of the dequeue function. However, I've noticed that such a solution prevents the local generated packets to be enqueued, which is very concerning to me being the priority based on the backlog... Does anyone know any trade-off solution between these needs (non-conserving-work and continuous enqueue flow)? Thanks in advance, Claudio. From WBohannan at spidersat.com.gh Thu May 31 14:22:06 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Thu May 31 14:22:13 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <465B21D1.5080300@riverviewtech.net> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net> Message-ID: <4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> Thanks Grant, I am very new to combining NATing and Brigdge. Please can you possibly give an example on how to add the virtual interface. Current /etc/networking/interfaces looks like this: --------------------------------------- auto lo iface lo inet loopback auto br0 iface br0 inet static address xxx.xxx.xxx.xxx netmask 255.255.255.128 network xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx gateway xxx.xxx.xxx.xxx pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 ----------------------------------------- Kind Regards William Bohannan -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Grant Taylor Sent: Monday, May 28, 2007 6:39 PM To: Mail List - Linux Advanced Routing and Traffic Control Subject: Re: [LARTC] 2 NICs Bridge + Router On 5/28/2007 8:12 AM, William Bohannan wrote: > Hi wondering if anyone can help. I have two NICs on a debian sarge based > system and current running as a bridge (br0) which consists of eth0 and > eth1. Is it possible to add a virtual interface to the eth1 so I can > also do NAT on the box as well? I have tried many times and keep coming > up with errors. Why not add virtual aliased interfaces to the br0 interface? Do your NATing there. Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From afshin.tajvidi at free.fr Thu May 31 16:22:55 2007 From: afshin.tajvidi at free.fr (Afshin Tajvidi) Date: Thu May 31 16:23:02 2007 Subject: [LARTC] IFB & 802.1q Message-ID: <1180621375.2571.4.camel@mahler.onetec> Hello What I'm looking for is how to configure the Linux QoS module to do global rate limitation for two (or more) 802.1q pseudo network devices. I naturally suppose there is a possibility with IFB. I don't want to use IMQ because it's not integrated to my kernel v2.6.21.1 and I didn't find IMQ patches for it nor for the iptables package I use (v1.3.7). I've found some sample for ingress shaping with IFB. But my goal is to make global ?egress? shaping on an IFB device grouping my two 802.1q devices (let's say eth0.10 and eth0.20 redirected to ifb0). I'm using the following commands to create a QoS simple tree : ip link set up dev ifb0 tc qdisc add dev ifb0 root handle 1: htb default 3 tc class add dev ifb0 parent 1: classid 1:1 htb rate 2000kbit quantum 1514 tc class add dev ifb0 parent 1:1 classid 1:2 htb rate 1000kbit ceil 2000kbit quantum 1514 tc class add dev ifb0 parent 1:1 classid 1:3 htb rate 1000kbit ceil 2000kbit quantum 1514 tc filter add dev ifb0 parent 1: protocol ip priority 10 u32 match ip sport 80 0xffff flowid 1:2 So more precisely my question is which commands are to be used to redirect flows outgoing from eth0.10 and eth0.20 to ifb0 ? (I don't want to create separate QoS trees for eth0.10 and eth0.20 because the borrowing feature of HTB interests me). I've used : tc filter add dev eth0.10 parent root protocol ip priority 10 u32 match u32 0 0 flowid 1: action mirred egress redirect dev ifb0 tc filter add dev eth0.20 parent root protocol ip priority 10 u32 match u32 0 0 flowid 1: action mirred egress redirect dev ifb0 But this do not work! (the ifb0 is always empty) Maybe I miss something or simply IFB does not allow to do global limitation as IMQ does. Somebody has already set such a configuration ? Any advice ? Thanks in advance -- Afshin Tajvidi From gtaylor at riverviewtech.net Thu May 31 16:36:18 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu May 31 16:34:30 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net> <4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> Message-ID: <465EDD62.3000602@riverviewtech.net> On 05/31/07 07:22, William Bohannan wrote: > Thanks Grant, I am very new to combining NATing and Brigdge. Please > can you possibly give an example on how to add the virtual interface. I'll try. I don't recognize the format of the file below, but I'll take a stab at it. > Current /etc/networking/interfaces looks like this: > --------------------------------------- > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address xxx.xxx.xxx.xxx > netmask 255.255.255.128 > network xxx.xxx.xxx.xxx > broadcast xxx.xxx.xxx.xxx > gateway xxx.xxx.xxx.xxx auto br0:1 iface br0:1 inet static address xxx.xxx.xxx.xxx netmask 255.255.255.128 network xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx gateway xxx.xxx.xxx.xxx > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 > ----------------------------------------- Again this is just a guess and where I would start. You may have better luck seeking support through your distribution. Grant. . . . From marek at piasta.pl Thu May 31 16:41:24 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Thu May 31 16:42:20 2007 Subject: ***SPAM*** [LARTC] IFB & 802.1q In-Reply-To: <1180621375.2571.4.camel@mahler.onetec> References: <1180621375.2571.4.camel@mahler.onetec> Message-ID: <20070531164124.24927f39@catlap> Hi, >tc filter add dev eth0.10 parent root protocol ip priority 10 u32 match >u32 0 0 flowid 1: action mirred egress redirect dev ifb0 >tc filter add dev eth0.20 parent root protocol ip priority 10 u32 match >u32 0 0 flowid 1: action mirred egress redirect dev ifb0 Try to add htb qdisc and attach your filter to qdisc instead of root. I think I used such configuration some time ago. As for filter rule, something like that worked for me: tc filter add dev ethX.X protocol ip parent 1: prio 4 u32 match ip dst 0.0.0.0/0 flowid :1 action mirred egress redirect dev ifbX cheers, Marek Kierdelewicz KoBa ISP From sa.foroak at gmail.com Thu May 31 16:59:50 2007 From: sa.foroak at gmail.com (Saioa Arrizabalaga) Date: Thu May 31 16:59:59 2007 Subject: [LARTC] IFB & 802.1q In-Reply-To: <1180621375.2571.4.camel@mahler.onetec> References: <1180621375.2571.4.camel@mahler.onetec> Message-ID: <12489d010705310759p249c2108saa812d23883578db@mail.gmail.com> Hi, > > ip link set up dev ifb0 > > tc qdisc add dev ifb0 root handle 1: htb default 3 > tc class add dev ifb0 parent 1: classid 1:1 htb rate 2000kbit quantum > 1514 > tc class add dev ifb0 parent 1:1 classid 1:2 htb rate 1000kbit ceil > 2000kbit quantum 1514 > tc class add dev ifb0 parent 1:1 classid 1:3 htb rate 1000kbit ceil > 2000kbit quantum 1514 > > tc filter add dev ifb0 parent 1: protocol ip priority 10 u32 match ip > sport 80 0xffff flowid 1:2 Try with "protocol 802.1q" instead of "protocol ip" in the filter: tc filter add dev ifb0 parent 1: protocol 802.1q priority 10 u32 match ip sport 80 0xffff flowid 1:2 I had a similar problem and that worked for me. These posts may be useful: http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg10132.html http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg15219.html http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg10726.html Regards, -- Saioa Arrizabalaga Telecommunication Engineer CEIT San Sebastian, Spain From GregScott at InfraSupportEtc.com Thu May 31 20:41:56 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu May 31 20:42:06 2007 Subject: [LARTC] Proxy ARP with a Coyote Point equalizer Message-ID: <925A849792280C4E80C5461017A4B8A210B803@mail733.InfraSupportEtc.com> I was thinking about what Grant said in his earlier reply about bridging, and the more I think about it, the more I think Grant is right. I was watching tcpdump and some firewall accounting rules when we tried this the other day and I saw all kinds of traffic that had absolutely nothing to do with my network - and I was blocking it! And I kept wondering, why was I blocking traffic that had a destination IP address nowhere near me that I should not care about? Well, this is a co-location site, so an Ethernet connects this network to the Internet, not a point to point serial or any kind of WAN. No doubt lots of other folks had their networks in racks in the same room on the same extended Ethernet as my stuff. But still, why did my box care about any of this traffic at all? Why would I specifically block it - why didn't the NIC in my box just ignore it, the way most normal systems do in an Ethernet for traffic it doesn't care about? Well, duh! It's proxy ARP. Every time anyone, anywhere on this Ethernet, sends out an ARP request and I see it, I answer - yup, here is my MAC Address and it belongs to the IP Address you just asked about. I don't care what IP Address, I answer ARP requests for ALL IP Addresses! I was essentialy ARP spoofing the whole world! Well, at least the whole world on that Ethernet that morning. Holy moley - based on that analysis, proxy ARP should be outlawed from any co-location site, at least anything directly exposed to all the public traffic. For that few minutes, I'll bet I messed up systems belonging to who-knows-how-many customers! Fortunately, it was in the middle of the night in my corner of the world. For Gypsy - it's not my own ARP cache I was messing with, it was everyone else's ARP caches. Anyway, lesson learned. Maybe this writeup will help somebody else out there. Definitely do bridging. - Greg Scott -----Original Message----- From: gypsy [mailto:gypsy@iswest.com] Sent: Thursday, May 31, 2007 1:50 AM To: lartc@mailman.ds9a.nl Cc: Greg Scott Subject: Re: [LARTC] Proxy ARP with a Coyote Point equalizer Greg Scott wrote: > > Here is a puzzle. > > I have a network with several servers. It's a mess. It's a /24 and > pieces and servers are all over the place inside this /24 block, on > both sides of the firewall. For example, the router at 1.2.3.1 is > outside the firewall and many of the servers at 1.2.3.nnn/24 are > behind the firewall. (Obviously, 1.2.3.nnn is a fudged network.) > > eth0 points outward to the Internet. > eth1 points inward to the serers. > > Both eth0 and eth1 have IP Address 1.2.3.2. I setup proxy ARP like > this: > > echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp > echo 1 > /proc/sys/net/ipv4/conf/eth1/proxy_arp > > And I set up appropriate routes to the systems on both sides of the > firewall. > > This all works - all the systems route the way they are supposed to > route. > > Here is the problem. Behind the firewall is a Coyote Point Equalizer > at 1.2.3.10, with a high-volume website behind it spread across > several servers. Every time I put this proxy ARP firewall in place, > that nasty Coyote Point box dies and this breaks the high volume > website behind it and makes lots of people mad. I've never seen a > Coyote Point Equalizer but I have a hunch it might not get along well > with a proxy ARP device in its same network. > > Here are my questions: > > Proxy ARP really means proxy ARP - that firewall answers ARP requests > for anything and everything it sees, for any network. This also has > consequences for new devices that try to be polite when they set IP > Addresses for themselves by ARPing to see if anyone else answers at > that address. Is there a way to limit proxy ARP to a list of IP Addresses? > > Or - should I forget proxy ARP and look at bridging instead? Can I do > bridging and still access the bridged interfaces remotely? > > Thanks > > - Greg Scott > GregScott@InfraSupportEtc.com See http://yesican.chsoft.biz/lartc/proxy-arp.conf and http://yesican.chsoft.biz/lartc/proxy-arp.sh to see if that helps. The LAN interface (eth0) uses the /proc-/proxy_arp setting while the WAN (eth1) interface uses the script. FWIW, those are my working setups. One computer has a WAN connection (eth1) and all other servers inside connect to its eth0. The above script and config file are on that computer. Note that both eth1 and eth0 have the same IP (66.209.101.198) and netmask. This machine has a third interface (eth2) to the LAN, but that is not material here. If the ISP changes things, which they have done a couple of times, I have to ask them to flush their ARP cache manually because their retention is HUGE (~70 minutes), but except for that, I've never had any problems with this setup. I had no success at all trying to use /proc on eth1. -- gypsy From gtaylor at riverviewtech.net Thu May 31 23:12:12 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu May 31 23:10:35 2007 Subject: [LARTC] Proxy ARP with a Coyote Point equalizer In-Reply-To: <925A849792280C4E80C5461017A4B8A210B803@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B803@mail733.InfraSupportEtc.com> Message-ID: <465F3A2C.8010503@riverviewtech.net> On 05/31/07 13:41, Greg Scott wrote: > I was thinking about what Grant said in his earlier reply about > bridging, and the more I think about it, the more I think Grant is > right. I was watching tcpdump and some firewall accounting rules > when we tried this the other day and I saw all kinds of traffic that > had absolutely nothing to do with my network - and I was blocking it! > And I kept wondering, why was I blocking traffic that had a > destination IP address nowhere near me that I should not care about? That's not good. > Well, this is a co-location site, so an Ethernet connects this > network to the Internet, not a point to point serial or any kind of > WAN. No doubt lots of other folks had their networks in racks in the > same room on the same extended Ethernet as my stuff. Probably. However I have to ask, why are so many different networks sharing one broadcast domain? Or could it be that traffic was trying to be routed to the target subnet, but your system responded to the ARP for the IP of the target router??? I wonder... Either way, this is not good. This is also why I have seriously considered statically setting some IP to MAC address mappings in some more stringent environments. If the upstream router knows the MAC address of my router, it will not need to ARP for it and thus I do not care if someone else claims to use my IP or not because the router will know which MAC address to use. > But still, why did my box care about any of this traffic at all? Why > would I specifically block it - why didn't the NIC in my box just > ignore it, the way most normal systems do in an Ethernet for traffic > it doesn't care about? Well, duh! It's proxy ARP. Every time > anyone, anywhere on this Ethernet, sends out an ARP request and I see > it, I answer - yup, here is my MAC Address and it belongs to the IP > Address you just asked about. I don't care what IP Address, I answer > ARP requests for ALL IP Addresses! I was essentially ARP spoofing the > whole world! Well, at least the whole world on that Ethernet that > morning. Oh NO! That was not a good thing. I wonder how many people you effected while doing that. > Holy moley - based on that analysis, proxy ARP should be outlawed > from any co-location site, at least anything directly exposed to all > the public traffic. For that few minutes, I'll bet I messed up > systems belonging to who-knows-how-many customers! Fortunately, it > was in the middle of the night in my corner of the world. Just because it is the middle of the night for you does not mean that it was not 9:30 in the morning for someone else who potentially has their box co-located there. I think I would be tempted to contact your co-location provider and let them know what happened. Consider if you were there administrator trying to figure out why a bunch of clients traffic did not route correctly for a short time and then repaired its self with out any explanation at all. I know that I would want to know if such a thing happened in one of my data centers. At the very least I would want to know that it was not a bug in the router but rather something that is a valid explanation for what happened. As far as outlawing Proxy ARP, I don't know. I do know that I see no reason to use Proxy ARP when bridging is sow much more powerful and safer. Just think, you an do any and all IPTables (layer 3 and above) firewalling ON a layer 2 device. You can be REALLY devious with this. If course there is also EBTables and ARPTables too. > For Gypsy - it's not my own ARP cache I was messing with, it was > everyone else's ARP caches. Gulp! > Anyway, lesson learned. Maybe this writeup will help somebody else > out there. Definitely do bridging. Here! HERE! Grant. . . . From lists at andyfurniss.entadsl.com Fri Jun 1 01:53:04 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Fri Jun 1 01:53:02 2007 Subject: [LARTC] IFB & 802.1q In-Reply-To: <1180621375.2571.4.camel@mahler.onetec> References: <1180621375.2571.4.camel@mahler.onetec> Message-ID: <465F5FE0.3040506@andyfurniss.entadsl.com> Afshin Tajvidi wrote: > So more precisely my question is which commands are to be used to > redirect flows outgoing from eth0.10 and eth0.20 to ifb0 ? (I don't want > to create separate QoS trees for eth0.10 and eth0.20 because the > borrowing feature of HTB interests me). > > I've used : > > tc filter add dev eth0.10 parent root protocol ip priority 10 u32 match > u32 0 0 flowid 1: action mirred egress redirect dev ifb0 > tc filter add dev eth0.20 parent root protocol ip priority 10 u32 match > u32 0 0 flowid 1: action mirred egress redirect dev ifb0 > > But this do not work! (the ifb0 is always empty) Maybe I miss something > or simply IFB does not allow to do global limitation as IMQ does. You need a classfull qdisc on the egress interface to get the redirect to work. If you redirect from eth0.X then protocol ip should be OK. Try - tc qdisc add dev eth0.10 root handle 1:0 prio tc filter add dev eth0.10 parent 1:0 protocol ip priority 10 u32 match u32 0 0 flowid 1: action mirred egress redirect dev ifb0 Andy. From luciano at lugmen.org.ar Fri Jun 1 04:43:05 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Fri Jun 1 04:43:16 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <465ADB7F.1020202@relef.net> References: <20070526135435.C96F540DB@outpost.ds9a.nl> <465AD949.6020904@zoomnet.ro> <465ADB7F.1020202@relef.net> Message-ID: <200705312343.05477.luciano@lugmen.org.ar> On Monday 28 May 2007 10:39:11 VladSun wrote: > Alexandru Dragoi ??????: > > u32 hash filters is the key, as somebody pointed. You can also tune your > > iptables setup, like this > > > > #192.168.1.0/24 > > iptables -t mangle -N 192-168-1-0-24 > > iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j 192-168-1-0-24 > > iptables -t mangle -N 192-168-1-0-25 > > iptables -t mangle -N 192-168-1-128-25 > > iptables -t mangle -A 192-168-1-0-24 -s 192.168.1.0/25 -j 192-168-1-0-25 > > iptables -t mangle -A 192-168-1-0-24 -s 192.168.128.0/25 -j > > 192-168-1-128-25 . > > . > > and so on, until (ip 192.168.1.11, which is called in chain created for > > 192.168.1.10/31) > > > > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.10 -j CLASSIFY > > --set-class 1:10 > > iptables -t mangle -A 192-168-1-10-31 -s 192.168.1.11 -j CLASSIFY > > --set-class 1:11 > > > > .. I guess you got the ideea, it requires some RAM, which i belive is > > not such a big problem. Similar rules should be made for download. > > Or you can use my patch - IPCLASSIFY. Then the rules above would be > substituted by a signle rule per direction: > > > iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j IPCLASSIFY --addr=src > --and-mask=0xff --or-mask=0x11000 > iptables -t mangle -A FORWARD -d 192.168.1.0/24 -j IPCLASSIFY --addr=dst > --and-mask=0xff --or-mask=0x12000 Wow! now i get it, this patch is amazing, now i have a pendient hack that is to merge this with htb-gen. Any chances that this get into mainline, have you mailed netfilter-dev list? -- Luciano From luciano at lugmen.org.ar Fri Jun 1 06:46:18 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Fri Jun 1 06:48:53 2007 Subject: [LARTC] htb-gen 9.0beta (htb frontend with web-frontend for home/small/medium ISPs) Message-ID: <200706010146.18511.luciano@lugmen.org.ar> original at: http://www.praga.org.ar/wacko/DevPraga/htbgen Htb-gen has evolved a lot since it release in feb/2006, but i have no time to make a public decent documented and generalized release. But right now i think that is better to put the stuff here, so others can enjoy the notorious improvements (and maybe someone whants to help out) Lets go to the hacks: I have made 2 flavors of htb-gen (actually these are two real setups each one with diferent needs) config files where touched and some documentation udpate was made in place. * First flavor (htb-gen evolution) ? htb-gen-9.0b.tar.gz Source tarball ? Multiples ifaces support, you can have now mult. LAN and mult. ISPs. ? Per host p2p percent of rate assignation ? Named ISP/LAN and clients in the web-frontend ? Code simplification ? htb-init support removed (no one find this usefull) ? pfifo_fast for prio class ? Compatibility with bash v2 ? tc batch mode support, now both iptables and tc are batched, huge speed impact on large setups and yet tc and iptables command in the source are transparent readables * Second flavor (htb-gen advanced) ? htb-gen-9.0b-advanced.tar.gz Source tarball ? All features of htb-gen-9.0b ? Grained prio/non_prio per host definition, you can setup per client: ? prio_tcp_ports ? prio_udp_ports ? prio_protos (as esp,gre,igmp or ie even udp to include all udp traffic) ? prio_helpers (netfilter helpers) ? Customizable defaults for the variables above ? An php-based web front-end: ? builded with PEAR Quick Form? ? data entry safe-checks ? inline graphics per client * Bonus There is also a per client graphic development, look at htb-graph script that collects data triggered by a cron entry(look at cron.d/htb-graph), and put it in /var/lib/rrd/, then there is a perl script that display clients graphics in a fashion maner. The graphics are per client and have diferent color(ligth/dark green) for prio/non_prio traffic. :-) Good luck, and plz mail me any clean-up of this! -- Luciano From afshin.tajvidi at free.fr Fri Jun 1 11:12:32 2007 From: afshin.tajvidi at free.fr (Afshin Tajvidi) Date: Fri Jun 1 11:12:48 2007 Subject: [LARTC] IFB & 802.1q In-Reply-To: <12489d010705310759p249c2108saa812d23883578db@mail.gmail.com> References: <1180621375.2571.4.camel@mahler.onetec> <12489d010705310759p249c2108saa812d23883578db@mail.gmail.com> Message-ID: <1180689152.2598.6.camel@mahler.onetec> Thank you for you response Saioa I've tried "protocol 802.1q" but this solution does not work... Regards Afsh?n On Thu, 2007-05-31 at 16:59 +0200, Saioa Arrizabalaga wrote: > Hi, > > > > ip link set up dev ifb0 > > > > tc qdisc add dev ifb0 root handle 1: htb default 3 > > tc class add dev ifb0 parent 1: classid 1:1 htb rate 2000kbit quantum > > 1514 > > tc class add dev ifb0 parent 1:1 classid 1:2 htb rate 1000kbit ceil > > 2000kbit quantum 1514 > > tc class add dev ifb0 parent 1:1 classid 1:3 htb rate 1000kbit ceil > > 2000kbit quantum 1514 > > > > tc filter add dev ifb0 parent 1: protocol ip priority 10 u32 match ip > > sport 80 0xffff flowid 1:2 > > Try with "protocol 802.1q" instead of "protocol ip" in the filter: > > tc filter add dev ifb0 parent 1: protocol 802.1q priority 10 u32 match > ip sport 80 0xffff flowid 1:2 > > I had a similar problem and that worked for me. These posts may be useful: > http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg10132.html > http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg15219.html > http://www.mail-archive.com/lartc@mailman.ds9a.nl/msg10726.html > > Regards, From afshin.tajvidi at free.fr Fri Jun 1 11:32:39 2007 From: afshin.tajvidi at free.fr (Afshin Tajvidi) Date: Fri Jun 1 11:32:44 2007 Subject: [LARTC] Re: LARTC] IFB & 802.1q In-Reply-To: <20070531164124.24927f39@catlap> References: <1180621375.2571.4.camel@mahler.onetec> <20070531164124.24927f39@catlap> Message-ID: <1180690359.2598.10.camel@mahler.onetec> Thank you so much Marek and Andy Your solutions work great! Now my complete configuration is setup by the following commands: ip link set up dev ifb0 tc qdisc add dev ifb0 root handle 1: htb default 3 tc class add dev ifb0 parent 1: classid 1:1 htb rate 2000kbit quantum 1514 tc class add dev ifb0 parent 1:1 classid 1:2 htb rate 1000kbit ceil2000kbit quantum 1514 tc class add dev ifb0 parent 1:1 classid 1:3 htb rate 1000kbit ceil 2000kbit quantum 1514 tc filter add dev ifb0 parent 1: protocol ip priority 10 u32 match ipsport 80 0xffff flowid 1:2 qdisc add dev eth0.10 root handle 1: htb qdisc add dev eth0.20 root handle 1: htb tc filter add dev eth0.10 parent 1: protocol ip priority 10 u32 match u32 0 0 flowid 1: action mirred egress redirect dev ifb0 tc filter add dev eth0.20 parent 1: protocol ip priority 10 u32 match u32 0 0 flowid 1: action mirred egress redirect dev ifb0 On Thu, 2007-05-31 at 16:41 +0200, Marek Kierdelewicz wrote: > Hi, > > >tc filter add dev eth0.10 parent root protocol ip priority 10 u32 match > >u32 0 0 flowid 1: action mirred egress redirect dev ifb0 > >tc filter add dev eth0.20 parent root protocol ip priority 10 u32 match > >u32 0 0 flowid 1: action mirred egress redirect dev ifb0 > > Try to add htb qdisc and attach your filter to qdisc instead of root. I > think I used such configuration some time ago. As for filter rule, > something like that worked for me: > > tc filter add dev ethX.X protocol ip parent 1: prio 4 u32 match ip > dst 0.0.0.0/0 flowid :1 action mirred egress redirect dev ifbX > > cheers, > Marek Kierdelewicz > KoBa ISP > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Afshin Tajvidi IT Technical Architect From nozo at ziu.info Fri Jun 1 13:24:55 2007 From: nozo at ziu.info (Michal Soltys) Date: Fri Jun 1 13:24:38 2007 Subject: [LARTC] tc offset & subheader matching clarification / question Message-ID: <46600207.2090605@ziu.info> Hello TC's syntax, particulary u32 filter, is far more rich than what man, howto or command's help provides. I've been looking for information about the uses of 'offset' parameter, or more detailed explanation of a few other/relevan options, but what I've found is very brief to say the least. So I checked the sources of cls_u32.c and f_u32.c. According to that, the separate 'offset' parameter controls the offset to the subheader (i.e. tcp from the beginning of ip), and it must be supplied explicitely. So for example, doing something like: tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 \ match tcp dst 1234 0xffff flowid 1:5 or its equivalent tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 \ match u32 0x00001234 0x0000ffff at nexthdr+0 flowid 1:5 is not enough. Looking at f_u32.c, the only thing that nexthdr+ will cause, is setting offset mask (key->offmask), used in the following line of net/sched/cls_u32.c: if ((*(u32*)(ptr+key->off+(off2&key->offmask))^key->val)&key->mask) { If I understand it correctly, then i.e. lartc howto's 12.1.1 examples wouldn't work as intended. off2 would have to be set by the means of 'offset' option on the command line. Now, the hash table example in README.iproute2+tc shows how to use 'offset' option and after analysing mentioned sources, it's more or less clear for me what and how happes: offset mask 0x0F00 shift 6 - specifies ip header size in bytes (calculated as off2 in cls_u32.c) match tcp dst 0x17 0xffff - having key->offmask == -1, off2 will be added in addition to key->off. Thus skipping to actual tcp header, if such need arises. This is needed for match tcp to work properly in: $TC filter add dev eth1 parent 1:0 prio 5 u32 ht 1:6: \ match ip dst 193.233.7.75 \ match tcp dst 0x17 0xffff \ flowid 1:4 \ police rate 32kbit buffer 5kb/8 mpu 64 mtu 1514 index 1 Anyway, as far as I understand, 'offset' option only works in context of extra hash tables, as off2 is calculated before the move to the next ht. Do I understand this correctly, or did I miss something ? From vladsun at relef.net Fri Jun 1 14:00:38 2007 From: vladsun at relef.net (VladSun) Date: Fri Jun 1 14:00:55 2007 Subject: [LARTC] big problem with HTB/CBQ and CPU for more than 1.700 customers In-Reply-To: <200705312343.05477.luciano@lugmen.org.ar> References: <20070526135435.C96F540DB@outpost.ds9a.nl> <465AD949.6020904@zoomnet.ro> <465ADB7F.1020202@relef.net> <200705312343.05477.luciano@lugmen.org.ar> Message-ID: <46600A66.2000307@relef.net> Luciano Ruete ??????: >> Or you can use my patch - IPCLASSIFY. Then the rules above would be >> substituted by a signle rule per direction: >> >> >> iptables -t mangle -A FORWARD -s 192.168.1.0/24 -j IPCLASSIFY --addr=src >> --and-mask=0xff --or-mask=0x11000 >> iptables -t mangle -A FORWARD -d 192.168.1.0/24 -j IPCLASSIFY --addr=dst >> --and-mask=0xff --or-mask=0x12000 >> > > Wow! now i get it, this patch is amazing, now i have a pendient hack that is > to merge this with htb-gen. Any chances that this get into mainline, have you > mailed netfilter-dev list? > > :) Thank you! You should thank Grzegorz Janoszka also - he wrote the original IPMARK patch. My patch is just a slight modification of it. As far as I know netfilter team refused to include the IPMARK in the official P-o-M. So I don't think IPCLASSIFY would be accepted either. Regards, Vladimir Mirchev. From russell-tcatm at stuart.id.au Sat Jun 2 01:21:49 2007 From: russell-tcatm at stuart.id.au (Russell Stuart) Date: Sat Jun 2 01:22:03 2007 Subject: [LARTC] tc offset & subheader matching clarification / question In-Reply-To: <46600207.2090605@ziu.info> References: <46600207.2090605@ziu.info> Message-ID: <1180740109.3928.2.camel@ras.pc.stuart.local> On Fri, 2007-06-01 at 13:24 +0200, Michal Soltys wrote: > TC's syntax, particulary u32 filter, is far more rich than what man, > howto or command's help provides. I've been looking for information > about the uses of 'offset' parameter, or more detailed explanation of a > few other/relevan options, but what I've found is very brief to say the > least. Look here: http://www.stuart.id.au/russell/files/tc/doc/tc/cls_u32.txt From luciano at lugmen.org.ar Sat Jun 2 05:27:46 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Sat Jun 2 05:28:02 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000a01c7a340$df560720$5964a8c0@SalimSi> References: <000a01c7a340$df560720$5964a8c0@SalimSi> Message-ID: <200706020027.47126.luciano@lugmen.org.ar> On Thursday 31 May 2007 02:02:16 Salim S I wrote: > Before we get into the "Top-posting" stuff, it would be nice if you > follow the normal way of replying (or atleast marking a copy) to the > list. I think that is the basic idea behind mailing list. Shure! :-), my fault, not looking at headers, my wish was always to write to the list. > If you had done that, I wouldn't have had to do the "Top-Posting". Take > a look at the archives please. There is no reason to do Top-Posting, if i've missed the cc to the list, you still can do a normal innline reply. But all this is getting OT in this list. > On Wednesday 30 May 2007 00:58:18 you wrote: [snip] > Yessir. 3 bags full. > If you had read my post c l e a r l y, before you felt obliged to show > off your knowledge, you might have understood that I was talking about > the so-called 'special-protocols'. You mention online gaming and IM protocols, and there is nothing special about that. What im triyng to say is that CONNTRACK+CONNMARK solves that problem for you. You can have IM(msn,jabber,yahoo,aol) connected all day long without problems, or you can do online gamming too, or have an ssh session for weeks. CONNTRACK has the avility to track tcp(ammong others) flows and to remember an ESTABLISHED connection. Then you can use CONNMARK to MARK an ESTABLISHED connection with an unique tag based on the provider that it use. Then, every time you see the same MARK on that ESTABLISHED connection you assure that it will be routed over the same original provider. Full example here: http://mailman.ds9a.nl/pipermail/lartc/2006q2/018964.html > Btw, thanks for that bit about "TCP flow between two ports", was quite > new to me. > >> I remember > >> even MSN have had problems (timeout every 5 mins), but it seems to > >> been fixed at the server level. > > > >With CONNMARK solution 99,99% of the things works, i am the > sys/net-admin >from > >an ISP that proves it, whit load balancing over multiple links. > > Sorry again! That figure of '99.99' is in YOUR case, but are you aware > there are others in this world too, with different scenarios/setups? You > did not think Peter and I were dreaming up a scenario,did you? The scenario that you mention is a bad/incomplete setup, so do not spect that it will work right. > Btw, your being a netadmin doesn't automatically make your statements > correct. Which make my statement correct is the fact that in my networks there are not all the problems that you mention in your post. > >For each protocol that are not covered by simple tcp flow a helper > > module >was > written. > It must be a well kept secret then! > I am sorry to say this, if your knowledge was half the size of your ego, > it would have been good for us all. Is not about ego, sorry if you take this personal, it is not my intention, i speak rude because this list get heavly indexed by google, and it is taked as good advice for many answer seekers. You afirm that Linux cannot handle load balancing properly and this is completly WRONG and is bad advertising and a lie. Since 2.4 series has been avaible the greats julian's patchs[1], and then in 2.6.12 CONNMARK has get in mainline, and with a litle of setup all connection problems related to load balancing get perfectly solved. > >> Could you please point out if I had missed any open discussion in the > >> list which covers these things? > > > >just google(ie): "connmark site:lartc...archive" > > Thanks for introducing google. But my question still stands. i hope is answered now. [1]http://www.ssi.bg/~ja/ -- Luciano From drumlesson at gmail.com Sat Jun 2 11:19:51 2007 From: drumlesson at gmail.com (terraja-based) Date: Sat Jun 2 11:20:09 2007 Subject: [LARTC] u32 classifier Message-ID: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> Hi folks...!!! I?ve a problem that i did not solve it. i want to limit the DOWNLOAD to my hosts (upstream traffic for the firewall) using IMQ, If i classify by PORT (source or destination) all seems to be fine, but...BUT...if i want to restrict by IP addresss (internal IP address) i can?t do it, because my hosts go to Internet toward the firewall using NAT, so after NAT my IP address in Internet is not my internal address, because the NAT acction change my source and internal IP address. So...so...so...how can i limit the traffic by IP address using TC, IMQ, U32..etc...????? Can i modify some field in the TCP header with u32 filter?, i did read the TCP RFC and nothing, i can?t guess how can solve it... Please, HELPPPPPPP ME...!!! -- terraja-based -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070602/d10b47d6/attachment.htm From afshin.tajvidi at free.fr Sat Jun 2 12:31:50 2007 From: afshin.tajvidi at free.fr (Afshin Tajvidi) Date: Sat Jun 2 12:31:57 2007 Subject: [LARTC] u32 classifier In-Reply-To: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> References: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> Message-ID: <1180780310.2387.2.camel@mahler.onetec> Hi Maybe you have to review your IMQ behavior and choose CONFIG_IMQ_BEHAVIOR_AA or CONFIG_IMQ_BEHAVIOR_AB during the kernel compilation (and not CONFIG_IMQ_BEHAVIOR_BA or CONFIG_IMQ_BEHAVIOR_BB) Regards Afshin On Sat, 2007-06-02 at 06:19 -0300, terraja-based wrote: > Hi folks...!!! > > > I?ve a problem that i did not solve it. > i want to limit the DOWNLOAD to my hosts (upstream traffic for the > firewall) using IMQ, > > If i classify by PORT (source or destination) all seems to be fine, > but...BUT...if i want to restrict by IP addresss (internal IP address) > i can?t do it, because my hosts go to Internet toward the firewall > using NAT, so after NAT my IP address in Internet is not my internal > address, because the NAT acction change my source and internal IP > address. > > So...so...so...how can i limit the traffic by IP address using TC, > IMQ, U32..etc...????? > > Can i modify some field in the TCP header with u32 filter?, i did read > the TCP RFC and nothing, i can?t guess how can solve it... > Please, HELPPPPPPP ME...!!! > > > -- > terraja-based > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -- Afshin Tajvidi IT Technical Architect From vladsun at relef.net Sat Jun 2 13:46:38 2007 From: vladsun at relef.net (VladSun) Date: Sat Jun 2 13:47:00 2007 Subject: [LARTC] u32 classifier In-Reply-To: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> References: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> Message-ID: <4661589E.8070100@relef.net> terraja-based ??????: > Hi folks...!!! > I?ve a problem that i did not solve it. > i want to limit the DOWNLOAD to my hosts (upstream traffic for the > firewall) using IMQ, > If i classify by PORT (source or destination) all seems to be fine, > but...BUT...if i want to restrict by IP addresss (internal IP address) > i can?t do it, because my hosts go to Internet toward the firewall > using NAT, so after NAT my IP address in Internet is not my internal > address, because the NAT acction change my source and internal IP > address. > So...so...so...how can i limit the traffic by IP address using TC, > IMQ, U32..etc...????? > Can i modify some field in the TCP header with u32 filter?, i did read > the TCP RFC and nothing, i can?t guess how can solve it... > Please, HELPPPPPPP ME...!!! > > > -- > terraja-based > ------------------------------------------------------------------------ > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > Use iptables MARK, and TC fw. From nozo at ziu.info Sat Jun 2 14:17:09 2007 From: nozo at ziu.info (Michal Soltys) Date: Sat Jun 2 14:17:35 2007 Subject: [LARTC] tc offset & subheader matching clarification / question In-Reply-To: <1180740109.3928.2.camel@ras.pc.stuart.local> References: <46600207.2090605@ziu.info> <1180740109.3928.2.camel@ras.pc.stuart.local> Message-ID: <46615FC5.9090201@ziu.info> Russell Stuart wrote: > > Look here: http://www.stuart.id.au/russell/files/tc/doc/tc/cls_u32.txt > Awesome documentation. Thanks. One tiny detail after inital reading - shift 6 divides by 64, not 32. From WBohannan at spidersat.com.gh Mon Jun 4 11:46:54 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Mon Jun 4 11:47:16 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <465EDD62.3000602@riverviewtech.net> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net> Message-ID: <4D411FB02758FE45915E9724339093F6321E55@intranet.scpl.local> Grant Didn't work comes up with cannot create bridge as already exists and current bridge br0 stops working. Currently using Debian. Will try the debian forums to see if someone can help. Thanks again for the assistance. # /etc/network/interfaces auto lo iface lo inet loopback # public ip auto br0 iface br0 inet static address xxx.xxx.xxx.xxx netmask 255.255.255.128 network xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx gateway xxx.xxx.xxx.xxx # private ip auto br0:1 iface br0:1 inet static address 10.10.10.254 netmask 255.255.255.0 network 10.10.10.0 broadcast 10.10.10.255 pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 Kind Regards William Bohannan -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Grant Taylor Sent: Thursday, May 31, 2007 2:36 PM To: Mail List - Linux Advanced Routing and Traffic Control Subject: Re: [LARTC] 2 NICs Bridge + Router On 05/31/07 07:22, William Bohannan wrote: > Thanks Grant, I am very new to combining NATing and Brigdge. Please > can you possibly give an example on how to add the virtual interface. I'll try. I don't recognize the format of the file below, but I'll take a stab at it. > Current /etc/networking/interfaces looks like this: > --------------------------------------- > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address xxx.xxx.xxx.xxx > netmask 255.255.255.128 > network xxx.xxx.xxx.xxx > broadcast xxx.xxx.xxx.xxx > gateway xxx.xxx.xxx.xxx auto br0:1 iface br0:1 inet static address xxx.xxx.xxx.xxx netmask 255.255.255.128 network xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx gateway xxx.xxx.xxx.xxx > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 > ----------------------------------------- Again this is just a guess and where I would start. You may have better luck seeking support through your distribution. Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From WBohannan at spidersat.com.gh Mon Jun 4 18:28:31 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Mon Jun 4 18:28:47 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <465EDD62.3000602@riverviewtech.net> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net> Message-ID: <4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> Grant Works well except I cannot for the life of me get NAT working. I have the following setup: ### Network Interface script # /etc/init.d/network/interfaces auto lo iface lo inet loopback auto br0 iface br0 inet static address 193.xxx.xxx.77 netmask 255.255.255.128 network 193.xxx.xxx.0 broadcast 193.xxx.xxx.127 gateway 193.xxx.xxx.126 pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 ### Simple script to start at boot # /etc/init.d/brouter.init echo "Bringing up NAT" ip addr add 10.10.1.254/24 dev br0 iptables -t nat -A POSTROUTING -o br0 -j MASQUERADE route add -net -n 0.0.0.0 dev br0 #enable forwarding echo 1 > /proc/sys/net/ipv4/ip_forward Please advise. Kind Regards William Bohannan -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Grant Taylor Sent: Thursday, May 31, 2007 2:36 PM To: Mail List - Linux Advanced Routing and Traffic Control Subject: Re: [LARTC] 2 NICs Bridge + Router On 05/31/07 07:22, William Bohannan wrote: > Thanks Grant, I am very new to combining NATing and Brigdge. Please > can you possibly give an example on how to add the virtual interface. I'll try. I don't recognize the format of the file below, but I'll take a stab at it. > Current /etc/networking/interfaces looks like this: > --------------------------------------- > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address xxx.xxx.xxx.xxx > netmask 255.255.255.128 > network xxx.xxx.xxx.xxx > broadcast xxx.xxx.xxx.xxx > gateway xxx.xxx.xxx.xxx auto br0:1 iface br0:1 inet static address xxx.xxx.xxx.xxx netmask 255.255.255.128 network xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx gateway xxx.xxx.xxx.xxx > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 > ----------------------------------------- Again this is just a guess and where I would start. You may have better luck seeking support through your distribution. Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From gtaylor at riverviewtech.net Mon Jun 4 18:53:14 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Mon Jun 4 18:51:25 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net> <4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> Message-ID: <4664437A.9040601@riverviewtech.net> On 06/04/07 11:28, William Bohannan wrote: > Works well except I cannot for the life of me get NAT working. I have > the following setup: Good. > ### Network Interface script > # /etc/init.d/network/interfaces > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address 193.xxx.xxx.77 > netmask 255.255.255.128 > network 193.xxx.xxx.0 > broadcast 193.xxx.xxx.127 > gateway 193.xxx.xxx.126 > > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 What would happen if you added additional address, netmask, network, broadcast, and gateway lines? Would that allow you to have aliases defined in this manner, or would it simply over ride the existing settings? > ### Simple script to start at boot > # /etc/init.d/brouter.init > echo "Bringing up NAT" > ip addr add 10.10.1.254/24 dev br0 > iptables -t nat -A POSTROUTING -o br0 -j MASQUERADE > route add -net -n 0.0.0.0 dev br0 > #enable forwarding > echo 1 > /proc/sys/net/ipv4/ip_forward Hum, this looks like you will be MASQUERADEing any and all traffic that leaves br0. I'm betting that you are MASQUERADEing some traffic that you do not want to MASQUERADE. > Please advise. You need to selectively MASQUERADE traffic that is leaving your br0 interface. I.e. MASQUERADE any traffic that is leaving your network headed to the world. You can accomplish this a couple of different ways (possibly more). 1) MASQUERADE any traffic that is not destined to your internal network. In other words MASQUERADE any traffic that is leaving your network. I.e. iptables -t nat -A POSTROUTING -o br0 -d ! 10.10.1.0/24 -j MASQUERADE (If I have that IPTables syntax correct. You get the idea.) 2) MASQUERADE any traffic that is leaving the physical interface that is facing the internet via the physdev IPTables match extension. (Sorry, I have no experience with this option.) Personally, I would try to do it based on destination IP address rather than physical interface for various reasons that are not really pertinent here. Grant. . . . From WBohannan at spidersat.com.gh Mon Jun 4 19:25:21 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Mon Jun 4 19:25:43 2007 Subject: [LARTC] 2 NICs Bridge + Router In-Reply-To: <4664437A.9040601@riverviewtech.net> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net><4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> <4664437A.9040601@riverviewtech.net> Message-ID: <4D411FB02758FE45915E9724339093F6321E97@intranet.scpl.local> Grant Thanks for the quick reply. On the test machine (10.10.1.20) can ping 193.xxx.xxx.77 & 10.10.1.254 (the brouter), however still cannot ping the internet gateway 193.xxx.xxx.126. Below is my routing table: [root:~]$ route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface localnet * 255.255.255.128 U 0 0 0 br0 10.10.1.0 * 255.255.255.0 U 0 0 0 br0 default * 0.0.0.0 U 0 0 0 br0 default 193.xxx.xxx.126 0.0.0.0 UG 0 0 0 br0 ## Start up script # echo "Bringing up NAT" ip addr add 10.10.1.254/24 dev br0 iptables -t nat -A POSTROUTING -o br0 -d ! 10.10.1.0/24 -j MASQUERADE route add -net -n 0.0.0.0 dev br0 #enable forwarding echo 1 > /proc/sys/net/ipv4/ip_forward route add default gw 193.220.59.126 ## Network interfaces file # /etc/network/interfaces auto lo iface lo inet loopback auto br0 iface br0 inet static address 193.xxx.xxx.77 netmask 255.255.255.128 network 193.xxx.xxx.0 broadcast 193.xxx.xxx.127 gateway 193.xxx.xxx.126 pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 Thanks again for all the help so far. Kind Regards William Bohannan -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Grant Taylor Sent: Monday, June 04, 2007 4:53 PM To: Mail List - Linux Advanced Routing and Traffic Control Subject: Re: [LARTC] 2 NICs Bridge + Router On 06/04/07 11:28, William Bohannan wrote: > Works well except I cannot for the life of me get NAT working. I have > the following setup: Good. > ### Network Interface script > # /etc/init.d/network/interfaces > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address 193.xxx.xxx.77 > netmask 255.255.255.128 > network 193.xxx.xxx.0 > broadcast 193.xxx.xxx.127 > gateway 193.xxx.xxx.126 > > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 What would happen if you added additional address, netmask, network, broadcast, and gateway lines? Would that allow you to have aliases defined in this manner, or would it simply over ride the existing settings? > ### Simple script to start at boot > # /etc/init.d/brouter.init > echo "Bringing up NAT" > ip addr add 10.10.1.254/24 dev br0 > iptables -t nat -A POSTROUTING -o br0 -j MASQUERADE > route add -net -n 0.0.0.0 dev br0 > #enable forwarding > echo 1 > /proc/sys/net/ipv4/ip_forward Hum, this looks like you will be MASQUERADEing any and all traffic that leaves br0. I'm betting that you are MASQUERADEing some traffic that you do not want to MASQUERADE. > Please advise. You need to selectively MASQUERADE traffic that is leaving your br0 interface. I.e. MASQUERADE any traffic that is leaving your network headed to the world. You can accomplish this a couple of different ways (possibly more). 1) MASQUERADE any traffic that is not destined to your internal network. In other words MASQUERADE any traffic that is leaving your network. I.e. iptables -t nat -A POSTROUTING -o br0 -d ! 10.10.1.0/24 -j MASQUERADE (If I have that IPTables syntax correct. You get the idea.) 2) MASQUERADE any traffic that is leaving the physical interface that is facing the internet via the physdev IPTables match extension. (Sorry, I have no experience with this option.) Personally, I would try to do it based on destination IP address rather than physical interface for various reasons that are not really pertinent here. Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From WBohannan at spidersat.com.gh Mon Jun 4 20:26:58 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Mon Jun 4 20:28:06 2007 Subject: [LARTC] 2 NICs Bridge + Router (working debian) In-Reply-To: <4664437A.9040601@riverviewtech.net> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net><4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> <4664437A.9040601@riverviewtech.net> Message-ID: <4D411FB02758FE45915E9724339093F6321E99@intranet.scpl.local> Thank you so much been wanting to do this for ages, finally got it working (had to remove the gw) :) ### /etc/network/interfaces # auto lo iface lo inet loopback auto br0 iface br0 inet static address 193.xxx.xxx.77 netmask 255.255.255.128 network 193.xxx.xxx.0 broadcast 193.xxx.xxx.127 pre-up /sbin/ip link set eth0 up pre-up /sbin/ip link set eth1 up pre-up /usr/sbin/brctl addbr br0 pre-up /usr/sbin/brctl addif br0 eth0 pre-up /usr/sbin/brctl addif br0 eth1 ### /etc/init.d/brouter.sh # echo "Bringing up NAT" ip addr add 192.168.2.101/24 dev br0 iptables -t nat -A POSTROUTING -o br0 -d ! 192.168.2.0/24 -j MASQUERADE #enable forwarding echo 1 > /proc/sys/net/ipv4/ip_forward route add default gw 193.xxx.xxx.126 Kind Regards William Bohannan -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Grant Taylor Sent: Monday, June 04, 2007 4:53 PM To: Mail List - Linux Advanced Routing and Traffic Control Subject: Re: [LARTC] 2 NICs Bridge + Router On 06/04/07 11:28, William Bohannan wrote: > Works well except I cannot for the life of me get NAT working. I have > the following setup: Good. > ### Network Interface script > # /etc/init.d/network/interfaces > auto lo > iface lo inet loopback > > auto br0 > iface br0 inet static > address 193.xxx.xxx.77 > netmask 255.255.255.128 > network 193.xxx.xxx.0 > broadcast 193.xxx.xxx.127 > gateway 193.xxx.xxx.126 > > pre-up /sbin/ip link set eth0 up > pre-up /sbin/ip link set eth1 up > pre-up /usr/sbin/brctl addbr br0 > pre-up /usr/sbin/brctl addif br0 eth0 > pre-up /usr/sbin/brctl addif br0 eth1 What would happen if you added additional address, netmask, network, broadcast, and gateway lines? Would that allow you to have aliases defined in this manner, or would it simply over ride the existing settings? > ### Simple script to start at boot > # /etc/init.d/brouter.init > echo "Bringing up NAT" > ip addr add 10.10.1.254/24 dev br0 > iptables -t nat -A POSTROUTING -o br0 -j MASQUERADE > route add -net -n 0.0.0.0 dev br0 > #enable forwarding > echo 1 > /proc/sys/net/ipv4/ip_forward Hum, this looks like you will be MASQUERADEing any and all traffic that leaves br0. I'm betting that you are MASQUERADEing some traffic that you do not want to MASQUERADE. > Please advise. You need to selectively MASQUERADE traffic that is leaving your br0 interface. I.e. MASQUERADE any traffic that is leaving your network headed to the world. You can accomplish this a couple of different ways (possibly more). 1) MASQUERADE any traffic that is not destined to your internal network. In other words MASQUERADE any traffic that is leaving your network. I.e. iptables -t nat -A POSTROUTING -o br0 -d ! 10.10.1.0/24 -j MASQUERADE (If I have that IPTables syntax correct. You get the idea.) 2) MASQUERADE any traffic that is leaving the physical interface that is facing the internet via the physdev IPTables match extension. (Sorry, I have no experience with this option.) Personally, I would try to do it based on destination IP address rather than physical interface for various reasons that are not really pertinent here. Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From gtaylor at riverviewtech.net Mon Jun 4 20:38:07 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Mon Jun 4 20:36:17 2007 Subject: [LARTC] 2 NICs Bridge + Router (working debian) In-Reply-To: <4D411FB02758FE45915E9724339093F6321E99@intranet.scpl.local> References: <4D411FB02758FE45915E9724339093F62E4DE2@intranet.scpl.local> <465B21D1.5080300@riverviewtech.net><4D411FB02758FE45915E9724339093F6321D0D@intranet.scpl.local> <465EDD62.3000602@riverviewtech.net><4D411FB02758FE45915E9724339093F6321E95@intranet.scpl.local> <4664437A.9040601@riverviewtech.net> <4D411FB02758FE45915E9724339093F6321E99@intranet.scpl.local> Message-ID: <46645C0F.7020909@riverviewtech.net> On 06/04/07 13:26, William Bohannan wrote: > Thank you so much been wanting to do this for ages, finally got it > working (had to remove the gw) :) *nod* I was in the middle of reading your last message when you replied stating that you had fixed your problem. I was just staring at the fact that you had two defaults and wondering if that was not the problem. You are welcome. I'm glad that I was able to help. :) Grant. . . . From salim.si at cipherium.com.tw Tue Jun 5 08:48:01 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Tue Jun 5 08:48:40 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <200706020027.47126.luciano@lugmen.org.ar> Message-ID: <000101c7a73d$79483f10$5964a8c0@SalimSi> -----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Saturday, June 02, 2007 11:28 AM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter >Is not about ego, sorry if you take this personal, it is not my intention, >i >speak rude because this list get heavly indexed by google, and it is taked >as >good advice for many answer seekers. > >You afirm that Linux cannot handle load balancing properly and this is >completly WRONG and is bad advertising and a lie. > >Since 2.4 series has been avaible the greats julian's patchs[1], and then >in >2.6.12 CONNMARK has get in mainline, and with a litle of setup all >connection >problems related to load balancing get perfectly solved. I did not say Linux can't do Load balancing (btw, my setup has Julian's DGD patch as well as CONNMARK). But there are some limitations to the popular methods currently used. 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the separate control and data servers. He mentions AIM servers as example. This probably can only be solved by having exception IP list. 2.The other situation, and the one I am more concerned, is about different connections which belongs to same session. Consider Client X and Server Y. Client X initiates a connection from port a to port b of server Y. Xa <---> Yb This connection goes through WAN1. After sometime, X opens another connection to Y from port c to port d. Xc <---> Yd This is a perfectly new TCP connection, so it may go through WAN2 (Note that the client is NATed, and that no CONNTRACK exist for this app) The server may reject the second and subsequent connections as it comes in with a different source IP than the first. This situation happens often in IM and Gaming scenarios. Some sort of IP persistence is required to handle this. And I was wondering if recent match would solve this to an extent, without affecting performance. Or if there are some other method available. (Note that I can't depend much on cache). From rgood at crg.ee.uct.ac.za Tue Jun 5 10:41:02 2007 From: rgood at crg.ee.uct.ac.za (Richard Good) Date: Tue Jun 5 10:41:16 2007 Subject: [LARTC] Using tcng to create a basic DiffServ router Message-ID: <4665219E.3090507@crg.ee.uct.ac.za> Hi all, I am new-ish to Linux tcng and am struggling to find the resources and know how to implement what I need. What I have is a PC with 2 interfaces and a control program on that PC. In the control program I can specify an IP Flow with source and dest IP address, source and dest port, and a QoS class (1-5 which can be mapped to a DiffServ class) - in the control program I want to be able to add these IP Flows to the traffic control rules on the fly, and also to be able to remove them. As I see it: When I enforce IP Flows in the router both interfaces must mark (according to the QoS class) and forward packets that fit the criterion (IP addresses, and ports). When I remove IP Flows from the router I need to be able to remove these rules I need to setup a basic DiffServ router such that each interface queues and shapes traffic according to its DiffServ class I thought of using IPTables to mangle the dsmark field of the packets as they come in and to forward the packets from one interface to another- I can add and remove these iptable rules on the fly. Does this sound correct, or is there an easier way to do this using the tcng package? Also does anyone know of a basic tcng configuration script that sets up a basic DiffServ router with BE, EF, AF1, AF2, AF3 classes? (A big ask, I know!) thanks and regards, Richard From alchemyx at uznam.net.pl Tue Jun 5 11:13:52 2007 From: alchemyx at uznam.net.pl (=?ISO-8859-2?Q?Micha=B3_Margula?=) Date: Tue Jun 5 11:13:39 2007 Subject: [LARTC] Multipath routing Message-ID: <46652950.8000701@uznam.net.pl> Hello! I have trouble with multipath routing. Those options are enabled in kernel: [*] IP: policy routing [*] IP: equal cost multipath [*] IP: equal cost multipath with caching support (EXPERIMENTAL) <*> MULTIPATH: round robin algorithm But issuing: ip r a 1.2.3.0/23 scope global equalize nexthop via 80.245.176.11 \ dev eth0 weight 1 nexthop via 80.245.176.13 dev eth0 weight 1 and then # ip r s [...] 1.2.3.0/24 nexthop via 80.245.176.11 dev eth0 weight 1 nexthop via 80.245.176.13 dev eth0 weight 1 As you can see there is no equalize keyword in here. Also I have trouble using multipath quagga, it simply doesn't put multipath route in routing table. For example: faramir# sh ip bgp 10.100.0.1 BGP routing table entry for 10.101.0.0/22 Paths: (2 available, best #1, table Default-IP-Routing-Table) Not advertised to any peer Local 80.245.176.13 (metric 1) from 80.245.176.13 (80.245.177.4) Origin IGP, metric 0, localpref 100, weight 150, valid, internal, best Last update: Tue Jun 5 01:59:29 2007 Local 80.245.176.10 (metric 1) from 80.245.176.10 (80.245.176.10) Origin IGP, metric 0, localpref 100, weight 100, valid, internal Last update: Tue Jun 5 01:28:02 2007 # ip r s [...] 10.100.0.0/22 via 80.245.176.11 dev eth0 proto zebra But if I manually put something like that in quagga: faramir(config)# ip route 1.2.3.0/24 80.245.176.13 faramir(config)# ip route 1.2.3.0/24 80.245.176.11 Then: # ip r s [...] 1.2.3.0/24 nexthop via 80.245.176.11 dev eth0 weight 1 nexthop via 80.245.176.13 dev eth0 weight 1 Please help, I am out of ideas. -- Micha? Margula, alchemyx@uznam.net.pl, http://alchemyx.uznam.net.pl/ "W ?yciu pi?kne s? tylko chwile" [Ryszard Riedel] From pch at packetconsulting.pl Tue Jun 5 21:46:01 2007 From: pch at packetconsulting.pl (Piotr Chytla) Date: Tue Jun 5 21:47:49 2007 Subject: [LARTC] Multipath routing In-Reply-To: <46652950.8000701@uznam.net.pl> References: <46652950.8000701@uznam.net.pl> Message-ID: <20070605194601.GB25580@packetconsulting.pl> On Tue, Jun 05, 2007 at 11:13:52AM +0200, Micha? Margula wrote: > Hello! > Hi > I have trouble with multipath routing. Those options are enabled in > kernel: > > [*] IP: policy routing > [*] IP: equal cost multipath > [*] IP: equal cost multipath with caching support (EXPERIMENTAL) > <*> MULTIPATH: round robin algorithm > First of all equal cost multipathing is evil ;>, It simply doesn't work for packets in forwarding path besides support in kernel is not maintained Realy if you want load balance both uplinks disable CONFIG_IP_ROUTE_MULTIPATH_CACHED and you will have random traffic distribiution between both links. More details : http://lists.openwall.net/netdev/2007/03/14/50 http://lists.openwall.net/netdev/2007/03/12/76 http://lists.quagga.net/pipermail/quagga-users/2007-May/008469.html > Also I have trouble using multipath quagga, it simply doesn't put > multipath route in routing table. > > For example: > > faramir# sh ip bgp 10.100.0.1 > BGP routing table entry for 10.101.0.0/22 > Paths: (2 available, best #1, table Default-IP-Routing-Table) > Not advertised to any peer > Local > 80.245.176.13 (metric 1) from 80.245.176.13 (80.245.177.4) > Origin IGP, metric 0, localpref 100, weight 150, valid, internal, > best > Last update: Tue Jun 5 01:59:29 2007 > > Local > 80.245.176.10 (metric 1) from 80.245.176.10 (80.245.176.10) > Origin IGP, metric 0, localpref 100, weight 100, valid, internal > Last update: Tue Jun 5 01:28:02 2007 > BGP always have alternative paths in BGP RIB and mostly don't insert them as multipath route to FIB. Of course there is path : http://lebon.org.ua/quagga.html that force route to be inserted to kernel with multiple gateways - but realy this is some kind of dirty-hack. Check thread 'Linux and BGP multipath' on quagga-dev, and especially this mail: http://lists.quagga.net/pipermail/quagga-dev/2007-April/004700.html > # ip r s > [...] > 10.100.0.0/22 via 80.245.176.11 dev eth0 proto zebra > > But if I manually put something like that in quagga: > > faramir(config)# ip route 1.2.3.0/24 80.245.176.13 > faramir(config)# ip route 1.2.3.0/24 80.245.176.11 > yeah this is static route. /pch -- Dyslexia bug unpatched since 1977 ... exploit has been leaked to the underground. From alex at samad.com.au Tue Jun 5 23:09:02 2007 From: alex at samad.com.au (Alex Samad) Date: Tue Jun 5 23:09:13 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000101c7a73d$79483f10$5964a8c0@SalimSi> References: <200706020027.47126.luciano@lugmen.org.ar> <000101c7a73d$79483f10$5964a8c0@SalimSi> Message-ID: <20070605210902.GF31415@samad.com.au> On Tue, Jun 05, 2007 at 02:48:01PM +0800, Salim S I wrote: > > > -----Original Message----- > From: Luciano Ruete [mailto:luciano@lugmen.org.ar] > Sent: Saturday, June 02, 2007 11:28 AM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > >Is not about ego, sorry if you take this personal, it is not my > intention, >i > >speak rude because this list get heavly indexed by google, and it is > taked >as > >good advice for many answer seekers. > > > >You afirm that Linux cannot handle load balancing properly and this is > >completly WRONG and is bad advertising and a lie. > > > >Since 2.4 series has been avaible the greats julian's patchs[1], and > then >in > >2.6.12 CONNMARK has get in mainline, and with a litle of setup all > >connection > >problems related to load balancing get perfectly solved. > > > I did not say Linux can't do Load balancing (btw, my setup has Julian's > DGD patch as well as CONNMARK). But there are some limitations to the > popular methods currently used. > > 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the > separate control and data servers. He mentions AIM servers as example. > This probably can only be solved by having exception IP list. > > 2.The other situation, and the one I am more concerned, is about > different connections which belongs to same session. > > Consider Client X and Server Y. > > Client X initiates a connection from port a to port b of server Y. > > Xa <---> Yb This connection goes through WAN1. > > After sometime, X opens another connection to Y from port c to port d. > > Xc <---> Yd This is a perfectly new TCP connection, so it may go > through WAN2 > > (Note that the client is NATed, and that no CONNTRACK exist for this > app) > > The server may reject the second and subsequent connections as it comes > in with a different source IP than the first. > > This situation happens often in IM and Gaming scenarios. Some sort of IP > persistence is required to handle this. And I was wondering if recent > match would solve this to an extent, without affecting performance. Or > if there are some other method available. (Note that I can't depend much > on cache). Are all of these idioms of each method documented in the wiki ? So what is the preferred method going forward ? > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070606/bafefab1/attachment.pgp From tb at becket.net Wed Jun 6 06:37:08 2007 From: tb at becket.net (Thomas Bushnell BSG) Date: Wed Jun 6 06:37:21 2007 Subject: [LARTC] elementary usage clamping Message-ID: <1181104629.22239.25.camel@localhost> I'm pretty smart, and was once regarded as pretty network and computer savvy. But the world has obviously passed me by! I have a server in a colocation facility, and I was recently hit by a bill for overage; I used more bandwidth than I expected, and I must pay. So now, I want to bother with packet shaping on the server. The *most* important thing is to clamp bandwidth to the 1Mbps that my contract allows for. This is well within my ordinary usage; there is no reason for me to want more. But I must be careful about overage: when I am transferring large amounts of data, I don't mind waiting for how long it takes at 1Mbps (minus overhead), but I certainly don't want to pay lots extra! This is the most important thing. The next thing is that, once the bandwidth has been clamped, I want to have the ability to be flexible about shaping traffic. Obviously such things as ssh need priority, and then AFS, and then ftp and http. But this is still really only a single-user case, so even if the shaping is not so great, it's ok. I cannot, for the life of me, figure out what tcng syntax would get me what I want. Can someone help me? Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070605/dd01ece0/attachment.pgp From alchemyx at uznam.net.pl Wed Jun 6 12:05:58 2007 From: alchemyx at uznam.net.pl (=?ISO-8859-2?Q?Micha=B3_Margula?=) Date: Wed Jun 6 12:05:47 2007 Subject: [LARTC] Multipath routing In-Reply-To: <20070605194601.GB25580@packetconsulting.pl> References: <46652950.8000701@uznam.net.pl> <20070605194601.GB25580@packetconsulting.pl> Message-ID: <46668706.9080906@uznam.net.pl> Piotr Chytla pisze: > First of all equal cost multipathing is evil ;>, It simply doesn't work for packets in > forwarding path besides support in kernel is not maintained > > Realy if you want load balance both uplinks disable > CONFIG_IP_ROUTE_MULTIPATH_CACHED and you will have random traffic > distribiution between both links. > > More details : > http://lists.openwall.net/netdev/2007/03/14/50 > http://lists.openwall.net/netdev/2007/03/12/76 > http://lists.quagga.net/pipermail/quagga-users/2007-May/008469.html > Oh. I see. Thanks. BTW: google doesn't show that links when looking for multipath on linux :-) Random load sharing over multiple routes is not a good idea, or maybe is it? Am I guessing right that with enough amount of traffic and having two nexthops it will split 50%/50% ? > BGP always have alternative paths in BGP RIB and mostly don't insert them > as multipath route to FIB. > > Of course there is path : http://lebon.org.ua/quagga.html that force > route to be inserted to kernel with multiple gateways - but realy this > is some kind of dirty-hack. I know that site, but I thought that those patches were obsoleted, because of --multipath option when compiling quagga. > Check thread 'Linux and BGP multipath' on quagga-dev, and especially this mail: > > http://lists.quagga.net/pipermail/quagga-dev/2007-April/004700.html I know that also :). BTW: until now I was quite pleased with linux networking which quality is amazing. But now, when I need loadbalancing I am disapointed, because it doesn't support things that with cisco hardware you take for granted. I miss mostly recursive routes. Something like that ip route add 80.245.177.4/32 via 80.245.176.11 ip route add 10.0.0.0/24 via 80.245.177.4 It would solve problems with multipath bgp and loadbalancing because I could add remove additional routes to 80.245.177.4 (or some other imaginary loopback) and it would work as expected. I hope it will be added some day :) Thank you for your help! -- Micha? Margula, alchemyx@uznam.net.pl, http://alchemyx.uznam.net.pl/ "W ?yciu pi?kne s? tylko chwile" [Ryszard Riedel] From fredi_bieging at yahoo.com.br Wed Jun 6 12:40:57 2007 From: fredi_bieging at yahoo.com.br (Fredi Bieging) Date: Wed Jun 6 12:41:02 2007 Subject: [LARTC] Controlling FTP in Passive Mode Message-ID: <945406.27870.qm@web50508.mail.re2.yahoo.com> I am trying to control traffic in my server and a doubt came over me... My ftp server is set up in passive mode, so it will randomly choose a port to transfer data (in my case ports 50000-50100)... Is there a way of controlling this ftp traffic without marking packets? Thanks! Bye... msn: fredi_bieging@hotmail.com skype: fredibieging A mathematician is a machine for converting coffee into theorems. Windows - reboot, Linux - be root. --------------------------------- Novo Yahoo! Cad?? - Experimente uma nova busca. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070606/248ab4dc/attachment.html From cata at geniusnet.ro Wed Jun 6 15:19:04 2007 From: cata at geniusnet.ro (Catalin Bucur) Date: Wed Jun 6 15:19:16 2007 Subject: [LARTC] u32 classifier In-Reply-To: <4661589E.8070100@relef.net> References: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> <4661589E.8070100@relef.net> Message-ID: <4666B448.90208@geniusnet.ro> VladSun wrote: > terraja-based ??????: >> Hi folks...!!! >> I?ve a problem that i did not solve it. >> i want to limit the DOWNLOAD to my hosts (upstream traffic for the >> firewall) using IMQ, >> If i classify by PORT (source or destination) all seems to be fine, >> but...BUT...if i want to restrict by IP addresss (internal IP address) >> i can?t do it, because my hosts go to Internet toward the firewall >> using NAT, so after NAT my IP address in Internet is not my internal >> address, because the NAT acction change my source and internal IP >> address. >> So...so...so...how can i limit the traffic by IP address using TC, >> IMQ, U32..etc...????? >> Can i modify some field in the TCP header with u32 filter?, i did read >> the TCP RFC and nothing, i can?t guess how can solve it... >> > Use iptables MARK, and TC fw. SCENARIO ======== tc utility, iproute2-ss061214 kernel 2.6.20-1.2952.fc6 Mark packets: #iptables -A OUTPUT -t mangle -o eth1 -j MARK --set-mark 1 Shape marked packets with tc fw: #tc class add dev eth1 parent 11:1 classid 11:2 htb rate 10Mbit ceil 90Mbit prio 6 #tc qdisc add dev eth1 parent 11:2 sfq quantum 1500 perturb 5 #tc filter add dev eth1 parent 11:0 protocol ip handle 1 fw classid 11:2 Result in iptables seems ok: Chain OUTPUT (policy ACCEPT 8054768 packets, 8122202853 bytes) pkts bytes target prot opt in out source destination 3827080 4103809298 MARK all -- * eth1 0.0.0.0/0 0.0.0.0/0 MARK set 0x1 Result in tc: filter parent 11: protocol ip pref 49152 fw filter parent 11: protocol ip pref 49152 fw handle 0x1 classid 11:2 So there are no matches in this filter, the other filters work fine (for example: rule hit 5846685 success 5846685). The class is empty too: class htb 11:2 parent 11:1 leaf 8003: prio 6 rate 10000Kbit ceil 90000Kbit burst 2850b cburst 12847b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 What could be the problem? Cheers, -- Catalin Bucur mailto:cata@geniusnet.ro NOC @ Genius Network SRL - Galati - Romania From vladsun at relef.net Wed Jun 6 15:50:05 2007 From: vladsun at relef.net (VladSun) Date: Wed Jun 6 15:50:47 2007 Subject: [LARTC] u32 classifier In-Reply-To: <4666B448.90208@geniusnet.ro> References: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> <4661589E.8070100@relef.net> <4666B448.90208@geniusnet.ro> Message-ID: <4666BB8D.3080509@relef.net> Catalin Bucur ??????: > VladSun wrote: > >> terraja-based ??????: >> >>> Hi folks...!!! >>> I?ve a problem that i did not solve it. >>> i want to limit the DOWNLOAD to my hosts (upstream traffic for the >>> firewall) using IMQ, >>> If i classify by PORT (source or destination) all seems to be fine, >>> but...BUT...if i want to restrict by IP addresss (internal IP address) >>> i can?t do it, because my hosts go to Internet toward the firewall >>> using NAT, so after NAT my IP address in Internet is not my internal >>> address, because the NAT acction change my source and internal IP >>> address. >>> So...so...so...how can i limit the traffic by IP address using TC, >>> IMQ, U32..etc...????? >>> Can i modify some field in the TCP header with u32 filter?, i did read >>> the TCP RFC and nothing, i can?t guess how can solve it... >>> >>> >> Use iptables MARK, and TC fw. >> > > SCENARIO > ======== > > tc utility, iproute2-ss061214 > kernel 2.6.20-1.2952.fc6 > > Mark packets: > #iptables -A OUTPUT -t mangle -o eth1 -j MARK --set-mark 1 > > Shape marked packets with tc fw: > #tc class add dev eth1 parent 11:1 classid 11:2 htb rate 10Mbit ceil > 90Mbit prio 6 > #tc qdisc add dev eth1 parent 11:2 sfq quantum 1500 perturb 5 > #tc filter add dev eth1 parent 11:0 protocol ip handle 1 fw classid 11:2 > > Result in iptables seems ok: > Chain OUTPUT (policy ACCEPT 8054768 packets, 8122202853 bytes) > pkts bytes target prot opt in out source > destination > 3827080 4103809298 MARK all -- * eth1 0.0.0.0/0 > 0.0.0.0/0 MARK set 0x1 > > Result in tc: > filter parent 11: protocol ip pref 49152 fw > filter parent 11: protocol ip pref 49152 fw handle 0x1 classid 11:2 > > So there are no matches in this filter, the other filters work fine (for > example: rule hit 5846685 success 5846685). The class is empty too: > class htb 11:2 parent 11:1 leaf 8003: prio 6 rate 10000Kbit ceil > 90000Kbit burst 2850b cburst 12847b > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > What could be the problem? > > > Cheers, > 11:1 is not your root class, right? If so, try to apply the filter to root class - i.e. something like tc filter add dev eth1 parent 1:0 protocol ip handle 1 fw classid 11:2 From cata at geniusnet.ro Wed Jun 6 16:00:56 2007 From: cata at geniusnet.ro (Catalin Bucur) Date: Wed Jun 6 16:01:08 2007 Subject: [LARTC] u32 classifier In-Reply-To: <4666BB8D.3080509@relef.net> References: <823158cf0706020219w6c22cbd1yc5b1a72857ece07a@mail.gmail.com> <4661589E.8070100@relef.net> <4666B448.90208@geniusnet.ro> <4666BB8D.3080509@relef.net> Message-ID: <4666BE18.5020500@geniusnet.ro> VladSun wrote: > 11:1 is not your root class, right? > > If so, try to apply the filter to root class - i.e. something like > > tc filter add dev eth1 parent 1:0 protocol ip handle 1 fw classid 11:2 11:0 is my root class, and the line is (as I write below): #tc filter add dev eth1 parent 11:0 protocol ip handle 1 fw classid 11:2 -- Catalin Bucur mailto:cata@geniusnet.ro NOC @ Genius Network SRL - Galati - Romania From ethy.brito at inexo.com.br Wed Jun 6 16:16:19 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Wed Jun 6 16:16:27 2007 Subject: [LARTC] how hierarchical is HTB? Message-ID: <20070606111619.26ef7781@pulsar.inexo.com.br> Hi there! I've using HTB for a while and now I an faced with a 'problem'. How hierarchical is HTB? Let's say I have this 3 layer HTB setup: root class 1: (rate=100, ceil=100) 1: children classes 1:10 (30,100) and 1:20 (70,100) 1:10 children classes 1:100 (10,100) and 1:101 (20,100) 1:20 children classes 1:200 (30,100) and 1:201 (70,100) I managed to have the root rate equals to the sum of its children. But how must the rates of the leaves be signed? And how the bandwidth of these leaves will be distributed when borrowing/lending is necessary? classs 1:10 will/may lend/borrow from class 1:20. I know that. But how about 1:1XX and classes 1:2XX? will the borrow/lend from each others? Any docs about this? Thanx Ethy From cla.greco at fastwebnet.it Wed Jun 6 16:58:14 2007 From: cla.greco at fastwebnet.it (Claudio Greco) Date: Wed Jun 6 16:59:34 2007 Subject: [LARTC] how hierarchical is HTB? In-Reply-To: <20070606111619.26ef7781@pulsar.inexo.com.br> References: <20070606111619.26ef7781@pulsar.inexo.com.br> Message-ID: <4666CB86.2070304@fastwebnet.it> > root class 1: (rate=100, ceil=100) > 1: children classes 1:10 (30,100) and 1:20 (70,100) > 1:10 children classes 1:100 (10,100) and 1:101 (20,100) > 1:20 children classes 1:200 (30,100) and 1:201 (70,100) > > I managed to have the root rate equals to the sum of its children. > > Well, it is still true that total assured rate for classes 1:200 and 1:201 is greater than assured rate for class 1:20. Still, I don't think this is a big deal. > But how must the rates of the leaves be signed? > What do you mean with 'signed'? > And how the bandwidth of these leaves will be distributed when > borrowing/lending is necessary? > > As far as I know, when a leaf is 'yellow', i.e. its rate is greater than its assured rate and lesser than its ceil rate, it can borrow from its parent providing there's a yellow-path to the root and the root is green (root can't be yellow, only green or red). If there's more than one child borrowing from the same class, they're served according to their priority (argument prio in *tc class add*). If there's more than one child having the same priority, then they're served in DRR order (Deficit Round Robin). You can tune DRR behaviour with arguments r2q in *tc qdisc add* and quantum in *tc class add*. > classs 1:10 will/may lend/borrow from class 1:20. I know that. > No it can not. A class can only borrow from its parent, never from its siblings. > But how about 1:1XX and classes 1:2XX? will the borrow/lend from each > others? > > ibidem. > Any docs about this? > > You may see: http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm From marco.casaroli at gmail.com Wed Jun 6 17:42:03 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Wed Jun 6 17:42:09 2007 Subject: [LARTC] elementary usage clamping In-Reply-To: <1181104629.22239.25.camel@localhost> References: <1181104629.22239.25.camel@localhost> Message-ID: <92ed523b0706060842i1aa8783dl476aee52ad2e8e64@mail.gmail.com> use the HTB wondershaper that can be found at lartc.org On 6/6/07, Thomas Bushnell BSG wrote: > I'm pretty smart, and was once regarded as pretty network and computer > savvy. But the world has obviously passed me by! > > I have a server in a colocation facility, and I was recently hit by a > bill for overage; I used more bandwidth than I expected, and I must > pay. > > So now, I want to bother with packet shaping on the server. The *most* > important thing is to clamp bandwidth to the 1Mbps that my contract > allows for. This is well within my ordinary usage; there is no reason > for me to want more. But I must be careful about overage: when I am > transferring large amounts of data, I don't mind waiting for how long it > takes at 1Mbps (minus overhead), but I certainly don't want to pay lots > extra! > > This is the most important thing. The next thing is that, once the > bandwidth has been clamped, I want to have the ability to be flexible > about shaping traffic. Obviously such things as ssh need priority, and > then AFS, and then ftp and http. But this is still really only a > single-user case, so even if the shaping is not so great, it's ok. > > I cannot, for the life of me, figure out what tcng syntax would get me > what I want. Can someone help me? > > Thomas > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From ethy.brito at inexo.com.br Wed Jun 6 18:49:40 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Wed Jun 6 18:49:54 2007 Subject: [LARTC] how hierarchical is HTB? In-Reply-To: <4666CB86.2070304@fastwebnet.it> References: <20070606111619.26ef7781@pulsar.inexo.com.br> <4666CB86.2070304@fastwebnet.it> Message-ID: <20070606134940.0daf614e@pulsar.inexo.com.br> On Wed, 06 Jun 2007 16:58:14 +0200 Claudio Greco wrote: > > > root class 1: (rate=100, ceil=100) > > 1: children classes 1:10 (30,100) and 1:20 (70,100) > > 1:10 children classes 1:100 (10,100) and 1:101 (20,100) > > 1:20 children classes 1:200 (30,100) and 1:201 (70,100) > > > > I managed to have the root rate equals to the sum of its children. > > > > > Well, it is still true that total assured rate for classes 1:200 and > 1:201 is greater than assured rate for class 1:20. Still, I don't think > this is a big deal. My mistake. I meant 1:20 (40,100) and not (70,100). > > > But how must the rates of the leaves be signed? > > > What do you mean with 'signed'? Again!? Please read "assigned" > > > And how the bandwidth of these leaves will be distributed when > > borrowing/lending is necessary? > > > > > As far as I know, when a leaf is 'yellow', i.e. its rate is greater than > its assured rate and lesser than its ceil rate, it can borrow from its > parent providing there's a yellow-path to the root and the root is green > (root can't be yellow, only green or red). Ok. Then class 1:200 may "borrow" from 1:100 via the path to root class. > > If there's more than one child borrowing from the same class, they're > served according to their priority (argument prio in *tc class add*). > > If there's more than one child having the same priority, then they're > served in DRR order (Deficit Round Robin). So will the available BW at root class be assigned to 200 and 100 proportionally to its rate (different amounts) or they both (100 and 200) grow by the same amount up to its own ceil? (confuse?) > > You can tune DRR behaviour with arguments r2q in *tc qdisc add* and > quantum in *tc class add*. I will research about that. > > Any docs about this? > > > > > You may see: > > http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm > http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm I found then. Thanx Ethy From tb at becket.net Wed Jun 6 21:28:21 2007 From: tb at becket.net (Thomas Bushnell BSG) Date: Wed Jun 6 21:28:27 2007 Subject: [LARTC] elementary usage clamping In-Reply-To: <92ed523b0706060842i1aa8783dl476aee52ad2e8e64@mail.gmail.com> References: <1181104629.22239.25.camel@localhost> <92ed523b0706060842i1aa8783dl476aee52ad2e8e64@mail.gmail.com> Message-ID: <1181158101.24243.14.camel@localhost> On Wed, 2007-06-06 at 12:42 -0300, Marco Aurelio wrote: > use the HTB wondershaper that can be found at lartc.org Thanks for your reply. I looked at wondershaper, and I could not tell from the documentation whether it actually limited the rate of packets transmitted, and policed incoming packets, in a reliable fashion. In other words, all the documentation I see is written as if it is addressing the case of a residential customer with a bandwidth-limited connection (cable modem, say), that has large queues, and arranges to shape on the box instead of on the connection's queues, allowing for better and more sensitive control. But it still seemed (from what I read) as if it tries to keep the pipe as full as possible, merely reordering packets carefully, in which case I'm sure to lose, because I *don't want* the pipe as full as possible; I want to dribble bits out the pipe to conform to the pricing I have agreed with my ISP. Am I missing something? Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070606/0e04dcdd/attachment.pgp From Jon.J.Flechsenhaar at boeing.com Wed Jun 6 22:44:53 2007 From: Jon.J.Flechsenhaar at boeing.com (Flechsenhaar, Jon J) Date: Wed Jun 6 22:45:53 2007 Subject: [LARTC] how hierarchical is HTB? In-Reply-To: <4666CB86.2070304@fastwebnet.it> References: <20070606111619.26ef7781@pulsar.inexo.com.br> <4666CB86.2070304@fastwebnet.it> Message-ID: <0E24ED2A7F9AA349A8633E6A56A64BE0027A8308@XCH-SW-2V1.sw.nos.boeing.com> Few quick comments: HTB parent rate should never be less than the sum of its children. This is referring to the rate parameter not the ceil. Class 1:20 needs to equal 1:200+1:201. You will get strange results if you try and test with any configuration where the the sum of all childeren rates are greater than their parent. Borrowing occurs from the parent and from classes at the same level. So if you have 3 leaf classes. 1:1, 1:2, and 1:3 they will get their assigned rate and borrow up their ceil if there is extra bandwidth. If there is no traffic in one of the classes then it can give its assured bandwidth to the other 2 classes with traffic. Borrowing is based on the priority assigned to the class. Jon Flechsenhaar Boeing WNW Team Network Services (714)-762-1231 202-E7 -----Original Message----- From: Claudio Greco [mailto:cla.greco@fastwebnet.it] Sent: Wednesday, June 06, 2007 7:58 AM To: Ethy H. Brito Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] how hierarchical is HTB? > root class 1: (rate=100, ceil=100) > 1: children classes 1:10 (30,100) and 1:20 (70,100) 1:10 children > classes 1:100 (10,100) and 1:101 (20,100) 1:20 children classes 1:200 > (30,100) and 1:201 (70,100) > > I managed to have the root rate equals to the sum of its children. > > Well, it is still true that total assured rate for classes 1:200 and 1:201 is greater than assured rate for class 1:20. Still, I don't think this is a big deal. > But how must the rates of the leaves be signed? > What do you mean with 'signed'? > And how the bandwidth of these leaves will be distributed when > borrowing/lending is necessary? > > As far as I know, when a leaf is 'yellow', i.e. its rate is greater than its assured rate and lesser than its ceil rate, it can borrow from its parent providing there's a yellow-path to the root and the root is green (root can't be yellow, only green or red). If there's more than one child borrowing from the same class, they're served according to their priority (argument prio in *tc class add*). If there's more than one child having the same priority, then they're served in DRR order (Deficit Round Robin). You can tune DRR behaviour with arguments r2q in *tc qdisc add* and quantum in *tc class add*. > classs 1:10 will/may lend/borrow from class 1:20. I know that. > No it can not. A class can only borrow from its parent, never from its siblings. > But how about 1:1XX and classes 1:2XX? will the borrow/lend from each > others? > > ibidem. > Any docs about this? > > You may see: http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From marco.casaroli at gmail.com Wed Jun 6 22:58:01 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Wed Jun 6 22:58:04 2007 Subject: [LARTC] elementary usage clamping In-Reply-To: <1181158101.24243.14.camel@localhost> References: <1181104629.22239.25.camel@localhost> <92ed523b0706060842i1aa8783dl476aee52ad2e8e64@mail.gmail.com> <1181158101.24243.14.camel@localhost> Message-ID: <92ed523b0706061358o70c84f2bsc7498954c7a2a3e0@mail.gmail.com> On 6/6/07, Thomas Bushnell BSG wrote: > On Wed, 2007-06-06 at 12:42 -0300, Marco Aurelio wrote: > > use the HTB wondershaper that can be found at lartc.org > > Thanks for your reply. I looked at wondershaper, and I could not tell > from the documentation whether it actually limited the rate of packets > transmitted, and policed incoming packets, in a reliable fashion. What do you mean by reliable fashion? The upstream is hard limited by the kernel. So it is absolutely reliable. The data people send you (downstream) you cannot control directly. > > In other words, all the documentation I see is written as if it is > addressing the case of a residential customer with a bandwidth-limited > connection (cable modem, say), that has large queues, and arranges to > shape on the box instead of on the connection's queues, allowing for > better and more sensitive control. You can use it in your environment. The wondershaper limits your traffic a bit less than the link speed, for the packets to be queued in the kernel and not in the modem (hub, switch, etc), so you can reserve some resources for the real time traffic. In your case, the modems or hubs may almost never queue. Please tell me more about the limits of the provider. You say that they bill you if you use more than 1Mbps? I mean, this is strange because they normally define a transfer quota (eg: 100GB per month) and not a bandwidth limit. And also, what services are you providing in this server? > > But it still seemed (from what I read) as if it tries to keep the pipe > as full as possible, merely reordering packets carefully, in which case > I'm sure to lose, because I *don't want* the pipe as full as possible; I > want to dribble bits out the pipe to conform to the pricing I have > agreed with my ISP. > You don't keep the pipe as full as possible all the time. Only when you are sending more than the limit rate you specified in the script. > Am I missing something? > > Thomas > > > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From GregScott at InfraSupportEtc.com Thu Jun 7 01:51:11 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu Jun 7 01:51:15 2007 Subject: [LARTC] What I learned about Linux bridging Message-ID: <925A849792280C4E80C5461017A4B8A210B83E@mail733.InfraSupportEtc.com> Here are some notes I have about Linux bridging. I'll try to separate what I know I know from what I think I know. Let's say I want to bridge eth0, eth1, and eth2 together, all with an IP Address of, say, 1.2.3.2. This is how to do it: echo "Setting up br0 to bridge eth0 with eth1 and eth2" /usr/sbin/brctl addbr br0 /usr/sbin/brctl addif br0 eth0 /usr/sbin/brctl addif br0 eth1 /usr/sbin/brctl addif br0 eth2 /sbin/ip addr add 1.2.3.2/24 dev br0 /sbin/ip link set br0 up Continuing with the above example, most of the writeups also say to remove any IP Addresses from eth0, eth1, and eth2. But I've found this doesn't seem necessary - well, sort of. Let's say that eth0 is at IP Address 1.2.3.2, and now I bridge eth0, eth1, and eth2 together and give bridge br0 the same IP Address of 1.2.3.2. Now I have a mess because both eth0 and br0 have the same IP Address. Doing this: ip addr del 1.2.3.2/24 dev eth0 cleans up the mess. But let's say that physical interface eth1 has IP Address 10.0.0.1. >From testing, it looks like other systems can ping 10.0.0.1 just fine, assuming they have a route to it. So I **think** I know that I can assign an IP Address to a raw interface, as long as it's a different IP Address than what I assigned to the overall bridge. But I haven't seen this capability documented anywhere. Let's say the bridge is up and working at IP Address 1.2.3.2. I have a system at IP Address 1.2.3.1 connected via eth0. That system can ping 1.2.3.2 easily. If I disconnect the Ethernet cable from eth0 and plug into eth1 or eth2, after about 30 seconds, that bridged system begins answering pings again. As indicated in the writeups, that spare PC with a bunch of NICs is now acting like a managed Ethernet switch. Cool! Filtering iptables is a super-sophisticated toolset to filter IP packets. ebtables is another toolset to filter at the OSI layer 2 (datalink) layer. iptables concerns itself (mostly) with routing across an IP network, computer to computer. ebtables concerns itself (mostly) with filtering packets across physical NIC interfaces in the same computer. Here is a great writeup on using ebtables and iptables together: http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html But - like everything I've been able to find so far, I don't think this writeup is completely accurate. iptables has a module called physdev. According to the writeup, I can use the iptables physdev module to filter among the raw interfaces in a bridge. But a discussion in the netfilter list essentially says that physdev is being removed because it creates all kinds of other problems. At least, I think that's what it says. The relevant discussion took place in early July 2006. Here is a pointer to the beginning of the discussion: https://lists.netfilter.org/pipermail/netfilter-devel/2006-July/024896.h tml So it looks like when filtering at the network layer, (IP in this case) use iptables. When filtering at the data link layer, use ebtables and maybe arptables. Avoid using -m physdev in iptables because it's going away. You can add IP Addresses to bridged eth-- interfaces as long as they don't conflict with the bridge IP Address(es). Next up will be to try some filtering scenarios with ebtables and iptables. - Greg Scott From Jon.J.Flechsenhaar at boeing.com Thu Jun 7 02:24:16 2007 From: Jon.J.Flechsenhaar at boeing.com (Flechsenhaar, Jon J) Date: Thu Jun 7 02:24:23 2007 Subject: [LARTC] KOM RSVP Message-ID: <0E24ED2A7F9AA349A8633E6A56A64BE0027A830F@XCH-SW-2V1.sw.nos.boeing.com> Does anyone on hear have any thoughts on the changes that would be nessary for KOM RSVP to use HTB rather than just CBQ and HFSC? Jon Flechsenhaar Boeing WNW Team Network Services (714)-762-1231 202-E7 From diegows at gmail.com Thu Jun 7 03:19:38 2007 From: diegows at gmail.com (Diego Woitasen) Date: Thu Jun 7 03:19:43 2007 Subject: [LARTC] Wan optimizations with linux Message-ID: Hi, I'm researching for WAN optimizations with linux. My network is composed for MPLS network connecting 200 branches against a central site. I use Linux machines to provide security with IPSEC in the branches and in the central site. Now I'm lookup for techniques for optimization the link. My first ideas was use IPCOMP and proxy to cache traffic of HTTP applications. Somebody have any other idea to get better utilization of WAN links with Linux? I had seen commercial appliances that do dictionary compression and other cool things, but I don't want to use they. regards, diegows -- ------------------- Diego Woitasen ------------------- From gtaylor at riverviewtech.net Thu Jun 7 04:06:52 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 7 04:07:00 2007 Subject: [LARTC] What I learned about Linux bridging In-Reply-To: <925A849792280C4E80C5461017A4B8A210B83E@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B83E@mail733.InfraSupportEtc.com> Message-ID: <4667683C.1030306@riverviewtech.net> On 6/6/2007 6:51 PM, Greg Scott wrote: > Continuing with the above example, most of the write ups also say to > remove any IP Addresses from eth0, eth1, and eth2. But I've found > this doesn't seem necessary - well, sort of. No, you don't have to remove IP addresses from the bridge member interfaces, though it does make things (IMHO) cleaner and consistent. You could put one IP address on each of the (physical) bridge member interfaces, or you could put the IP addresses (as aliases) on the bridge interface. Although I'm not sure how well EBTables would be able to filter addresses on a raw interface verses on the bridge interface. > Let's say that eth0 is at IP Address 1.2.3.2, and now I bridge eth0, > eth1, and eth2 together and give bridge br0 the same IP Address of > 1.2.3.2. Now I have a mess because both eth0 and br0 have the same > IP Address. Doing this: Yes, I would consider an IP address conflict a mess. > But let's say that physical interface eth1 has IP Address 10.0.0.1. > From testing, it looks like other systems can ping 10.0.0.1 just > fine, assuming they have a route to it. So I **think** I know that > I can assign an IP Address to a raw interface, as long as it's a > different IP Address than what I assigned to the overall bridge. But > I haven't seen this capability documented anywhere. Just because you can does not mean that you should. Most people that are working (read: using in production) bridging would not do so unless they had a need to do so. If you do not need bridging, generally it is not used. Thus if you can get by with putting the IP address(es) on the physical interfaces, then usually you will not be using bridging. This does not mean that it can not be done. > Let's say the bridge is up and working at IP Address 1.2.3.2. I have > a system at IP Address 1.2.3.1 connected via eth0. That system can > ping 1.2.3.2 easily. If I disconnect the Ethernet cable from eth0 > and plug into eth1 or eth2, after about 30 seconds, that bridged > system begins answering pings again. As indicated in the write ups, > that spare PC with a bunch of NICs is now acting like a managed > Ethernet switch. Cool! In Spanning Tree Protocol, this is considered the listening / learning delay to prevent loops in the network. This delay should be tunable (I'm not 100% certain of this fact). > iptables is a super-sophisticated toolset to filter IP packets. > ebtables is another toolset to filter at the OSI layer 2 (datalink) > layer. iptables concerns itself (mostly) with routing across an IP > network, computer to computer. ebtables concerns itself (mostly) > with filtering packets across physical NIC interfaces in the same > computer. Close, but not quite. (I'll preface this paragraph by saying that by default, this is how IPTables behaves, as it can be changed.) IPTables is a GREAT filtering framework that operates on OSI Layer 3 and higher (match extensions and what not). EBTables is a VERY GOOD filtering framework that operates on OSI Layer 2. I.e. one filters in the IP stack and the other filters in the bridging code (respectively). Granted both IPTables and EBTables can filter on some things like source / destination IP address, IP protocol, TCP / UDP port, MAC address, and a few other things. However IPTables has access to a LOT more information than EBTables does. Now, for the non default config. You can turn on "Bridge Netfilter" code in the kernel to allow the bridging code to use IPTables in addition to EBTables. What this means is that you can use all of IPTables matches extensions to make OSI Layer 3 and higher decisions on OSI Layer 2. I.e. you can have a bridging firewall do just about any thing that you want it to do. > Here is a great writeup on using ebtables and iptables together: > http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html But - like > everything I've been able to find so far, I don't think this writeup > is completely accurate. (It has been too long since I read that. Please refresh my memory.) What about that write up do you think is in-accurate? (I'll re-read the parts you point out.) I'll see if I agree with you or the write up and if so try to help explain why. > iptables has a module called physdev. According to the writeup, I > can use the iptables physdev module to filter among the raw > interfaces in a bridge. But a discussion in the netfilter list > essentially says that physdev is being removed because it creates all > kinds of other problems. At least, I think that's what it says. The > relevant discussion took place in early July 2006. Here is a pointer > to the beginning of the discussion: > https://lists.netfilter.org/pipermail/netfilter-devel/2006-July/024896.html > > So it looks like when filtering at the network layer, (IP in this > case) use iptables. When filtering at the data link layer, use > ebtables and maybe arptables. Avoid using -m physdev in iptables > because it's going away. You can add IP Addresses to bridged eth-- > interfaces as long as they don't conflict with the bridge IP > Address(es). With out reading too far in to it, it looks like the existing physdev match will be deprecated. I however do not think this will be a permanent thing. I believe that either a re-worked physdev, or something like it will re-appear in due time. Also keep in mind that there is a LOT of re-working going on in the later 2.6.x kernels, including renaming variables and changing the internal workings. (Too much so if you ask me for a .even release.) > Next up will be to try some filtering scenarios with ebtables and > iptables. ... Grant. . . . From gtaylor at riverviewtech.net Thu Jun 7 04:12:49 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 7 04:12:57 2007 Subject: [LARTC] Wan optimizations with linux In-Reply-To: References: Message-ID: <466769A1.8070703@riverviewtech.net> On 6/6/2007 8:19 PM, Diego Woitasen wrote: > I'm researching for WAN optimizations with linux. My network is > composed for MPLS network connecting 200 branches against a central > site. I use Linux machines to provide security with IPSEC in the > branches and in the central site. Now I'm lookup for techniques for > optimization the link. My first ideas was use IPCOMP and proxy to > cache traffic of HTTP applications. Somebody have any other idea to > get better utilization of WAN links with Linux? I have never been a really big fan of trying to compress traffic on one side of a link and decompress it on the other side. Something else to consider is what you are trying to optimize. Are you trying to get more bandwidth, or better latency, or what. If you are dealing with lots of long distance links, you may want to look in to playing with TCP window size(s) so that you can have more data in transit at one time with out pausing for acknowledgments. I would recommend that you use QoS to make sure that your priority traffic is put through first and that the rest is shared as evenly as possible or as you want it to be. Proxies, local replicating file servers, local store and forward email servers that can directly route messages, are all good things. With out knowing more about your configuration, I can't offer much more than general things to look at. Grant. . . . From GregScott at InfraSupportEtc.com Thu Jun 7 07:12:34 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu Jun 7 07:12:45 2007 Subject: [LARTC] What I learned about Linux bridging Message-ID: <925A849792280C4E80C5461017A4B8A210B843@mail733.InfraSupportEtc.com> > Although I'm not sure how well EBTables would be able > to filter addresses on a raw interface verses on the > bridge interface. I ran some tests and it seemed to work. Here is a little test script: #!/bin/sh /firewall-scripts/allow-all EBTABLES="/usr/local/sbin/ebtables" IPTABLES="/sbin/iptables" $EBTABLES -F $EBTABLES -A INPUT -i eth0 -j DROP $EBTABLES -A INPUT -i eth1 -j ACCEPT $EBTABLES -A INPUT -i eth2 -j ACCEPT #$IPTABLES -A FORWARD -i eth0 -j DROP #$IPTABLES -A FORWARD -i eth1 -j ACCEPT #$IPTABLES -A FORWARD -i eth2 -j ACCEPT #$IPTABLES -A INPUT -i eth0 -j DROP #$IPTABLES -A INPUT -i eth1 -j ACCEPT #$IPTABLES -A INPUT -i eth2 -j ACCEPT [root@IXFactor-fw gregs]# The IP Address for br0 is 1.2.3.2. I setup a system at 1.2.3.1 and had it ping 1.2.3.2. Everything behaved as expected. With bridging turned on, none of the iptables rules made any difference, so I commented them out. With the cable plugged into eth0, none of the pings replied. Tcpdump on the 1.2.3.2 box didn't even show anything coming into br0. Connecting to eth1 or eth2, pings replied and tcpdump showed the ICMP going back and forth on br0. Setting br0 to 1.2.3.2 and eth1 to 10.0.0.1, my external host could ping both IP Addresses when I fudged in appropriate routes on my test system. Here are a couple of my challenges. I want to block anything coming in from the Internet claiming to come from 10/8, or 172.16/12 or 192.168/16. But anything from these address blocks from any trusted eth-- interface should be OK. Easy to do with pure routing, seems a little more challenging with bridging. I think I can whip up some rules, but they depend on ebtables filtering by physical eth-- port. I have a public /24 block, call it 1.2.3.nnn. Some of these are on eth0, some on eth1, a few others on eth2. Bridging seems to handle this nicely. eth0 faces the Internet. I am a little nervous in packaging all this. Let's say something goes haywire in startup, before I run the brctl stuff. Or let's say something bad happens and bridging goes haywire. If eth0 has no IP Address, I have no remote access to the box. So I'd like to at least give an IP Address to eth0. This seems to work, but like I said, it's not documented to work. It just hit me what's bugging me - it bugs me to give an IP Address to a logical device and cut off any kind of fallback access to the physical device. Maybe I'm looking for trouble that's not there and creating problems I don't need just because bridging is so new in my little world. I wonder what others have done in this situation? And then there's my PPTP clients. I give these guys a 10.0.0.xxx IP Address with a gateway of 10.0.0.1. This was all part of the eth1 LAN when it was a pure router. I suppose in this bridging setup, I could make 10.0.0.1 an alias for br0 and leave eth1 with no IP Address. This just takes a little getting used to I guess. - Greg From GregScott at InfraSupportEtc.com Thu Jun 7 07:18:23 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu Jun 7 07:18:29 2007 Subject: [LARTC] Wan optimizations with linux Message-ID: <925A849792280C4E80C5461017A4B8A210B844@mail733.InfraSupportEtc.com> > I'm researching for WAN optimizations with linux. Are you doing lots of HTTP and SSL traffic to an internal website? If so, maybe it makes sense to put Squid on your remote sites. Squid is an open source web proxy and I know at least some of the commercial stuff depends on it. I've had good luck at a few sites doing Squid. - Greg Scott From gtaylor at riverviewtech.net Thu Jun 7 07:50:04 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 7 07:50:28 2007 Subject: [LARTC] What I learned about Linux bridging In-Reply-To: <925A849792280C4E80C5461017A4B8A210B843@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B843@mail733.InfraSupportEtc.com> Message-ID: <46679C8C.4090202@riverviewtech.net> On 6/7/2007 12:12 AM, Greg Scott wrote: > The IP Address for br0 is 1.2.3.2. I setup a system at 1.2.3.1 and > had it ping 1.2.3.2. Everything behaved as expected. With bridging > turned on, none of the iptables rules made any difference, so I > commented them out. With the cable plugged into eth0, none of the > pings replied. Tcpdump on the 1.2.3.2 box didn't even show anything > coming into br0. Connecting to eth1 or eth2, pings replied and > tcpdump showed the ICMP going back and forth on br0. Setting br0 to > 1.2.3.2 and eth1 to 10.0.0.1, my external host could ping both IP > Addresses when I fudged in appropriate routes on my test system. Ok, good to know. The only other thing that I can think of why you might bind your IP address(es) to the physical interface verses the bridge interface is if you were not wanting to bridge IP, but some other non routeable protocol, say NetBEUI. (But why would you want to do that???) In this scenario, it would be perfectly normal to use a bridging router to bridge any thing but IP and route IP. > Here are a couple of my challenges. > > I want to block anything coming in from the Internet claiming to come > from 10/8, or 172.16/12 or 192.168/16. But anything from these > address blocks from any trusted eth-- interface should be OK. Easy > to do with pure routing, seems a little more challenging with > bridging. I think I can whip up some rules, but they depend on > ebtables filtering by physical eth-- port. You should be able to easily do this with a few EBTables rules. Just check the source IP of the packets as they come in eth0. ebtables -t filter -N bogonsFromNet -P RETURN ebtables -t filter -A bogonsFromNet -s 10.0.0.0/8 -j DROP ebtables -t filter -A bogonsFromNet -s 172.16.0.0/12 -j DROP ebtables -t filter -A bogonsFromNet -s 192.168.0.0/16 -j DROP ebtables -t filter -A INPUT -j bogonsFromNet ebtables -t filter -A FORWARD -j bogonsFromNet (Note: This is untested.) If I have things correct, this should be a simple chain that checks the source IP and blocks drops them if they match. The (new) chain its self has a default policy of RETURN to return back to the chain that jumped to it. You might want to expand your bogons list a bit more to include some more forbidden networks. I.e. Test Net 192.0.2.x/24... RFC 3330 is your friend in this matter. > I have a public /24 block, call it 1.2.3.nnn. Some of these are on > eth0, some on eth1, a few others on eth2. Bridging seems to handle > this nicely. eth0 faces the Internet. I am a little nervous in > packaging all this. Let's say something goes haywire in startup, > before I run the brctl stuff. Or let's say something bad happens and > bridging goes haywire. If eth0 has no IP Address, I have no remote > access to the box. So I'd like to at least give an IP Address to > eth0. This seems to work, but like I said, it's not documented to > work. It just hit me what's bugging me - it bugs me to give an IP > Address to a logical device and cut off any kind of fallback access > to the physical device. Maybe I'm looking for trouble that's not > there and creating problems I don't need just because bridging is so > new in my little world. I wonder what others have done in this > situation? I don't think you are being too paranoid, but I think you should be aware of something I have run in to in my own testing. If I have eth0 up with an IP address w.x.y.z and I am connected to it (via ssh) and I enslave it to a bridge (bri0) (I like 3 letter / 1 number device names) my connection goes dead. I have to try reconnecting or log in to console and move the IP to the bridge. Well, at least I think that's what I have to do, it's been too long sense I last did it and I don't remember the details. You may want to play with this. The reason that I bring it up is that I think an IP address on the ether interface is different than an IP address on the enslaved ether interface. Your mileage may vary. What I'm worried about is that if the bridging code is seeing the traffic before the system's IP stack sees it (supported by the fact that you can use EBTables to filter it), you are still passing through the bridging code even with the IP address on the enslaved ether interface. I have personally used bridging extensively for the past 4+ years and I've been ABSOLUTELY THRILLED with it. I have some systems with four interfaces bridged together with all IPs on the bri0 interface. I have another system that is 802.1q trunking to a Layer 2 managed switch with 25+ VLAN interfaces on the trunk. I'm then bridging the 25+ VLANs plus the enslaved ether interface in one bridge with the IP bound to bri0. I then have EBTables dividing up what access each VLAN has. Specifically, each VLAN can communicate with the bri0 interface and the bri0 interface can communicate with all VLANs, but the VLANs can not communicate with each other. I even threw in ARP tables and EBTables to masquerade MAC addresses b/c the managed switch that I was working with could only see any given MAC on one VLAN. Let's just say this was fun to set up. In the end, the system has been up and running with out problems at multi megabit speeds for 3+ years. > And then there's my PPTP clients. I give these guys a 10.0.0.xxx IP > Address with a gateway of 10.0.0.1. This was all part of the eth1 > LAN when it was a pure router. I suppose in this bridging setup, I > could make 10.0.0.1 an alias for br0 and leave eth1 with no IP > Address. This just takes a little getting used to I guess. I don't think you should be scared of bridging and / or EBTables. With them you can do a LOT of things you could not do other wise. Grant. . . . From GregScott at InfraSupportEtc.com Thu Jun 7 08:55:47 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Thu Jun 7 08:56:05 2007 Subject: [LARTC] What I learned about Linux bridging Message-ID: <925A849792280C4E80C5461017A4B8A210B845@mail733.InfraSupportEtc.com> >> I want to block anything coming in from the Internet claiming to come >> from 10/8, or 172.16/12 or 192.168/16. > You should be able to easily do this with a few EBTables rules. Yup - I was off putting together something similar to what you did and just now saw your reply. It all tested good. I'll paste it in below: # # ebtables rules for bridging # echo "ebtables rules" echo " Directing anything between the Internet and private IP Addresses to the bogus_ip chain" $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-src 10.0.0.0/8 -j bogus_ip $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-src 172.16.0.0/12 -j bogus_ip $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-src 192.168.0.0/16 -j bogus_ip $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-dst 10.0.0.0/8 -j bogus_ip $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-dst 172.16.0.0/12 -j bogus_ip $EBTABLES -A FORWARD -i $INET_IFACE -p IPv4 --ip-dst 192.168.0.0/16 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-src 10.0.0.0/8 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-src 172.16.0.0/12 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-src 192.168.0.0/16 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-dst 10.0.0.0/8 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-dst 172.16.0.0/12 -j bogus_ip $EBTABLES -A INPUT -i $INET_IFACE -p IPv4 --ip-dst 192.168.0.0/16 -j bogus_ip # # Set up the bogus_ip chain to log and drop packets to/from private IP addresses # echo "Setting up the bogus_ip chain to LOG and DROP spoofed packets" $EBTABLES -A bogus_ip --log-prefix " spoofed packet" $EBTABLES -A bogus_ip -j DROP I might see if I can do this with one set of rules in the PREROUTING chain. > I don't think you are being too paranoid, but > I think you should be aware of something I have > run in to in my own testing. . . . Yup - I noticed similar behavior. So this is how I'll handle it. eth0 gets an IP Address of 1.2.3.6 during normal bootup. And then when I do the brctl stuff, br0 gets 1.2.3.2. That way there's never a conflict between the physical and logical. All the physical interfaces have unique addresses so I can route based on the address when it makes sense, or bridge based on the interface when that makes sense. I'm feeling lots better about all this. Hopefully this discussion can help others out there. - Greg From alchemyx at uznam.net.pl Thu Jun 7 11:07:01 2007 From: alchemyx at uznam.net.pl (=?ISO-8859-2?Q?Micha=B3_Margula?=) Date: Thu Jun 7 11:06:32 2007 Subject: [LARTC] Multipath routing In-Reply-To: <20070605194601.GB25580@packetconsulting.pl> References: <46652950.8000701@uznam.net.pl> <20070605194601.GB25580@packetconsulting.pl> Message-ID: <4667CAB5.5020308@uznam.net.pl> Piotr Chytla pisze: > First of all equal cost multipathing is evil ;>, It simply doesn't work for packets in > forwarding path besides support in kernel is not maintained > > Realy if you want load balance both uplinks disable > CONFIG_IP_ROUTE_MULTIPATH_CACHED and you will have random traffic > distribiution between both links. > Unfortunately it still doesn't work as expected. When i ping some host it always go trough one nexthop. It does per-destination loadbalancing, I am afraid. -- Micha? Margula, alchemyx@uznam.net.pl, http://alchemyx.uznam.net.pl/ "W ?yciu pi?kne s? tylko chwile" [Ryszard Riedel] From WBohannan at spidersat.com.gh Thu Jun 7 15:21:40 2007 From: WBohannan at spidersat.com.gh (William Bohannan) Date: Thu Jun 7 15:22:01 2007 Subject: [LARTC] HTB - Setting up guaranteed minimum rate for a leaf Message-ID: <4D411FB02758FE45915E9724339093F63220DC@intranet.scpl.local> Hi I am current trying to set up a guaranteed minimum rate for the leaf (1:1x). Also would I be correct in saying that the quantum is the dividing rule (so if I keep it the same "1532" and keep all the leafs in "1:1x" prio 3 they should all get the same amount of bandwidth shared across them equally?). For example below would the "rate" in the "1:1x" leaf be the minimum rate for that leaf and what would happen if there were three leafs "1:10", "1:11", "1:12" all using 300Kbit as their rate, would the bandwidth be shared equally among them even though it is greater than the "1:1" root rate of 600Kbit? 1: 1:1 (600Kbit) 1:10 1:11 1:12 etc... 1:1001,1002... 1:2001,2002... 1:3001,3002... etc... # setting up the main root 1:1 (600Kbit) /sbin/tc class add dev eth1 parent 1: classid 1:1 htb rate 600Kbit # setting up leafs 1:1x /sbin/tc class add dev eth1 parent 1:1 classid 1:1x htb rate xxxxkbit ceil xxxxkbit prio 3 quantum 1532 # setting up leafs 1:xxxx /sbin/tc class add dev eth1 parent 1:11 classid 1:xxxx htb rate xxxxKbit ceil xxxxKbit prio x quantum 1532 /sbin/tc qdisc add dev eth1 handle xxxx: parent 1:xxxx sfq Kind Regards William Bohannan From gtaylor at riverviewtech.net Thu Jun 7 16:21:31 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 7 16:19:42 2007 Subject: [LARTC] What I learned about Linux bridging In-Reply-To: <925A849792280C4E80C5461017A4B8A210B845@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B845@mail733.InfraSupportEtc.com> Message-ID: <4668146B.1060403@riverviewtech.net> On 06/07/07 01:55, Greg Scott wrote: > I might see if I can do this with one set of rules in the PREROUTING > chain. That is exactly why I put the rules in a new chain and jumped to the chain. One place to maintain them. With the ability to jump to the chain and then return to the chain that you jumped from, things act more like a sub-routine in programming. - Create new chains with a default policy of RETURN. - DROP on failures with in the new chain. - Default policy of RETURN will return packet flow to the calling chain. - Jump to the new chain from INPUT and return from it if everything is ok. - Jump to the new chain from FORWARD and return from it if everything is ok. In fact, you could use the new chain as a (sub)chain from any chain with in the filter table. > Yup - I noticed similar behavior. So this is how I'll handle it. > eth0 gets an IP Address of 1.2.3.6 during normal bootup. And then > when I do the brctl stuff, br0 gets 1.2.3.2. That way there's never > a conflict between the physical and logical. All the physical > interfaces have unique addresses so I can route based on the address > when it makes sense, or bridge based on the interface when that makes > sense. I'm feeling lots better about all this. Hopefully this > discussion can help others out there. This is one way to do it. Just be careful of race conditions while you are bringing things up. Grant. . . . From gtaylor at riverviewtech.net Thu Jun 7 22:25:07 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 7 22:23:16 2007 Subject: [LARTC] Wan optimizations with linux In-Reply-To: References: <466769A1.8070703@riverviewtech.net> Message-ID: <466869A3.4030103@riverviewtech.net> On 06/07/07 09:31, Diego Woitasen wrote: > My configuration is very simple, the branches only have HTTP and VOIP > traffc agaisnt central site. My question is about general tools to > optimize WAN traffic. Short of an HTTP proxy that uses compression between it and its parent proxy(s) and proper codecs, I can't think of any thing else off hand to try to help. I suppose you could use PPP across the link and try deflate compression... But in my opinion the PPP encapsulation would just make things nasty. Grant. . . . From sauloaugustosilva at gmail.com Fri Jun 8 19:26:46 2007 From: sauloaugustosilva at gmail.com (Saulo Silva) Date: Fri Jun 8 19:27:03 2007 Subject: [LARTC] CBQ + Layer7 x Emule Message-ID: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> Hi All , My first message and I have a little problem with my FC6 box trying to block emule traffic using layer7 . Here my network : Internet --------- ADSL Router ------------------- FC6 Box -------------------- Emule Box external ADSL : Dynamic Internal ADSL : 192.168.254.1 external FC6 : 192.168.254.3 internal FC6 : 192.168.253.1 Emule Box : 192.168.253.3 I guess that everything is ok with layer7 . Here my mangle rules . # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK --set-mark 2 # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix "PREROUTING MARK : " iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK --set-mark 2 iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix "FORWARD MARK : " The output from log is : Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= 203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 TTL=105 ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK PSH URGP=0 Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC=200.209.170.138 DST= 192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 TTL=115 ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH URGP=0 Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= 200.209.170.138 DST=192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 TTL=114 ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH URGP=0 Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC=200.244.104.10 DST= 192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 TTL=117 ID=7042 PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= 200.244.104.10 DST=192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 TTL=116 ID=7042 PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 So it's look like mark is working . So now I use the cbq.init script with that configuration : cat /etc/sysconfig/cbq/cbq-0002.emule_in DEVICE=eth0,100Mbit,10Mbit RATE=3Kbit WEIGHT=1Kbit PRIO=5 BOUNDED=yes ISOLATED=yes MARK=2 cat /etc/sysconfig/cbq/cbq-0002.emule_out DEVICE=eth1,100Mbit,10Mbit RATE=3Kbit WEIGHT=1Kbit PRIO=5 BOUNDED=yes ISOLATED=yes MARK=2 that generate this tc codes . /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt 3000 cell 8 /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 /sbin/tc qdisc del dev eth1 root /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt 3000 cell 8 /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth 100Mbit rate 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 bounded isolated /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit buffer 10Kb/8 limit 15Kb mtu 1500 /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 2 fw classid 1:2 /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth 100Mbit rate 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 bounded isolated /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit buffer 10Kb/8 limit 15Kb mtu 1500 /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 handle 2 fw classid 1:2 Can anyone explain me what is wrong . Why I cannot shape this traffic ???? Any help will be appreciated . Best Regards , Saulo Silva -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070608/475e1129/attachment.html From marco.casaroli at gmail.com Fri Jun 8 20:50:35 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Fri Jun 8 20:50:44 2007 Subject: [LARTC] CBQ + Layer7 x Emule In-Reply-To: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> References: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> Message-ID: <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> l7's edonkey filter does not match all edonkey traffic, it does not match data packets (that you want to shape). It matches however the signaling packets that can be related to data connections. I never tried L7 but I think these may help you iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK --set-mark 2 iptables -t mangle -A PREROUTING -p tcp -m mark --mark 2 -j CONNMARK --save-mark On 6/8/07, Saulo Silva wrote: > Hi All , > > My first message and I have a little problem with my FC6 box trying to block > emule traffic using layer7 . > > Here my network : > > Internet --------- ADSL Router ------------------- FC6 Box > -------------------- Emule Box > > external ADSL : Dynamic > Internal ADSL : 192.168.254.1 > > external FC6 : 192.168.254.3 > internal FC6 : 192.168.253.1 > > Emule Box : 192.168.253.3 > > I guess that everything is ok with layer7 . Here my mangle rules . > > # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > --set-mark 2 > # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix > "PREROUTING MARK : " > > > iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK --set-mark > 2 > iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix "FORWARD > MARK : " > > The output from log is : > > Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > SRC=203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 TTL=105 > ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK PSH URGP=0 > > Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 > SRC=200.209.170.138 DST=192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 TTL=115 > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH URGP=0 > Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > 200.209.170.138 DST=192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 TTL=114 > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH URGP=0 > > Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC= > 200.244.104.10 DST=192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 TTL=117 ID=7042 > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > 200.244.104.10 DST=192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 TTL=116 ID=7042 > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > So it's look like mark is working . > > So now I use the cbq.init script with that configuration : > > cat /etc/sysconfig/cbq/cbq-0002.emule_in > > DEVICE=eth0,100Mbit,10Mbit > RATE=3Kbit > WEIGHT=1Kbit > PRIO=5 > BOUNDED=yes > ISOLATED=yes > MARK=2 > > cat /etc/sysconfig/cbq/cbq-0002.emule_out > DEVICE=eth1,100Mbit,10Mbit > RATE=3Kbit > WEIGHT=1Kbit > PRIO=5 > BOUNDED=yes > ISOLATED=yes > MARK=2 > > that generate this tc codes . > > /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt 3000 > cell 8 > /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 > > /sbin/tc qdisc del dev eth1 root > /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt 3000 > cell 8 > /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 > > /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth 100Mbit rate > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 bounded > isolated > /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit buffer 10Kb/8 > limit 15Kb mtu 1500 > /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 2 fw > classid 1:2 > > /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth 100Mbit rate > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 bounded > isolated > /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit buffer 10Kb/8 > limit 15Kb mtu 1500 > /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 handle 2 fw > classid 1:2 > > Can anyone explain me what is wrong . Why I cannot shape this traffic ???? > > Any help will be appreciated . > > Best Regards , > > Saulo Silva > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From sauloaugustosilva at gmail.com Sat Jun 9 02:55:41 2007 From: sauloaugustosilva at gmail.com (Saulo Silva) Date: Sat Jun 9 02:55:49 2007 Subject: [LARTC] CBQ + Layer7 x Emule In-Reply-To: <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> References: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> Message-ID: <3ddff6900706081755u70c1632auf6df365879c7655a@mail.gmail.com> HI Marcos , I tried your rules, but without success . Thank for that help . And , how about ip2pp ? Is this application could do that ? Help me to shape edonkey traffic ??? Best Regards, Saulo Silva 2007/6/8, Marco Aurelio : > > l7's edonkey filter does not match all edonkey traffic, it does not > match data packets (that you want to shape). It matches however the > signaling packets that can be related to data connections. > > I never tried L7 but I think these may help you > > iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark > iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT > iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > --set-mark 2 > iptables -t mangle -A PREROUTING -p tcp -m mark --mark 2 -j CONNMARK > --save-mark > > > On 6/8/07, Saulo Silva wrote: > > Hi All , > > > > My first message and I have a little problem with my FC6 box trying to > block > > emule traffic using layer7 . > > > > Here my network : > > > > Internet --------- ADSL Router ------------------- FC6 Box > > -------------------- Emule Box > > > > external ADSL : Dynamic > > Internal ADSL : 192.168.254.1 > > > > external FC6 : 192.168.254.3 > > internal FC6 : 192.168.253.1 > > > > Emule Box : 192.168.253.3 > > > > I guess that everything is ok with layer7 . Here my mangle rules . > > > > # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > > --set-mark 2 > > # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG --log-prefix > > "PREROUTING MARK : " > > > > > > iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK > --set-mark > > 2 > > iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix > "FORWARD > > MARK : " > > > > The output from log is : > > > > Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > > SRC=203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 TTL=105 > > ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK PSH > URGP=0 > > > > Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 > > SRC=200.209.170.138 DST=192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 TTL=115 > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > URGP=0 > > Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > 200.209.170.138 DST=192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 TTL=114 > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > URGP=0 > > > > Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC= > > 200.244.104.10 DST=192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 TTL=117 > ID=7042 > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > 200.244.104.10 DST=192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 TTL=116 > ID=7042 > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > So it's look like mark is working . > > > > So now I use the cbq.init script with that configuration : > > > > cat /etc/sysconfig/cbq/cbq-0002.emule_in > > > > DEVICE=eth0,100Mbit,10Mbit > > RATE=3Kbit > > WEIGHT=1Kbit > > PRIO=5 > > BOUNDED=yes > > ISOLATED=yes > > MARK=2 > > > > cat /etc/sysconfig/cbq/cbq-0002.emule_out > > DEVICE=eth1,100Mbit,10Mbit > > RATE=3Kbit > > WEIGHT=1Kbit > > PRIO=5 > > BOUNDED=yes > > ISOLATED=yes > > MARK=2 > > > > that generate this tc codes . > > > > /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt > 3000 > > cell 8 > > /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 > > > > /sbin/tc qdisc del dev eth1 root > > /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt > 3000 > > cell 8 > > /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 > > > > /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth 100Mbit > rate > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > bounded > > isolated > > /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit buffer > 10Kb/8 > > limit 15Kb mtu 1500 > > /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 2 fw > > classid 1:2 > > > > /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth 100Mbit > rate > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > bounded > > isolated > > /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit buffer > 10Kb/8 > > limit 15Kb mtu 1500 > > /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 handle 2 fw > > classid 1:2 > > > > Can anyone explain me what is wrong . Why I cannot shape this traffic > ???? > > > > Any help will be appreciated . > > > > Best Regards , > > > > Saulo Silva > > > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > > -- > Marco Casaroli > SapucaiNet Telecom > +55 35 34712377 ext 5 > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070608/c68a9e48/attachment.htm From salatiel.filho at gmail.com Sat Jun 9 03:04:12 2007 From: salatiel.filho at gmail.com (Salatiel Filho) Date: Sat Jun 9 03:04:19 2007 Subject: [LARTC] CBQ + Layer7 x Emule In-Reply-To: <3ddff6900706081755u70c1632auf6df365879c7655a@mail.gmail.com> References: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> <3ddff6900706081755u70c1632auf6df365879c7655a@mail.gmail.com> Message-ID: On 6/8/07, Saulo Silva wrote: > > HI Marcos , > > I tried your rules, but without success . Thank for that help . > And , how about ip2pp ? Is this application could do that ? Help me to > shape edonkey traffic ??? > > Best Regards, > > Saulo Silva > > 2007/6/8, Marco Aurelio : > > > > l7's edonkey filter does not match all edonkey traffic, it does not > > match data packets (that you want to shape). It matches however the > > signaling packets that can be related to data connections. > > > > I never tried L7 but I think these may help you > > > > iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark > > iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT > > iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > > --set-mark 2 > > iptables -t mangle -A PREROUTING -p tcp -m mark --mark 2 -j CONNMARK > > --save-mark > > > > > > On 6/8/07, Saulo Silva wrote: > > > Hi All , > > > > > > My first message and I have a little problem with my FC6 box trying to > > block > > > emule traffic using layer7 . > > > > > > Here my network : > > > > > > Internet --------- ADSL Router ------------------- FC6 Box > > > -------------------- Emule Box > > > > > > external ADSL : Dynamic > > > Internal ADSL : 192.168.254.1 > > > > > > external FC6 : 192.168.254.3 > > > internal FC6 : 192.168.253.1 > > > > > > Emule Box : 192.168.253.3 > > > > > > I guess that everything is ok with layer7 . Here my mangle rules . > > > > > > # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > > > --set-mark 2 > > > # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG > > --log-prefix > > > "PREROUTING MARK : " > > > > > > > > > iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK > > --set-mark > > > 2 > > > iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix > > "FORWARD > > > MARK : " > > > > > > The output from log is : > > > > > > Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > > > SRC=203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 TTL=105 > > > ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK PSH > > URGP=0 > > > > > > Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 > > > SRC=200.209.170.138 DST=192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 > > TTL=115 > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > > URGP=0 > > > Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > > 200.209.170.138 DST=192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 TTL=114 > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > > URGP=0 > > > > > > Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC= > > > 200.244.104.10 DST=192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 TTL=117 > > ID=7042 > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > > 200.244.104.10 DST=192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 TTL=116 > > ID=7042 > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > So it's look like mark is working . > > > > > > So now I use the cbq.init script with that configuration : > > > > > > cat /etc/sysconfig/cbq/cbq- 0002.emule_in > > > > > > DEVICE=eth0,100Mbit,10Mbit > > > RATE=3Kbit > > > WEIGHT=1Kbit > > > PRIO=5 > > > BOUNDED=yes > > > ISOLATED=yes > > > MARK=2 > > > > > > cat /etc/sysconfig/cbq/cbq-0002.emule_out > > > DEVICE=eth1,100Mbit,10Mbit > > > RATE=3Kbit > > > WEIGHT=1Kbit > > > PRIO=5 > > > BOUNDED=yes > > > ISOLATED=yes > > > MARK=2 > > > > > > that generate this tc codes . > > > > > > /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt > > 3000 > > > cell 8 > > > /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 > > > > > > /sbin/tc qdisc del dev eth1 root > > > /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt > > 3000 > > > cell 8 > > > /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 > > > > > > /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth > > 100Mbit rate > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > > bounded > > > isolated > > > /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit buffer > > 10Kb/8 > > > limit 15Kb mtu 1500 > > > /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 2 > > fw > > > classid 1:2 > > > > > > /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth > > 100Mbit rate > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > > bounded > > > isolated > > > /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit buffer > > 10Kb/8 > > > limit 15Kb mtu 1500 > > > /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 handle 2 > > fw > > > classid 1:2 > > > > > > Can anyone explain me what is wrong . Why I cannot shape this traffic > > ???? > > > > > > Any help will be appreciated . > > > > > > Best Regards , > > > > > > Saulo Silva > > > > > > _______________________________________________ > > > LARTC mailing list > > > LARTC@mailman.ds9a.nl > > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > > > > > > > -- > > Marco Casaroli > > SapucaiNet Telecom > > +55 35 34712377 ext 5 > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > I block all P2P traffic with ipp2p , it works great. iptables -t mangle -i eth0 -A FORWARD -m ipp2p --ipp2p -j DROP -- []'s Salatiel "O maior prazer do inteligente ? bancar o idiota diante de um idiota que banca o inteligente". -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070608/93484709/attachment.html From marco.casaroli at gmail.com Sat Jun 9 09:17:57 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Sat Jun 9 09:18:56 2007 Subject: [LARTC] CBQ + Layer7 x Emule In-Reply-To: References: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> <3ddff6900706081755u70c1632auf6df365879c7655a@mail.gmail.com> Message-ID: <92ed523b0706090017y5f5db099i28720d7a978389f1@mail.gmail.com> from ipp2p news page ""quote"" I suggest the following tcp and udp for connection tracking (see docu section) 01# iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark 02# iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT 03# iptables -t mangle -A PREROUTING -p tcp -m ipp2p --ipp2p -j MARK --set-mark 1 04# iptables -t mangle -A PREROUTING -p tcp -m mark --mark 1 -j CONNMARK --save-mark 05# iptables -t mangle -A PREROUTING -p udp -m ipp2p --ipp2p -j MARK --set-mark 1 detect TCP FIRST, SAVE MARK , and detect udp after you saved the mark !! You will have now every p2p packet marked, but a dramtic reduce of udp missmatches. ""quote"" On 6/8/07, Salatiel Filho wrote: > > > On 6/8/07, Saulo Silva wrote: > > HI Marcos , > > > > I tried your rules, but without success . Thank for that help . > > And , how about ip2pp ? Is this application could do that ? Help me to > shape edonkey traffic ??? > > > > Best Regards, > > > > Saulo Silva > > > > > > 2007/6/8, Marco Aurelio : > > > > > l7's edonkey filter does not match all edonkey traffic, it does not > > > match data packets (that you want to shape). It matches however the > > > signaling packets that can be related to data connections. > > > > > > I never tried L7 but I think these may help you > > > > > > iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark > > > iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT > > > iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > --set-mark 2 > > > iptables -t mangle -A PREROUTING -p tcp -m mark --mark 2 -j CONNMARK > --save-mark > > > > > > > > > On 6/8/07, Saulo Silva < sauloaugustosilva@gmail.com> wrote: > > > > Hi All , > > > > > > > > My first message and I have a little problem with my FC6 box trying to > block > > > > emule traffic using layer7 . > > > > > > > > Here my network : > > > > > > > > Internet --------- ADSL Router ------------------- FC6 Box > > > > -------------------- Emule Box > > > > > > > > external ADSL : Dynamic > > > > Internal ADSL : 192.168.254.1 > > > > > > > > external FC6 : 192.168.254.3 > > > > internal FC6 : 192.168.253.1 > > > > > > > > Emule Box : 192.168.253.3 > > > > > > > > I guess that everything is ok with layer7 . Here my mangle rules . > > > > > > > > # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > > > > --set-mark 2 > > > > # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG > --log-prefix > > > > "PREROUTING MARK : " > > > > > > > > > > > > iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK > --set-mark > > > > 2 > > > > iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix > "FORWARD > > > > MARK : " > > > > > > > > The output from log is : > > > > > > > > Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > > > > SRC= 203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 > TTL=105 > > > > ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK PSH > URGP=0 > > > > > > > > Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 > > > > SRC=200.209.170.138 DST= 192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 > TTL=115 > > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > URGP=0 > > > > Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > > > 200.209.170.138 DST= 192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 TTL=114 > > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK PSH > URGP=0 > > > > > > > > Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC= > > > > 200.244.104.10 DST= 192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 TTL=117 > ID=7042 > > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > > > Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 SRC= > > > > 200.244.104.10 DST= 192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 TTL=116 > ID=7042 > > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > > > So it's look like mark is working . > > > > > > > > So now I use the cbq.init script with that configuration : > > > > > > > > cat /etc/sysconfig/cbq/cbq- 0002.emule_in > > > > > > > > DEVICE=eth0,100Mbit,10Mbit > > > > RATE=3Kbit > > > > WEIGHT=1Kbit > > > > PRIO=5 > > > > BOUNDED=yes > > > > ISOLATED=yes > > > > MARK=2 > > > > > > > > cat /etc/sysconfig/cbq/cbq-0002.emule_out > > > > DEVICE=eth1,100Mbit,10Mbit > > > > RATE=3Kbit > > > > WEIGHT=1Kbit > > > > PRIO=5 > > > > BOUNDED=yes > > > > ISOLATED=yes > > > > MARK=2 > > > > > > > > that generate this tc codes . > > > > > > > > /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit avpkt > 3000 > > > > cell 8 > > > > /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 > > > > > > > > /sbin/tc qdisc del dev eth1 root > > > > /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit avpkt > 3000 > > > > cell 8 > > > > /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 > > > > > > > > /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth > 100Mbit rate > > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > bounded > > > > isolated > > > > /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit buffer > 10Kb/8 > > > > limit 15Kb mtu 1500 > > > > /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 handle 2 > fw > > > > classid 1:2 > > > > > > > > /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth > 100Mbit rate > > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > bounded > > > > isolated > > > > /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit buffer > 10Kb/8 > > > > limit 15Kb mtu 1500 > > > > /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 handle 2 > fw > > > > classid 1:2 > > > > > > > > Can anyone explain me what is wrong . Why I cannot shape this traffic > ???? > > > > > > > > Any help will be appreciated . > > > > > > > > Best Regards , > > > > > > > > Saulo Silva > > > > > > > > _______________________________________________ > > > > LARTC mailing list > > > > LARTC@mailman.ds9a.nl > > > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > > > > > > > > > > > > -- > > > Marco Casaroli > > > SapucaiNet Telecom > > > +55 35 34712377 ext 5 > > > > > > > > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > I block all P2P traffic with ipp2p , it works great. > iptables -t mangle -i eth0 -A FORWARD -m ipp2p --ipp2p -j DROP > > > -- > []'s > Salatiel > > "O maior prazer do inteligente ? bancar o idiota > diante de um idiota que banca o inteligente". -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From marco.casaroli at gmail.com Sat Jun 9 09:24:43 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Sat Jun 9 09:24:57 2007 Subject: [LARTC] how hierarchical is HTB? In-Reply-To: <0E24ED2A7F9AA349A8633E6A56A64BE0027A8308@XCH-SW-2V1.sw.nos.boeing.com> References: <20070606111619.26ef7781@pulsar.inexo.com.br> <4666CB86.2070304@fastwebnet.it> <0E24ED2A7F9AA349A8633E6A56A64BE0027A8308@XCH-SW-2V1.sw.nos.boeing.com> Message-ID: <92ed523b0706090024l70254a0al27c4a15799f96e85@mail.gmail.com> What exactly happens if the sum of the children classes rate is bigger than the parent's? What if the majority of these classes are using less than the minimum rate established (eg. 0kbps)? On 6/6/07, Flechsenhaar, Jon J wrote: > Few quick comments: > > HTB parent rate should never be less than the sum of its children. This > is referring to the rate parameter not the ceil. > > Class 1:20 needs to equal 1:200+1:201. You will get strange results if > you try and test with any configuration where the the sum of all > childeren rates are greater than their parent. > > Borrowing occurs from the parent and from classes at the same level. So > if you have 3 leaf classes. 1:1, 1:2, and 1:3 they will get their > assigned rate and borrow up their ceil if there is extra bandwidth. If > there is no traffic in one of the classes then it can give its assured > bandwidth to the other 2 classes with traffic. Borrowing is based on > the priority assigned to the class. > > Jon Flechsenhaar > Boeing WNW Team > Network Services > (714)-762-1231 > 202-E7 > > -----Original Message----- > From: Claudio Greco [mailto:cla.greco@fastwebnet.it] > Sent: Wednesday, June 06, 2007 7:58 AM > To: Ethy H. Brito > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] how hierarchical is HTB? > > > > root class 1: (rate=100, ceil=100) > > 1: children classes 1:10 (30,100) and 1:20 (70,100) 1:10 children > > classes 1:100 (10,100) and 1:101 (20,100) 1:20 children classes 1:200 > > (30,100) and 1:201 (70,100) > > > > I managed to have the root rate equals to the sum of its children. > > > > > Well, it is still true that total assured rate for classes 1:200 and > 1:201 is greater than assured rate for class 1:20. Still, I don't think > this is a big deal. > > > But how must the rates of the leaves be signed? > > > What do you mean with 'signed'? > > > And how the bandwidth of these leaves will be distributed when > > borrowing/lending is necessary? > > > > > As far as I know, when a leaf is 'yellow', i.e. its rate is greater than > its assured rate and lesser than its ceil rate, it can borrow from its > parent providing there's a yellow-path to the root and the root is green > (root can't be yellow, only green or red). > > If there's more than one child borrowing from the same class, they're > served according to their priority (argument prio in *tc class add*). > > If there's more than one child having the same priority, then they're > served in DRR order (Deficit Round Robin). > > You can tune DRR behaviour with arguments r2q in *tc qdisc add* and > quantum in *tc class add*. > > > classs 1:10 will/may lend/borrow from class 1:20. I know that. > > > No it can not. A class can only borrow from its parent, never from its > siblings. > > > But how about 1:1XX and classes 1:2XX? will the borrow/lend from each > > others? > > > > > ibidem. > > > Any docs about this? > > > > > You may see: > > http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm > http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From nozo at ziu.info Sat Jun 9 17:09:19 2007 From: nozo at ziu.info (Michal Soltys) Date: Sat Jun 9 17:09:53 2007 Subject: [LARTC] vlan interfaces and tc Message-ID: <466AC29F.7030909@ziu.info> Hello I have few questions regarding tc functionality (qdiscs, classes, etc.) when vlans are in use. For example, consider interface eth0, for which I create and extra vlan with vconfig, let's say eth0.11. Then using tc I can add usual things - qdiscs, filters, ... - to both eth0 and eth0.11. The questions are: - on which interface - virtual or real, should I actually use tc ? Or are either of them allowed, depending on what I need ? If so: - what happens if both interfaces - virtual and real have disciplines / filters ? Does packet traverse both (I'd assume, first through eth0.11 than through eth0) ? From sauloaugustosilva at gmail.com Sat Jun 9 18:29:38 2007 From: sauloaugustosilva at gmail.com (Saulo Silva) Date: Sat Jun 9 18:29:44 2007 Subject: [LARTC] CBQ + Layer7 x Emule In-Reply-To: <92ed523b0706090017y5f5db099i28720d7a978389f1@mail.gmail.com> References: <3ddff6900706081026h2f2c763al9bc7b272b86cb75c@mail.gmail.com> <92ed523b0706081150s4bba1854w4a3d73c78a9dbbf7@mail.gmail.com> <3ddff6900706081755u70c1632auf6df365879c7655a@mail.gmail.com> <92ed523b0706090017y5f5db099i28720d7a978389f1@mail.gmail.com> Message-ID: <3ddff6900706090929n74449871xccd070da9b835c43@mail.gmail.com> Hi Marcos , Now works with l7 and this iptables lines . I the first email we got only 4 lines and now we have 5 . Its working nice . Thanks the help . Saulo Silva 2007/6/9, Marco Aurelio : > > from ipp2p news page > ""quote"" > > I suggest the following tcp and udp for connection tracking (see docu > section) > > 01# iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark > 02# iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT > 03# iptables -t mangle -A PREROUTING -p tcp -m ipp2p --ipp2p -j MARK > --set-mark 1 > 04# iptables -t mangle -A PREROUTING -p tcp -m mark --mark 1 -j > CONNMARK --save-mark > 05# iptables -t mangle -A PREROUTING -p udp -m ipp2p --ipp2p -j MARK > --set-mark 1 > > > detect TCP FIRST, SAVE MARK , and detect udp after you saved the mark !! > You will have now every p2p packet marked, but a dramtic reduce of udp > missmatches. > > ""quote"" > > On 6/8/07, Salatiel Filho wrote: > > > > > > On 6/8/07, Saulo Silva wrote: > > > HI Marcos , > > > > > > I tried your rules, but without success . Thank for that help . > > > And , how about ip2pp ? Is this application could do that ? Help me to > > shape edonkey traffic ??? > > > > > > Best Regards, > > > > > > Saulo Silva > > > > > > > > > 2007/6/8, Marco Aurelio : > > > > > > > l7's edonkey filter does not match all edonkey traffic, it does not > > > > match data packets (that you want to shape). It matches however the > > > > signaling packets that can be related to data connections. > > > > > > > > I never tried L7 but I think these may help you > > > > > > > > iptables -t mangle -A PREROUTING -p tcp -j CONNMARK --restore-mark > > > > iptables -t mangle -A PREROUTING -p tcp -m mark ! --mark 0 -j ACCEPT > > > > iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j MARK > > --set-mark 2 > > > > iptables -t mangle -A PREROUTING -p tcp -m mark --mark 2 -j CONNMARK > > --save-mark > > > > > > > > > > > > On 6/8/07, Saulo Silva < sauloaugustosilva@gmail.com> wrote: > > > > > Hi All , > > > > > > > > > > My first message and I have a little problem with my FC6 box > trying to > > block > > > > > emule traffic using layer7 . > > > > > > > > > > Here my network : > > > > > > > > > > Internet --------- ADSL Router ------------------- FC6 Box > > > > > -------------------- Emule Box > > > > > > > > > > external ADSL : Dynamic > > > > > Internal ADSL : 192.168.254.1 > > > > > > > > > > external FC6 : 192.168.254.3 > > > > > internal FC6 : 192.168.253.1 > > > > > > > > > > Emule Box : 192.168.253.3 > > > > > > > > > > I guess that everything is ok with layer7 . Here my mangle rules . > > > > > > > > > > # iptables -t mangle -A PREROUTING -mlayer7 --l7proto edonkey -j > MARK > > > > > --set-mark 2 > > > > > # iptables -t mangle -A PREROUTING -m mark --mark 2 -j LOG > > --log-prefix > > > > > "PREROUTING MARK : " > > > > > > > > > > > > > > > iptables -t mangle -A FORWARD -mlayer7 --l7proto edonkey -j MARK > > --set-mark > > > > > 2 > > > > > iptables -t mangle -A FORWARD -m mark --mark 2 -j LOG --log-prefix > > "FORWARD > > > > > MARK : " > > > > > > > > > > The output from log is : > > > > > > > > > > Jun 8 14:18:46 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > > > > > SRC= 203.91.83.127 DST=192.168.253.3 LEN=180 TOS=0x00 PREC=0x00 > > TTL=105 > > > > > ID=18725 PROTO=TCP SPT=51674 DPT=4662 WINDOW=16944 RES=0x00 ACK > PSH > > URGP=0 > > > > > > > > > > Jun 8 14:18:48 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 > > > > > SRC=200.209.170.138 DST= 192.168.254.3 LEN=139 TOS=0x00 PREC=0x00 > > TTL=115 > > > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK > PSH > > URGP=0 > > > > > Jun 8 14:18:48 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > SRC= > > > > > 200.209.170.138 DST= 192.168.253.3 LEN=139 TOS=0x00 PREC=0x00 > TTL=114 > > > > > ID=18002 DF PROTO=TCP SPT=1476 DPT=4662 WINDOW=65535 RES=0x00 ACK > PSH > > URGP=0 > > > > > > > > > > Jun 8 14:18:51 fs-linux kernel: PREROUTING MARK : IN=eth0 OUT= > > > > > MAC=00:06:4f:47:ad:e0:00:0f:3d:cc:29:e0:08:00 SRC= > > > > > 200.244.104.10 DST= 192.168.254.3 LEN=40 TOS=0x00 PREC=0x00 > TTL=117 > > ID=7042 > > > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > > > > > Jun 8 14:18:51 fs-linux kernel: FORWARD MARK : IN=eth0 OUT=eth1 > SRC= > > > > > 200.244.104.10 DST= 192.168.253.3 LEN=40 TOS=0x00 PREC=0x00 > TTL=116 > > ID=7042 > > > > > PROTO=TCP SPT=50675 DPT=4662 WINDOW=64952 RES=0x00 ACK FIN URGP=0 > > > > > > > > > > So it's look like mark is working . > > > > > > > > > > So now I use the cbq.init script with that configuration : > > > > > > > > > > cat /etc/sysconfig/cbq/cbq- 0002.emule_in > > > > > > > > > > DEVICE=eth0,100Mbit,10Mbit > > > > > RATE=3Kbit > > > > > WEIGHT=1Kbit > > > > > PRIO=5 > > > > > BOUNDED=yes > > > > > ISOLATED=yes > > > > > MARK=2 > > > > > > > > > > cat /etc/sysconfig/cbq/cbq-0002.emule_out > > > > > DEVICE=eth1,100Mbit,10Mbit > > > > > RATE=3Kbit > > > > > WEIGHT=1Kbit > > > > > PRIO=5 > > > > > BOUNDED=yes > > > > > ISOLATED=yes > > > > > MARK=2 > > > > > > > > > > that generate this tc codes . > > > > > > > > > > /sbin/tc qdisc add dev eth0 root handle 1 cbq bandwidth 100Mbit > avpkt > > 3000 > > > > > cell 8 > > > > > /sbin/tc class change dev eth0 root cbq weight 10Mbit allot 1514 > > > > > > > > > > /sbin/tc qdisc del dev eth1 root > > > > > /sbin/tc qdisc add dev eth1 root handle 1 cbq bandwidth 100Mbit > avpkt > > 3000 > > > > > cell 8 > > > > > /sbin/tc class change dev eth1 root cbq weight 10Mbit allot 1514 > > > > > > > > > > /sbin/tc class add dev eth0 parent 1: classid 1:2 cbq bandwidth > > 100Mbit rate > > > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > > bounded > > > > > isolated > > > > > /sbin/tc qdisc add dev eth0 parent 1:2 handle 2 tbf rate 3Kbit > buffer > > 10Kb/8 > > > > > limit 15Kb mtu 1500 > > > > > /sbin/tc filter add dev eth0 parent 1:0 protocol ip prio 200 > handle 2 > > fw > > > > > classid 1:2 > > > > > > > > > > /sbin/tc class add dev eth1 parent 1: classid 1:2 cbq bandwidth > > 100Mbit rate > > > > > 3Kbit weight 1Kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 3000 > > bounded > > > > > isolated > > > > > /sbin/tc qdisc add dev eth1 parent 1:2 handle 2 tbf rate 3Kbit > buffer > > 10Kb/8 > > > > > limit 15Kb mtu 1500 > > > > > /sbin/tc filter add dev eth1 parent 1:0 protocol ip prio 200 > handle 2 > > fw > > > > > classid 1:2 > > > > > > > > > > Can anyone explain me what is wrong . Why I cannot shape this > traffic > > ???? > > > > > > > > > > Any help will be appreciated . > > > > > > > > > > Best Regards , > > > > > > > > > > Saulo Silva > > > > > > > > > > _______________________________________________ > > > > > LARTC mailing list > > > > > LARTC@mailman.ds9a.nl > > > > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > > > > > > > > > > > > > > > > > -- > > > > Marco Casaroli > > > > SapucaiNet Telecom > > > > +55 35 34712377 ext 5 > > > > > > > > > > > > > _______________________________________________ > > > LARTC mailing list > > > LARTC@mailman.ds9a.nl > > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > > > > > I block all P2P traffic with ipp2p , it works great. > > iptables -t mangle -i eth0 -A FORWARD -m ipp2p --ipp2p -j DROP > > > > > > -- > > []'s > > Salatiel > > > > "O maior prazer do inteligente ? bancar o idiota > > diante de um idiota que banca o inteligente". > > > -- > Marco Casaroli > SapucaiNet Telecom > +55 35 34712377 ext 5 > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070609/022ff72c/attachment.htm From marco.casaroli at gmail.com Sun Jun 10 18:27:19 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Sun Jun 10 18:27:39 2007 Subject: [LARTC] HTB Message-ID: <92ed523b0706100927j1601dbe6p55b5a3de10c89f5e@mail.gmail.com> What exactly happens if the sum of the children classes rate is bigger than the parent's? What if the majority of these classes are using less than the minimum rate established (eg. 0kbps)? -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From m.innocenti at cineca.it Mon Jun 11 10:09:37 2007 From: m.innocenti at cineca.it (m.innocenti@cineca.it) Date: Mon Jun 11 10:09:49 2007 Subject: [LARTC] HTB In-Reply-To: <92ed523b0706100927j1601dbe6p55b5a3de10c89f5e@mail.gmail.com> References: <92ed523b0706100927j1601dbe6p55b5a3de10c89f5e@mail.gmail.com> Message-ID: <466D0331.9080509@cineca.it> Marco Aurelio ha scritto: > What exactly happens if the sum of the children classes rate is bigger > than the parent's? HTB will assign to the leaf the rate regardeless of the value of the parent's rate. The parent's rate is used only to compute how much bandwith must be allocated to the leaf's ceil. > What if the majority of these classes are using less than the minimum > rate established (eg. 0kbps)? -- ********************************************************************** Marco Innocenti Gruppo Infrastruttura e Sicurezza CINECA phone:+39 0516171553 / fax:+39 0516132198 Via Magnanelli 6/3 e-mail: innocenti@cineca.it 40033 Casalecchio di Reno Bologna (Italia) ********************************************************************** From christian.benvenuti at libero.it Mon Jun 11 13:30:11 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 11 13:27:24 2007 Subject: [LARTC] Re: HTB Message-ID: <1181561411.5881.32.camel@benve-laptop> Hi, >What exactly happens if the sum of the children classes rate is bigger >than the parent's? I would say that in most cases it would be a misconfiguration, especially if you have more layers of HTB classes. The bw you configure with rate is not going to be reserved properly if you do not respect the rule rate(parent)>=Sum of rates(children). Anyway, the parent node does not throttle the children classes. Parents are there mainly to allow borrowing and sharing between sibling/descendant classes. >What if the majority of these classes are using less than the minimum >rate established (eg. 0kbps)? Why should this be a problem? In this case a class simply uses less than what it has been allocated. Depending on your configuration, other classes would probably be able to borrow more. Regards /Christian [ http://benve.info ] From christian.benvenuti at libero.it Mon Jun 11 13:30:39 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 11 13:27:48 2007 Subject: [LARTC] Re: u32 classifier Message-ID: <1181561439.5881.33.camel@benve-laptop> Hi, >ladSun wrote: > > 11:1 is not your root class, right? >> >> If so, try to apply the filter to root class - i.e. something like >> >> tc filter add dev eth1 parent 1:0 protocol ip handle 1 fw classid 11:2 > >11:0 is my root class, and the line is (as I write below): >#tc filter add dev eth1 parent 11:0 protocol ip handle 1 fw classid 11:2 Do you mean to say that the handle of the root _qdisc_ is 11:0? (I could not find that configuration command in your email) Is the traffic that should match the filter going through the qdisc at all? (you can check the qdisc counters) If it is, then maybe there is another filter (with higher priority?) that catches the packets earlier. Can it be the case? Regards /Christian [ http://benve.info ] From christian.benvenuti at libero.it Mon Jun 11 13:31:27 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 11 13:28:37 2007 Subject: [LARTC] Re: vlan interfaces and tc Message-ID: <1181561487.5881.36.camel@benve-laptop> Hi, >Hello > >I have few questions regarding tc functionality (qdiscs, classes, etc.) when >vlans are in use. For example, consider interface eth0, for which I create >and extra vlan with vconfig, let's say eth0.11. Then using tc I can add >usual things - qdiscs, filters, ... - to both eth0 and eth0.11. The >questions are: > >- on which interface - virtual or real, should I actually use tc ? It depends on what you want to control. The QoS you configure on the VLAN interface is only enforced for the traffic that goes through the VLAN interface. (Note that in this case the VLAN interface is a L3 interface). The QoS you configure on the real interface is enforced for all the traffic that goes through that interface (regardless of whether it is injected through a virtual interface). > Or are either of them allowed, depending on what I need ? Yes they are both allowed. This means, for example, that the traffic that originates from or that is addressed to a VLAN interface can potentially go through two independent QoS configurations. Depending on what you want to achieve, you may configure QoS only on the VLAN interface, only on the real interface, or on both. > If so: >- what happens if both interfaces - virtual and real have disciplines / >filters ? Does packet traverse both (I'd assume, first through eth0.11 than >through eth0) ? Yes, it traverses both. On the egress path, it traverses first eth0.11 and then eth0. On the ingress path, it traverses first eth0 and then eth0.11 Regards /Christian [ http://benve.info ] From nozo at ziu.info Mon Jun 11 14:34:47 2007 From: nozo at ziu.info (Michal Soltys) Date: Mon Jun 11 14:35:28 2007 Subject: [LARTC] Re: vlan interfaces and tc In-Reply-To: <1181561487.5881.36.camel@benve-laptop> References: <1181561487.5881.36.camel@benve-laptop> Message-ID: <466D4167.4010101@ziu.info> Christian Benvenuti wrote: > Hi, > > [cut] > > Yes they are both allowed. > This means, for example, that the traffic that originates from > or that is addressed to a VLAN interface can potentially go through > two independent QoS configurations. > Depending on what you want to achieve, you may configure QoS > only on the VLAN interface, only on the real interface, or > on both. > > [cut] > Thanks for the answers. I've made some simple tests and there seems to be one thing that doesn't work on virtual interfaces - classifying. Whenever I used filters - u32, or fw paired with iptables' mark target, or simply classify target - it was completely ignored on vlan interface, while the same setup on real interface worked fine (if it wasn't going through vlan earlier - look question below). So maybe queuing, despite it's possible to set on vlan, shouldn't be used ? (it's weird a bit, especially if someone wanted to have both disciplines at the same time). One more question though - I've noticed that marks or direct classify don't survive going through vlan interface (seems logical), so I can't use them later on the real one. In the past someone asked it on the list, and the answer was to use negative offsets with u32 filter, looking for vlan tags in layer 2 header. It seems to work fine, but is it actually safe to use ? From lartc at ssi.bg Mon Jun 11 16:19:37 2007 From: lartc at ssi.bg (Anton Glinkov) Date: Mon Jun 11 16:19:42 2007 Subject: [LARTC] EM64T and network performance Message-ID: <2764.217.79.71.231.1181571577.squirrel@217.79.71.231> Hello, Will switchning to 64bit distribution (intel EM64T) provide any performance gain to the networking code: routing (lots of rules and big routing tables), scheduling (htb) and iptables? Thank you. -- Anton Glinkov network administrator From marek at piasta.pl Mon Jun 11 16:31:51 2007 From: marek at piasta.pl (Marek Kierdelewicz) Date: Mon Jun 11 16:33:10 2007 Subject: [LARTC] EM64T and network performance In-Reply-To: <2764.217.79.71.231.1181571577.squirrel@217.79.71.231> References: <2764.217.79.71.231.1181571577.squirrel@217.79.71.231> Message-ID: <20070611163151.7ff05c80@catlap> >Hello, Hi, >Will switchning to 64bit distribution (intel EM64T) provide any >performance gain to the networking code: routing (lots of rules and big >routing tables), scheduling (htb) and iptables? You can have 64bit kernel on 32bit distro. Network traffic is processed in kernelspace, so system libraries (32-bit or 64-bit) won't be relevant. I'm curious myself about 64-bit vs 32-bit network performance comparison. cheers, Marek Kierdelewicz KoBa ISP From andang76 at gmail.com Mon Jun 11 17:20:21 2007 From: andang76 at gmail.com (Andrea) Date: Mon Jun 11 17:20:27 2007 Subject: [LARTC] multiple routing tables for internal router programs Message-ID: <466D6835.3090204@gmail.com> Maybe a strange request, I'll try to explain this as clearer as I can (forgive my bad english, please :-) ). I'm setting a linux box as a router. My router uses multiple routing tables, so I can address the traffic from specific ip addresses of my lan to distinct ISPs providers (specifying a different default gateway fo r each table), marking packets with iptables (prerouting marks). This works with the forwarding traffic (lan-ISPs) that crosses my router. But how can I reach the same result for programs/services that are working INTO the linux box? All I want is that a program (ping, for examples, or a VOIP server, better) uses a secondary routing table in the same machine. In this mode, I can manipulate route settings for different classes of program in my router. Is it possible? Thanks From cata at geniusnet.ro Mon Jun 11 17:30:21 2007 From: cata at geniusnet.ro (Catalin Bucur) Date: Mon Jun 11 17:30:38 2007 Subject: [LARTC] Re: u32 classifier In-Reply-To: <1181561439.5881.33.camel@benve-laptop> References: <1181561439.5881.33.camel@benve-laptop> Message-ID: <466D6A8D.50105@geniusnet.ro> Christian Benvenuti wrote: > Do you mean to say that the handle of the root _qdisc_ is 11:0? > (I could not find that configuration command in your email) > > Is the traffic that should match the filter going through the qdisc at all? > (you can check the qdisc counters) > > If it is, then maybe there is another filter (with higher priority?) > that catches the packets earlier. Can it be the case? No, none of this was the cause. I think that was fw filter. I've changed to u32 and everything works fine: Before: #tc filter add dev eth1 parent 11:0 protocol ip handle 1 fw classid 11:2 After: #tc filter add dev eth1 parent 11:0 protocol ip prio 1 u32 match ip dst 192.168.10.0/24 match mark 1 0xffff flowid 11:2 Cheers, -- Catalin Bucur mailto:cata@geniusnet.ro NOC @ Genius Network SRL - Galati - Romania From christian.benvenuti at libero.it Mon Jun 11 18:39:33 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 11 18:36:08 2007 Subject: [LARTC] Re: vlan interfaces and tc Message-ID: <1181579973.19121.10.camel@benve-laptop> >Christian Benvenuti wrote: >> Hi, >> > > [cut] > > > > Yes they are both allowed. > > This means, for example, that the traffic that originates from > > or that is addressed to a VLAN interface can potentially go through > > two independent QoS configurations. > > Depending on what you want to achieve, you may configure QoS > > only on the VLAN interface, only on the real interface, or > > on both. > > > > [cut] > > > >Thanks for the answers. I've made some simple tests and there seems to >be one thing that doesn't work on virtual interfaces - classifying. >Whenever I used filters - u32, or fw paired with iptables' mark target, >or simply classify target - it was completely ignored on vlan interface, >while the same setup on real interface worked fine (if it wasn't going >through vlan earlier - look question below). So maybe queuing, despite >it's possible to set on vlan, shouldn't be used ? (it's weird a bit, >especially if someone wanted to have both disciplines at the same time). This is one important detail you probably missed: >(Note that in this case the VLAN interface is a L3 interface) If you assign an IP address to the VLAN interface and you transmit IP traffic on that interface, than the traffic goes through the VLAN qdisc config and classification works (*). #vconfig add eth2 500 #ifconfig eth2.500 10.0.10.1 netmask 255.255.255.0 #tc filter add dev eth2.500 parent 1: protocol ip prio 1 \ u32 match ip dst 10.0.10.2 flowid 1:12 #ping 10.0.10.2 #tc -s -d filter list dev eth2.500 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:12 (rule hit 120 success 120) match 0a000a02/ffffffff at 16 (success 120 ) ^^^^^^^^^^^ >One more question though - I've noticed that marks or direct classify >don't survive going through vlan interface (seems logical), so I can't >use them later on the real one. >In the past someone asked it on the >list, and the answer was to use negative offsets with u32 filter, >looking for vlan tags in layer 2 header. It seems to work fine, but is >it actually safe to use ? To me it seems they do survive (I just tested it). Can it be the same issue above (*) ? Regards /Christian [ http://benve.info ] From ethy.brito at inexo.com.br Mon Jun 11 20:58:34 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Mon Jun 11 20:59:04 2007 Subject: [LARTC] shaping using source IP after NAT Message-ID: <20070611155834.075b7462@pulsar.inexo.com.br> Hi all I am using a pass trhu router and I need to QoS some clients output by its IP address. The problem is that QoS is due after NATing. Is there some clever way of doing this besides MARKing every packet with some IP hashing in POSTROUTING NAT table? Regards Ethy From javiercharne at speedy.com.ar Mon Jun 11 21:01:42 2007 From: javiercharne at speedy.com.ar (Javier Charne) Date: Mon Jun 11 21:01:53 2007 Subject: [LARTC] multiple routing tables for internal router programs In-Reply-To: <466D6835.3090204@gmail.com> References: <466D6835.3090204@gmail.com> Message-ID: <466D9C16.3000300@speedy.com.ar> Andrea escribi?: > Maybe a strange request, I'll try to explain this as clearer as I can > (forgive my bad english, please :-) ). > Est? permitido responder en castellano en esta lista? > I'm setting a linux box as a router. My router uses multiple routing > tables, so I can address the traffic from specific ip addresses of my > lan to distinct ISPs providers (specifying a different default gateway > fo r each table), marking packets with iptables (prerouting marks). > > This works with the forwarding traffic (lan-ISPs) that crosses my router. > > But how can I reach the same result for programs/services that are > working INTO the linux box? All I want is that a program (ping, for > examples, or a VOIP server, better) uses a secondary routing table in > the same machine. In this mode, I can manipulate route settings for > different classes of program in my router. > Lo que pod?s hacer es "marcar" los paquetes mediante iptables -t mangle y luego definir reglas (ip rule) para routear cada paquete de acuerdo a la marca que tenga, por las tablas (ip route) que tengas definidas. Por ejemplo: Defin?s una tabla con su gateway (alguno de tus conexiones), y le pon?s las redes que necesit?s sean "conocidas" en la tabla: ip route add 127.0.0.0/8 dev lo scope link table 100 ip route add $NET_INTERNA dev $IF_INTERNA scope link table 100 ip route add $NET_ADSL1 dev $IF_ADSL2 scope link table 100 ip route add $NET_ADSL2 dev $IF_ADSL2 scope link table 100 ip route add default dev $IF_ADSL2 via $GW_ADSL2 table 100 Defin?s una regla que todo paquete est? marcado con un 1, use esa tabla de routeo (salga por ese gateway...) ip rule add fwmark 1 table 100 Y tambi?n, marc?s con un 1 cada paquete que quer?s que use esa tabla (por ejemplo, el tr?fico web): iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 Espero te sirva. Saludos! Javier.- From vladsun at relef.net Mon Jun 11 21:02:31 2007 From: vladsun at relef.net (VladSun) Date: Mon Jun 11 21:03:03 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <20070611155834.075b7462@pulsar.inexo.com.br> References: <20070611155834.075b7462@pulsar.inexo.com.br> Message-ID: <466D9C47.7070909@relef.net> Ethy H. Brito ??????: > Hi all > > I am using a pass trhu router and I need to QoS some clients output by its > IP address. The problem is that QoS is due after NATing. > > Is there some clever way of doing this besides MARKing every packet with > some IP hashing in POSTROUTING NAT table? > > Regards > > Ethy > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > TC is performed after POSTROUTING, so you can not do any IP related TC filtering. You can use CPU friendly patches for iptables like IPMARK or IPCLASSIFY. Take a look at them. Regards! From tdiehl at rogueind.com Mon Jun 11 21:06:39 2007 From: tdiehl at rogueind.com (Tom Diehl) Date: Mon Jun 11 21:06:49 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466D9C16.3000300@speedy.com.ar> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> Message-ID: On Mon, 11 Jun 2007, Javier Charne wrote: > Andrea escribi?: >> Maybe a strange request, I'll try to explain this as clearer as I can >> (forgive my bad english, please :-) ). >> > Est? permitido responder en castellano en esta lista? >> I'm setting a linux box as a router. My router uses multiple routing >> tables, so I can address the traffic from specific ip addresses of my >> lan to distinct ISPs providers (specifying a different default gateway >> fo r each table), marking packets with iptables (prerouting marks). >> >> This works with the forwarding traffic (lan-ISPs) that crosses my router. >> >> But how can I reach the same result for programs/services that are >> working INTO the linux box? All I want is that a program (ping, for >> examples, or a VOIP server, better) uses a secondary routing table in >> the same machine. In this mode, I can manipulate route settings for >> different classes of program in my router. >> > Any possibility someone could repost this reply in english. > Lo que pod?s hacer es "marcar" los paquetes mediante iptables -t mangle > y luego definir reglas (ip rule) para routear cada paquete de acuerdo a > la marca que tenga, por las tablas (ip route) que tengas definidas. > > Por ejemplo: > Defin?s una tabla con su gateway (alguno de tus conexiones), y le pon?s > las redes que necesit?s sean "conocidas" en la tabla: > > ip route add 127.0.0.0/8 dev lo scope link table 100 > ip route add $NET_INTERNA dev $IF_INTERNA scope link table 100 > ip route add $NET_ADSL1 dev $IF_ADSL2 scope link table 100 > ip route add $NET_ADSL2 dev $IF_ADSL2 scope link table 100 > ip route add default dev $IF_ADSL2 via $GW_ADSL2 table 100 > > > Defin?s una regla que todo paquete est? marcado con un 1, use esa tabla > de routeo (salga por ese gateway...) > > ip rule add fwmark 1 table 100 > > Y tambi?n, marc?s con un 1 cada paquete que quer?s que use esa tabla > (por ejemplo, el tr?fico web): > > iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 Regards, -- Tom Diehl tdiehl@rogueind.com Spamtrap address mtd123@rogueind.com From javiercharne at speedy.com.ar Mon Jun 11 21:23:57 2007 From: javiercharne at speedy.com.ar (Javier Charne) Date: Mon Jun 11 21:24:09 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> Message-ID: <466DA14D.4060503@speedy.com.ar> Tom Diehl escribi?: >> > > Any possibility someone could repost this reply in english. > Sorry, Tom. My english is really awful. >> Lo que pod?s hacer es "marcar" los paquetes mediante iptables -t mangle >> y luego definir reglas (ip rule) para routear cada paquete de acuerdo a >> la marca que tenga, por las tablas (ip route) que tengas definidas. >> >> Por ejemplo: >> Defin?s una tabla con su gateway (alguno de tus conexiones), y le pon?s >> las redes que necesit?s sean "conocidas" en la tabla: >> >> ip route add 127.0.0.0/8 dev lo scope link table 100 >> ip route add $NET_INTERNA dev $IF_INTERNA scope link table 100 >> ip route add $NET_ADSL1 dev $IF_ADSL2 scope link table 100 >> ip route add $NET_ADSL2 dev $IF_ADSL2 scope link table 100 >> ip route add default dev $IF_ADSL2 via $GW_ADSL2 table 100 >> >> >> Defin?s una regla que todo paquete est? marcado con un 1, use esa tabla >> de routeo (salga por ese gateway...) >> >> ip rule add fwmark 1 table 100 >> >> Y tambi?n, marc?s con un 1 cada paquete que quer?s que use esa tabla >> (por ejemplo, el tr?fico web): >> >> iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 > > Regards, > I was saying Andrea: Try to define a new routing table, add a chain in mangle table for tagging packets and add a rule to deliver those packets to the new route. Again, I'm sorry. I didn't know this is a "english-only" list. Saludos! Javier.- From nozo at ziu.info Mon Jun 11 22:11:09 2007 From: nozo at ziu.info (Michal Soltys) Date: Mon Jun 11 22:11:50 2007 Subject: [LARTC] Re: vlan interfaces and tc In-Reply-To: <1181579973.19121.10.camel@benve-laptop> References: <1181579973.19121.10.camel@benve-laptop> Message-ID: <466DAC5D.6070500@ziu.info> Christian Benvenuti wrote: > > This is one important detail you probably missed: > >> (Note that in this case the VLAN interface is a L3 interface) > > If you assign an IP address to the VLAN interface and you transmit > IP traffic on that interface, than the traffic goes through the VLAN > qdisc config and classification works (*). > > [config cut] > When I was doing testing with some trivial setup, I did pretty much the same thing as in your config (forward note - also checked htb, smaller mtu, vlan if up and down). In order: #vconfig add eth0 11 #ip add add 192.168.20.10/24 dev eth0.11 broad + #ip li set eth0.11 up #tc qdisc add dev eth0.11 root handle 1:0 hfsc default 1 #tc class add dev eth0.11 parent 1:0 classid 1:1 hfsc sc rate 10mbit #tc class add dev eth0.11 parent 1:0 classid 1:21 hfsc sc rate 10mbit #tc filter add dev eth0.11 parent 1:0 proto ip prio 10 u32 flowid 1:21 \ match ip dst 192.168.20.1 #ip add sh dev eth0.11 12: eth0.11@eth0: mtu 1500 qdisc hfsc link/ether 00:0c:f1:da:e9:46 brd ff:ff:ff:ff:ff:ff inet 192.168.20.10/24 brd 192.168.20.255 scope global eth0.11 #tc -d filter sh dev eth0.11 filter parent 1: protocol ip pref 10 u32 filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:21 match c0a81401/ffffffff at 16 #tc -d class sh dev eth0.11 class hfsc 1: root class hfsc 1:1 parent 1: sc m1 0bit d 0ns m2 10000Kbit class hfsc 1:21 parent 1: sc m1 0bit d 0ns m2 10000Kbit ... then I did ping 192.168.20.1 ... and ended with #tc -d -s class sh dev eth0.11 class hfsc 1: root Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 0 level 1 class hfsc 1:1 parent 1: sc m1 0bit d 0ns m2 10000Kbit Sent 348 bytes 9 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 9 work 348 bytes rtwork 348 bytes level 0 class hfsc 1:21 parent 1: sc m1 0bit d 0ns m2 10000Kbit Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 period 0 level 0 #tc -d -s filter sh dev eth0.11 filter parent 1: protocol ip pref 10 u32 filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:21 (rule hit 0 success 0) match c0a81401/ffffffff at 16 (success 0 ) ... so I'm probably missing / not seeing something simple, or I don't know. This setup works for real interface, as well as for bonding. During testing, real interface is normally working in 192.168.100/24 subnet. "Moving" from OBSD I'm checking what I can and cannot do under linux, so my kernel is a bit full atm, with majority of stuff compiled into it. I'm using clean & patched gentoo here. From christian.benvenuti at libero.it Mon Jun 11 22:40:55 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 11 22:37:25 2007 Subject: [LARTC] Re: vlan interfaces and tc Message-ID: <1181594455.19634.25.camel@benve-laptop> Hi, >.. so I'm probably missing / not seeing something simple, or I don't >know. >This setup works for real interface, as well as for bonding. During testing, >real interface is normally working in 192.168.100/24 subnet. Is there an interface configured on the same VLAN on the other side of the link? If there is not, ARP fails (no one replies to the requests) and you never transmit anything to 192.168.20.1 (which is why the filter is not even tested). For a quick test, you can hardcode the IP/MAC mapping with ip neigh add 192.168.20.1 lladdr dev eth1.11 Your exact same config works on my system. Regards /Christian [ http://benve.info ] From marco.casaroli at gmail.com Mon Jun 11 22:44:48 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Mon Jun 11 22:44:55 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <466D9C47.7070909@relef.net> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> Message-ID: <92ed523b0706111344n404db962lc5f88cc570ea0478@mail.gmail.com> Use IFB which seems to be already on kernel 2.6 On 6/11/07, VladSun wrote: > Ethy H. Brito ??????: > > Hi all > > > > I am using a pass trhu router and I need to QoS some clients output by its > > IP address. The problem is that QoS is due after NATing. > > > > Is there some clever way of doing this besides MARKing every packet with > > some IP hashing in POSTROUTING NAT table? > > > > Regards > > > > Ethy > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > TC is performed after POSTROUTING, so you can not do any IP related TC > filtering. You can use CPU friendly patches for iptables like IPMARK or > IPCLASSIFY. Take a look at them. > > Regards! > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From nozo at ziu.info Mon Jun 11 23:16:37 2007 From: nozo at ziu.info (Michal Soltys) Date: Mon Jun 11 23:17:18 2007 Subject: [LARTC] Re: vlan interfaces and tc [solved] In-Reply-To: <1181594455.19634.25.camel@benve-laptop> References: <1181594455.19634.25.camel@benve-laptop> Message-ID: <466DBBB5.1050509@ziu.info> Christian Benvenuti wrote: > > Is there an interface configured on the same VLAN on the other side > of the link? > If there is not, ARP fails (no one replies to the requests) and you > > [cut] Bloody hell. I knew I missed something embarassing. Faked mac solved the "issue". Thanks for help ! From shemminger at linux-foundation.org Tue Jun 12 07:10:09 2007 From: shemminger at linux-foundation.org (Stephen Hemminger) Date: Tue Jun 12 07:10:38 2007 Subject: [LARTC] Re: vlan interfaces and tc In-Reply-To: <466DAC5D.6070500@ziu.info> References: <1181579973.19121.10.camel@benve-laptop> <466DAC5D.6070500@ziu.info> Message-ID: <20070611221009.73521d04@localhost.localdomain> On Mon, 11 Jun 2007 22:11:09 +0200 Michal Soltys wrote: > Christian Benvenuti wrote: > > > > This is one important detail you probably missed: > > > >> (Note that in this case the VLAN interface is a L3 interface) > > > > If you assign an IP address to the VLAN interface and you transmit > > IP traffic on that interface, than the traffic goes through the VLAN > > qdisc config and classification works (*). > > > > [config cut] > > > > When I was doing testing with some trivial setup, I did pretty much the same > thing as in your config (forward note - also checked htb, smaller mtu, vlan > if up and down). > > In order: > > #vconfig add eth0 11 > #ip add add 192.168.20.10/24 dev eth0.11 broad + > #ip li set eth0.11 up > > #tc qdisc add dev eth0.11 root handle 1:0 hfsc default 1 > #tc class add dev eth0.11 parent 1:0 classid 1:1 hfsc sc rate 10mbit > #tc class add dev eth0.11 parent 1:0 classid 1:21 hfsc sc rate 10mbit > > #tc filter add dev eth0.11 parent 1:0 proto ip prio 10 u32 flowid 1:21 \ > match ip dst 192.168.20.1 > > #ip add sh dev eth0.11 > > 12: eth0.11@eth0: mtu 1500 qdisc hfsc > link/ether 00:0c:f1:da:e9:46 brd ff:ff:ff:ff:ff:ff > inet 192.168.20.10/24 brd 192.168.20.255 scope global eth0.11 > > #tc -d filter sh dev eth0.11 > > filter parent 1: protocol ip pref 10 u32 > filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 > filter parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 > bkt 0 flowid 1:21 match c0a81401/ffffffff at 16 > > #tc -d class sh dev eth0.11 > > class hfsc 1: root > class hfsc 1:1 parent 1: sc m1 0bit d 0ns m2 10000Kbit > class hfsc 1:21 parent 1: sc m1 0bit d 0ns m2 10000Kbit > > ... then I did > > ping 192.168.20.1 > > ... and ended with > > #tc -d -s class sh dev eth0.11 > > class hfsc 1: root > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > period 0 level 1 > > class hfsc 1:1 parent 1: sc m1 0bit d 0ns m2 10000Kbit > Sent 348 bytes 9 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > period 9 work 348 bytes rtwork 348 bytes level 0 > > class hfsc 1:21 parent 1: sc m1 0bit d 0ns m2 10000Kbit > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > period 0 level 0 > > #tc -d -s filter sh dev eth0.11 > > filter parent 1: protocol ip pref 10 u32 > filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 > filter parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 > bkt 0 flowid 1:21 (rule hit 0 success 0) > match c0a81401/ffffffff at 16 (success 0 ) > > > ... so I'm probably missing / not seeing something simple, or I don't know. > This setup works for real interface, as well as for bonding. During testing, > real interface is normally working in 192.168.100/24 subnet. > > "Moving" from OBSD I'm checking what I can and cannot do under linux, so my > kernel is a bit full atm, with majority of stuff compiled into it. > > I'm using clean & patched gentoo here. Doing traffic control on vlan's may work as expected because the vlan pseudo-device does not have any transmit queue. -- Stephen Hemminger From andang76 at gmail.com Tue Jun 12 09:59:31 2007 From: andang76 at gmail.com (Andrea) Date: Tue Jun 12 09:59:42 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466DA14D.4060503@speedy.com.ar> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> <466DA14D.4060503@speedy.com.ar> Message-ID: <466E5263.2040602@gmail.com> > I was saying Andrea: Try to define a new routing table, add a chain in > mangle table for tagging packets and add a rule to deliver those packets > to the new route. > Again, I'm sorry. I didn't know this is a "english-only" list. Thanks for the reply. This is the exact way that I used for managing traffic of my lan towards ISPs. But is this mode still valid if I want to manage services executed directly in the router? this rule: iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 capture all (web) traffic that crosses my router. Can I capture only the (web) traffic generated from my router and directed to internet? Anymore, I don't need it more: I've resolved my problem, the conflict between a "ping script" (that I'm writing for multiple gateway testing)and servers executed in router too: first version of my script sets a default gateway for testing it with ping, now I've discovered that I can use a specific route involving the gateway without setting default gateway, a much better solution. From salim.si at cipherium.com.tw Tue Jun 12 10:09:04 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Tue Jun 12 10:09:28 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E5263.2040602@gmail.com> Message-ID: <001701c7acc8$f4ed0840$5964a8c0@SalimSi> You have to capture the local packets in OUTPUT chain, not in PREROUTING. Well, I have a problem with the ping scripts used for dead gateway detection, I will post it in another thread. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Andrea Sent: Tuesday, June 12, 2007 4:00 PM Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs > I was saying Andrea: Try to define a new routing table, add a chain in > mangle table for tagging packets and add a rule to deliver those packets > to the new route. > Again, I'm sorry. I didn't know this is a "english-only" list. Thanks for the reply. This is the exact way that I used for managing traffic of my lan towards ISPs. But is this mode still valid if I want to manage services executed directly in the router? this rule: iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 capture all (web) traffic that crosses my router. Can I capture only the (web) traffic generated from my router and directed to internet? Anymore, I don't need it more: I've resolved my problem, the conflict between a "ping script" (that I'm writing for multiple gateway testing)and servers executed in router too: first version of my script sets a default gateway for testing it with ping, now I've discovered that I can use a specific route involving the gateway without setting default gateway, a much better solution. _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From rabbit at rabbit.us Tue Jun 12 11:01:45 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 12 11:01:53 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E5263.2040602@gmail.com> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> <466DA14D.4060503@speedy.com.ar> <466E5263.2040602@gmail.com> Message-ID: <466E60F9.4080606@rabbit.us> Andrea wrote: > This is the exact way that I used for managing traffic of my lan towards > ISPs. But is this mode still valid if I want to manage services > executed directly in the router? > > this rule: > > iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 1 > > capture all (web) traffic that crosses my router. Can I capture only the > (web) traffic generated from my router and directed to internet? > > Anymore, I don't need it more: I've resolved my problem, the conflict > between a "ping script" (that I'm writing for multiple gateway > testing)and servers executed in router too: first version of my script > sets a default gateway for testing it with ping, now I've discovered > that I can use a specific route involving the gateway without setting > default gateway, a much better solution. > It can and can not be done at the same time, depends on what you are doing. Normally for bound services you have this: o Service is bound to a specific IP 1.2.3.4 o Its outgoing packet has SRC of 1.2.3.4 o You mark it in the OUTPUT chain based on that SRC o The routing (which occurs after OUTPUT) acts on the MARK Now what happens when there is no specific binding (you send from 0.0.0.0): o Program requests a socket from the kernel, supplying only a DST o The kernel consults the _default_ routing table (because it does not know any better, there are no marks yet), and _assigns_ a SRC that seems the closest to this particular DST o Everything else happens as in the scenario above So depending on what you are doing it might help you or it might drive you insane. In your case it plays out nicely - you can request a specific interface (what you would do with the ping script), and you are guaranteed that packets are going this direction. But if you want to _balance_ locally generated traffic - you can not do anything short of NATing local connections (ugly), because the routing sort of happens before netfilter had a chance to play. From andang76 at gmail.com Tue Jun 12 11:15:39 2007 From: andang76 at gmail.com (Andrea) Date: Tue Jun 12 11:15:45 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E60F9.4080606@rabbit.us> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> <466DA14D.4060503@speedy.com.ar> <466E5263.2040602@gmail.com> <466E60F9.4080606@rabbit.us> Message-ID: <466E643B.5060404@gmail.com> Peter Rabbitson ha scritto: > o The routing (which occurs after OUTPUT) acts on the MARK ^ This is the focal point I'm searching for > Now what happens when there is no specific binding (you send from 0.0.0.0): [snip] Very very clear. Thanks very much!!! The only still obscure aspect for me is this: >you can request a specific interface (what you would do with the ping script) From rabbit at rabbit.us Tue Jun 12 11:29:20 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 12 11:29:29 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E643B.5060404@gmail.com> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> <466DA14D.4060503@speedy.com.ar> <466E5263.2040602@gmail.com> <466E60F9.4080606@rabbit.us> <466E643B.5060404@gmail.com> Message-ID: <466E6770.5080100@rabbit.us> Andrea wrote: > Very very clear. Thanks very much!!! The only still obscure aspect for > me is this: > > >you can request a specific interface (what you would do with the ping > script) Check the man page of ping, and look for the '-I' option. Most network testing utilities have this capability in one form or another.By the way if you request an _interface_ and not a specific IP, the first IP of the interface is taken as listed by `ip addr` From salim.si at cipherium.com.tw Tue Jun 12 11:49:57 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Tue Jun 12 11:50:24 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E6770.5080100@rabbit.us> Message-ID: <001f01c7acd7$0fd76930$5964a8c0@SalimSi> Here is my issue with ping. When I use -I with ping, the DNS queries for that domain is still sent out with wrong source address through the interface, and hence, no reply. This happens in both WAN interfaces. When I add rules in OUTPUT chain to reroute packets with the unmatching source address and output interface, things work fine. When I use IP address instead of URL, everything is fine. I have applied Julian's routes patch. What could be the problem? -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Tuesday, June 12, 2007 5:29 PM To: Andrea Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs Andrea wrote: > Very very clear. Thanks very much!!! The only still obscure aspect for > me is this: > > >you can request a specific interface (what you would do with the ping > script) Check the man page of ping, and look for the '-I' option. Most network testing utilities have this capability in one form or another.By the way if you request an _interface_ and not a specific IP, the first IP of the interface is taken as listed by `ip addr` _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From rabbit at rabbit.us Tue Jun 12 12:02:18 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 12 12:02:25 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <001f01c7acd7$0fd76930$5964a8c0@SalimSi> References: <001f01c7acd7$0fd76930$5964a8c0@SalimSi> Message-ID: <466E6F2A.7090202@rabbit.us> Salim S I wrote: > Here is my issue with ping. > > When I use -I with ping, the DNS queries for that domain is still sent > out with wrong source address through the interface, and hence, no > reply. This happens in both WAN interfaces. > When I add rules in OUTPUT chain to reroute packets with the unmatching > source address and output interface, things work fine. > > When I use IP address instead of URL, everything is fine. > The problem is ping itself, which uses gethostbyname() which in turn does not understand how to bind to specific interfaces etc. Besides specifying IP addresses instead of hostnames is much much better IMHO. Here is the ping.c snippet: while (argc > 0) { target = *argv; bzero((char *)&whereto, sizeof(whereto)); whereto.sin_family = AF_INET; if (inet_aton(target, &whereto.sin_addr) == 1) { hostname = target; if (argc == 1) options |= F_NUMERIC; } else { hp = gethostbyname2(target, AF_INET); if (!hp) { fprintf(stderr, "ping: unknown host %s\n", target); exit(2); } memcpy(&whereto.sin_addr, hp->h_addr, 4); strncpy(hnamebuf, hp->h_name, sizeof(hnamebuf) - 1); hnamebuf[sizeof(hnamebuf) - 1] = 0; hostname = hnamebuf; } if (argc > 1) route[nroute++] = whereto.sin_addr.s_addr; argc--; argv++; } From andang76 at gmail.com Tue Jun 12 12:10:16 2007 From: andang76 at gmail.com (Andrea) Date: Tue Jun 12 12:10:23 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E6770.5080100@rabbit.us> References: <466D6835.3090204@gmail.com> <466D9C16.3000300@speedy.com.ar> <466DA14D.4060503@speedy.com.ar> <466E5263.2040602@gmail.com> <466E60F9.4080606@rabbit.us> <466E643B.5060404@gmail.com> <466E6770.5080100@rabbit.us> Message-ID: <466E7108.9050304@gmail.com> Peter Rabbitson ha scritto: > Check the man page of ping, and look for the '-I' option. Most network > testing utilities have this capability in one form or another.By the way > if you request an _interface_ and not a specific IP, the first IP of the > interface is taken as listed by `ip addr` Didn't know about this option. With this, my (old) script should work fine too. Another lesson learned, thanks :-) From salim.si at cipherium.com.tw Tue Jun 12 12:20:34 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Tue Jun 12 12:20:55 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E6F2A.7090202@rabbit.us> Message-ID: <002001c7acdb$53c25b10$5964a8c0@SalimSi> Thanks! I get it now. But why the src address for the interface is wrong? In my case eth2 has a.b.c.d and eth3 has p.q.r.s. DNS queries going through eth2 has p.q.r.s as src address and those going through eth3 has a.b.c.d. Something wrong with routing? I was wondering, how the ping script (to check the lonk status) of others work id domain name is used. -----Original Message----- From: Peter Rabbitson [mailto:rabbit@rabbit.us] Sent: Tuesday, June 12, 2007 6:02 PM To: Salim S I Cc: 'Andrea'; lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs Salim S I wrote: > Here is my issue with ping. > > When I use -I with ping, the DNS queries for that domain is still sent > out with wrong source address through the interface, and hence, no > reply. This happens in both WAN interfaces. > When I add rules in OUTPUT chain to reroute packets with the unmatching > source address and output interface, things work fine. > > When I use IP address instead of URL, everything is fine. > The problem is ping itself, which uses gethostbyname() which in turn does not understand how to bind to specific interfaces etc. Besides specifying IP addresses instead of hostnames is much much better IMHO. Here is the ping.c snippet: while (argc > 0) { target = *argv; bzero((char *)&whereto, sizeof(whereto)); whereto.sin_family = AF_INET; if (inet_aton(target, &whereto.sin_addr) == 1) { hostname = target; if (argc == 1) options |= F_NUMERIC; } else { hp = gethostbyname2(target, AF_INET); if (!hp) { fprintf(stderr, "ping: unknown host %s\n", target); exit(2); } memcpy(&whereto.sin_addr, hp->h_addr, 4); strncpy(hnamebuf, hp->h_name, sizeof(hnamebuf) - 1); hnamebuf[sizeof(hnamebuf) - 1] = 0; hostname = hnamebuf; } if (argc > 1) route[nroute++] = whereto.sin_addr.s_addr; argc--; argv++; } From rabbit at rabbit.us Tue Jun 12 13:23:42 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 12 13:23:49 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <002001c7acdb$53c25b10$5964a8c0@SalimSi> References: <002001c7acdb$53c25b10$5964a8c0@SalimSi> Message-ID: <466E823E.8020509@rabbit.us> Salim S I wrote: > Thanks! I get it now. > But why the src address for the interface is wrong? > In my case eth2 has a.b.c.d and eth3 has p.q.r.s. > > DNS queries going through eth2 has p.q.r.s as src address and those > going through eth3 has a.b.c.d. Something wrong with routing? Possible. Post full configuration and someone might be able to help. > I was wondering, how the ping script (to check the lonk status) of > others work id domain name is used. Don't know about others, and I personally use ip addresses :) From luciano at lugmen.org.ar Wed Jun 13 04:52:27 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Wed Jun 13 04:52:55 2007 Subject: [LARTC] Multihome load balancing - kernel vs netfilter In-Reply-To: <000101c7a73d$79483f10$5964a8c0@SalimSi> References: <000101c7a73d$79483f10$5964a8c0@SalimSi> Message-ID: <200706122352.27708.luciano@lugmen.org.ar> On Tuesday 05 June 2007 03:48:01 Salim S I wrote: > -----Original Message----- > From: Luciano Ruete [mailto:luciano@lugmen.org.ar] > Sent: Saturday, June 02, 2007 11:28 AM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > >Is not about ego, sorry if you take this personal, it is not my > > intention, >i > > >speak rude because this list get heavly indexed by google, and it is > > taked >as > > >good advice for many answer seekers. > > > >You afirm that Linux cannot handle load balancing properly and this is > >completly WRONG and is bad advertising and a lie. > > > >Since 2.4 series has been avaible the greats julian's patchs[1], and > > then >in > > >2.6.12 CONNMARK has get in mainline, and with a litle of setup all > >connection > >problems related to load balancing get perfectly solved. > > I did not say Linux can't do Load balancing (btw, my setup has Julian's > DGD patch as well as CONNMARK). But there are some limitations to the > popular methods currently used. > > 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the > separate control and data servers. He mentions AIM servers as example. > This probably can only be solved by having exception IP list. Ok, this is one clear example where NAT concept fails in load balancing, this is not much Linux related, there is no magic to be done here but to write special code for that protcol(as a helper, right). But the helper is not needed in a normal setup. This AIM "geniality" could be done transparent to the end user, and even doing it at clients code, there is not need to use the IP as a part of the auth mecanism, is for shure insecure and useless. So i will not blame Linux on this one. > 2.The other situation, and the one I am more concerned, is about > different connections which belongs to same session. > > Consider Client X and Server Y. > > Client X initiates a connection from port a to port b of server Y. > > Xa <---> Yb This connection goes through WAN1. > > After sometime, X opens another connection to Y from port c to port d. > > Xc <---> Yd This is a perfectly new TCP connection, so it may go > through WAN2 > > (Note that the client is NATed, and that no CONNTRACK exist for this > app) > > The server may reject the second and subsequent connections as it comes > in with a different source IP than the first. well it is perfectly clear now. > This situation happens often in IM and Gaming scenarios. I really don't know what IM protocol do this..., if you can specify will be better, i have no complains about any IM issue related to load balancing, i personally use MSN and Jabber protocols and have not problems at all. In games i'm not an expert, but i will like to know what is the percentage of this special games. Route cache will help as mentioned in case like this but it is not fail safe. But again, this are things that affect not only a Linux box doing load balancing, it will affect any other solution, and AFAIKSee you need to start to write special per protocol helpers _only_ for load balancing proupouses. This are application exceptions, they are not linux fails, they are not designed taking in account special setups like NATed load balancing, that's it. > Some sort of IP > persistence is required to handle this. And I was wondering if recent > match would solve this to an extent, without affecting performance. Or > if there are some other method available. (Note that I can't depend much > on cache). I think "recent" could work, matching only the special ports(and optionally each client address) the impact on other clients will using same ports but with different applications will be mostly null. -- Luciano From salim.si at cipherium.com.tw Wed Jun 13 06:08:22 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Wed Jun 13 06:08:53 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <466E823E.8020509@rabbit.us> Message-ID: <000201c7ad70$7f4c2eb0$5964a8c0@SalimSi> My configuration root@127.0.0.1:~# ip ru 0: from all lookup local 32150: from all lookup main 32201: from all fwmark 0x200/0x200 lookup wan1_route 32202: from all fwmark 0x400/0x400 lookup wan2_route 32203: from all lookup catch_all 32766: from all lookup main 32767: from all lookup default root@127.0.0.1:~# ip ro li ta main 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.254 10.20.0.0/24 dev eth2 proto kernel scope link src 10.20.0.137 192.168.1.0/24 dev eth10 proto kernel scope link src 192.168.1.254 10.2.3.0/24 dev eth3 proto kernel scope link src 10.2.3.107 127.0.0.0/8 dev lo scope link root@127.0.0.1:~# ip ro li ta wan1_route default via 10.20.0.1 dev eth2 proto static root@127.0.0.1:~# ip ro li ta wan2_route default via 10.2.3.254 dev eth3 proto static root@127.0.0.1:~# ip ro li ta catch_all default proto static nexthop via 10.20.0.1 dev eth2 weight 1 nexthop via 10.2.3.254 dev eth3 weight 1 The catch_all table comes into play only for local packets. All forwarded packets are marked in mangle PREROUTING, with 0x200 0r 0x400. If not loadblancing ping script, there maybe other apps using domain names instead of IP address, they might still fail, right? The problem happens when one of the link goes down (not the nexthop,but after that). Then the kernel will pick an interface and wrong src IP for local packets. -----Original Message----- From: Peter Rabbitson [mailto:rabbit@rabbit.us] Sent: Tuesday, June 12, 2007 7:24 PM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs Salim S I wrote: > Thanks! I get it now. > But why the src address for the interface is wrong? > In my case eth2 has a.b.c.d and eth3 has p.q.r.s. > > DNS queries going through eth2 has p.q.r.s as src address and those > going through eth3 has a.b.c.d. Something wrong with routing? Possible. Post full configuration and someone might be able to help. > I was wondering, how the ping script (to check the lonk status) of > others work id domain name is used. Don't know about others, and I personally use ip addresses :) From gtaylor at riverviewtech.net Wed Jun 13 07:27:22 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 13 07:27:37 2007 Subject: [LARTC] Will this work, or have I been around too much magic smoke??? Message-ID: <466F803A.1030503@riverviewtech.net> Will this (in theory) work, or have I been around too much magic smoke that has escaped from fried equipment??? I have a system with two different internet connections. One connection is a WISP via an external bridging radio (ethernet to proprietary wireless back haul). The other connection is PPPoE ADSL via the local phone company. (I think) I am wanting to use equal cost multi path routing to try to utilize both of these connections. After reading some other information I'm not entirely sure that I do want to use ECMP routing. However, this is out side of this discussion. To utilize ECMP routing, you need two or more static upstream gateways. The problem is that one of my upstream gateways is dynamic via PPPoE. Thus I do not have two static default routes to add via the "ip route ... nexthop ..." command. So, my proposed theoretical solution. (At least so far it has sounded good in my head.) Use socat (http://www.dest-unreach.org/socat/)to create a pair of virtual TUN interfaces that are connected with each other. With these two additional virtual TUN interfaces, I *THINK* I can split the routing in to multiple tables. The main routing table would contain lo, eth0 (WISP), tun0, and eth2 (internal LAN). While the virtualRouter routing table would contain tun1 and eth1 (ADSL). If I use ip rule(s) to determine which routing table to use, I think I can get the system to virtually act like two different routers. The hope is that I can put a common subnet on tun0 and tun1 that exists in both routing tables, but with only one interface local to each routing table. Thus each routing table will (hopefully) think that it has to go across the virtual point to point interface to reach the other end of / IP on the subnet. *IF*, and this is a big if, I can get this to work like I've tried to explain, I think I can have the virtual (non default / main) router do nothing but translate the PPPoE to raw IP thus presenting an additional upstream static IP to the main system, thus allowing the main system to see two static upstream gateways. Ultimately I see the routing tables as such: main routing table(s): lo: 127.0.0.1/8 eth0: A.B.C.D/24 (WISP) eth2: 192.168.0.254/24 (LAN) tun0: 192.168.1.253/24 (virtual point-to-point) virtualRouter routing table(s): lo: 127.0.0.1/8 eth1: (PPPoE ADSL) tun1: 192.168.1.254/24 (virtual point-to-point) ppp0: M.N.O.P/24 (ADSL ISP) Some packet flow might help make it easier to understand. Traffic flowing from the LAN out through the main system out through the PPPoE would pass through the system as such: 1) In the eth2 LAN interface out the tun0 virtual interface. 2)**In the tun0 virtual interface out tun1 virtual interface.** 3) In the tun1 interface out the ppp0 interface. 4) In the ppp0 interface out the eth0 (ADSL) interface. Returning traffic would take this path: 1) In the eth0 (ADSL) interface out the ppp0 interface. 2) In the ppp0 interface out the tun1 interface. 3)**In the tun1 interface out the tun0 interface.** 4) In the tun0 interface out the eth2 (LAN) interface. Steps 2 and 3 respectively (*ed lines) are where the traffic would go from one routing table to the other. So, now that I have tried to explain what I'm wanting to do, and probably thoroughly made a mess of it, do you think that at least in theory this is possible? Grant. . . . From ranko at spidernet.net Wed Jun 13 18:40:30 2007 From: ranko at spidernet.net (Ranko Zivojnovic) Date: Wed Jun 13 18:40:53 2007 Subject: [LARTC] HTB deadlock Message-ID: <1181752830.9399.66.camel@ranko-fc2.spidernet.net> Greetings, I've been experiencing problems with HTB where the whole machine locks up. This usually happens when the whole qdisc is being removed and occasionally when a leaf is being removed. Common is that it always happens when some sort of removal is in progress. Console output I have captured is at the end of this message. The same behavior exists from vanilla 2.6.19.7 and above. It is possible that the problem also exist in the earlier versions however I did not go further back. I also believe I have found where the actual problem is: qdisc_destroy() function is always called with dev->queue_lock locked. htb_destroy() function up the stack is using del_timer_sync() call to deactivate HTB qdisc timers. >From the comments in the source where del_timer_sync() is defined: ---copy/paste--- /** * del_timer_sync - deactivate a timer and wait for the handler to finish. * @timer: the timer to be deactivated * * This function only differs from del_timer() on SMP: besides deactivating * the timer it also makes sure the handler has finished executing on other * CPUs. * * Synchronization rules: Callers must prevent restarting of the timer, * otherwise this function is meaningless. It must not be called from * interrupt contexts. The caller must not hold locks which would prevent * completion of the timer's handler. The timer's handler must not call * add_timer_on(). Upon exit the timer is not queued and the handler is * not running on any CPU. * * The function returns whether it has deactivated a pending timer or not. */ ---copy/paste--- Now, htb_rate_timer() does exactly what appears to be the source of the problem - it tries obtain dev->queue_lock - and given the right moment (timer fired handler while qdisc_destroy was holding the lock) - system locks up - del_timer_sync is waiting for handler to finish while the handler is waiting for the dev->queue_lock. Of course I could also be completely wrong here and missing something not so obvious. I could also attempt to fix this but I haven't dealt with this code in the past so I was hoping someone with better insight might just have an elegant solution up his sleeve. Best regards, Ranko PS: If this list is not the right place for this report - please let me know. -----------CONSOLE (2.6.19.7)----------- BUG: soft lockup detected on CPU#3! [] softlockup_tick+0x93/0xc2 [] update_process_times+0x26/0x5c [] smp_apic_timer_interrupt+0x97/0xb2 [] apic_timer_interrupt+0x1f/0x24 [] klist_next+0x4/0x8a [] _spin_unlock_irqrestore+0xa/0xc [] try_to_del_timer_sync+0x47/0x4f [] del_timer_sync+0xe/0x14 [] htb_destroy+0x20/0x7b [sch_htb] [] qdisc_destroy+0x44/0x8d [] htb_destroy_class+0xd0/0x12d [sch_htb] [] htb_destroy_class+0x52/0x12d [sch_htb] [] htb_destroy+0x3f/0x7b [sch_htb] [] qdisc_destroy+0x44/0x8d [] htb_destroy_class+0xd0/0x12d [sch_htb] [] htb_destroy_class+0x52/0x12d [sch_htb] [] htb_destroy+0x3f/0x7b [sch_htb] [] qdisc_destroy+0x44/0x8d [] tc_get_qdisc+0x1a3/0x1ef [] tc_get_qdisc+0x0/0x1ef [] rtnetlink_rcv_msg+0x158/0x215 [] rtnetlink_rcv_msg+0x0/0x215 [] netlink_run_queue+0x88/0x11d [] rtnetlink_rcv+0x26/0x42 [] netlink_data_ready+0x12/0x54 [] netlink_sendskb+0x1c/0x33 [] netlink_sendmsg+0x1ee/0x2d7 [] sock_sendmsg+0xe5/0x100 [] autoremove_wake_function+0x0/0x37 [] autoremove_wake_function+0x0/0x37 [] sock_sendmsg+0xe5/0x100 [] copy_from_user+0x33/0x69 [] sys_sendmsg+0x12d/0x243 [] _read_unlock_irq+0x5/0x7 [] find_get_page+0x37/0x42 [] filemap_nopage+0x30c/0x3a3 [] __handle_mm_fault+0x21c/0x943 [] _spin_unlock_bh+0x5/0xd [] sock_setsockopt+0x63/0x59d [] anon_vma_prepare+0x1b/0xcb [] sys_socketcall+0x24f/0x271 [] do_page_fault+0x0/0x600 [] sysenter_past_esp+0x56/0x79 ======================= BUG: soft lockup detected on CPU#1! [] softlockup_tick+0x93/0xc2 [] update_process_times+0x26/0x5c [] smp_apic_timer_interrupt+0x97/0xb2 [] apic_timer_interrupt+0x1f/0x24 [] blk_do_ordered+0x70/0x27e [] _raw_spin_lock+0xaa/0x13e [] htb_rate_timer+0x18/0xc4 [sch_htb] [] run_timer_softirq+0x163/0x189 [] htb_rate_timer+0x0/0xc4 [sch_htb] [] __do_softirq+0x70/0xdb [] do_softirq+0x3b/0x42 [] smp_apic_timer_interrupt+0x9c/0xb2 [] apic_timer_interrupt+0x1f/0x24 [] mwait_idle_with_hints+0x3b/0x3f [] mwait_idle+0xc/0x1b [] cpu_idle+0x63/0x79 ======================= BUG: soft lockup detected on CPU#2! [] softlockup_tick+0x93/0xc2 [] update_process_times+0x26/0x5c [] smp_apic_timer_interrupt+0x97/0xb2 [] apic_timer_interrupt+0x1f/0x24 [] blk_do_ordered+0x70/0x27e [] _raw_spin_lock+0xaa/0x13e [] dev_queue_xmit+0x53/0x2e4 [] neigh_connected_output+0x80/0xa0 [] ip_output+0x1b5/0x24b [] ip_finish_output+0x0/0x192 [] ip_forward+0x1c8/0x2b9 [] ip_forward_finish+0x0/0x37 [] ip_rcv+0x2a5/0x538 [] ip_rcv_finish+0x0/0x2aa [] __netdev_alloc_skb+0x12/0x2a [] ip_rcv+0x0/0x538 [] netif_receive_skb+0x218/0x318 [] bitmap_get_counter+0x41/0x1e6 [] e1000_clean_rx_irq+0x12c/0x4ef [e1000] [] e1000_clean_rx_irq+0x0/0x4ef [e1000] [] e1000_clean+0xe5/0x130 [e1000] [] net_rx_action+0xbc/0x1d5 [] __do_softirq+0x70/0xdb [] do_softirq+0x3b/0x42 [] do_IRQ+0x6c/0xda [] common_interrupt+0x1a/0x20 [] mwait_idle_with_hints+0x3b/0x3f [] mwait_idle+0xc/0x1b [] cpu_idle+0x63/0x79 ======================= BUG: soft lockup detected on CPU#0! [] softlockup_tick+0x93/0xc2 [] update_process_times+0x26/0x5c [] smp_apic_timer_interrupt+0x97/0xb2 [] apic_timer_interrupt+0x1f/0x24 [] delay_tsc+0x7/0x13 [] __delay+0x6/0x7 [] _raw_spin_lock+0xb8/0x13e [] dev_queue_xmit+0x53/0x2e4 [] neigh_connected_output+0x80/0xa0 [] ip_output+0x1b5/0x24b [] ip_finish_output+0x0/0x192 [] ip_forward+0x1c8/0x2b9 [] ip_forward_finish+0x0/0x37 [] ip_rcv+0x2a5/0x538 [] ip_rcv_finish+0x0/0x2aa [] __alloc_skb+0x47/0xf3 [] ip_rcv+0x0/0x538 [] netif_receive_skb+0x218/0x318 [] bitmap_get_counter+0x41/0x1e6 [] tg3_poll+0x6d3/0x906 [tg3] [] net_rx_action+0xbc/0x1d5 [] __do_softirq+0x70/0xdb [] do_softirq+0x3b/0x42 [] do_IRQ+0x6c/0xda [] common_interrupt+0x1a/0x20 [] mwait_idle_with_hints+0x3b/0x3f [] mwait_idle+0xc/0x1b [] cpu_idle+0x63/0x79 [] start_kernel+0x353/0x423 [] unknown_bootoption+0x0/0x260 ======================= -----------CONSOLE----------- From ethy.brito at inexo.com.br Wed Jun 13 20:18:28 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Wed Jun 13 20:18:04 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <466D9C47.7070909@relef.net> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> Message-ID: <20070613151828.2fe236a8@babalu.inexo.com.br> On Mon, 11 Jun 2007 22:02:31 +0300 VladSun wrote: > TC is performed after POSTROUTING, so you can not do any IP related TC > filtering. You can use CPU friendly patches for iptables like IPMARK or > IPCLASSIFY. Take a look at them. Ok. Can someone point me the right direction to add IPMARK kernel support? I downloaded patch-o-matic today's snapshot and there is no IPMARK there. I have iptables-1.3.7 and kernel 2.6.21.1 sources (distro is slackware 11.0) The curious thing is that IPMARK is at iptables man page but I got and error when I execute it. It says it could not find /usr/lib/iptables/libipt_IPMARK.so: # locate -i IPMARK # (no output here) Regards. Ethy From vladsun at relef.net Wed Jun 13 22:20:28 2007 From: vladsun at relef.net (VladSun) Date: Wed Jun 13 22:20:59 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <20070613151828.2fe236a8@babalu.inexo.com.br> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> <20070613151828.2fe236a8@babalu.inexo.com.br> Message-ID: <4670518C.1090806@relef.net> Ethy H. Brito ??????: > On Mon, 11 Jun 2007 22:02:31 +0300 > VladSun wrote: > > > >> TC is performed after POSTROUTING, so you can not do any IP related TC >> filtering. You can use CPU friendly patches for iptables like IPMARK or >> IPCLASSIFY. Take a look at them. >> > > Ok. Can someone point me the right direction to add IPMARK kernel support? > > I downloaded patch-o-matic today's snapshot and there is no IPMARK there. > > I have iptables-1.3.7 and kernel 2.6.21.1 sources (distro is slackware 11.0) > > The curious thing is that IPMARK is at iptables man page but I got and > error when I execute it. It says it could not > find /usr/lib/iptables/libipt_IPMARK.so: > > # locate -i IPMARK > # (no output here) > > > Regards. > > Ethy > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > Try "./runme download" in tge PoM directory. It should work if there is defined download URL for IPMARK in the source.list file in the PoM directory. If it doesn't work try to download older version of PoM. That is because netfilter team has refused to include IPMARK in the official versions some time ago. Regards From salim.si at cipherium.com.tw Thu Jun 14 05:50:30 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu Jun 14 05:50:49 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <000201c7ad70$7f4c2eb0$5964a8c0@SalimSi> Message-ID: <000601c7ae37$276b3480$5964a8c0@SalimSi> I solved it, thought a bit ugly. Have two more rules now in ip ru 32150: from all lookup main 32201: from all fwmark 0x200/0x200 lookup wan1_route 32202: from all fwmark 0x400/0x400 lookup wan2_route 32203: from 10.20.0.137 lookup wan1_route 32204: from 10.2.3.107 lookup wan2_route 32205: from all lookup catch_all 32766: from all lookup main I did not like to include WAN IP anywhere, coz it may be dynamic, but well, seems like no choice. And then two rules in OUTPUT chain Iptables -t mangle -A OUTPUT -o eth2 -j LB1 Iptables -t mangle -A OUTPUT -o eth3 -j LB2 -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Salim S I Sent: Wednesday, June 13, 2007 12:08 PM To: 'Peter Rabbitson' Cc: lartc@mailman.ds9a.nl Subject: RE: [LARTC] Re: multiple routing tables for internal router programs My configuration root@127.0.0.1:~# ip ru 0: from all lookup local 32150: from all lookup main 32201: from all fwmark 0x200/0x200 lookup wan1_route 32202: from all fwmark 0x400/0x400 lookup wan2_route 32203: from all lookup catch_all 32766: from all lookup main 32767: from all lookup default root@127.0.0.1:~# ip ro li ta main 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.254 10.20.0.0/24 dev eth2 proto kernel scope link src 10.20.0.137 192.168.1.0/24 dev eth10 proto kernel scope link src 192.168.1.254 10.2.3.0/24 dev eth3 proto kernel scope link src 10.2.3.107 127.0.0.0/8 dev lo scope link root@127.0.0.1:~# ip ro li ta wan1_route default via 10.20.0.1 dev eth2 proto static root@127.0.0.1:~# ip ro li ta wan2_route default via 10.2.3.254 dev eth3 proto static root@127.0.0.1:~# ip ro li ta catch_all default proto static nexthop via 10.20.0.1 dev eth2 weight 1 nexthop via 10.2.3.254 dev eth3 weight 1 The catch_all table comes into play only for local packets. All forwarded packets are marked in mangle PREROUTING, with 0x200 0r 0x400. If not loadblancing ping script, there maybe other apps using domain names instead of IP address, they might still fail, right? The problem happens when one of the link goes down (not the nexthop,but after that). Then the kernel will pick an interface and wrong src IP for local packets. -----Original Message----- From: Peter Rabbitson [mailto:rabbit@rabbit.us] Sent: Tuesday, June 12, 2007 7:24 PM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs Salim S I wrote: > Thanks! I get it now. > But why the src address for the interface is wrong? > In my case eth2 has a.b.c.d and eth3 has p.q.r.s. > > DNS queries going through eth2 has p.q.r.s as src address and those > going through eth3 has a.b.c.d. Something wrong with routing? Possible. Post full configuration and someone might be able to help. > I was wondering, how the ping script (to check the lonk status) of > others work id domain name is used. Don't know about others, and I personally use ip addresses :) _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From alex at samad.com.au Thu Jun 14 06:23:14 2007 From: alex at samad.com.au (Alex Samad) Date: Thu Jun 14 06:23:26 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <000601c7ae37$276b3480$5964a8c0@SalimSi> References: <000201c7ad70$7f4c2eb0$5964a8c0@SalimSi> <000601c7ae37$276b3480$5964a8c0@SalimSi> Message-ID: <20070614042314.GD5364@samad.com.au> On Thu, Jun 14, 2007 at 11:50:30AM +0800, Salim S I wrote: > I solved it, thought a bit ugly. > > Have two more rules now in ip ru > > 32150: from all lookup main > 32201: from all fwmark 0x200/0x200 lookup wan1_route > 32202: from all fwmark 0x400/0x400 lookup wan2_route > 32203: from 10.20.0.137 lookup wan1_route > 32204: from 10.2.3.107 lookup wan2_route > 32205: from all lookup catch_all > 32766: from all lookup main > > I did not like to include WAN IP anywhere, coz it may be dynamic, but > well, seems like no choice. ran into the same problem, I capture the link information at ip-up time for ppp/pppoe and dhcp time for cable modem, then I fire off a scrip that pulls down all the ip ru & ip ro and builds it from scratch (as well as the specialised iptables rules as well). This should only happen when I loose a connection so should be okay > > And then two rules in OUTPUT chain > Iptables -t mangle -A OUTPUT -o eth2 -j LB1 > Iptables -t mangle -A OUTPUT -o eth3 -j LB2 > > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Salim S I > Sent: Wednesday, June 13, 2007 12:08 PM > To: 'Peter Rabbitson' > Cc: lartc@mailman.ds9a.nl > Subject: RE: [LARTC] Re: multiple routing tables for internal router > programs > > My configuration > > root@127.0.0.1:~# ip ru > 0: from all lookup local > 32150: from all lookup main > 32201: from all fwmark 0x200/0x200 lookup wan1_route > 32202: from all fwmark 0x400/0x400 lookup wan2_route > 32203: from all lookup catch_all > 32766: from all lookup main > 32767: from all lookup default > > root@127.0.0.1:~# ip ro li ta main > 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.254 > 10.20.0.0/24 dev eth2 proto kernel scope link src 10.20.0.137 > 192.168.1.0/24 dev eth10 proto kernel scope link src 192.168.1.254 > 10.2.3.0/24 dev eth3 proto kernel scope link src 10.2.3.107 > 127.0.0.0/8 dev lo scope link > > root@127.0.0.1:~# ip ro li ta wan1_route > default via 10.20.0.1 dev eth2 proto static > root@127.0.0.1:~# ip ro li ta wan2_route > default via 10.2.3.254 dev eth3 proto static > > root@127.0.0.1:~# ip ro li ta catch_all > default proto static > nexthop via 10.20.0.1 dev eth2 weight 1 > nexthop via 10.2.3.254 dev eth3 weight 1 > > The catch_all table comes into play only for local packets. All > forwarded packets are marked in mangle PREROUTING, with 0x200 0r 0x400. > > If not loadblancing ping script, there maybe other apps using domain > names instead of IP address, they might still fail, right? > > The problem happens when one of the link goes down (not the nexthop,but > after that). Then the kernel will pick an interface and wrong src IP for > local packets. > > > -----Original Message----- > From: Peter Rabbitson [mailto:rabbit@rabbit.us] > Sent: Tuesday, June 12, 2007 7:24 PM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Re: multiple routing tables for internal router > programs > > Salim S I wrote: > > Thanks! I get it now. > > But why the src address for the interface is wrong? > > In my case eth2 has a.b.c.d and eth3 has p.q.r.s. > > > > DNS queries going through eth2 has p.q.r.s as src address and those > > going through eth3 has a.b.c.d. Something wrong with routing? > > Possible. Post full configuration and someone might be able to help. > > > I was wondering, how the ping script (to check the lonk status) of > > others work id domain name is used. > > Don't know about others, and I personally use ip addresses :) > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070614/e21afb7c/attachment.pgp From kristiadi_himawan at dtp.net.id Thu Jun 14 08:32:40 2007 From: kristiadi_himawan at dtp.net.id (Kristiadi Himawan) Date: Thu Jun 14 08:37:22 2007 Subject: [LARTC] Parent shaping In-Reply-To: <20060331085417.59533.qmail@web54303.mail.yahoo.com> Message-ID: Hi It's possible if we try to shape the parent class at the parent ceil although total of the child ceil more than parent. Thanks. From rabbit at rabbit.us Thu Jun 14 09:26:55 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu Jun 14 09:27:02 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <000601c7ae37$276b3480$5964a8c0@SalimSi> References: <000601c7ae37$276b3480$5964a8c0@SalimSi> Message-ID: <4670EDBF.6040208@rabbit.us> Salim S I wrote: > I solved it, thought a bit ugly. > Sorry I didn't answer earlier. Can you post your iptables rules too, the routing alone is not sufficient. If your setup is confidential at least show all statements that set MARKs one way or another. What you did is strange, but it might very well be warranted. Still - depends on your existing rules. From J.Kraaijeveld at Askesis.nl Thu Jun 14 11:06:54 2007 From: J.Kraaijeveld at Askesis.nl (Joost Kraaijeveld) Date: Thu Jun 14 11:07:03 2007 Subject: [LARTC] GUI or other tools for traffic shaping Message-ID: Hi, Are there GUI (preferable) or scripting tools available somewhere that can help me with traffic shaping? I have found MasterShaper and tcng but hey seem both unmaintained. Directly writing scripts is still a bit out of my reach, so I would like to learn by using tools... TIA Joost From gshobowale at nextworksltd.com Thu Jun 14 12:23:38 2007 From: gshobowale at nextworksltd.com (Oluwagbenga Shobowale) Date: Thu Jun 14 12:22:37 2007 Subject: [LARTC] Priority and regulate http traffice for users Message-ID: <002001c7ae6e$13dd3c10$6d00a8c0@fly1> Hi all, I would like to priorities http traffic over smtp, that is give http 70% of bandwidth and smtp 30%. Also I would like to deny a few people from browsing until a certain time like 8am to 4pm... Any idea? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070614/240d58cd/attachment.htm From salim.si at cipherium.com.tw Thu Jun 14 12:34:33 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu Jun 14 12:34:51 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <4670EDBF.6040208@rabbit.us> Message-ID: <001601c7ae6f$9cb88610$5964a8c0@SalimSi> The relevant portions are: root@127.0.0.1:~# iptables -t mangle -L LOC -v Chain LOC (1 references) pkts bytes target prot opt in out source destination 10125 1152K CONNMARK all -- any any anywhere anywhere CONNMARK restore 64 12017 LB1 all -- any any anywhere anywhere state NEW MARK match 0x0 random 84% 174 28502 LB2 all -- any any anywhere anywhere state NEW MARK match 0x0 root@127.0.0.1:~# iptables -t mangle -L LB1 -v Chain LB1 (2 references) pkts bytes target prot opt in out source destination 2350 257K MARK all -- any any anywhere anywhere MARK or 0x200 2350 257K CONNMARK all -- any any anywhere anywhere CONNMARK save root@127.0.0.1:~# iptables -t mangle -L LB2 -v Chain LB2 (2 references) pkts bytes target prot opt in out source destination 6931 1196K MARK all -- any any anywhere anywhere MARK or 0x400 6931 1196K CONNMARK all -- any any anywhere anywhere CONNMARK save root@127.0.0.1:~# iptables -t mangle -L OUTPUT -v Chain OUTPUT (policy ACCEPT 8358 packets, 1290K bytes) pkts bytes target prot opt in out source destination 1551 119K LB1 all -- any eth2 anywhere anywhere 6788 1170K LB2 all -- any eth3 anywhere anywhere NATing is done with MASQUERADE, not SNAT, I use another MARK for it, but in essence it is -o eth2 -j MASQUEARDE -o eth3 -j MASQUEARDE In addition, there are several other MARKs for policy routing. They have their own routing tables also. But at present, they are all empty. -----Original Message----- From: Peter Rabbitson [mailto:rabbit@rabbit.us] Sent: Thursday, June 14, 2007 3:27 PM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs Salim S I wrote: > I solved it, thought a bit ugly. > Sorry I didn't answer earlier. Can you post your iptables rules too, the routing alone is not sufficient. If your setup is confidential at least show all statements that set MARKs one way or another. What you did is strange, but it might very well be warranted. Still - depends on your existing rules. From P.Kaagman at atlascollege.nl Thu Jun 14 14:12:38 2007 From: P.Kaagman at atlascollege.nl (Peter Kaagman) Date: Thu Jun 14 14:12:51 2007 Subject: [LARTC] tc: Trying to understand what I have done Message-ID: <6BEC0BC0C32DBE4480920A1A1DA06D25A280A2@MERCURIUS.atlas.atlascollege.nl> Hi list, Up front: A bit sorry this post turned out a wee bit long I work as a system administrator for the Atlas College in the Netherlands. We are what is called a merger school consisting of 5 separate (more or less) locations and one central administration. The network is a class A network (10.0.0.0/8) in which all locations have there own subnet (i.e. 10.9.0.0/16 for the central administration). Sine 2004 the separate units share the 6 mbit Internet access. When we started with a central access to Internet it was still possible for one of the locations to clog the access to the Internet. Giving an unfair situation. For this reason we started to a HTB bandwidth shaper. What I tried to achieve was giving the separate location a fair share of the bandwidth (in relation to their student count) with as a ceiling the 6 mbit maximum. As a complicating factor there is also a DMZ connected at LAN speed (100 mbit). So what I did was make a root class of 100/100 mbit, subclassing it in an Internet class off 6/6mbit and a DMZ class of 94/100mbit. The default class is the DMZ class. The Internet class is subclassed further to make a class per unit. I've enclosed the script below, it has worked well for 2 years now.... but there are changes at the horizon :D The 6mbit Internet connection has been full ever since we bought it. Now people are starting to complain about slow connection. So we've decided to upgrade our contract to a 40mbit connection. This could offcourse simply be done by changing the numbers. But there are 2 complications: 1) Most locations are connected to our backbone with 8mbit microwaves. This means I will not give them more than 6mbit on the internet without a change to borrow. Not the reason I write this (long) message 2) This is the reason: I can no longer explain to myself what I have done in the script. The classes and sub-classes I understand. I understand the filter rules I've made for the locations. But looking at the filter rules for the DMZ I think they are wrong. The first rule I can dig: 61 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 62 match ip src 192.168.0.0/24 flowid 1:20 All traffic coming form 192.168.0.0/24 (the DMZ) belong to class 1:20 (the DMZ) But I've got serious doubts about the next 2 rules: 63 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 64 match ip src 10.0.0.99 flowid 1:20 65 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 66 match ip dst 10.0.0.99 flowid 1:20 IP 10.0.0.99 is the ip address of eth1 (the LAN interface) of the router. Traffic coming and going from that ip is put in to class 1:20. The only reason I can imagine why I have done that is to put local traffic from the router in the DMZ class because I do not want it in class 1:10 or one of its sub-classes. So my question would be: Does this script do the things I described above? Could I not better leave those DMZ rules out because 1:20 is the default class anyway? Met Vriendelijke Groet, ? Peter Kaagman Systeembeheer Atlas College p.kaagman@atlascollege.nl ? 1 # /bin/sh 2 # eth1: Lan link 3 # root 4 # 1: 5 # | 6 # base 7 # 100/100mbit 8 # _1:1_ 9 # / \ 10 # / \ 11 # / \ 12 # Internet DMZ 13 # 6/6mbit 94/100mbit 14 # 1:10 1:20 15 # | 16 # | 17 # |-- DDK 10.2.0.0/16 18 # | 1:12 19 # | 438kbit/6mbit 1) 20 # | 21 # |-- Tit 10.4.0.0/16 22 # | 1:14 23 # | 1254kbit/6mbit 24 # | 25 # |-- CSG 10.5.0.0/16 26 # | 1:15 27 # | 1605kbit/6mbit 28 # | 29 # |-- OSG 10.6.0.0/16 30 # | 1:16 31 # | 1605kbit/6mbit 32 # | 33 # |-- Tri 10.8.0.0/16 34 # | 1:18 35 # | 730kbit/6mbit 36 # | 37 # |-- CB 10.9.0.0/16 38 # 1:19 39 # 512kbit/6mbit 40 # 41 42 # root qdisc 43 /sbin/tc qdisc add dev eth1 root handle 1: htb default 20 44 # root class for borrow 100/100mbit 45 /sbin/tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit ceil 100mbit 46 # class for Internet 6/6mbit 47 /sbin/tc class add dev eth1 parent 1:1 classid 1:10 htb rate 6mbit ceil 6mbit 48 # class for DMZ 94/100mbit 49 /sbin/tc class add dev eth1 parent 1:1 classid 1:20 htb rate 94mbit ceil 100mbit 50 51 # child classes for divide 52 /sbin/tc class add dev eth1 parent 1:10 classid 1:12 htb rate 438kbit ceil 6mbit 53 /sbin/tc class add dev eth1 parent 1:10 classid 1:14 htb rate 1254kbit ceil 6mbit 54 /sbin/tc class add dev eth1 parent 1:10 classid 1:15 htb rate 1605kbit ceil 6mbit 55 /sbin/tc class add dev eth1 parent 1:10 classid 1:16 htb rate 1605kbit ceil 6mbit 56 /sbin/tc class add dev eth1 parent 1:10 classid 1:18 htb rate 730kbit ceil 6mbit 57 /sbin/tc class add dev eth1 parent 1:10 classid 1:19 htb rate 512kbit ceil 6mbit 58 # filters 59 # HTB rules should be attached to the root 60 # From DMZ to 1:20 rest 1:1* 61 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 62 match ip src 192.168.0.0/24 flowid 1:20 63 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 64 match ip src 10.0.0.99 flowid 1:20 65 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 66 match ip dst 10.0.0.99 flowid 1:20 67 # Locations 68 # 10.2.0.0/16 naar class 1:12 69 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 70 match ip dst 10.2.0.0/16 flowid 1:12 71 # 10.4.0.0/16 naar class 1:14 72 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 73 match ip dst 10.4.0.0/16 flowid 1:14 74 # 10.5.0.0/16 naar class 1:15 75 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 76 match ip dst 10.5.0.0/16 flowid 1:15 77 # 10.6.0.0/16 naar class 1:16 78 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 79 match ip dst 10.6.0.0/16 flowid 1:16 80 # 10.8.0.0/16 naar class 1:18 81 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 82 match ip dst 10.8.0.0/16 flowid 1:18 83 # 10.9.0.0/16 naar class 1:19 84 /sbin/tc filter add dev eth1 protocol ip parent 1: prio 1 u32 \ 85 match ip dst 10.9.0.0/16 flowid 1:19 86 87 88 # re-init 89 # /sbin/tc qdisc del dev eth1 root From gtaylor at riverviewtech.net Thu Jun 14 18:49:24 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 14 18:47:29 2007 Subject: [LARTC] Will this work, or have I been around too much magic smoke??? In-Reply-To: <466F803A.1030503@riverviewtech.net> References: <466F803A.1030503@riverviewtech.net> Message-ID: <46717194.5080707@riverviewtech.net> On 06/13/07 00:27, Grant Taylor wrote: > So, now that I have tried to explain what I'm wanting to do, and > probably thoroughly made a mess of it, do you think that at least in > theory this is possible? Any takers??? Grant. . . . From marco.casaroli at gmail.com Thu Jun 14 21:25:14 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Thu Jun 14 21:25:19 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <4670518C.1090806@relef.net> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> <20070613151828.2fe236a8@babalu.inexo.com.br> <4670518C.1090806@relef.net> Message-ID: <92ed523b0706141225l240fe5b8gaa80aba870ae00f2@mail.gmail.com> I think it is better to use an IFB device and shape the upload traffic using source IP before the NAT http://linux-net.osdl.org/index.php/IFB On 6/13/07, VladSun wrote: > Ethy H. Brito ??????: > > On Mon, 11 Jun 2007 22:02:31 +0300 > > VladSun wrote: > > > > > > > >> TC is performed after POSTROUTING, so you can not do any IP related TC > >> filtering. You can use CPU friendly patches for iptables like IPMARK or > >> IPCLASSIFY. Take a look at them. > >> > > > > Ok. Can someone point me the right direction to add IPMARK kernel support? > > > > I downloaded patch-o-matic today's snapshot and there is no IPMARK there. > > > > I have iptables-1.3.7 and kernel 2.6.21.1 sources (distro is slackware 11.0) > > > > The curious thing is that IPMARK is at iptables man page but I got and > > error when I execute it. It says it could not > > find /usr/lib/iptables/libipt_IPMARK.so: > > > > # locate -i IPMARK > > # (no output here) > > > > > > Regards. > > > > Ethy > > > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > Try "./runme download" in tge PoM directory. It should work if there is > defined download URL for IPMARK in the source.list file in the PoM > directory. > If it doesn't work try to download older version of PoM. > That is because netfilter team has refused to include IPMARK in the > official versions some time ago. > > Regards > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From tenos at ll.mit.edu Thu Jun 14 21:29:17 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Thu Jun 14 21:29:58 2007 Subject: [LARTC] PQ questions Message-ID: <200706141929.l5EJTnbG019836@ll.mit.edu> Hi all, First, let me say I've been most impressed with how quickly and professionally people on this list ask and answer questions. Next, let me say that with which I need help is properly configuring strict PQ, and gathering certain stats. Specifically: - I need to create a priority queue with four queues (let's say they are of high, medium, normal, and low priority) - I need to use tc filters such that: - EF (0xB8) goes to the high priority queue - AF21 (0x50) goes to the medium priority queue - AF11 (0x28) goes to the normal priority queue, and - BE traffic goes to the low priority queue - For stat collection, I need to see: - how many bytes and packets are in each of the four queues - My configuration thus far is: tc qdisc add dev eml_test root handle 1: prio bands 4 priomap 0 1 2 3 tc filter add dev eml_test parent 1:0 prio 1 protocol ip u32 match ip tos 0xb8 0xff flowid 1:1 tc filter add dev eml_test parent 1:0 prio 2 protocol ip u32 match ip tos 0x80 0xff flowid 1:2 tc filter add dev eml_test parent 1:0 prio 3 protocol ip u32 match ip tos 0x50 0xff flowid 1:3 tc filter add dev eml_test parent 1:0 prio 4 protocol ip u32 match ip tos 0x00 0xff flowid 1:4 __________ My questions are: - What if anything is missing/requiring change in my config given the stated requirements? - What if any command should I use to view how many bytes and packets are in each of the four queues? Any help would be most appreciated. From ethy.brito at inexo.com.br Wed Jun 13 20:18:28 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Thu Jun 14 22:03:18 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <466D9C47.7070909@relef.net> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> Message-ID: <20070613151828.2fe236a8@babalu.inexo.com.br> On Mon, 11 Jun 2007 22:02:31 +0300 VladSun wrote: > TC is performed after POSTROUTING, so you can not do any IP related TC > filtering. You can use CPU friendly patches for iptables like IPMARK or > IPCLASSIFY. Take a look at them. Ok. Can someone point me the right direction to add IPMARK kernel support? I downloaded patch-o-matic today's snapshot and there is no IPMARK there. I have iptables-1.3.7 and kernel 2.6.21.1 sources (distro is slackware 11.0) The curious thing is that IPMARK is at iptables man page but I got and error when I execute it. It says it could not find /usr/lib/iptables/libipt_IPMARK.so: # locate -i IPMARK # (no output here) Regards. Ethy From ethy.brito at inexo.com.br Thu Jun 14 22:11:32 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Thu Jun 14 22:10:57 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <20070613151828.2fe236a8@babalu.inexo.com.br> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> <20070613151828.2fe236a8@babalu.inexo.com.br> Message-ID: <20070614171132.1653e03f@babalu.inexo.com.br> PLEASE disregard this. My MUA gone crazy and resent a lot of my emails today. Forgive me. Ethy On Wed, 13 Jun 2007 15:18:28 -0300 "Ethy H. Brito" wrote: > On Mon, 11 Jun 2007 22:02:31 +0300 > VladSun wrote: > > > > TC is performed after POSTROUTING, so you can not do any IP related TC > > filtering. You can use CPU friendly patches for iptables like IPMARK or > > IPCLASSIFY. Take a look at them. > > Ok. Can someone point me the right direction to add IPMARK kernel support? > > I downloaded patch-o-matic today's snapshot and there is no IPMARK there. > > I have iptables-1.3.7 and kernel 2.6.21.1 sources (distro is slackware 11.0) > > The curious thing is that IPMARK is at iptables man page but I got and > error when I execute it. It says it could not > find /usr/lib/iptables/libipt_IPMARK.so: > > # locate -i IPMARK > # (no output here) > > > Regards. > > Ethy > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -- Ethy H. Brito /"\ InterNexo Ltda. \ / CAMPANHA DA FITA ASCII - CONTRA MAIL HTML +55 (12) 3797-6860 X ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL S.J.Campos - Brasil / \ From christian.benvenuti at libero.it Thu Jun 14 22:43:48 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Jun 14 22:38:45 2007 Subject: [LARTC] Re: PQ questions Message-ID: <1181853828.10947.15.camel@benve-laptop> Hi, >Hi all, > >First, let me say I've been most impressed with how quickly and >professionally people on this list ask and answer questions. > >Next, let me say that with which I need help is properly configuring strict >PQ, and gathering certain stats. Specifically: > >- I need to create a priority queue with four queues (let's say they are of >high, medium, normal, and low priority) > >- I need to use tc filters such that: > > - EF (0xB8) goes to the high priority queue > > - AF21 (0x50) goes to the medium priority queue > > - AF11 (0x28) goes to the normal priority queue, and > > - BE traffic goes to the low priority queue > >- For stat collection, I need to see: > > - how many bytes and packets are in each of the four queues > >- My configuration thus far is: > >tc qdisc add dev eml_test root handle 1: prio bands 4 priomap 0 1 2 3 > >tc filter add dev eml_test parent 1:0 prio 1 protocol ip u32 match ip tos >0xb8 0xff flowid 1:1 > >tc filter add dev eml_test parent 1:0 prio 2 protocol ip u32 match ip tos >0x80 0xff flowid 1:2 > >tc filter add dev eml_test parent 1:0 prio 3 protocol ip u32 match ip tos >0x50 0xff flowid 1:3 > >tc filter add dev eml_test parent 1:0 prio 4 protocol ip u32 match ip tos >0x00 0xff flowid 1:4 >__________ Here is an article you may find useful: http://citeseer.ist.psu.edu/539891.html Here is the description of the configuration parameters of the PRIO qdisc: http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN903 (just in case you did not know what the "priomap" option is used for) >My questions are: > >- What if anything is missing/requiring change in my config given the stated >requirements? Your config does not prevent an higher priority class from starving a lower priority class. You can prevent it in two different ways (at least): 1) You can assign a TBF qdisc (Token Bucket) to the PRIO classes TBF: http://www.lartc.org/howto/lartc.qdisc.classless.html#AEN691 2) You can replace the PRIO qdisc with something like HTB/CBQ CBQ: http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN939 HTB: http://luxik.cdi.cz/~devik/qos/htb/ >- What if any command should I use to view how many bytes and packets are in >each of the four queues? The PRIO qdisc does not return statistics for its classes. However, a simple workaround consists of explicitly adding a qdisc to the four classes. By default the PRIO qdisc assigns a pFIFO (packet FIFO) qdisc to its classes. Here is how you can replace the 4 default pFIFO qdisc with 4 explicit pFIFO qdisc: tc qdisc add dev eml_test parent 1:1 pfifo limit 1000 tc qdisc add dev eml_test parent 1:2 pfifo limit 1000 tc qdisc add dev eml_test parent 1:3 pfifo limit 1000 tc qdisc add dev eml_test parent 1:4 pfifo limit 1000 Now you can get the stats with: tc -s -d qdisc list dev eml_test Regards /Christian [ http://benve.info ] From ethy.brito at inexo.com.br Thu Jun 14 22:51:18 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Thu Jun 14 22:50:50 2007 Subject: [LARTC] shaping using source IP after NAT In-Reply-To: <92ed523b0706141225l240fe5b8gaa80aba870ae00f2@mail.gmail.com> References: <20070611155834.075b7462@pulsar.inexo.com.br> <466D9C47.7070909@relef.net> <20070613151828.2fe236a8@babalu.inexo.com.br> <4670518C.1090806@relef.net> <92ed523b0706141225l240fe5b8gaa80aba870ae00f2@mail.gmail.com> Message-ID: <20070614175118.394b24bd@babalu.inexo.com.br> On Thu, 14 Jun 2007 16:25:14 -0300 "Marco Aurelio" wrote: > I think it is better to use an IFB device and shape the upload traffic > using source IP before the NAT > > http://linux-net.osdl.org/index.php/IFB Before NAT?!?! Where does IFB hook netfilter tables?? Before mangle POSTROUTING? Ethy From luciano at lugmen.org.ar Fri Jun 15 04:55:35 2007 From: luciano at lugmen.org.ar (Luciano Ruete) Date: Fri Jun 15 04:55:50 2007 Subject: [LARTC] GUI or other tools for traffic shaping In-Reply-To: References: Message-ID: <200706142355.35528.luciano@lugmen.org.ar> On Thursday 14 June 2007 06:06:54 Joost Kraaijeveld wrote: > Hi, > > Are there GUI (preferable) or scripting tools available somewhere that can > help me with traffic shaping? I have found MasterShaper and tcng but hey > seem both unmaintained. Directly writing scripts is still a bit out of my > reach, so I would like to learn by using tools... htb-gen[1] is very easy to setup but yet powerfull. It is meant for internet sharing scenarios, like small/medium ISPs and home/office internet share. You also have an stdout target to see what are the tc/iptables commands executed. And it is still maintained and evolving... ;) [1]http://freshmeat.net/projects/htb-gen/ -- Luciano From salim.si at cipherium.com.tw Fri Jun 15 05:26:56 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri Jun 15 05:27:20 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <20070614042314.GD5364@samad.com.au> Message-ID: <001601c7aefd$0a3df520$5964a8c0@SalimSi> I do the same way, from ip-up. But I only change the two concerned rules. Rest of the things are free from IP. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Alex Samad Sent: Thursday, June 14, 2007 12:23 PM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Re: multiple routing tables for internal router programs On Thu, Jun 14, 2007 at 11:50:30AM +0800, Salim S I wrote: > I solved it, thought a bit ugly. > > Have two more rules now in ip ru > > 32150: from all lookup main > 32201: from all fwmark 0x200/0x200 lookup wan1_route > 32202: from all fwmark 0x400/0x400 lookup wan2_route > 32203: from 10.20.0.137 lookup wan1_route > 32204: from 10.2.3.107 lookup wan2_route > 32205: from all lookup catch_all > 32766: from all lookup main > > I did not like to include WAN IP anywhere, coz it may be dynamic, but > well, seems like no choice. ran into the same problem, I capture the link information at ip-up time for ppp/pppoe and dhcp time for cable modem, then I fire off a scrip that pulls down all the ip ru & ip ro and builds it from scratch (as well as the specialised iptables rules as well). This should only happen when I loose a connection so should be okay > > And then two rules in OUTPUT chain > Iptables -t mangle -A OUTPUT -o eth2 -j LB1 > Iptables -t mangle -A OUTPUT -o eth3 -j LB2 > > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Salim S I > Sent: Wednesday, June 13, 2007 12:08 PM > To: 'Peter Rabbitson' > Cc: lartc@mailman.ds9a.nl > Subject: RE: [LARTC] Re: multiple routing tables for internal router > programs > > My configuration > > root@127.0.0.1:~# ip ru > 0: from all lookup local > 32150: from all lookup main > 32201: from all fwmark 0x200/0x200 lookup wan1_route > 32202: from all fwmark 0x400/0x400 lookup wan2_route > 32203: from all lookup catch_all > 32766: from all lookup main > 32767: from all lookup default > > root@127.0.0.1:~# ip ro li ta main > 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.254 > 10.20.0.0/24 dev eth2 proto kernel scope link src 10.20.0.137 > 192.168.1.0/24 dev eth10 proto kernel scope link src 192.168.1.254 > 10.2.3.0/24 dev eth3 proto kernel scope link src 10.2.3.107 > 127.0.0.0/8 dev lo scope link > > root@127.0.0.1:~# ip ro li ta wan1_route > default via 10.20.0.1 dev eth2 proto static > root@127.0.0.1:~# ip ro li ta wan2_route > default via 10.2.3.254 dev eth3 proto static > > root@127.0.0.1:~# ip ro li ta catch_all > default proto static > nexthop via 10.20.0.1 dev eth2 weight 1 > nexthop via 10.2.3.254 dev eth3 weight 1 > > The catch_all table comes into play only for local packets. All > forwarded packets are marked in mangle PREROUTING, with 0x200 0r 0x400. > > If not loadblancing ping script, there maybe other apps using domain > names instead of IP address, they might still fail, right? > > The problem happens when one of the link goes down (not the nexthop,but > after that). Then the kernel will pick an interface and wrong src IP for > local packets. > > > -----Original Message----- > From: Peter Rabbitson [mailto:rabbit@rabbit.us] > Sent: Tuesday, June 12, 2007 7:24 PM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Re: multiple routing tables for internal router > programs > > Salim S I wrote: > > Thanks! I get it now. > > But why the src address for the interface is wrong? > > In my case eth2 has a.b.c.d and eth3 has p.q.r.s. > > > > DNS queries going through eth2 has p.q.r.s as src address and those > > going through eth3 has a.b.c.d. Something wrong with routing? > > Possible. Post full configuration and someone might be able to help. > > > I was wondering, how the ping script (to check the lonk status) of > > others work id domain name is used. > > Don't know about others, and I personally use ip addresses :) > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > From J.Kraaijeveld at Askesis.nl Fri Jun 15 06:55:08 2007 From: J.Kraaijeveld at Askesis.nl (Joost Kraaijeveld) Date: Fri Jun 15 06:55:16 2007 Subject: [LARTC] GUI or other tools for traffic shaping In-Reply-To: <200706142355.35528.luciano@lugmen.org.ar> References: <200706142355.35528.luciano@lugmen.org.ar> Message-ID: <1181883308.27344.0.camel@panoramix> Hi Luciano, On Thu, 2007-06-14 at 23:55 -0300, Luciano Ruete wrote: > On Thursday 14 June 2007 06:06:54 Joost Kraaijeveld wrote: > > Hi, > > > > Are there GUI (preferable) or scripting tools available somewhere that can > > help me with traffic shaping? I have found MasterShaper and tcng but hey > > seem both unmaintained. Directly writing scripts is still a bit out of my > > reach, so I would like to learn by using tools... > > htb-gen[1] is very easy to setup but yet powerfull. > It is meant for internet sharing scenarios, like small/medium ISPs and > home/office internet share. > You also have an stdout target to see what are the tc/iptables commands > executed. > > And it is still maintained and evolving... ;) > > [1]http://freshmeat.net/projects/htb-gen/ This looks as what I am looking for , thanks. -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 web: www.askesis.nl From rabbit at rabbit.us Fri Jun 15 08:00:57 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Fri Jun 15 08:01:03 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <001601c7ae6f$9cb88610$5964a8c0@SalimSi> References: <001601c7ae6f$9cb88610$5964a8c0@SalimSi> Message-ID: <46722B19.40601@rabbit.us> Salim S I wrote: > > NATing is done with MASQUERADE, not SNAT, I use another MARK for it, but > in essence it is > -o eth2 -j MASQUEARDE > -o eth3 -j MASQUEARDE > > In addition, there are several other MARKs for policy routing. They have > their own routing tables also. But at present, they are all empty. > This is the part I definitely do not like. First of all - wht SNAT/MASQUERADE _all_ traffic? You should do this for forwarder traffic only. Like so: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $EXTCH -j SNAT --to $EXTCH_IP iptables -t nat -A SOURCE_NAT -o $EXTCB -j SNAT --to $EXTCB_IP Also you mention that there are "other marks" , which means that you might very well be overwriting marks as you go. A packet/connection can have only _one_ mark value at any time, no more no less (a 0x0 is still a mark) HTH From rabbit at rabbit.us Fri Jun 15 08:01:07 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Fri Jun 15 08:01:18 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <001601c7ae6f$9cb88610$5964a8c0@SalimSi> References: <001601c7ae6f$9cb88610$5964a8c0@SalimSi> Message-ID: <46722B23.8070503@rabbit.us> Salim S I wrote: > > NATing is done with MASQUERADE, not SNAT, I use another MARK for it, but > in essence it is > -o eth2 -j MASQUEARDE > -o eth3 -j MASQUEARDE > > In addition, there are several other MARKs for policy routing. They have > their own routing tables also. But at present, they are all empty. > This is the part I definitely do not like. First of all - wht SNAT/MASQUERADE _all_ traffic? You should do this for forwarder traffic only. Like so: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $EXTCH -j SNAT --to $EXTCH_IP iptables -t nat -A SOURCE_NAT -o $EXTCB -j SNAT --to $EXTCB_IP Also you mention that there are "other marks" , which means that you might very well be overwriting marks as you go. A packet/connection can have only _one_ mark value at any time, no more no less (a 0x0 is still a mark) HTH From salim.si at cipherium.com.tw Fri Jun 15 08:21:21 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri Jun 15 08:21:42 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <46722B23.8070503@rabbit.us> Message-ID: <001701c7af15$677f2bb0$5964a8c0@SalimSi> > > NATing is done with MASQUERADE, not SNAT, I use another MARK for it, but > > in essence it is > > -o eth2 -j MASQUEARDE > > -o eth3 -j MASQUEARDE > > > > In addition, there are several other MARKs for policy routing. They have > > their own routing tables also. But at present, they are all empty. > > > > This is the part I definitely do not like. First of all - wht > SNAT/MASQUERADE _all_ traffic? You should do this for forwarder traffic > only. Like so: Yes, in fact, this is what I do. I mentioned I use MARK for MASQUERADing, but forgot to elaborate. That particular MARK is set for forwarded packets only. > Also you mention that there are "other marks" , which means that you > might very well be overwriting marks as you go. A packet/connection can > have only _one_ mark value at any time, no more no less (a 0x0 is still > a mark) I use --or-mark in iptables, so that I can use bitwise masks. The 'ip' tool supports bit masks too. From rabbit at rabbit.us Fri Jun 15 08:29:43 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Fri Jun 15 08:29:56 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <001701c7af15$677f2bb0$5964a8c0@SalimSi> References: <001701c7af15$677f2bb0$5964a8c0@SalimSi> Message-ID: <467231D7.1070105@rabbit.us> Salim S I wrote: > >>> NATing is done with MASQUERADE, not SNAT, I use another MARK for it, > but >>> in essence it is >>> -o eth2 -j MASQUEARDE >>> -o eth3 -j MASQUEARDE >>> >>> In addition, there are several other MARKs for policy routing. They > have >>> their own routing tables also. But at present, they are all empty. >>> >> This is the part I definitely do not like. First of all - wht >> SNAT/MASQUERADE _all_ traffic? You should do this for forwarder > traffic >> only. Like so: > > Yes, in fact, this is what I do. I mentioned I use MARK for > MASQUERADing, but forgot to elaborate. That particular MARK is set for > forwarded packets only. > > >> Also you mention that there are "other marks" , which means that you >> might very well be overwriting marks as you go. A packet/connection > can >> have only _one_ mark value at any time, no more no less (a 0x0 is > still >> a mark) > > > I use --or-mark in iptables, so that I can use bitwise masks. The 'ip' > tool supports bit masks too. > Well then you are certainly ahead of the game. Still I would suggest to avoid the complexity of bit mask marks - it is rather error prone and is pretty hard to maintain, while the same result can usually be achieved by other means (like in my SNAT example). As far as your original problem goes - it seems like a mark is getting eaten away or is not set somewhere in the first place. I have not had any problems like the ones you describe. From tenos at ll.mit.edu Fri Jun 15 08:43:21 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Fri Jun 15 08:48:49 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181853828.10947.15.camel@benve-laptop> Message-ID: <200706150643.l5F6hYeo015579@ll.mit.edu> Hi Christian, Thanks for the help. Please see my in-line comments: > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Thursday, June 14, 2007 4:44 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > > >Hi all, > > > >First, let me say I've been most impressed with how quickly and > >professionally people on this list ask and answer questions. > > > >Next, let me say that with which I need help is properly configuring > strict > >PQ, and gathering certain stats. Specifically: > > > >- I need to create a priority queue with four queues (let's say they are > of > >high, medium, normal, and low priority) > > > >- I need to use tc filters such that: > > > > - EF (0xB8) goes to the high priority queue > > > > - AF21 (0x50) goes to the medium priority queue > > > > - AF11 (0x28) goes to the normal priority queue, and > > > > - BE traffic goes to the low priority queue > > > >- For stat collection, I need to see: > > > > - how many bytes and packets are in each of the four queues > > > >- My configuration thus far is: > > > >tc qdisc add dev eml_test root handle 1: prio bands 4 priomap 0 1 2 3 > > > >tc filter add dev eml_test parent 1:0 prio 1 protocol ip u32 match ip tos > >0xb8 0xff flowid 1:1 > > > >tc filter add dev eml_test parent 1:0 prio 2 protocol ip u32 match ip tos > >0x80 0xff flowid 1:2 > > > >tc filter add dev eml_test parent 1:0 prio 3 protocol ip u32 match ip tos > >0x50 0xff flowid 1:3 > > > >tc filter add dev eml_test parent 1:0 prio 4 protocol ip u32 match ip tos > >0x00 0xff flowid 1:4 > >__________ > > Here is an article you may find useful: > http://citeseer.ist.psu.edu/539891.html > > Here is the description of the configuration parameters of the > PRIO qdisc: > http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN903 > (just in case you did not know what the "priomap" option is > used for) > > >My questions are: > > > >- What if anything is missing/requiring change in my config given the > stated > >requirements? > > Your config does not prevent an higher priority class from starving > a lower priority class. Exactly. That is requirement. > You can prevent it in two different ways (at > least): Don't want to prevent it right now. > > 1) You can assign a TBF qdisc (Token Bucket) to the PRIO classes > TBF: http://www.lartc.org/howto/lartc.qdisc.classless.html#AEN691 > > 2) You can replace the PRIO qdisc with something like HTB/CBQ > CBQ: http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN939 > HTB: http://luxik.cdi.cz/~devik/qos/htb/ > > >- What if any command should I use to view how many bytes and packets are > in > >each of the four queues? > > The PRIO qdisc does not return statistics for its classes. > However, a simple workaround consists of explicitly adding > a qdisc to the four classes. > By default the PRIO qdisc assigns a pFIFO (packet FIFO) qdisc to > its classes. > Here is how you can replace the 4 default pFIFO qdisc with 4 > explicit pFIFO qdisc: > > tc qdisc add dev eml_test parent 1:1 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:2 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:3 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:4 pfifo limit 1000 > > Now you can get the stats with: > tc -s -d qdisc list dev eml_test Those stats are nice to have, but the ones I must have are for how many bytes/packets are enqueued at whatever time I check the queues. > > Regards > /Christian > [ http://benve.info ] > I have tried to configure PQ to have two queues per filter with no success. Is it even possible to have (what I'll call) hierarchical PQ? I have yet to find it. > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From christian.benvenuti at libero.it Fri Jun 15 09:32:06 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Fri Jun 15 09:27:06 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <200706150643.l5F6hYeo015579@ll.mit.edu> References: <200706150643.l5F6hYeo015579@ll.mit.edu> Message-ID: <1181892726.2702.7.camel@benve-laptop> Hi, > > Your config does not prevent an higher priority class from starving > > a lower priority class. > > Exactly. That is requirement. OK > Those stats are nice to have, but the ones I must have are for how many > bytes/packets are enqueued at whatever time I check the queues. That information is there. Here is an example: (b=bytes p=packets) #tc -s -d qdisc list dev eth1 qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 35p requeues 0 +-> This field is not initialized for this qdisc type qdisc pfifo 10: parent 1:1 limit 1000p Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 ^^^^^^^^^^^^^ qdisc pfifo 20: parent 1:2 limit 1000p Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 50470b 35p requeues 0 ^^^^^^^^^^^^^^^^^^ qdisc pfifo 30: parent 1:3 limit 1000p Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 ^^^^^^^^^^^^^ > I have tried to configure PQ to have two queues per filter with no success. What do you mean? > Is it even possible to have (what I'll call) hierarchical PQ? I have yet to > find it. Something like this? tc qdisc add dev eth1 handle 1: root prio tc qdisc add dev eth1 parent 1:1 handle 10 prio tc qdisc add dev eth1 parent 1:2 handle 20 prio tc qdisc add dev eth1 parent 1:3 handle 30 prio Regards /Christian [ http://benve.info ] From salim.si at cipherium.com.tw Fri Jun 15 09:46:31 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri Jun 15 09:46:48 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181892726.2702.7.camel@benve-laptop> Message-ID: <001801c7af21$4d9a5510$5964a8c0@SalimSi> Slightly offtopic... Has anyone really experienced starving of low priority traffic with PRIO qdisc? In my setup, I never achieved that, though I also wanted exactly that situation. I gave both the classes same amount of traffic at the same time. High prio got more bandwidth, but no starvation, even after I sent more traffic than the link capacity. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Friday, June 15, 2007 3:32 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > > > > Your config does not prevent an higher priority class from starving > > > a lower priority class. > > > > Exactly. That is requirement. > > OK > > > Those stats are nice to have, but the ones I must have are for how many > > bytes/packets are enqueued at whatever time I check the queues. > > That information is there. Here is an example: > (b=bytes p=packets) > > #tc -s -d qdisc list dev eth1 > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 35p requeues 0 > +-> This field is not initialized for this > qdisc type > qdisc pfifo 10: parent 1:1 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > qdisc pfifo 20: parent 1:2 limit 1000p > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 50470b 35p requeues 0 > ^^^^^^^^^^^^^^^^^^ > qdisc pfifo 30: parent 1:3 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > > > I have tried to configure PQ to have two queues per filter with no > success. > > What do you mean? > > > Is it even possible to have (what I'll call) hierarchical PQ? I have yet > to > > find it. > > Something like this? > > tc qdisc add dev eth1 handle 1: root prio > tc qdisc add dev eth1 parent 1:1 handle 10 prio > tc qdisc add dev eth1 parent 1:2 handle 20 prio > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > Regards > /Christian > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From christian.benvenuti at libero.it Fri Jun 15 10:16:12 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Fri Jun 15 10:10:48 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <001801c7af21$4d9a5510$5964a8c0@SalimSi> References: <001801c7af21$4d9a5510$5964a8c0@SalimSi> Message-ID: <1181895372.2702.20.camel@benve-laptop> Hi, a class is starved only if those with higher priority are always (of pretty often) backlogged and do not give the lower priority classes a chance to transmit. Therefore, if you transmit at a rate smaller than your CPU/s and NIC/s can handle you will not experience any starving. For example, if you generate 50Mbit traffic on a 100Mbit NIC it is likely that you won't see any starving (unless your system is not able to handle 50Mbit traffic because of a complex TC or iptables configuration that consumes lot of CPU). Regards /Christian [ http://benve.info ] On Fri, 2007-06-15 at 15:46 +0800, Salim S I wrote: > Slightly offtopic... Has anyone really experienced starving of low > priority traffic with PRIO qdisc? > In my setup, I never achieved that, though I also wanted exactly that > situation. I gave both the classes same amount of traffic at the same > time. High prio got more bandwidth, but no starvation, even after I sent > more traffic than the link capacity. > > > -----Original Message----- > > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] > > On Behalf Of Christian Benvenuti > > Sent: Friday, June 15, 2007 3:32 PM > > To: lartc@mailman.ds9a.nl > > Subject: [LARTC] Re: PQ questions > > > > Hi, > > > > > > Your config does not prevent an higher priority class from > starving > > > > a lower priority class. > > > > > > Exactly. That is requirement. > > > > OK > > > > > Those stats are nice to have, but the ones I must have are for how > many > > > bytes/packets are enqueued at whatever time I check the queues. > > > > That information is there. Here is an example: > > (b=bytes p=packets) > > > > #tc -s -d qdisc list dev eth1 > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 35p requeues 0 > > +-> This field is not initialized for this > > qdisc type > > qdisc pfifo 10: parent 1:1 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > qdisc pfifo 20: parent 1:2 limit 1000p > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > ^^^^^^^^^^^^^^^^^^ > > qdisc pfifo 30: parent 1:3 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > > > > I have tried to configure PQ to have two queues per filter with no > > success. > > > > What do you mean? > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have > yet > > to > > > find it. > > > > Something like this? > > > > tc qdisc add dev eth1 handle 1: root prio > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > Regards > > /Christian > > [ http://benve.info ] From salim.si at cipherium.com.tw Fri Jun 15 11:13:48 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri Jun 15 11:14:02 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181895372.2702.20.camel@benve-laptop> Message-ID: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> I tested on wireless link. It could give a maximum of 45Mbps. And I sent 30Mbps of both low prio and high prio traffic. Total of 60Mbps. My test was done with UDP, using tcpdump. When I increased the bandwidth to 40Mbps each, the high priority class got lesser bandwidth. (maybe the effect of the known issue that large amount of low prio traffic can starve high prio traffic) > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Friday, June 15, 2007 4:16 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > a class is starved only if those with higher priority are > always (of pretty often) backlogged and do not give the lower > priority classes a chance to transmit. > Therefore, if you transmit at a rate smaller than your CPU/s and > NIC/s can handle you will not experience any starving. > > For example, if you generate 50Mbit traffic on a 100Mbit NIC > it is likely that you won't see any starving (unless your system is > not able to handle 50Mbit traffic because of a complex TC or > iptables configuration that consumes lot of CPU). > > Regards > /Christian > [ http://benve.info ] > > On Fri, 2007-06-15 at 15:46 +0800, Salim S I wrote: > > Slightly offtopic... Has anyone really experienced starving of low > > priority traffic with PRIO qdisc? > > In my setup, I never achieved that, though I also wanted exactly that > > situation. I gave both the classes same amount of traffic at the same > > time. High prio got more bandwidth, but no starvation, even after I sent > > more traffic than the link capacity. > > > > > -----Original Message----- > > > From: lartc-bounces@mailman.ds9a.nl > > [mailto:lartc-bounces@mailman.ds9a.nl] > > > On Behalf Of Christian Benvenuti > > > Sent: Friday, June 15, 2007 3:32 PM > > > To: lartc@mailman.ds9a.nl > > > Subject: [LARTC] Re: PQ questions > > > > > > Hi, > > > > > > > > Your config does not prevent an higher priority class from > > starving > > > > > a lower priority class. > > > > > > > > Exactly. That is requirement. > > > > > > OK > > > > > > > Those stats are nice to have, but the ones I must have are for how > > many > > > > bytes/packets are enqueued at whatever time I check the queues. > > > > > > That information is there. Here is an example: > > > (b=bytes p=packets) > > > > > > #tc -s -d qdisc list dev eth1 > > > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 35p requeues 0 > > > +-> This field is not initialized for this > > > qdisc type > > > qdisc pfifo 10: parent 1:1 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > qdisc pfifo 20: parent 1:2 limit 1000p > > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > > ^^^^^^^^^^^^^^^^^^ > > > qdisc pfifo 30: parent 1:3 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > > > > > I have tried to configure PQ to have two queues per filter with no > > > success. > > > > > > What do you mean? > > > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have > > yet > > > to > > > > find it. > > > > > > Something like this? > > > > > > tc qdisc add dev eth1 handle 1: root prio > > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > > > Regards > > > /Christian > > > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From salim.si at cipherium.com.tw Fri Jun 15 11:36:20 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Fri Jun 15 11:36:34 2007 Subject: [LARTC] Re: multiple routing tables for internal router programs In-Reply-To: <467231D7.1070105@rabbit.us> Message-ID: <001a01c7af30$a4c23330$5964a8c0@SalimSi> > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Peter Rabbitson > Sent: Friday, June 15, 2007 2:30 PM > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Re: multiple routing tables for internal router > programs > Well then you are certainly ahead of the game. Still I would suggest to > avoid the complexity of bit mask marks - it is rather error prone and is > pretty hard to maintain, while the same result can usually be achieved > by other means (like in my SNAT example). As far as your original > problem goes - it seems like a mark is getting eaten away or is not set > somewhere in the first place. I have not had any problems like the ones > you describe. Those different MARKs are used for policy-routing, load balancing, firewall, traffic control, virtual server, user-group profiles etc. I think eventually you may have to use it, warts and all, or find some other way for integrating all those. :-) I will soon run out of bits, it seems. I've replaced that multipath rule for local packets with a single route, and change it on failovers. No balancing for local traffic, but there isn't much local traffic anyway. From imthiyaz at peopletech.co.in Fri Jun 15 12:30:25 2007 From: imthiyaz at peopletech.co.in (imthiyaz@peopletech.co.in) Date: Fri Jun 15 12:30:35 2007 Subject: [LARTC] sangoma WAN boards with lartc Message-ID: <380-220076515103025735@M2W034.mail2web.com> Hi anyone using sangoma hardware with lartc? pls let me know Thanks Imthiyaz Original Message: ----------------- From: lartc-request@mailman.ds9a.nl Date: Fri, 15 Jun 2007 12:00:07 +0200 (CEST) To: lartc@mailman.ds9a.nl Subject: LARTC Digest, Vol 28, Issue 23 Send LARTC mailing list submissions to lartc@mailman.ds9a.nl To subscribe or unsubscribe via the World Wide Web, visit http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc or, via email, send a message with subject or body 'help' to lartc-request@mailman.ds9a.nl You can reach the person managing the list at lartc-owner@mailman.ds9a.nl When replying, please edit your Subject line so it is more specific than "Re: Contents of LARTC digest..." Today's Topics: 1. RE: Re: PQ questions (Tim Enos) 2. Re: PQ questions (Christian Benvenuti) 3. RE: Re: PQ questions (Salim S I) 4. Re: PQ questions (Christian Benvenuti) 5. RE: Re: PQ questions (Salim S I) 6. RE: Re: multiple routing tables for internal router programs (Salim S I) ---------------------------------------------------------------------- Message: 1 Date: Fri, 15 Jun 2007 02:43:21 -0400 From: "Tim Enos" Subject: RE: [LARTC] Re: PQ questions To: "'Christian Benvenuti'" , Message-ID: <200706150643.l5F6hYeo015579@ll.mit.edu> Content-Type: text/plain; charset="us-ascii" Hi Christian, Thanks for the help. Please see my in-line comments: > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Thursday, June 14, 2007 4:44 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > > >Hi all, > > > >First, let me say I've been most impressed with how quickly and > >professionally people on this list ask and answer questions. > > > >Next, let me say that with which I need help is properly configuring > strict > >PQ, and gathering certain stats. Specifically: > > > >- I need to create a priority queue with four queues (let's say they are > of > >high, medium, normal, and low priority) > > > >- I need to use tc filters such that: > > > > - EF (0xB8) goes to the high priority queue > > > > - AF21 (0x50) goes to the medium priority queue > > > > - AF11 (0x28) goes to the normal priority queue, and > > > > - BE traffic goes to the low priority queue > > > >- For stat collection, I need to see: > > > > - how many bytes and packets are in each of the four queues > > > >- My configuration thus far is: > > > >tc qdisc add dev eml_test root handle 1: prio bands 4 priomap 0 1 2 3 > > > >tc filter add dev eml_test parent 1:0 prio 1 protocol ip u32 match ip tos > >0xb8 0xff flowid 1:1 > > > >tc filter add dev eml_test parent 1:0 prio 2 protocol ip u32 match ip tos > >0x80 0xff flowid 1:2 > > > >tc filter add dev eml_test parent 1:0 prio 3 protocol ip u32 match ip tos > >0x50 0xff flowid 1:3 > > > >tc filter add dev eml_test parent 1:0 prio 4 protocol ip u32 match ip tos > >0x00 0xff flowid 1:4 > >__________ > > Here is an article you may find useful: > http://citeseer.ist.psu.edu/539891.html > > Here is the description of the configuration parameters of the > PRIO qdisc: > http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN903 > (just in case you did not know what the "priomap" option is > used for) > > >My questions are: > > > >- What if anything is missing/requiring change in my config given the > stated > >requirements? > > Your config does not prevent an higher priority class from starving > a lower priority class. Exactly. That is requirement. > You can prevent it in two different ways (at > least): Don't want to prevent it right now. > > 1) You can assign a TBF qdisc (Token Bucket) to the PRIO classes > TBF: http://www.lartc.org/howto/lartc.qdisc.classless.html#AEN691 > > 2) You can replace the PRIO qdisc with something like HTB/CBQ > CBQ: http://www.lartc.org/howto/lartc.qdisc.classful.html#AEN939 > HTB: http://luxik.cdi.cz/~devik/qos/htb/ > > >- What if any command should I use to view how many bytes and packets are > in > >each of the four queues? > > The PRIO qdisc does not return statistics for its classes. > However, a simple workaround consists of explicitly adding > a qdisc to the four classes. > By default the PRIO qdisc assigns a pFIFO (packet FIFO) qdisc to > its classes. > Here is how you can replace the 4 default pFIFO qdisc with 4 > explicit pFIFO qdisc: > > tc qdisc add dev eml_test parent 1:1 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:2 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:3 pfifo limit 1000 > tc qdisc add dev eml_test parent 1:4 pfifo limit 1000 > > Now you can get the stats with: > tc -s -d qdisc list dev eml_test Those stats are nice to have, but the ones I must have are for how many bytes/packets are enqueued at whatever time I check the queues. > > Regards > /Christian > [ http://benve.info ] > I have tried to configure PQ to have two queues per filter with no success. Is it even possible to have (what I'll call) hierarchical PQ? I have yet to find it. > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc ------------------------------ Message: 2 Date: Fri, 15 Jun 2007 09:32:06 +0200 From: Christian Benvenuti Subject: [LARTC] Re: PQ questions To: lartc@mailman.ds9a.nl Message-ID: <1181892726.2702.7.camel@benve-laptop> Content-Type: text/plain Hi, > > Your config does not prevent an higher priority class from starving > > a lower priority class. > > Exactly. That is requirement. OK > Those stats are nice to have, but the ones I must have are for how many > bytes/packets are enqueued at whatever time I check the queues. That information is there. Here is an example: (b=bytes p=packets) #tc -s -d qdisc list dev eth1 qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 35p requeues 0 +-> This field is not initialized for this qdisc type qdisc pfifo 10: parent 1:1 limit 1000p Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 ^^^^^^^^^^^^^ qdisc pfifo 20: parent 1:2 limit 1000p Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 50470b 35p requeues 0 ^^^^^^^^^^^^^^^^^^ qdisc pfifo 30: parent 1:3 limit 1000p Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 ^^^^^^^^^^^^^ > I have tried to configure PQ to have two queues per filter with no success. What do you mean? > Is it even possible to have (what I'll call) hierarchical PQ? I have yet to > find it. Something like this? tc qdisc add dev eth1 handle 1: root prio tc qdisc add dev eth1 parent 1:1 handle 10 prio tc qdisc add dev eth1 parent 1:2 handle 20 prio tc qdisc add dev eth1 parent 1:3 handle 30 prio Regards /Christian [ http://benve.info ] ------------------------------ Message: 3 Date: Fri, 15 Jun 2007 15:46:31 +0800 From: "Salim S I" Subject: RE: [LARTC] Re: PQ questions To: "'Christian Benvenuti'" , Message-ID: <001801c7af21$4d9a5510$5964a8c0@SalimSi> Content-Type: text/plain; charset="us-ascii" Slightly offtopic... Has anyone really experienced starving of low priority traffic with PRIO qdisc? In my setup, I never achieved that, though I also wanted exactly that situation. I gave both the classes same amount of traffic at the same time. High prio got more bandwidth, but no starvation, even after I sent more traffic than the link capacity. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Friday, June 15, 2007 3:32 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > > > > Your config does not prevent an higher priority class from starving > > > a lower priority class. > > > > Exactly. That is requirement. > > OK > > > Those stats are nice to have, but the ones I must have are for how many > > bytes/packets are enqueued at whatever time I check the queues. > > That information is there. Here is an example: > (b=bytes p=packets) > > #tc -s -d qdisc list dev eth1 > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 35p requeues 0 > +-> This field is not initialized for this > qdisc type > qdisc pfifo 10: parent 1:1 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > qdisc pfifo 20: parent 1:2 limit 1000p > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 50470b 35p requeues 0 > ^^^^^^^^^^^^^^^^^^ > qdisc pfifo 30: parent 1:3 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > > > I have tried to configure PQ to have two queues per filter with no > success. > > What do you mean? > > > Is it even possible to have (what I'll call) hierarchical PQ? I have yet > to > > find it. > > Something like this? > > tc qdisc add dev eth1 handle 1: root prio > tc qdisc add dev eth1 parent 1:1 handle 10 prio > tc qdisc add dev eth1 parent 1:2 handle 20 prio > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > Regards > /Christian > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc ------------------------------ Message: 4 Date: Fri, 15 Jun 2007 10:16:12 +0200 From: Christian Benvenuti Subject: [LARTC] Re: PQ questions To: lartc@mailman.ds9a.nl Message-ID: <1181895372.2702.20.camel@benve-laptop> Content-Type: text/plain Hi, a class is starved only if those with higher priority are always (of pretty often) backlogged and do not give the lower priority classes a chance to transmit. Therefore, if you transmit at a rate smaller than your CPU/s and NIC/s can handle you will not experience any starving. For example, if you generate 50Mbit traffic on a 100Mbit NIC it is likely that you won't see any starving (unless your system is not able to handle 50Mbit traffic because of a complex TC or iptables configuration that consumes lot of CPU). Regards /Christian [ http://benve.info ] On Fri, 2007-06-15 at 15:46 +0800, Salim S I wrote: > Slightly offtopic... Has anyone really experienced starving of low > priority traffic with PRIO qdisc? > In my setup, I never achieved that, though I also wanted exactly that > situation. I gave both the classes same amount of traffic at the same > time. High prio got more bandwidth, but no starvation, even after I sent > more traffic than the link capacity. > > > -----Original Message----- > > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] > > On Behalf Of Christian Benvenuti > > Sent: Friday, June 15, 2007 3:32 PM > > To: lartc@mailman.ds9a.nl > > Subject: [LARTC] Re: PQ questions > > > > Hi, > > > > > > Your config does not prevent an higher priority class from > starving > > > > a lower priority class. > > > > > > Exactly. That is requirement. > > > > OK > > > > > Those stats are nice to have, but the ones I must have are for how > many > > > bytes/packets are enqueued at whatever time I check the queues. > > > > That information is there. Here is an example: > > (b=bytes p=packets) > > > > #tc -s -d qdisc list dev eth1 > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 35p requeues 0 > > +-> This field is not initialized for this > > qdisc type > > qdisc pfifo 10: parent 1:1 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > qdisc pfifo 20: parent 1:2 limit 1000p > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > ^^^^^^^^^^^^^^^^^^ > > qdisc pfifo 30: parent 1:3 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > > > > I have tried to configure PQ to have two queues per filter with no > > success. > > > > What do you mean? > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have > yet > > to > > > find it. > > > > Something like this? > > > > tc qdisc add dev eth1 handle 1: root prio > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > Regards > > /Christian > > [ http://benve.info ] ------------------------------ Message: 5 Date: Fri, 15 Jun 2007 17:13:48 +0800 From: "Salim S I" Subject: RE: [LARTC] Re: PQ questions To: "'Christian Benvenuti'" , Message-ID: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> Content-Type: text/plain; charset="us-ascii" I tested on wireless link. It could give a maximum of 45Mbps. And I sent 30Mbps of both low prio and high prio traffic. Total of 60Mbps. My test was done with UDP, using tcpdump. When I increased the bandwidth to 40Mbps each, the high priority class got lesser bandwidth. (maybe the effect of the known issue that large amount of low prio traffic can starve high prio traffic) > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Friday, June 15, 2007 4:16 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > a class is starved only if those with higher priority are > always (of pretty often) backlogged and do not give the lower > priority classes a chance to transmit. > Therefore, if you transmit at a rate smaller than your CPU/s and > NIC/s can handle you will not experience any starving. > > For example, if you generate 50Mbit traffic on a 100Mbit NIC > it is likely that you won't see any starving (unless your system is > not able to handle 50Mbit traffic because of a complex TC or > iptables configuration that consumes lot of CPU). > > Regards > /Christian > [ http://benve.info ] > > On Fri, 2007-06-15 at 15:46 +0800, Salim S I wrote: > > Slightly offtopic... Has anyone really experienced starving of low > > priority traffic with PRIO qdisc? > > In my setup, I never achieved that, though I also wanted exactly that > > situation. I gave both the classes same amount of traffic at the same > > time. High prio got more bandwidth, but no starvation, even after I sent > > more traffic than the link capacity. > > > > > -----Original Message----- > > > From: lartc-bounces@mailman.ds9a.nl > > [mailto:lartc-bounces@mailman.ds9a.nl] > > > On Behalf Of Christian Benvenuti > > > Sent: Friday, June 15, 2007 3:32 PM > > > To: lartc@mailman.ds9a.nl > > > Subject: [LARTC] Re: PQ questions > > > > > > Hi, > > > > > > > > Your config does not prevent an higher priority class from > > starving > > > > > a lower priority class. > > > > > > > > Exactly. That is requirement. > > > > > > OK > > > > > > > Those stats are nice to have, but the ones I must have are for how > > many > > > > bytes/packets are enqueued at whatever time I check the queues. > > > > > > That information is there. Here is an example: > > > (b=bytes p=packets) > > > > > > #tc -s -d qdisc list dev eth1 > > > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 35p requeues 0 > > > +-> This field is not initialized for this > > > qdisc type > > > qdisc pfifo 10: parent 1:1 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > qdisc pfifo 20: parent 1:2 limit 1000p > > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > > ^^^^^^^^^^^^^^^^^^ > > > qdisc pfifo 30: parent 1:3 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > > > > > I have tried to configure PQ to have two queues per filter with no > > > success. > > > > > > What do you mean? > > > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have > > yet > > > to > > > > find it. > > > > > > Something like this? > > > > > > tc qdisc add dev eth1 handle 1: root prio > > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > > > Regards > > > /Christian > > > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc ------------------------------ Message: 6 Date: Fri, 15 Jun 2007 17:36:20 +0800 From: "Salim S I" Subject: RE: [LARTC] Re: multiple routing tables for internal router programs To: "'Peter Rabbitson'" Cc: lartc@mailman.ds9a.nl Message-ID: <001a01c7af30$a4c23330$5964a8c0@SalimSi> Content-Type: text/plain; charset="us-ascii" > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Peter Rabbitson > Sent: Friday, June 15, 2007 2:30 PM > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Re: multiple routing tables for internal router > programs > Well then you are certainly ahead of the game. Still I would suggest to > avoid the complexity of bit mask marks - it is rather error prone and is > pretty hard to maintain, while the same result can usually be achieved > by other means (like in my SNAT example). As far as your original > problem goes - it seems like a mark is getting eaten away or is not set > somewhere in the first place. I have not had any problems like the ones > you describe. Those different MARKs are used for policy-routing, load balancing, firewall, traffic control, virtual server, user-group profiles etc. I think eventually you may have to use it, warts and all, or find some other way for integrating all those. :-) I will soon run out of bits, it seems. I've replaced that multipath rule for local packets with a single route, and change it on failovers. No balancing for local traffic, but there isn't much local traffic anyway. ------------------------------ _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc End of LARTC Digest, Vol 28, Issue 23 ************************************* -------------------------------------------------------------------- mail2web LIVE ? Free email based on Microsoft? Exchange technology - http://link.mail2web.com/LIVE From tenos at ll.mit.edu Fri Jun 15 18:33:53 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Fri Jun 15 18:43:17 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181892726.2702.7.camel@benve-laptop> Message-ID: <200706151643.l5FGh5eX004201@ll.mit.edu> Hi Christian, > #tc -s -d qdisc list dev eth1 > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 35p requeues 0 > +-> This field is not initialized for this > qdisc type > qdisc pfifo 10: parent 1:1 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > qdisc pfifo 20: parent 1:2 limit 1000p > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 50470b 35p requeues 0 > ^^^^^^^^^^^^^^^^^^ > qdisc pfifo 30: parent 1:3 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ Yes, I can see that from your output. Here however is my config: tc qdisc add dev eml_test root handle 1: prio bands 4 priomap 0 1 2 3 tc filter add dev eml_test parent 1:0 prio 1 protocol ip u32 match ip tos 0xb8 0xff flowid 1:1 tc filter add dev eml_test parent 1:0 prio 2 protocol ip u32 match ip tos 0x50 0xff flowid 1:2 tc filter add dev eml_test parent 1:0 prio 3 protocol ip u32 match ip tos 0x28 0xff flowid 1:3 tc filter add dev eml_test parent 1:0 prio 4 protocol ip u32 match ip tos 0x00 0xff flowid 1:4 tc qdisc add dev eml_test parent 1:1 handle 10: pfifo limit 2 tc qdisc add dev eml_test parent 1:2 handle 20: pfifo limit 2 tc qdisc add dev eml_test parent 1:3 handle 30: pfifo limit 2 tc qdisc add dev eml_test parent 1:4 handle 40: pfifo limit 2 ___ Here is what I see when issuing the same command: # tc -s -d qdisc list dev eml_test qdisc prio 1: bands 4 priomap 0 1 2 3 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkts (dropped 0, overlimits 0 requeues 0) qdisc pfifo 10: parent 1:1 limit 2p Sent 0 bytes 0 pkts (dropped 0, overlimits 0 requeues 0) qdisc pfifo 20: parent 1:2 limit 2p Sent 0 bytes 0 pkts (dropped 0, overlimits 0 requeues 0) qdisc pfifo 30: parent 1:3 limit 2p Sent 0 bytes 0 pkts (dropped 0, overlimits 0 requeues 0) qdisc pfifo 40: parent 1:4 limit 2p Sent 0 bytes 0 pkts (dropped 0, overlimits 0 requeues 0) > > > I have tried to configure PQ to have two queues per filter with no > success. > > What do you mean? Sorry, let me try to explain it this way (please refer to the above config): - I presently have -a strict PQ scheme which uses four queues - four filters each of which determine what type of traffic gets into which queue (EF, AF21, AF11 and BE respectively in my case) - a specific pFIFO qdisc for each PQ "class" __________ > > > Is it even possible to have (what I'll call) hierarchical PQ? I have yet > to > > find it. > > Something like this? > > tc qdisc add dev eth1 handle 1: root prio > tc qdisc add dev eth1 parent 1:1 handle 10 prio > tc qdisc add dev eth1 parent 1:2 handle 20 prio > tc qdisc add dev eth1 parent 1:3 handle 30 prio (see above) I already have something just like this, just with pfifo for each child as opposed to the prio listed in the above config (thanks in great part to your previous help). What I need is one more layer of hierarchy. Specifically, the queues defined by: tc qdisc add dev eth1 parent 1:1 handle 10 prio tc qdisc add dev eth1 parent 1:2 handle 20 prio tc qdisc add dev eth1 parent 1:3 handle 30 prio themselves need to be parents (e.g.): tc qdisc add dev eth1 parent 10:0 handle 11 prio tc qdisc add dev eth1 parent 20:0 handle 21 prio tc qdisc add dev eth1 parent 30:0 handle 31 prio > > Regards > /Christian > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From markdv.lartc at asphyx.net Fri Jun 15 20:03:43 2007 From: markdv.lartc at asphyx.net (mark) Date: Fri Jun 15 20:03:52 2007 Subject: [LARTC] HTB question, tokens. Message-ID: <1181930623.3681.48.camel@velocity.nl.tiscali.com> Hi, What exactly are the "tokens"? I thought each token allowed the sending of one byte, that tokens are stored in a bucket that can hold a max of "burst" tokens, and that this bucket is filled with tokens at "rate". But theory does not seem to explain the "tc -s .." output in the examples below. And I can't figure out why or how... #tc qdisc del dev eth0 root #tc qdisc add dev eth0 root handle 1: htb default 1 #tc class add dev eth0 parent 1:0 classid 1:1 htb rate 2mbit #tc -s -d class show dev eth0 class htb 1:1 root prio 0 quantum 25000 rate 2000Kbit ceil 2000Kbit burst 2599b/8 mpu 0b overhead 0b cburst 2599b/8 mpu 0b overhead 0b level 0 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 10649 ctokens: 10649 #tc qdisc del dev eth0 root #tc qdisc add dev eth0 root handle 1: htb default 1 #tc class add dev eth0 parent 1:0 classid 1:1 htb rate 1mbit #tc -s -d class show dev eth0 class htb 1:1 root prio 0 quantum 12500 rate 1000Kbit ceil 1000Kbit burst 2099b/8 mpu 0b overhead 0b cburst 2099b/8 mpu 0b overhead 0b level 0 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 17203 ctokens: 17203 Why do the amount of tokens go UP if the configured rate (and burst) is lower? (The commands where run from a script so these amounts of tokens available right after the creation of the class.) If I set the rate to 9mbit the amount of tokens is always lower then the burst size. Wouldn't that mean that there are always too few tokens available to actually burst the "burst" amount of data? Regards, Mark. From tenos at ll.mit.edu Fri Jun 15 20:31:14 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Fri Jun 15 20:32:04 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181892726.2702.7.camel@benve-laptop> Message-ID: <200706151831.l5FIVomx025357@ll.mit.edu> Please send me the exact config by which you got all those params in the output (especially backlog 0b 35p)... I just do not see that in mine. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Christian Benvenuti > Sent: Friday, June 15, 2007 3:32 AM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Re: PQ questions > > Hi, > > > > Your config does not prevent an higher priority class from starving > > > a lower priority class. > > > > Exactly. That is requirement. > > OK > > > Those stats are nice to have, but the ones I must have are for how many > > bytes/packets are enqueued at whatever time I check the queues. > > That information is there. Here is an example: > (b=bytes p=packets) > > #tc -s -d qdisc list dev eth1 > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 35p requeues 0 > +-> This field is not initialized for this > qdisc type > qdisc pfifo 10: parent 1:1 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > qdisc pfifo 20: parent 1:2 limit 1000p > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 50470b 35p requeues 0 > ^^^^^^^^^^^^^^^^^^ > qdisc pfifo 30: parent 1:3 limit 1000p > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > ^^^^^^^^^^^^^ > > > I have tried to configure PQ to have two queues per filter with no > success. > > What do you mean? > > > Is it even possible to have (what I'll call) hierarchical PQ? I have yet > to > > find it. > > Something like this? > > tc qdisc add dev eth1 handle 1: root prio > tc qdisc add dev eth1 parent 1:1 handle 10 prio > tc qdisc add dev eth1 parent 1:2 handle 20 prio > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > Regards > /Christian > [ http://benve.info ] > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From christian.benvenuti at libero.it Fri Jun 15 20:49:42 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Fri Jun 15 20:43:44 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> References: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> Message-ID: <1181933382.6296.4.camel@benve-laptop> Hi, On Fri, 2007-06-15 at 17:13 +0800, Salim S I wrote: > I tested on wireless link. It could give a maximum of 45Mbps. And I sent > 30Mbps of both low prio and high prio traffic. Total of 60Mbps. Do you mean to say that your wireless link can transmit at 45Mbps? If that's what you meant, what I meant to say is that if you generate almost (or more than) 45Mbps of high prio traffic than there is nothing (or almost nothing) left for the low prio traffic. When you forward the above traffic (as opposed to when you generate it locally) there are other factors to take into account that can change the overall behavior. For example, for each CPU there is one ingress queue that is shared by all ingress traffic that is received by interfaces whose driver does not use NAPI. These CPU queues are traversed before the ingress queueing disciplines and they have nothing to do with Traffic Control. It is possible therefore that under heavy load the low prio traffic fills in a significant portion of such CPU queues and reduces the amount of high prio traffic that reaches the egress queueing discipline (leaving therefore more chances to the low priority traffic to be scheduled). > My test was done with UDP, using tcpdump. When I increased the bandwidth > to 40Mbps each, the high priority class got lesser bandwidth. (maybe the > effect of the known issue that large amount of low prio traffic can > starve high prio traffic) Possible. See my comment above. Regards /Christian [ http://benve.info ] > > -----Original Message----- > > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] > > On Behalf Of Christian Benvenuti > > Sent: Friday, June 15, 2007 4:16 PM > > To: lartc@mailman.ds9a.nl > > Subject: [LARTC] Re: PQ questions > > > > Hi, > > a class is starved only if those with higher priority are > > always (of pretty often) backlogged and do not give the lower > > priority classes a chance to transmit. > > Therefore, if you transmit at a rate smaller than your CPU/s and > > NIC/s can handle you will not experience any starving. > > > > For example, if you generate 50Mbit traffic on a 100Mbit NIC > > it is likely that you won't see any starving (unless your system is > > not able to handle 50Mbit traffic because of a complex TC or > > iptables configuration that consumes lot of CPU). > > > > Regards > > /Christian > > [ http://benve.info ] > > > > On Fri, 2007-06-15 at 15:46 +0800, Salim S I wrote: > > > Slightly offtopic... Has anyone really experienced starving of low > > > priority traffic with PRIO qdisc? > > > In my setup, I never achieved that, though I also wanted exactly > that > > > situation. I gave both the classes same amount of traffic at the > same > > > time. High prio got more bandwidth, but no starvation, even after I > sent > > > more traffic than the link capacity. > > > > > > > -----Original Message----- > > > > From: lartc-bounces@mailman.ds9a.nl > > > [mailto:lartc-bounces@mailman.ds9a.nl] > > > > On Behalf Of Christian Benvenuti > > > > Sent: Friday, June 15, 2007 3:32 PM > > > > To: lartc@mailman.ds9a.nl > > > > Subject: [LARTC] Re: PQ questions > > > > > > > > Hi, > > > > > > > > > > Your config does not prevent an higher priority class from > > > starving > > > > > > a lower priority class. > > > > > > > > > > Exactly. That is requirement. > > > > > > > > OK > > > > > > > > > Those stats are nice to have, but the ones I must have are for > how > > > many > > > > > bytes/packets are enqueued at whatever time I check the queues. > > > > > > > > That information is there. Here is an example: > > > > (b=bytes p=packets) > > > > > > > > #tc -s -d qdisc list dev eth1 > > > > > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 > 1 > > > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues > 0) > > > > rate 0bit 0pps backlog 0b 35p requeues 0 > > > > +-> This field is not initialized for > this > > > > qdisc type > > > > qdisc pfifo 10: parent 1:1 limit 1000p > > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > ^^^^^^^^^^^^^ > > > > qdisc pfifo 20: parent 1:2 limit 1000p > > > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues > 0) > > > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > > > ^^^^^^^^^^^^^^^^^^ > > > > qdisc pfifo 30: parent 1:3 limit 1000p > > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > > ^^^^^^^^^^^^^ > > > > > > > > > I have tried to configure PQ to have two queues per filter with > no > > > > success. > > > > > > > > What do you mean? > > > > > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I > have > > > yet > > > > to > > > > > find it. > > > > > > > > Something like this? > > > > > > > > tc qdisc add dev eth1 handle 1: root prio > > > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > > > > > Regards > > > > /Christian > > > > [ http://benve.info ] From christian.benvenuti at libero.it Fri Jun 15 20:57:25 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Fri Jun 15 20:51:21 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <200706151831.l5FIVomx025357@ll.mit.edu> References: <200706151831.l5FIVomx025357@ll.mit.edu> Message-ID: <1181933845.6296.13.camel@benve-laptop> Hi, On Fri, 2007-06-15 at 14:31 -0400, Tim Enos wrote: > Please send me the exact config by which you got all those params in the > output (especially backlog 0b 35p)... I just do not see that in mine. The configuration is the same as yours, with the difference that I have eth0 instead of eml_test. I believe your config is OK. I managed to get backlog!=0 by generating a huge amount of traffic with mgen: 10K pkts/s of 1300bytes of size. If you do not saturate your link it is likely you will not see anything sitting in the queue. Regards /Christian [ http://benve.info ] > > -----Original Message----- > > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > > On Behalf Of Christian Benvenuti > > Sent: Friday, June 15, 2007 3:32 AM > > To: lartc@mailman.ds9a.nl > > Subject: [LARTC] Re: PQ questions > > > > Hi, > > > > > > Your config does not prevent an higher priority class from starving > > > > a lower priority class. > > > > > > Exactly. That is requirement. > > > > OK > > > > > Those stats are nice to have, but the ones I must have are for how many > > > bytes/packets are enqueued at whatever time I check the queues. > > > > That information is there. Here is an example: > > (b=bytes p=packets) > > > > #tc -s -d qdisc list dev eth1 > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 35p requeues 0 > > +-> This field is not initialized for this > > qdisc type > > qdisc pfifo 10: parent 1:1 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > qdisc pfifo 20: parent 1:2 limit 1000p > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > ^^^^^^^^^^^^^^^^^^ > > qdisc pfifo 30: parent 1:3 limit 1000p > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > ^^^^^^^^^^^^^ > > > > > I have tried to configure PQ to have two queues per filter with no > > success. > > > > What do you mean? > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have yet > > to > > > find it. > > > > Something like this? > > > > tc qdisc add dev eth1 handle 1: root prio > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > Regards > > /Christian > > [ http://benve.info ] > > From tenos at ll.mit.edu Fri Jun 15 22:09:47 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Fri Jun 15 22:10:18 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <1181933845.6296.13.camel@benve-laptop> Message-ID: <200706152010.l5FKABAE028145@ll.mit.edu> Cool, Thanks Christian! I'm wishing that all of those same params showed up in the output without having to run anything. No problem. Should it matter that I'm using an emulated interface? Also wondering what you think about my "hierarchical PQ" question. Have a good weekend. > -----Original Message----- > From: Christian Benvenuti [mailto:christian.benvenuti@libero.it] > Sent: Friday, June 15, 2007 2:57 PM > To: lartc@mailman.ds9a.nl > Cc: Tim Enos > Subject: RE: [LARTC] Re: PQ questions > > Hi, > > On Fri, 2007-06-15 at 14:31 -0400, Tim Enos wrote: > > Please send me the exact config by which you got all those params in the > > output (especially backlog 0b 35p)... I just do not see that in mine. > > The configuration is the same as yours, with the difference that I have > eth0 instead of eml_test. > I believe your config is OK. > I managed to get backlog!=0 by generating a huge amount of traffic with > mgen: 10K pkts/s of 1300bytes of size. > If you do not saturate your link it is likely you will not see anything > sitting in the queue. > > Regards > /Christian > [ http://benve.info ] > > > > > -----Original Message----- > > > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc- > bounces@mailman.ds9a.nl] > > > On Behalf Of Christian Benvenuti > > > Sent: Friday, June 15, 2007 3:32 AM > > > To: lartc@mailman.ds9a.nl > > > Subject: [LARTC] Re: PQ questions > > > > > > Hi, > > > > > > > > Your config does not prevent an higher priority class from > starving > > > > > a lower priority class. > > > > > > > > Exactly. That is requirement. > > > > > > OK > > > > > > > Those stats are nice to have, but the ones I must have are for how > many > > > > bytes/packets are enqueued at whatever time I check the queues. > > > > > > That information is there. Here is an example: > > > (b=bytes p=packets) > > > > > > #tc -s -d qdisc list dev eth1 > > > > > > qdisc prio 1: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > > Sent 85357186 bytes 59299 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 35p requeues 0 > > > +-> This field is not initialized for this > > > qdisc type > > > qdisc pfifo 10: parent 1:1 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > qdisc pfifo 20: parent 1:2 limit 1000p > > > Sent 85357120 bytes 59298 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 50470b 35p requeues 0 > > > ^^^^^^^^^^^^^^^^^^ > > > qdisc pfifo 30: parent 1:3 limit 1000p > > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > > rate 0bit 0pps backlog 0b 0p requeues 0 > > > ^^^^^^^^^^^^^ > > > > > > > I have tried to configure PQ to have two queues per filter with no > > > success. > > > > > > What do you mean? > > > > > > > Is it even possible to have (what I'll call) hierarchical PQ? I have > yet > > > to > > > > find it. > > > > > > Something like this? > > > > > > tc qdisc add dev eth1 handle 1: root prio > > > tc qdisc add dev eth1 parent 1:1 handle 10 prio > > > tc qdisc add dev eth1 parent 1:2 handle 20 prio > > > tc qdisc add dev eth1 parent 1:3 handle 30 prio > > > > > > Regards > > > /Christian > > > [ http://benve.info ] > > > From stas.oskin at gmail.com Mon Jun 18 20:10:52 2007 From: stas.oskin at gmail.com (Stas Oskin) Date: Mon Jun 18 20:11:15 2007 Subject: [LARTC] Fwd: police burst is mandatory? In-Reply-To: <77938bc20706181109m3c54525do4b35784535d5f557@mail.gmail.com> References: <77938bc20706181109m3c54525do4b35784535d5f557@mail.gmail.com> Message-ID: <77938bc20706181110ne4a3a76hd26a5bcdb404ab11@mail.gmail.com> Hi. I'm using the following filter from lartc "ultimate PPP" example: tc filter add dev $DEV parent ffff: protocol ip prio 50 u32 match ip src \ 0.0.0.0/0 police rate ${DOWNLINK}kbit burst 10k drop flowid :1 It works fine, but when I remove the "burst 10k", I receive the following error: "burst" requires "rate". Illegal "police" AFAIK, burst is how many bytes can be transferred over "rate" up to "ceil" and is an optional parameter, but here it is mandatory? Also, shouldn't the "ceil" parameter absence make this parameter useless? Thanks, Stas. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070618/2e8e15e8/attachment.html From J.Kraaijeveld at Askesis.nl Tue Jun 19 13:47:48 2007 From: J.Kraaijeveld at Askesis.nl (Joost Kraaijeveld) Date: Tue Jun 19 13:48:21 2007 Subject: [LARTC] Why does this script noet work (bandwidth, tc en u32) Message-ID: <1182253668.5130.14.camel@panoramix> Hi, Can anyone point me out where the script below is wrong? All I want is that host 172.31.1.1 can only use 10 megabit. If I run this script on the in-between router nothing happens (the host uses still the full 100 mbit, tested with iperf) , so i assume that something must be wrong.... #!/bin/sh # LAN1 NIC tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: htb tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit # my machine tc class add dev eth0 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit # filter tc filter add dev eth0 parent 1:1 protocol ip prio 1 u32 match ip dst 172.31.1.1 flowid 1:2 # LAN2 NIC tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1: htb tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit # my machine tc class add dev eth1 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit # filter tc filter add dev eth1 parent 1:1 protocol ip prio 1 u32 match ip src 172.31.1.1 flowid 1:2 TIA -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 web: www.askesis.nl From ianbrn at gmail.com Tue Jun 19 14:41:54 2007 From: ianbrn at gmail.com (Ian Brown) Date: Tue Jun 19 14:42:01 2007 Subject: [LARTC] Routing cache and the missing redirect flag Message-ID: Hello, Should "route -C" show the RTCF_REDIRECTED flag ? (0x00040000). I had searched in the code and it seems that it should show this flag by "r". However, I could not show this flag by "route -c " even that it should have been there. I have the following scenario where I have this flag set. I see it in cat /proc/net/rt_cache but **not** in route -C. (BTW, "ip route show table cache" does not show flags at all). Here is what I do: We have machine A with ip 192.168.0.121. We have machine B with ip 192.168.0.10. On a machine A (192.168.0.121) I ran: route add -net 192.168.0.10 netmask 255.255.255.255 gw 192.168.0.189 The 192.168.0.189 machine, has forwarding and send_redirect set to 1. machine A (192.168.0.121) has accept_redirects set to 1. Now I run "ping 192.168.0.10". I get a redirect: (as should indded be the case under these circumstances): PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data. 64 bytes from 192.168.0.10: icmp_seq=1 ttl=64 time=0.194 ms >From 192.168.0.189: icmp_seq=2 Redirect Host(New nexthop: 192.168.0.10 Now , as far as I understand from the kernel code, this sets the RTCF_REDIRECTED in the route cache. And indeed, cat /proc/net/rt_cache | grep 0A00A8C0 shows: eth0 0A00A8C0 0A00A8C0 40000 0 1 0 7900A8C0 1500 0 0 00 -1 0 7900A8C0 (0A00A8C0 is 192.168.0.10 in HEX.) We see here the fourth field, which is 40000 (RTCF_REDIRECTED). **But**, route -Cn | grep 192.168.0.10 shows: 192.168.0.121 192.168.0.10 192.168.0.189 0 0 2 eth0 192.168.0.10 192.168.0.121 192.168.0.121 l 0 0 1 lo We don't see here the RTCF_REDIRECTED flag ! (the "l" is for RTCF_LOCAL). I had looked in the sources for "route" command ; "route" belongs to the net-tools package ; and parsing of flags is done in lib/netinet_gr.c, in the rprint_cache() method; According to the code there, this flag shoulf have been "r": ... ... if (iflags & RTCF_DOREDIRECT) strcat(flags, "r"); ... ... Any ideas? Regards, Ian From rosenrami at gmail.com Tue Jun 19 16:33:07 2007 From: rosenrami at gmail.com (Rami Rosen) Date: Tue Jun 19 16:33:17 2007 Subject: [LARTC] Re: Routing cache and the missing redirect flag Message-ID: Hello , > if (iflags & RTCF_DOREDIRECT) > strcat(flags, "r"); You should see the "r" flags (for routr -Cn) on 192.168.0.189 and NOT on 192.168.0.121. It is 192.168.0.189 who sends the redirect, so it's cache has the RTCF_DOREDIRECT flag. Regards, Rami Rosen From markdv.lartc at asphyx.net Tue Jun 19 19:42:36 2007 From: markdv.lartc at asphyx.net (mark) Date: Tue Jun 19 19:42:45 2007 Subject: [LARTC] Why does this script noet work (bandwidth, tc en u32) In-Reply-To: <1182253668.5130.14.camel@panoramix> References: <1182253668.5130.14.camel@panoramix> Message-ID: <1182274957.4708.16.camel@velocity.nl.tiscali.com> On Tue, 2007-06-19 at 13:47 +0200, Joost Kraaijeveld wrote: > Hi, > > Can anyone point me out where the script below is wrong? Maybee, I'm new to this stuff and having trouble getting some things to work myself. :S > All I want is that host 172.31.1.1 can only use 10 megabit. If I run > this script on the in-between router nothing happens (the host uses > still the full 100 mbit, tested with iperf) , so i assume that something > must be wrong.... > > > #!/bin/sh > > # LAN1 NIC > tc qdisc del dev eth0 root > tc qdisc add dev eth0 root handle 1: htb > tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit > > # my machine > tc class add dev eth0 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit One thing I find useful (especially when debugging) is to replace the default fifo qdisc on the leaf with one that _does_ maintain statistics - which you can see with 'tc -s qdisc show dev ...'. Makes it a bit easier to see where your traffic is going, and if that matches your expectations/intentions. > # filter > tc filter add dev eth0 parent 1:1 protocol ip prio 1 u32 match ip dst 172.31.1.1 flowid 1:2 > Try attaching the filter to the root qdisc (parent 1:0). What I think might be happening is that the root qdisc had no idea what to do with the packets - there are no filters there, and you did not specify a "default" class. So it just sends the packets directly to the interface. Or you could try adding "default 1" to the root htb qdisc. From there your filter should do the rest. Only I don't know if "default" can point to a non-leaf class, if you try let me know if it works or not. HTH, Mark. > # LAN2 NIC > tc qdisc del dev eth1 root > tc qdisc add dev eth1 root handle 1: htb > tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit > > # my machine > tc class add dev eth1 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit > > # filter > tc filter add dev eth1 parent 1:1 protocol ip prio 1 u32 match ip src 172.31.1.1 flowid 1:2 > > > TIA > From J.Kraaijeveld at Askesis.nl Tue Jun 19 22:48:31 2007 From: J.Kraaijeveld at Askesis.nl (Joost Kraaijeveld) Date: Tue Jun 19 22:48:37 2007 Subject: [LARTC] Why does this script noet work (bandwidth, tc en u32) In-Reply-To: <1182274957.4708.16.camel@velocity.nl.tiscali.com> References: <1182253668.5130.14.camel@panoramix> <1182274957.4708.16.camel@velocity.nl.tiscali.com> Message-ID: <1182286111.5130.26.camel@panoramix> Hi Mark, After changing the script in this way it seems to work (MI think that this is what you mend with attaching the filter to the root qdisk): # downlink tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: htb tc class add dev eth0 parent 1: classid 1:1 htb rate 100mbit tc class add dev eth0 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dst 172.31.1.1 flowid 1:2 # uplink tc qdisc del dev eth1 root tc qdisc add dev eth1 root handle 1: htb tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit tc class add dev eth1 parent 1:1 classid 1:2 htb rate 1mbit ceil 10mbit tc filter add dev eth1 parent 1: protocol ip prio 1 u32 match ip src 172.31.1.1 flowid 1:2 > One thing I find useful (especially when debugging) is to replace the > default fifo qdisc on the leaf with one that _does_ maintain statistics > - which you can see with 'tc -s qdisc show dev ...'. Makes it a bit > easier to see where your traffic is going, and if that matches your > expectations/intentions. Could you elaborate on this? Which "other fifo qdisc" that maintains statistics? Any hints on the right syntax? TIA -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 web: www.askesis.nl From sunnyboyfrank at web.de Tue Jun 19 23:41:04 2007 From: sunnyboyfrank at web.de (Frank Remetter) Date: Tue Jun 19 23:40:42 2007 Subject: [LARTC] Why does this script noet work (bandwidth, tc en u32) In-Reply-To: <1182286111.5130.26.camel@panoramix> References: <1182253668.5130.14.camel@panoramix> <1182274957.4708.16.camel@velocity.nl.tiscali.com> <1182286111.5130.26.camel@panoramix> Message-ID: <20070619234104.05e9fad9@ocean.remetter.homelinux.org> Hey, > # uplink > tc qdisc del dev eth1 root > > tc qdisc add dev eth1 root handle 1: htb > tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit > tc class add dev eth1 parent 1:1 classid 1:2 htb rate 1mbit ceil > 10mbit tc filter add dev eth1 parent 1: protocol ip prio 1 u32 match > ip src 172.31.1.1 flowid 1:2 > Could you elaborate on this? Which "other fifo qdisc" that maintains > statistics? Any hints on the right syntax? i guess he is talking of e.g. sfq: tc qdisc add dev eth1 parent 1:2 handle 2: sfq perturb 10 -- Frank Remetter http://www.remetter.de/ GPG-FP: 2B07 B7D8 5C27 AB94 7A37 8B0B DEBE DD89 D68B 7BE6 From lists at andyfurniss.entadsl.com Tue Jun 19 23:46:29 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Jun 19 23:46:30 2007 Subject: [LARTC] Fwd: police burst is mandatory? In-Reply-To: <77938bc20706181110ne4a3a76hd26a5bcdb404ab11@mail.gmail.com> References: <77938bc20706181109m3c54525do4b35784535d5f557@mail.gmail.com> <77938bc20706181110ne4a3a76hd26a5bcdb404ab11@mail.gmail.com> Message-ID: <46784EB5.1080202@andyfurniss.entadsl.com> Stas Oskin wrote: > Hi. > > I'm using the following filter from lartc "ultimate PPP" example: > > tc filter add dev $DEV parent ffff: protocol ip prio 50 u32 match ip src \ > > 0.0.0.0/0 police rate ${DOWNLINK}kbit burst 10k drop flowid :1 > > It works fine, but when I remove the "burst 10k", I receive the following > error: > > "burst" requires "rate". > Illegal "police" > > AFAIK, burst is how many bytes can be transferred over "rate" up to "ceil" > and is an optional parameter, but here it is mandatory? Also, shouldn't the > "ceil" parameter absence make this parameter useless? > You are thinking of htb - for policer burst/buffer is required. policers don't delay/shape/queue packets, they just drop overrate packets (when used with the drop param). The burst is the length of a virtual buffer used to decide when to drop a packet, when it's "full" everything else gets dropped till enough time has passed for it to have enough room for the next packet, how much time depends on the rate. It needs to be at least MTU (MTU+14 on eth) or the policer won't pass full size packets at all. If you make it too small, like making a real buffer too small - it won't be nice for tcp throughput. Andy. From lists at andyfurniss.entadsl.com Wed Jun 20 00:03:59 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Jun 20 00:03:55 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> References: <001901c7af2d$7c6e21d0$5964a8c0@SalimSi> Message-ID: <467852CF.6060007@andyfurniss.entadsl.com> Salim S I wrote: > I tested on wireless link. It could give a maximum of 45Mbps. And I sent > 30Mbps of both low prio and high prio traffic. Total of 60Mbps. > My test was done with UDP, using tcpdump. When I increased the bandwidth > to 40Mbps each, the high priority class got lesser bandwidth. Maybe wireless is a special case here - was the driver/device actually on the prio box? (maybe the > effect of the known issue that large amount of low prio traffic can > starve high prio traffic) On eth using tcp I can get prio to behave quite well. You need to remember to filter arp to a high (best empty) class - it goes to x:2 by default, which made for a bit of wierdness when I tried last. If you use tcp on my 100meg eth there is still a 300pkt buffer to fill before prio gets backlogged, so window scaling needs to be on and both ends need decent size buffers/scale amounts. Maybe UDP would be different I'll have to try sometime. Andy. From lists at andyfurniss.entadsl.com Wed Jun 20 00:17:16 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Wed Jun 20 00:17:14 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <200706152010.l5FKABAE028145@ll.mit.edu> References: <200706152010.l5FKABAE028145@ll.mit.edu> Message-ID: <467855EC.30209@andyfurniss.entadsl.com> Tim Enos wrote: > Cool, > > Thanks Christian! I'm wishing that all of those same params showed up in the > output without having to run anything. No problem. Should it matter that I'm > using an emulated interface? Quite possibly - using prio on real devices still can appear not to work until you have filled up any buffer the driver uses. On my 100meg eth it would take 5/6 unscaled tcp connections to fill enough for prio to do anything. You can use prio as a child of hfsc/htb so that they set the rate. It may be nicer to use htb's own prio though, if you need a slow rate and care about latency. Andy. From GregScott at InfraSupportEtc.com Wed Jun 20 00:54:46 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Wed Jun 20 00:54:55 2007 Subject: [LARTC] Linux bridging and cascaded switches Message-ID: <925A849792280C4E80C5461017A4B8A210B8D8@mail733.InfraSupportEtc.com> Hi - Still plugging away at my Linux bridge/firewall and thinking through the consequences. In a normal firewall situation, the Internet is on one side, the internal LAN on the other. Duh! But now, with a Linux bridge in the middle, the whole thing becomes one big messy LAN. So we have a scenario that looks like this: Internal---User---Core-----Firewall---Internet---Internet router Servers switch switch (Bridged) switch (and default GW for internal servers) The scenario is a little more complex than I drew above because the internal side has more than one LAN segment participating in the bridge. I'm working on a way to simulate all this here - before going into production - but I have a big question; That firewall/bridge is no longer a router - it's a bridge. Well, a bridge that also does a bunch of stateful IP layer 3 filtering. So now, it will participate in a spanning tree setup with all those switches, on both sides of it - right? I'm guessing I want to turn off STP in this case. Am I on the right track? Thanks - Greg Scott From alex at samad.com.au Wed Jun 20 01:03:25 2007 From: alex at samad.com.au (Alex Samad) Date: Wed Jun 20 01:03:30 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <925A849792280C4E80C5461017A4B8A210B8D8@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B8D8@mail733.InfraSupportEtc.com> Message-ID: <20070619230325.GR24808@samad.com.au> On Tue, Jun 19, 2007 at 05:54:46PM -0500, Greg Scott wrote: > Hi - > > Still plugging away at my Linux bridge/firewall and thinking through the > consequences. In a normal firewall situation, the Internet is on one > side, the internal LAN on the other. Duh! But now, with a Linux bridge > in the middle, the whole thing becomes one big messy LAN. So we have a > scenario that looks like this: > > Internal---User---Core-----Firewall---Internet---Internet router > Servers switch switch (Bridged) switch (and default GW for > internal servers) > out of curiosity why would you want to bridge at the firewall. is this meant to be a drop in-line firewall appliance > The scenario is a little more complex than I drew above because the > internal side has more than one LAN segment participating in the bridge. > I'm working on a way to simulate all this here - before going into > production - but I have a big question; > > That firewall/bridge is no longer a router - it's a bridge. Well, a > bridge that also does a bunch of stateful IP layer 3 filtering. So now, > it will participate in a spanning tree setup with all those switches, on > both sides of it - right? I'm guessing I want to turn off STP in this > case. Am I on the right track? if there is only 1 way to connect from the corporate (private LAN) to the public (internet) then I don't think you will need STP - it was meant to stop loops in ethernet segments. If you have multiple paths you might still need it > > Thanks > > - Greg Scott > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070620/757c4b46/attachment.pgp From GregScott at InfraSupportEtc.com Wed Jun 20 01:35:46 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Wed Jun 20 01:35:51 2007 Subject: [LARTC] Linux bridging and cascaded switches Message-ID: <925A849792280C4E80C5461017A4B8A210B8D9@mail733.InfraSupportEtc.com> > out of curiosity why would you want to bridge at the firewall. is this meant to be a drop in-line firewall appliance Long story but yes, it is essentially a drop in-line system. It's a mess. So will that Internet router really see 4 "switches" - a switch, a bridge, and 2 switches - between it and the internal servers? I don't remember all my LAN rules but that feels way too deep to me. - Greg From GregScott at InfraSupportEtc.com Wed Jun 20 05:31:44 2007 From: GregScott at InfraSupportEtc.com (Greg Scott) Date: Wed Jun 20 05:31:53 2007 Subject: [LARTC] Linux bridging and cascaded switches Message-ID: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> > On the bridged firewall - The simplest/ easiest/ well tested > method would be to run ebtables. A more complex method used > before the arrival of ebtables involved pseudo-bridging. Yes - thanks. I've been trying some ebtables experiments. Layer 2 filtering - takes some getting used to! More fundamentally, can I cascade these switches and my bridge/firewall this deep? How do the Internet router and internal servers find each others' MAC addresses when they are 4 "hops" (OSI layer 2 hops) separated? Or am I making this too complicated? > Internal---User---Core-----Firewall---Internet---Internet router > Servers switch switch (Bridged) switch (and default GW for > internal servers) Thanks - Greg -----Original Message----- From: Mohan Sundaram [mailto:mohan.tux@gmail.com] Sent: Tuesday, June 19, 2007 9:53 PM To: Greg Scott Subject: Re: [LARTC] Linux bridging and cascaded switches Greg Scott wrote: > Hi - > > Internal---User---Core-----Firewall---Internet---Internet router > Servers switch switch (Bridged) switch (and default GW for > internal servers) > > The scenario is a little more complex than I drew above because the > internal side has more than one LAN segment participating in the bridge. > I'm working on a way to simulate all this here - before going into > production - but I have a big question; > > That firewall/bridge is no longer a router - it's a bridge. Well, a > bridge that also does a bunch of stateful IP layer 3 filtering. So > now, it will participate in a spanning tree setup with all those > switches, on both sides of it - right? I'm guessing I want to turn > off STP in this case. Am I on the right track? > > Thanks > > - Greg Scott From what you have drawn, it seems like we will not have multiple paths in the LAN and so STP will not be needed. On the bridged firewall - The simplest/ easiest/ well tested method would be to run ebtables. A more complex method used before the arrival of ebtables involved pseudo-bridging. Mohan From alex at samad.com.au Wed Jun 20 06:07:17 2007 From: alex at samad.com.au (Alex Samad) Date: Wed Jun 20 06:07:26 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <925A849792280C4E80C5461017A4B8A210B8D9@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B8D9@mail733.InfraSupportEtc.com> Message-ID: <20070620040717.GT24808@samad.com.au> On Tue, Jun 19, 2007 at 06:35:46PM -0500, Greg Scott wrote: > > out of curiosity why would you want to bridge at the firewall. is > this meant to be a drop in-line firewall appliance > > Long story but yes, it is essentially a drop in-line system. It's a > mess. > > So will that Internet router really see 4 "switches" - a switch, a > bridge, and 2 switches - between it and the internal servers? I don't > remember all my LAN rules but that feels way too deep to me. I think that was the old 5-4-3 or was it 4-3-2 ... I think that was more in the days of repeater and broadcast hubs. Modern day switch I believe allow for a lot more. > > - Greg > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070620/134714be/attachment.pgp From markdv.lartc at asphyx.net Wed Jun 20 08:55:23 2007 From: markdv.lartc at asphyx.net (Mark) Date: Wed Jun 20 08:56:00 2007 Subject: [LARTC] Why does this script noet work (bandwidth, tc en u32) In-Reply-To: <20070619234104.05e9fad9@ocean.remetter.homelinux.org> References: <1182253668.5130.14.camel@panoramix> <1182274957.4708.16.camel@velocity.nl.tiscali.com> <1182286111.5130.26.camel@panoramix> <20070619234104.05e9fad9@ocean.remetter.homelinux.org> Message-ID: On Tue, 19 Jun 2007, Frank Remetter wrote: > Hey, > >> # uplink >> tc qdisc del dev eth1 root >> >> tc qdisc add dev eth1 root handle 1: htb >> tc class add dev eth1 parent 1: classid 1:1 htb rate 100mbit >> tc class add dev eth1 parent 1:1 classid 1:2 htb rate 1mbit ceil >> 10mbit tc filter add dev eth1 parent 1: protocol ip prio 1 u32 match >> ip src 172.31.1.1 flowid 1:2 > >> Could you elaborate on this? Which "other fifo qdisc" that maintains >> statistics? Any hints on the right syntax? > > i guess he is talking of e.g. sfq: > tc qdisc add dev eth1 parent 1:2 handle 2: sfq perturb 10 Yeah, that's what I meant. But forget I said it. According to the man page pfifo_fast "Does not maintain statistics and does not show up in tc qdisc ls." but I just noticed that it does so it doesn't make a difference. Regards, Mark. From 1marc1 at gmail.com Wed Jun 20 09:37:01 2007 From: 1marc1 at gmail.com (Marc) Date: Wed Jun 20 09:37:10 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? Message-ID: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> Hi, I am new to tc and have been reading quite a bit on how to set it up etc. Everything seems to be working fine, until I started scp-ing a large file over a low bandwidth connection as part of my testing process. Here is the setup: my pc --- bridge running tc/htb --- rest of network TC is filtering traffic from "my pc" and classifies it as 120kbit (see my script below). I then scp a 5MB file from a server in "rest of network" to "my pc". Everything seems to work fine and copies at a speed of around 12KB/s, which is what I would expect from a 120kbit connection. At some stage scp stalls and eventually disconnects or I get bored and press +C. The stage at which it stalls is different every time. First it was at 76% of the copy progress, then at 32% of the copy progress. For my testing purposes, there is no other traffic flowing through either this class or any other class. My expectation was that it would copy the entire file, just at a low speed. I expected to be able to copy a 600MB file at 12KB/s, which would of course be very slow, but eventually arrive. Here are the rules I specified, note that "my pc" does *not* have the ip address 10.0.2.42 in the test desribed above: #eth0 qdisc tc qdisc add dev eth0 root handle 1:0 htb default 2 tc class add dev eth0 parent 1:0 classid 1:1 htb rate 10mbit ceil 10mbit tc class add dev eth0 parent 1:1 classid 1:2 htb rate 120kbit ceil 120kbit tc class add dev eth0 parent 1:1 classid 1:3 htb rate 200kbit ceil 1mbit #eth1 qdisc tc qdisc add dev eth1 root handle 2:0 htb default 2 tc class add dev eth1 parent 2:1 classid 2:2 htb rate 120kbit ceil 120kbit tc class add dev eth1 parent 2:1 classid 2:3 htb rate 200kbit ceil 1mbit #eth0 filter tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip src 10.0.2.42 flowid 1:3 #eth1 filter tc filter add dev eth1 parent 2:0 protocol ip prio 1 u32 match ip dst 10.0.2.42 flowid 2:3 Thank you for your comments on this situation. Regards, /|/|arc. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070620/20f9697d/attachment.htm From alain.lermoyer at orange-ftgroup.com Wed Jun 20 15:05:39 2007 From: alain.lermoyer at orange-ftgroup.com (LERMOYER Alain RD-RESA-ISS) Date: Wed Jun 20 15:05:44 2007 Subject: [LARTC] Prio class HTB Message-ID: Hello everyone, We are working on HTB with TC and would like some clarifications from your part. Our example is as follows. We have one HTB root class and two HTB classes attached to it, as in this figure : 1: HTB | | | --------------------------------------------------------------------- | | | ++++++++++++++++++++++++++ ++++++++++++++++++++++++++ ++++++++++++++++++++++ + 1:10 HTB + + 1:20 HTB + + 1:30 HTB + +(parameters, ex: prio 0)+ +(parameters, ex: prio 1)+ + + ++++++++++++++++++++++++++ ++++++++++++++++++++++++++ ++++++++++++++++++++++ | | | | | | --------------------------------------------------------------------- | | (dequeue to hardware) | The configuration script is : $ tc class add dev ath0 parent 1: classed 1:1 htb rate 100kbps ceil 100kbps burst 2k $ tc class add dev ath0 parent 1:1 classed 1:10 htb rate 30kbps ceil 60kbps burst 2k prio 0 $ tc class add dev ath0 parent 1:1 classed 1:20 htb rate 10kbps ceil 100kbps burst 2k prio 1 $ tc class add dev ath0 parent 1:1 classed 1:30 htb rate 60kbps ceil 100kbps burst 2k Our questions are : 1- How priority between classes are defined within HTB ? What parameter(s) do we need to specify ? 2- How does the dequeuing algorithm in HTB work ? As our understanding, the "prio" parameter specifies the priority order between the two classes regarding the token sharing policy. Is this parameter also involved in the classes mixing-up order at the output (dequeue to hardware) ? Thank you for your help. Alain.L -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070620/912344b7/attachment.html From nfuhriman at gmail.com Wed Jun 20 22:04:21 2007 From: nfuhriman at gmail.com (Nate Fuhriman) Date: Wed Jun 20 22:04:27 2007 Subject: [LARTC] ATM [Cell Tax] Message-ID: <57ca0490706201304p1d7fda13k95be598ecb669e51@mail.gmail.com> I have read the thread at http://mailman.ds9a.nl/pipermail/lartc/2006q1/018287.html and still don't know how to fix this problem. It appears alot of work has gone into it but the HOWTO is so out of date it doesn't even begin to addresses this method. So here are my questions 1. what is the current state of these patches? are they in a specific version? do i have to patch myself? 2. how do i actually use this once patched in? an example script would work great! 3. is there a table for us mere mortals that describes how to figure out which type of adsl/atm i'm using so i can set the appropriate overhead? thanks for all the great work on QOS! nate From default at advaita.sytes.net Wed Jun 20 22:58:53 2007 From: default at advaita.sytes.net (John Default) Date: Wed Jun 20 22:58:55 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> Message-ID: <4679950D.8080102@advaita.sytes.net> Hi Greg Scott wrote: > More fundamentally, can I cascade these switches and my bridge/firewall > this deep? How do the Internet router and internal servers find each > others' MAC addresses when they are 4 "hops" (OSI layer 2 hops) > separated? Or am I making this too complicated? > > i was taught that you should have no more than 4 switches between any two LAN nodes, so this should work. STP should be ok, but not needed until you have some redundant links between bridge/fw and core switches (if you have more core switches, you would probably like to use more links for redundancy). [you would probably want the core switch to become STP root then.. ] when switch doesn't know destination it works like hub, so at the beginning your network will be flooded with frames and this way all switches will learn mac addresses no matter how many hops. (frame will be broadcasted to all corners of LAN). servers and router will search for each other's MAC using ARP broadcasts, which will get to every node in LAN (if you don't filter them out : ) ). Therefore they will certainly find each other. (( 4 switches add constant delay to your traffic that i think you would like to avoid. if you could make internet router a firewall (you probably can't : ) ), that would remove 2 layers of bridging, would be more simple and allow more control. servers--distribution_switch--core_switch--router/firewall | internet switch )) >> Internal---User---Core-----Firewall---Internet---Internet router >> Servers switch switch (Bridged) switch (and default GW for >> internal servers) >> > > Thanks > > - Greg > ___________________________________ S pozdravom / Best regards John Default __________________________________ From tenos at ll.mit.edu Thu Jun 21 01:07:58 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Thu Jun 21 01:08:29 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <467855EC.30209@andyfurniss.entadsl.com> Message-ID: <200706202308.l5KN8GSx010923@ll.mit.edu> It's PQ that is required. Here is what I have for config so far: tc qdisc add dev eth0 root handle 1: prio bands 4 priomap 0 1 2 3 tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 match ip tos 0xb8 0xff flowid 1:1 tc filter add dev eth0 parent 1:0 prio 2 protocol ip u32 match ip tos 0x50 0xff flowid 1:2 tc filter add dev eth0 parent 1:0 prio 3 protocol ip u32 match ip tos 0x28 0xff flowid 1:3 tc filter add dev eth0 parent 1:0 prio 4 protocol ip u32 match ip tos 0x00 0xff flowid 1:4 tc qdisc add dev eth0 parent 1:1 handle 10: pfifo limit 2 tc qdisc add dev eth0 parent 1:2 handle 11: pfifo limit 2 tc qdisc add dev eth0 parent 1:3 handle 12: pfifo limit 2 tc qdisc add dev eth0 parent 1:4 handle 13: pfifo limit 2 __________ The above config works fine. The last four qdisc lines (handles 10: - 13: inclusive) also work as prio if you leave out the 'limit' part of course. The remaining part is to set children for the last four qdiscs (one for each). Said children qdiscs would have all the same attributes (as the parents (limit is something I'd change; the '2' is just an example). Is this possible? > -----Original Message----- > From: Andy Furniss [mailto:lists@andyfurniss.entadsl.com] > Sent: Tuesday, June 19, 2007 6:17 PM > To: Tim Enos > Cc: 'Christian Benvenuti'; lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Re: PQ questions > > Tim Enos wrote: > > Cool, > > > > Thanks Christian! I'm wishing that all of those same params showed up in > the > > output without having to run anything. No problem. Should it matter that > I'm > > using an emulated interface? > > Quite possibly - using prio on real devices still can appear not to work > until you have filled up any buffer the driver uses. > > On my 100meg eth it would take 5/6 unscaled tcp connections to fill > enough for prio to do anything. > > You can use prio as a child of hfsc/htb so that they set the rate. It > may be nicer to use htb's own prio though, if you need a slow rate and > care about latency. > > Andy. From gtaylor at riverviewtech.net Thu Jun 21 03:34:01 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 03:34:07 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <20070620040717.GT24808@samad.com.au> References: <925A849792280C4E80C5461017A4B8A210B8D9@mail733.InfraSupportEtc.com> <20070620040717.GT24808@samad.com.au> Message-ID: <4679D589.7090304@riverviewtech.net> On 6/19/2007 11:07 PM, Alex Samad wrote: > I think that was the old 5-4-3 or was it 4-3-2 ... I think that was > more in the days of repeater and broadcast hubs. Modern day switch I > believe allow for a lot more. To the best of my knowledge (including inquiries with colleagues) the proverbial "3,4,5" rule for Ethernet was prior to switches, as in a store and forward, mechanism. I think the rule was mainly to help timing and to prevent signal degradation, which switches help take care of. So, now, at least in theory, you could have a bridged network the world over in one really big broadcast domain. The problem would be that it is one really big broadcast domain which has its own down sides to consider and mitigate. Grant. . . . From gtaylor at riverviewtech.net Thu Jun 21 03:41:53 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 03:41:58 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> References: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> Message-ID: <4679D761.4090101@riverviewtech.net> On 6/19/2007 10:31 PM, Greg Scott wrote: > More fundamentally, can I cascade these switches and my > bridge/firewall this deep? How do the Internet router and internal > servers find each others' MAC addresses when they are 4 "hops" (OSI > layer 2 hops) separated? Or am I making this too complicated? Yes, you probably can cascade the switches like that, though I question is that what you really want to do or not. As you have indicated, the switches operate at (OSI) layer 2. Thus they pass (sans filtering) any and all non-broadcast traffic that they do not know the destination for out all ports except for the one that it came in on. At least this is the standard operating procedure of most switches. Seeing as how ARP requests are broadcast they are forwarded out all interfaces except for the one they come in. So, if you ARP on one switch, it will forward it to the next switch, which will in turn forward it on to the next, and so on until there are no more ports to forward the traffic out. ARP replies are unicast from the MAC of the ARPed system back to the ARPing system. This return path is when the intermediary switches learn of the MAC address of the ARPed system. So, subsequent packets to the ARPed system will pass out the switches based on the target MAC address which was previously learned during the ARP. Incidentally, this is why some systems, especially load balancers and the likes, will send out a "Gratuitous ARP" (a.k.a. GARP) packet to pre-populate (if you will) the switches (MAC) address table(s). Hope that helps shed some light on the subject. Grant. . . . From gtaylor at riverviewtech.net Thu Jun 21 03:45:46 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 03:45:51 2007 Subject: [LARTC] Linux bridging and cascaded switches In-Reply-To: <4679950D.8080102@advaita.sytes.net> References: <925A849792280C4E80C5461017A4B8A210B8DA@mail733.InfraSupportEtc.com> <4679950D.8080102@advaita.sytes.net> Message-ID: <4679D84A.4040304@riverviewtech.net> On 6/20/2007 3:58 PM, John Default wrote: > when switch doesn't know destination it works like hub, so at the > beginning your network will be flooded with frames and this way all > switches will learn mac addresses no matter how many hops. (frame > will be broadcasted to all corners of LAN). This is also why it is a bad idea to have an overly large broadcast domain. This is also why you do not want to bridge across a WAN link if you don't have to. Usually the broadcasts are local to the LAN and need not go across the WAN link. Grant. . . . From yoyao at cisco.com Thu Jun 21 04:41:57 2007 From: yoyao at cisco.com (Yong Yao (yoyao)) Date: Thu Jun 21 04:42:23 2007 Subject: [LARTC] A HTB problem Message-ID: <87F5D18955699142BE79663129C80FA9023F4EA0@xmb-hkg-415.apac.cisco.com> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/bmp Size: 9254 bytes Desc: ciscologo.bmp Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070621/c3d62ffd/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 345 bytes Desc: click_to_call.gif Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070621/c3d62ffd/attachment-0001.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2743 bytes Desc: Glacier Bkgrd.jpg Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070621/c3d62ffd/attachment-0001.jpe From gtaylor at riverviewtech.net Thu Jun 21 09:05:56 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 09:06:31 2007 Subject: [LARTC] Redundant internet connections. Message-ID: <467A2354.1070805@riverviewtech.net> (I know that what I'm wanting to do can be done, but for some reason I can not get it to work for the life of me. I think I have been staring at it too long and too closely.) I have two different internet connections from two cooperating ISPs. I also have a small 8 block of IPs that are globally routable that both ISPs will route to me via my world facing globally routable IPs that I have with them. I.e. ISP A has a route to 75.19.28.7/29 via 12.34.56.78 and ISP B has a route to 75.19.28.7/29 via 87.65.43.21. I want to use one ISP as the primary default gateway and the other ISP as a backup default gateway. That is to say I want to *NOT* use load balancing rather just redundancy in this situation. I do *NOT* need to use NAT because I do have the globally routable IP address on *ALL* interfaces. I.e. eth0: 75.19.28.6 (DMZ) eth1: 12.34.56.78 (ISP A) eth2: 87.65.43.21 (ISP B) I want this router to use the default gateway for ISP A of 12.34.56.254 and only use the default gateway of ISP B 78.65.43.1 if the default gateway of ISP A can not be reached. If I set up the interfaces with their IPs and subnets and set up multiple default routes with varying metrics (for priority) and test by taking an interface down, things work. However, this is not a realistic test because the interface will never physically go down. For the sake of discussion, let one link be a DSL modem and the other link be a cable modem. Each of the links is an external modem that uses an ethernet cable to connect in to the router. Thus no matter what the state of the link coming in to my facility is, the link on the Linux router will always be up b/c the ethernet between the router and the modems sitting on the next shelf down will always be up. I need a way for the Linux kernel to try to use a default gateway and switch to another one if it does not see any traffic. Any help that any one could offer will be greatly appreciated. Thanks in advance, Grant. . . . From salim.si at cipherium.com.tw Thu Jun 21 09:46:53 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Thu Jun 21 09:47:15 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A2354.1070805@riverviewtech.net> Message-ID: <001101c7b3d8$5919a830$5964a8c0@SalimSi> Use a ping script, which pings some IP every minute or so. Ping can bind to a specific interface. Ping -c 1 -w 1 -I eth1 $SOME_IP Ping -c 1 -w 1 -I eth2 $SOME_IP Check for return values for those pings. Change your default routes based on the ping results. This is the basic idea. You can add many other things to this, more IPs, more counts, change time interval... (Better use IPs than domain names, so that DNS queries won't have problem) > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Grant Taylor > Sent: Thursday, June 21, 2007 3:06 PM > To: Mail List - Linux Advanced Routing and Traffic Control > Subject: [LARTC] Redundant internet connections. > > (I know that what I'm wanting to do can be done, but for some reason I > can not get it to work for the life of me. I think I have been staring > at it too long and too closely.) > > I have two different internet connections from two cooperating ISPs. I > also have a small 8 block of IPs that are globally routable that both > ISPs will route to me via my world facing globally routable IPs that I > have with them. I.e. ISP A has a route to 75.19.28.7/29 via 12.34.56.78 > and ISP B has a route to 75.19.28.7/29 via 87.65.43.21. > > I want to use one ISP as the primary default gateway and the other ISP > as a backup default gateway. That is to say I want to *NOT* use load > balancing rather just redundancy in this situation. > > I do *NOT* need to use NAT because I do have the globally routable IP > address on *ALL* interfaces. > > I.e. > eth0: 75.19.28.6 (DMZ) > eth1: 12.34.56.78 (ISP A) > eth2: 87.65.43.21 (ISP B) > > I want this router to use the default gateway for ISP A of 12.34.56.254 > and only use the default gateway of ISP B 78.65.43.1 if the default > gateway of ISP A can not be reached. > > If I set up the interfaces with their IPs and subnets and set up > multiple default routes with varying metrics (for priority) and test by > taking an interface down, things work. However, this is not a realistic > test because the interface will never physically go down. > > For the sake of discussion, let one link be a DSL modem and the other > link be a cable modem. Each of the links is an external modem that uses > an ethernet cable to connect in to the router. Thus no matter what the > state of the link coming in to my facility is, the link on the Linux > router will always be up b/c the ethernet between the router and the > modems sitting on the next shelf down will always be up. > > I need a way for the Linux kernel to try to use a default gateway and > switch to another one if it does not see any traffic. > > Any help that any one could offer will be greatly appreciated. > > > > Thanks in advance, > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From gtaylor at riverviewtech.net Thu Jun 21 16:46:55 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 16:44:56 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <1182411999.6975.58.camel@ras.pc.brisbane.lube> References: <467A2354.1070805@riverviewtech.net> <1182411999.6975.58.camel@ras.pc.brisbane.lube> Message-ID: <467A8F5F.6000001@riverviewtech.net> On 06/21/07 02:46, Russell Stuart wrote: > Well, it may be that you are connected to the modem by Ethernet, but > that doesn't mean you can't arrange to know if the link is up or > down. If you are familiar with Cisco, there is a physical link, and a protocol link. I'm ending with an (physical link) Up / (protocol down) Down scenario, which can not be detected by Linux's device state. > For DSL, you can run PPPoE on your Linux box. That way you will know > when your link is down because the PPPoE connection dies, taking all > routes with it. I do this. It works. In the case of a cable modem > you can request a short dhcp-lease-time (see the option of that name > in dhcp-options(5)) which achieves the same thing. This is by far > the best solution because it reacts quickly, and altering of the > routing table happens automagically as the links go up and down. Ugh! Besides the fact that this is not possible (in my scenario) it is in my opinion, EXTREMELY sub-optimal. Don't even get me started on PPPoE. There is also the fact that the DHCP leases would have to be sub-minute in length to even remotely come close to working for this. > Assuming this isn't possible for some reason the only other way to do > this is manually. Ie, you monitor the link somehow. There are any > number of ways you can do this. One nice way is use Nagios to > monitor the link. This is nice because Nagios can do things when the > link goes down and comes back up again - like altering your routing > table. Nagios is also good because it allows for some hysteresis, ie > waiting for a few failed pings before taking action. And it can > report what happened by SMS or whatever. There are a lot of Nagios > type monitoring systems out there, maybe you use one. Failing a home > baked shell script will work just as well. It would just use say: > ping -n -q -c 1 -w 120 -i 20 -I a.d.d.r next.hop.addr in a continuous > loop to verify the link is up. Double Ugh! Why do I need to implement a daemon to do this when just about every other OS that I work with will purportedly do this its self. Linux can purportedly do this too supposedly with Dead Gateway Detection and / or Equal Cost Multipath Routing or some combination there of. No, I feel like there is a way to do this, I'm just over looking it. If I do need to go back to this method, I'll completely re-design what needs to be done or switch to a different router OS (Free/Net BSD?) to do this. > Finally, be careful in how you set up your routing. You want to > avoid asymmetric routing, and that will happen by default when > someone connects to your backup link unless you take special steps to > avoid it. Actually, asymmetric routes are what I want to use in the event traffic does go to the backup route while the primary is up and running. Keep in mind that no one will be connecting to any of the IP addresses assigned to the router (save for router management) but rather the globally routable IP addresses in the DMZ behind said router. Grant. . . . From rabbit at rabbit.us Thu Jun 21 17:35:13 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu Jun 21 17:35:22 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A2354.1070805@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> Message-ID: <467A9AB1.4090902@rabbit.us> Grant Taylor wrote: > I need a way for the Linux kernel to try to use a default gateway and > switch to another one if it does not see any traffic. I don't know about any working in-kernel solutions, but you can do it trivially with netfilter and a cronjob: * In netfilter do this: -t mangle -N ispA -t mangle -A ispA -j RETURN -t mangle -N ispB -t mangle -A ispB -j RETURN -t mangle -A PREROUTING -i $ifA -s ! a.a.a.a/aa -j ispA -t mangle -A PREROUTING -i $ifB -s ! b.b.b.b/bb -j ispB where a.a.a.a and b.b.b.b are subnets describing your first 1 - 2 hops, so traffic from your upstream router will not count. * Then make a cron job that run this every minute: iptables -t mangle -vnxZL isp[AB] and will look for the first number on the third line. If it is not 0 - the link is alive, otherwise change the routing tables accordingly. Of course you can have up to 1 minute of downtime, but it does not look so bad IMO. HTH Peter From gtaylor at riverviewtech.net Thu Jun 21 17:52:44 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 17:50:43 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A9AB1.4090902@rabbit.us> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> Message-ID: <467A9ECC.9080905@riverviewtech.net> On 06/21/07 10:35, Peter Rabbitson wrote: > I don't know about any working in-kernel solutions, but you can do it > trivially with netfilter and a cronjob: If I understand what you are proposing correctly, it looks like you are jumping to a sub-chain used used only for counting traffic. If the counters show traffic, you are saying that traffic is flowing across the link and thus the link must be up and functional. Right? If the link is not up and functional the take action to not use that link. I'm also not clearly understanding how matching the source IP will work on either link considering that both links will have the capability to pass traffic for the same globally routable DMZ subnet. Though I think this could be mitigated by altering the rules to count packets going out or coming in an interface rather than based on source / destination IP. > Of course you can have up to 1 minute of downtime, but it does not look > so bad IMO. One minute may or may not be bad. I know that it is a long time (when you are trying to ssh) but automatic failover is better than manual. And the one minute will probably be much faster than manual failover. Grant. . . . From rabbit at rabbit.us Thu Jun 21 18:00:49 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu Jun 21 18:00:58 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A9ECC.9080905@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> Message-ID: <467AA0B1.1070603@rabbit.us> Grant Taylor wrote: > On 06/21/07 10:35, Peter Rabbitson wrote: >> I don't know about any working in-kernel solutions, but you can do it >> trivially with netfilter and a cronjob: > > > > If I understand what you are proposing correctly, it looks like you are > jumping to a sub-chain used used only for counting traffic. If the > counters show traffic, you are saying that traffic is flowing across the > link and thus the link must be up and functional. Right? Almost correct > If the link is not up and functional the take action to not use that link. This is not something I do automatically in netfilter - it is a responsibility of the cron job. > I'm also not clearly understanding how matching the source IP will work > on either link considering that both links will have the capability to > pass traffic for the same globally routable DMZ subnet. Though I think > this could be mitigated by altering the rules to count packets going out > or coming in an interface rather than based on source / destination IP. I am counting only INcomming traffic (the -i flag). The source matching is there only for the following reason: consider You ->1-> Uplink router ->2-> Internet If hop 2 is down, then the uplink router might send you back ICMP messages that whatever destination you are trying to reach is unreachable. This will count as traffic from the internet, whereas in fact it isn't. This is why you need to exclude (thus the _!_ in -s) the immediate uplink hops, and count incomming traffic (whatever it might be) from the "far side" of the internet only. From gtaylor at riverviewtech.net Thu Jun 21 18:23:45 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 18:21:44 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AA0B1.1070603@rabbit.us> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> <467AA0B1.1070603@rabbit.us> Message-ID: <467AA611.1090606@riverviewtech.net> On 06/21/07 11:00, Peter Rabbitson wrote: > This is not something I do automatically in netfilter - it is a > responsibility of the cron job. *nod* > I am counting only INcomming traffic (the -i flag). The source matching > is there only for the following reason: consider > > You ->1-> Uplink router ->2-> Internet > > If hop 2 is down, then the uplink router might send you back ICMP > messages that whatever destination you are trying to reach is > unreachable. This will count as traffic from the internet, whereas in > fact it isn't. This is why you need to exclude (thus the _!_ in -s) the > immediate uplink hops, and count incomming traffic (whatever it might > be) from the "far side" of the internet only. Ah, here is part of the problem. ( eth1 ) --- (DSL Modem) / DSL Gateway Server --- (DMZ) --- (Linux Router) ( eth2 ) --- (Cable Modem / Cable Gateway Note: Globally routable DMZ is connected to eth0. Traffic will be to / from servers in the DMZ and clients on the internet at large. My "Linux Router" (above) *IS* the system that would send the ICMP ... unreachable message. So, there is not an upstream router to look for traffic from. I suppose that I could match traffic coming in eth1 or eth2, but I would have to be careful about he source / destination. However the very existence of inbound traffic means that the link is up for at least inbound traffic. However I also need to know that I can send traffic too. I've had situations where the traffic would come in but not go out (Do NOT ask how why!). I suppose such monitoring will work, but I still feel like there is a better solution out there. There is also the fact that I am wanting to use one route unless it is down and then use the backup. If the primary route is up and traffic comes in the backup, it is to go back out the primary. Grant. . . . From rabbit at rabbit.us Thu Jun 21 18:47:17 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu Jun 21 18:47:23 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AA611.1090606@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> <467AA0B1.1070603@rabbit.us> <467AA611.1090606@riverviewtech.net> Message-ID: <467AAB95.1000204@rabbit.us> Grant Taylor wrote: > On 06/21/07 11:00, Peter Rabbitson wrote: > Ah, here is part of the problem. > > ( eth1 ) --- (DSL Modem) / DSL Gateway > Server --- (DMZ) --- (Linux Router) > ( eth2 ) --- (Cable Modem / Cable Gateway > > Note: Globally routable DMZ is connected to eth0. > > Traffic will be to / from servers in the DMZ and clients on the internet > at large. > > My "Linux Router" (above) *IS* the system that would send the ICMP ... > unreachable message. So, there is not an upstream router to look for > traffic from. > > I suppose that I could match traffic coming in eth1 or eth2, but I would > have to be careful about he source / destination. However the very > existence of inbound traffic means that the link is up for at least > inbound traffic. However I also need to know that I can send traffic > too. You are misunderstanding how ICMP works. The modems themselves are hops, and the thing they connect to is another hop. Just look at the first several entries of a traceroute to any destination, and you will see what I mean. If you still do not believe me - pull the ISP side cable from the modem, while still having your router connected to it, and try to do a ping to somewhere. Look at the source of the dest. unreachable message - it will come from the modem, not from the linux box. > I've had situations where the traffic would come in but not go out > (Do NOT ask how why!). This would be a problem with your router configuration. It is virtually impossible to have an upstream problem that would cause this. It either works both ways or does not at all. > I suppose such monitoring will work, but I still feel like there is a > better solution out there. I thought so too, but it seems that the only thing that comes close (and still does not cut it) are the DGD patches. And (this is my personal opinion) the fact they have not been included in the kernel for such a long time, indicates there is something fishy about them. I myself am using a different approach as I am doing load balancing as well. A script sends icmp ping packets with large payloads to several destinations and computes the mean rtt. Then the ratio of both rtts is used to assign link weights. When no pings come back one of the weights will be 0, and effectively no routing will be performed through this link. > There is also the fact that I am wanting to use one route unless it is > down and then use the backup. If the primary route is up and traffic > comes in the backup, it is to go back out the primary. > Nothing above prevents you from doing this, although it is a bad idea. Of course if you know what you are doing and still want to do it - it's your system :) From gtaylor at riverviewtech.net Thu Jun 21 19:02:50 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 19:00:53 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AAB95.1000204@rabbit.us> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> <467AA0B1.1070603@rabbit.us> <467AA611.1090606@riverviewtech.net> <467AAB95.1000204@rabbit.us> Message-ID: <467AAF3A.8090303@riverviewtech.net> On 06/21/07 11:47, Peter Rabbitson wrote: > You are misunderstanding how ICMP works. The modems themselves are hops, > and the thing they connect to is another hop. Just look at the first > several entries of a traceroute to any destination, and you will see > what I mean. If you still do not believe me - pull the ISP side cable > from the modem, while still having your router connected to it, and try > to do a ping to somewhere. Look at the source of the dest. unreachable > message - it will come from the modem, not from the linux box. Um, if you are using bridging modems (like I am) you are incorrect. If you are using modem router combos, yes. Every single install that I have used bridging modems on between the Linux router and the ISP acts the same way. If I have a workstation behind a Linux router (that is doing basic NATing) connected to a bridging DSL / Cable modem and I unplug the phone line or the coax cable from the modem, it is the Linux box that sends the ICMP message, NOT the modem. This is as expected too. The bridging modems bridge the traffic from the ethernet to the DSL / cable modem which is in turn bridged from DSL / cable back to a network interface at the ISP. Thus there is one broadcast domain between the Linux router and the ISPs router. Thus there is not IP device between the Linux router and the ISP router to send an ICMP message back. No, again, if you are dealing with modem router combos, I'll grant you what you say, but not on bridging modems. > This would be a problem with your router configuration. It is virtually > impossible to have an upstream problem that would cause this. It either > works both ways or does not at all. No, it was not a fault with my router. It was a fault radio in an (W)ISPs core network. Completely out of my control. When the ISP replaced the piece of equipment in their core (not even on the link to me) things started working correctly again. > I thought so too, but it seems that the only thing that comes close (and > still does not cut it) are the DGD patches. And (this is my personal > opinion) the fact they have not been included in the kernel for such a > long time, indicates there is something fishy about them. I agree that something is not quite right about the DGD patches. Though, I've applied them to 2.6.21.5 and did not have any more luck with them, so I'm not sure that there is much use for them. However I think that the DGD tests and failures there is were related to my config not being right. > I myself am using a different approach as I am doing load balancing as > well. A script sends icmp ping packets with large payloads to several > destinations and computes the mean rtt. Then the ratio of both rtts is > used to assign link weights. When no pings come back one of the weights > will be 0, and effectively no routing will be performed through this link. *nod* I am presently using dual load balanced SDSL circuits with automated (OSPF) failover at my office. This is working out VERY well. However the questions I'm asking have to do with a project for a different client. > Nothing above prevents you from doing this, although it is a bad idea. > Of course if you know what you are doing and still want to do it - it's > your system :) The contracts for the connections dictate that one is only used as a backup. If the primary is up any and all traffic outbound is to go out over it. So, if traffic comes in over the backup, returning out bound traffic is to go out the primary. Seeing as how the DMZ behind this router is globally routable, I'm not worried about issues with asymmetric routes. There are asymmetric routes in the core all the time. In my opinion, it is only at the edge where you NAT that you have to maintain IP addresses and thus have to be very careful and avoide asymmetric routes. Also, seeing as how both circuits are an ethernet connection that can carry a frame size / MTU of 1500 byes, I don't see the problems that would be introduced by encapsulated traffic like PPPoE for one link verses the other link. In short, I'm willing to listen to problems with the asymmetric routes, but I have yet to hear any thing that concerns me or even chafes me a little. Grant. . . . From rabbit at rabbit.us Thu Jun 21 19:37:52 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Thu Jun 21 19:37:58 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AAF3A.8090303@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> <467AA0B1.1070603@rabbit.us> <467AA611.1090606@riverviewtech.net> <467AAB95.1000204@rabbit.us> <467AAF3A.8090303@riverviewtech.net> Message-ID: <467AB770.8080500@rabbit.us> Grant Taylor wrote: > No, again, if you are dealing with modem router combos, I'll grant you > what you say, but not on bridging modems. *nod* I had several cases when my ISP had problems like the one ou describe below, so the first 2 hops were pingable but nothing outside.l This is why I suggested the entire ISP subnet exclusion, just to be on the safe side. >> This would be a problem with your router configuration. It is >> virtually impossible to have an upstream problem that would cause >> this. It either works both ways or does not at all. > > No, it was not a fault with my router. It was a fault radio in an > (W)ISPs core network. Completely out of my control. When the ISP > replaced the piece of equipment in their core (not even on the link to > me) things started working correctly again. I got to give you this one. Murphy at work. > *nod* I am presently using dual load balanced SDSL circuits with > automated (OSPF) failover at my office. This is working out VERY well. > However the questions I'm asking have to do with a project for a > different client. No contest here either. It's just rather rare for a small scale end-user to be able to get access to IGPs. > asymmetric routes. Also, seeing as how both circuits are an ethernet > connection that can carry a frame size / MTU of 1500 byes, I don't see > the problems that would be introduced by encapsulated traffic like PPPoE > for one link verses the other link. In short, I'm willing to listen to > problems with the asymmetric routes, but I have yet to hear any thing > that concerns me or even chafes me a little. > I misread the part about the stuff behind the router being routable. There is nothing wrong with asymmetric routing in this case. However you bring up an interesting point about MTU, only to dismiss it right there. I think you will have a problem with the default MTU of 1500 being combined with the effective MTU of PPPoE links being 1492. Too many systems in this day and age have PMTU discovery enabled, and you know what is the current state of ICMP messaging on the net. Peter From ghartung at photobucket.com Thu Jun 21 19:52:45 2007 From: ghartung at photobucket.com (Greg Hartung) Date: Thu Jun 21 19:53:12 2007 Subject: [LARTC] GRE tunnel Message-ID: I am trying to setup GRE between two CentOS 4.5 boxes. I have tried several variations of what's listed below, but none of them work. box1: modprobe ip_gre ip link set gre0 up ip tunnel add gretun mode gre local 66.1.1.161 remote 66.1.2.161 ttl 20 dev eth0 ip addr add dev gretun 10.253.253.1 peer 10.253.253.2/24 ip link set dev gretun up ip route add 10.2.0.0/16 via 10.253.253.2 box2: modprobe ip_gre ip link set gre0 up ip tunnel add gretun mode gre local 66.1.2.161 remote 66.1.1.161 ttl 20 dev eth0 ip addr add dev gretun 10.253.253.2 peer 10.253.253.1/24 ip link set dev gretun up ip route add 10.1.0.0/16 via 10.253.253.1 tcpdump shows NO rx or tx traffic from either box that isn't ARP or SSH. It's as if it's not even trying to bring the tunnel up. I'm a Cisco guy, so I'm lost with my show commands. The other variations I've tried consist mostly of trying different combinations of on-net (in the same subnet as eth0 and even the same address as eth0) and off-net (various combinations of loopback /24 and /32 addresses in separate 10 space) on the 'ip addr add dev gretun' statements. But the above example is what *should* work on a Cisco, I think. It's been a while. How do I troubleshoot this? This is all I've got so far: root@den1tun01:/home/root $ ip link 1: lo: mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: mtu 8800 qdisc pfifo_fast qlen 1000 link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff 3: eth0.2: mtu 8800 qdisc noqueue link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff 4: gre0: mtu 1476 qdisc noqueue link/gre 0.0.0.0 brd 0.0.0.0 5: gretun@eth0: mtu 8776 qdisc noqueue link/gre 66.1.1.161 peer 66.1.2.161 root@den1tun01:/home/root $ ip tun gre0: gre/ip remote any local any ttl inherit nopmtudisc gretun: gre/ip remote 66.1.2.161 local 66.1.1.161 dev eth0 ttl 20 root@den1tun01:/home/root $ ifconfig eth0 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 inet addr:10.1.2.243 Bcast:10.1.3.255 Mask:255.255.254.0 UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 RX packets:3357 errors:0 dropped:0 overruns:0 frame:0 TX packets:484 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:230757 (225.3 KiB) TX bytes:63937 (62.4 KiB) Interrupt:169 Memory:f8000000-f8011100 eth0.2 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 inet addr:66.1.1.161 Bcast:66.1.1.191 Mask:255.255.255.192 UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 RX packets:950 errors:0 dropped:0 overruns:0 frame:0 TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:43860 (42.8 KiB) TX bytes:1200 (1.1 KiB) gretun Link encap:UNSPEC HWaddr 42-0B-33-A1-FF-C0-00-00-00-00-00-00-00-00-00-00 inet addr:10.253.253.1 P-t-P:10.253.253.2 Mask:255.255.255.0 UP POINTOPOINT RUNNING NOARP MTU:8776 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:756 (756.0 b) gre0 Link encap:UNSPEC HWaddr 00-00-00-00-FF-00-00-00-00-00-00-00-00-00-00-00 UP RUNNING NOARP MTU:1476 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:225 errors:0 dropped:0 overruns:0 frame:0 TX packets:225 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:13271 (12.9 KiB) TX bytes:13271 (12.9 KiB) I've also tried changing the destination for the route to the near end of the private subnet and tried pinging various things on the tunnel subnet and remote network to create "interesting traffic" to bring the tunnel up but tcpdump still shows nothing. Then I noticed that ping does show an error count: [root@den1tun01 ~]# ping 10.253.253.2 PING 10.253.253.2 (10.253.253.2) 56(84) bytes of data. >From 10.253.253.1 icmp_seq=0 Destination Host Unreachable >From 10.253.253.1 icmp_seq=1 Destination Host Unreachable --- 10.253.253.2 ping statistics --- 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms , pipe 2 I can ping the local end: 10.253.253.1, but the tunnel is still non-functinoal. Thanks! Greg From gtaylor at riverviewtech.net Thu Jun 21 20:27:15 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 20:25:14 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AB770.8080500@rabbit.us> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <467A9ECC.9080905@riverviewtech.net> <467AA0B1.1070603@rabbit.us> <467AA611.1090606@riverviewtech.net> <467AAB95.1000204@rabbit.us> <467AAF3A.8090303@riverviewtech.net> <467AB770.8080500@rabbit.us> Message-ID: <467AC303.10702@riverviewtech.net> On 06/21/07 12:37, Peter Rabbitson wrote: > *nod* I had several cases when my ISP had problems like the one you > describe below, so the first 2 hops were pingable but nothing outside. > This is why I suggested the entire ISP subnet exclusion, just to be on > the safe side. *nod* > I got to give you this one. Murphy at work. Ya, Murphy and I go back a long way. I can usually tell when I'm on the right track to solving a problem. If I'm about to beat something, I start having other little problems, i.e. batteries in equipment going out, not having the proper patch cord (strait through verses cross over), not having proper user name and / or password for equipment, etc. I've gotten to the point that I rather like seeing such speed bumps because I have noticed that they are usually an indication that I'm at least going the right direction. > No contest here either. It's just rather rare for a small scale end-user > to be able to get access to IGPs. Well, just because OSPF is what is used does not mean that I have access to the IGP. To make things work, I'm having to have my ISP co-locate a piece of their equipment at my facility so they are using the IGP with in their administrative domain. I pick up from the single ethernet interface out of said equipment. It's just a political / administrative paradigm shift, but it does allow the circuits to do what I want them to do and rather nicely at that I might add. > I misread the part about the stuff behind the router being routable. > There is nothing wrong with asymmetric routing in this case. However you > bring up an interesting point about MTU, only to dismiss it right there. > I think you will have a problem with the default MTU of 1500 being > combined with the effective MTU of PPPoE links being 1492. Too many > systems in this day and age have PMTU discovery enabled, and you know > what is the current state of ICMP messaging on the net. *nod* I figured that the globally routable DMZ IPs was not sinking in so I tried re-stating it differently to see if it would make it. ;) Both of my links use statically assigned IP addresses on the raw ethernet interfaces. Thus there is no encapsulation (MTU) overhead to worry about, i.e. no PPPoE. Seeing as how I'm running MTUs of 1500 out my interfaces to the world and at least that or larger in to the ISP (ATM links have 4470 (set for something else some time previous) I don't think MTU issues will be on my end. Incidentally, this is one of the reasons that I try to avoid PPPoE if I can. Well MTU and the fact that our local incumbent phone company as an ISP likes to tare down the PPPoE connections after less than 60 seconds of inactivity *WITH OUT* notifying the client end. Thus our only reliable recourse is to tare down the connection on the client end before the ILEC does so that we know the state and can re-establish it on demand when needed. Grant. . . . From christian.benvenuti at libero.it Thu Jun 21 22:22:34 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Jun 21 22:12:47 2007 Subject: [LARTC] Re: PQ questions In-Reply-To: <200706202308.l5KN8GSx010923@ll.mit.edu> References: <200706202308.l5KN8GSx010923@ll.mit.edu> Message-ID: <1182457354.2684.23.camel@benve-laptop> Hi Tim, Andy, On Wed, 2007-06-20 at 19:07 -0400, Tim Enos wrote: > It's PQ that is required. Here is what I have for config so far: > > tc qdisc add dev eth0 root handle 1: prio bands 4 priomap 0 1 2 3 Is "priomap 0 1 2 3" what you want/need or just a random mapping? (this is the default mapping that is used when none of the filters matches) > tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 match ip tos 0xb8 > 0xff flowid 1:1 > > tc filter add dev eth0 parent 1:0 prio 2 protocol ip u32 match ip tos 0x50 > 0xff flowid 1:2 > > tc filter add dev eth0 parent 1:0 prio 3 protocol ip u32 match ip tos 0x28 > 0xff flowid 1:3 > > tc filter add dev eth0 parent 1:0 prio 4 protocol ip u32 match ip tos 0x00 > 0xff flowid 1:4 > > > tc qdisc add dev eth0 parent 1:1 handle 10: pfifo limit 2 > > tc qdisc add dev eth0 parent 1:2 handle 11: pfifo limit 2 > > tc qdisc add dev eth0 parent 1:3 handle 12: pfifo limit 2 > > tc qdisc add dev eth0 parent 1:4 handle 13: pfifo limit 2 > > __________ > > The above config works fine. The last four qdisc lines (handles 10: - 13: > inclusive) also work as prio if you leave out the 'limit' part of course. What do you mean? > The remaining part is to set children for the last four qdiscs (one for > each). Said children qdiscs would have all the same attributes (as the > parents (limit is something I'd change; the '2' is just an example). Is this > possible? Do you mean something like this? tc qdisc add dev eth0 parent 10: handle 100: prio ... tc qdisc add dev eth0 parent 11: handle 110: prio ... tc qdisc add dev eth0 parent 12: handle 120: prio ... tc qdisc add dev eth0 parent 13: handle 130: prio ... Why would you need to put a pfifo qdisc between the two prio qdisc? Wouldn't it be better to have prio -> prio OR prio -> prio -> pfifo instead of prio -> pfifo -> prio ? What criteria are you going to use to assign the right priority to the packets in the nested (i.e., 2nd level) prio qdisc? Regards /Christian [ http://benve.info ] > > -----Original Message----- > > From: Andy Furniss [mailto:lists@andyfurniss.entadsl.com] > > Sent: Tuesday, June 19, 2007 6:17 PM > > To: Tim Enos > > Cc: 'Christian Benvenuti'; lartc@mailman.ds9a.nl > > Subject: Re: [LARTC] Re: PQ questions > > > > Tim Enos wrote: > > > Cool, > > > > > > Thanks Christian! I'm wishing that all of those same params showed up in > > the > > > output without having to run anything. No problem. Should it matter that > > I'm > > > using an emulated interface? > > > > Quite possibly - using prio on real devices still can appear not to work > > until you have filled up any buffer the driver uses. > > > > On my 100meg eth it would take 5/6 unscaled tcp connections to fill > > enough for prio to do anything. > > > > You can use prio as a child of hfsc/htb so that they set the rate. It > > may be nicer to use htb's own prio though, if you need a slow rate and > > care about latency. > > > > Andy. From alex at samad.com.au Thu Jun 21 23:01:01 2007 From: alex at samad.com.au (Alex Samad) Date: Thu Jun 21 23:01:06 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A9AB1.4090902@rabbit.us> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> Message-ID: <20070621210101.GB31479@samad.com.au> On Thu, Jun 21, 2007 at 05:35:13PM +0200, Peter Rabbitson wrote: > Grant Taylor wrote: > > >I need a way for the Linux kernel to try to use a default gateway and > >switch to another one if it does not see any traffic. should something like this work default proto static metric 5 nexthop via 58.173.108.1 dev vlan2 weight 10 nexthop via 10.20.20.106 dev ppp0 weight 20 and then let the dgd detect dead gateways and drop the relevant route about. > > I don't know about any working in-kernel solutions, but you can do it > trivially with netfilter and a cronjob: > > * In netfilter do this: > -t mangle -N ispA > -t mangle -A ispA -j RETURN > -t mangle -N ispB > -t mangle -A ispB -j RETURN > -t mangle -A PREROUTING -i $ifA -s ! a.a.a.a/aa -j ispA > -t mangle -A PREROUTING -i $ifB -s ! b.b.b.b/bb -j ispB > > where a.a.a.a and b.b.b.b are subnets describing your first 1 - 2 hops, > so traffic from your upstream router will not count. > > * Then make a cron job that run this every minute: > iptables -t mangle -vnxZL isp[AB] > and will look for the first number on the third line. If it is not 0 - > the link is alive, otherwise change the routing tables accordingly. > > Of course you can have up to 1 minute of downtime, but it does not look > so bad IMO. > > HTH > > Peter > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070622/c483913b/attachment.pgp From christian.benvenuti at libero.it Thu Jun 21 23:22:28 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Thu Jun 21 23:12:38 2007 Subject: [LARTC] Re: HTB question, tokens. Message-ID: <1182460948.2684.44.camel@benve-laptop> Hi Mark, >Hi, > >What exactly are the "tokens"? > >I thought each token allowed the sending of one byte, that tokens are >stored in a bucket that can hold a max of "burst" tokens, and that this >bucket is filled with tokens at "rate". > >But theory does not seem to explain the "tc -s .." output in the >examples below. And I can't figure out why or how... Tokens normally represent the number of bytes the token bucket algorithm has accumulated. However, the numbers you see with tokens/ctokens are not expressed in bytes: they are expressed in units of time whose size is an approximation of 1 microsecond (how close a unit of time is to 1 microsecond depends on the kernel config). For example, the value of "tokens" that you see soon after configuring the HTB qdisc (and supposing no traffic has gone through the qdisc yet) is the number of pseudo microseconds that are necessary to transmit "burst" bytes at the rate "rate" configured on the class. It may look more complex that what it actually is. Just think of it as the number of (pseudo) microseconds the class can transmit at rate "rate" without terminating its tokens. The last sentence above should answer your questions in the second part of the email too. Regards /Christian [ http://benve.info ] >#tc qdisc del dev eth0 root >#tc qdisc add dev eth0 root handle 1: htb default 1 >#tc class add dev eth0 parent 1:0 classid 1:1 htb rate 2mbit >#tc -s -d class show dev eth0 >class htb 1:1 root prio 0 quantum 25000 rate 2000Kbit ceil 2000Kbit burst 2599b/8 mpu 0b overhead 0b cburst 2599b/8 mpu 0b overhead 0b level 0 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > lended: 0 borrowed: 0 giants: 0 > tokens: 10649 ctokens: 10649 > >#tc qdisc del dev eth0 root >#tc qdisc add dev eth0 root handle 1: htb default 1 >#tc class add dev eth0 parent 1:0 classid 1:1 htb rate 1mbit >#tc -s -d class show dev eth0 >class htb 1:1 root prio 0 quantum 12500 rate 1000Kbit ceil 1000Kbit burst 2099b/8 mpu 0b overhead 0b cburst 2099b/8 mpu 0b overhead 0b level 0 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > lended: 0 borrowed: 0 giants: 0 > tokens: 17203 ctokens: 17203 > >Why do the amount of tokens go UP if the configured rate (and burst) is >lower? >(The commands where run from a script so these amounts of tokens >available right after the creation of the class.) > >If I set the rate to 9mbit the amount of tokens is always lower then the >burst size. Wouldn't that mean that there are always too few tokens >available to actually burst the "burst" amount of data? > >Regards, >Mark. From gtaylor at riverviewtech.net Thu Jun 21 23:24:19 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Thu Jun 21 23:22:25 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <20070621210101.GB31479@samad.com.au> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <20070621210101.GB31479@samad.com.au> Message-ID: <467AEC83.1070502@riverviewtech.net> On 06/21/07 16:01, Alex Samad wrote: > should something like this work > > default proto static metric 5 > nexthop via 58.173.108.1 dev vlan2 weight 10 > nexthop via 10.20.20.106 dev ppp0 weight 20 > > and then let the dgd detect dead gateways and drop the relevant route > about. Doesn't this use "Equal Cost Multi Path" (ECMP) routing? If so, how does this take in to account that I do not want any of the traffic to run over the backup connection unless the primary is down? It is my understanding that the weights of an ECMP route are for a fraction of the traffic. I.e. 10/30 and 20/30 of the traffic will use each of the routes. (Note: I state 10/30 and 20/30 because the man page indicates that 10/30 does not equal 1/3. Namely because the kernel creates an in memory route for each weight for each route. Thus if you use a weight of 10, there will be 10 routes in memory.) Grant. . . . From rmartija at telcordia.com Thu Jun 21 23:47:08 2007 From: rmartija at telcordia.com (Martija, Ricardo V) Date: Thu Jun 21 23:47:18 2007 Subject: [LARTC] How do you delete a filter? Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 4242 bytes Desc: not available Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070621/be6e7aac/smime.bin From alex at samad.com.au Fri Jun 22 00:18:23 2007 From: alex at samad.com.au (Alex Samad) Date: Fri Jun 22 00:18:29 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AEC83.1070502@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <20070621210101.GB31479@samad.com.au> <467AEC83.1070502@riverviewtech.net> Message-ID: <20070621221823.GE31479@samad.com.au> On Thu, Jun 21, 2007 at 04:24:19PM -0500, Grant Taylor wrote: > On 06/21/07 16:01, Alex Samad wrote: > >should something like this work > > > >default proto static metric 5 > > nexthop via 58.173.108.1 dev vlan2 weight 10 > > nexthop via 10.20.20.106 dev ppp0 weight 20 > > > >and then let the dgd detect dead gateways and drop the relevant route > >about. > > Doesn't this use "Equal Cost Multi Path" (ECMP) routing? sorry yep, just woken up, reading and answering whilst eating breakfast okay then why not default via preffered path default via backup path metric 100 > > If so, how does this take in to account that I do not want any of the > traffic to run over the backup connection unless the primary is down? > > It is my understanding that the weights of an ECMP route are for a > fraction of the traffic. I.e. 10/30 and 20/30 of the traffic will use > each of the routes. > > (Note: I state 10/30 and 20/30 because the man page indicates that > 10/30 does not equal 1/3. Namely because the kernel creates an in > memory route for each weight for each route. Thus if you use a weight > of 10, there will be 10 routes in memory.) > > > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070622/5bcca79d/attachment.pgp From gtaylor at riverviewtech.net Fri Jun 22 00:23:23 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 00:21:22 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <20070621221823.GE31479@samad.com.au> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <20070621210101.GB31479@samad.com.au> <467AEC83.1070502@riverviewtech.net> <20070621221823.GE31479@samad.com.au> Message-ID: <467AFA5B.8030601@riverviewtech.net> On 06/21/07 17:18, Alex Samad wrote: > sorry yep, just woken up, reading and answering whilst eating breakfast *nod* > okay then why not > > default via preffered path > default via backup path metric 100 I've done that with a metric of 0/1, and 1/2. The problem that I'm seeing is that the system will never try to use the second metric. It's as if the system will never go to a next higher metric if it does not receive an error while trying to use a lower metric. Grant. . . . From alex at samad.com.au Fri Jun 22 00:30:16 2007 From: alex at samad.com.au (Alex Samad) Date: Fri Jun 22 00:30:22 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AFA5B.8030601@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <20070621210101.GB31479@samad.com.au> <467AEC83.1070502@riverviewtech.net> <20070621221823.GE31479@samad.com.au> <467AFA5B.8030601@riverviewtech.net> Message-ID: <20070621223016.GF31479@samad.com.au> On Thu, Jun 21, 2007 at 05:23:23PM -0500, Grant Taylor wrote: > On 06/21/07 17:18, Alex Samad wrote: > >sorry yep, just woken up, reading and answering whilst eating breakfast > > *nod* > > >okay then why not > > > >default via preffered path > >default via backup path metric 100 > > I've done that with a metric of 0/1, and 1/2. The problem that I'm > seeing is that the system will never try to use the second metric. It's > as if the system will never go to a next higher metric if it does not > receive an error while trying to use a lower metric. Strange I am running openwrt on a linksys wr54gs with 1 cable and 1 adsl. I load balance, (also have julian patches applied - its 2.4.30), when the routing notices the link is dead, so if i do a ip li. then it marks the routes as dead and stops using them, once the interface is brought down the routes disappear I haven;t followed the dgd threads, but I seem to remember it having some problem with upstream detection. You talked about getting OSPF routing for this, is this from the ISP's inbound as well as outbound. Wouldn't OSPF handle link state as well ? (it been a while since I looked at OSPF) > > > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070622/92f4adc2/attachment.pgp From gtaylor at riverviewtech.net Fri Jun 22 00:35:14 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 00:33:19 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467A2354.1070805@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> Message-ID: <467AFD22.1030501@riverviewtech.net> Ok, after more testing and trying things that others have suggested, I've made some headways. Or at least what I think is some head ways. This is not an answer, just data that I have gathered along the way to help others that are trying to help me. I have determined that either I can not get the DGD patches (routes-2.6.21-15.diff) off of Julian's site to work the way that I think it should, or I'm using the wrong patch there from, or said patch does not work. I don't know which, and I can't really say one way or the other. If I compile a stock 2.6.21.5 kernel (plus patch to see my VMWare LSI SCSI card (should make no difference in routing)) with out ECMP or any advanced routing, I can get the system to fail to the next route after a period of time if the first is down. I do this by adding the two alternate routes with the same metric in reverse order that I want to use. I.e. if I have the following routes: a.b.c.d (preferred) and z.y.x.w (backup) I add the backup route and then the preferred route it will fail over after time. If I set /proc/sys/net/ipv4/route/gc_timeout to 10 seconds the system will fall back to the backup route in about 120 seconds. I'm still playing with numbers in the /proc tree. The problem with this method is that I have yet to get it to start re-using the primary route when it becomes available again. If I use the previously mentioned DGD patch, the system will just try to cache the route for something like 245 days. I'm still wondering if I am applying the correct patch. This happens with or with out ECMP compiled in to the kernel. Grant. . . . From gtaylor at riverviewtech.net Fri Jun 22 00:39:13 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 00:37:11 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <20070621223016.GF31479@samad.com.au> References: <467A2354.1070805@riverviewtech.net> <467A9AB1.4090902@rabbit.us> <20070621210101.GB31479@samad.com.au> <467AEC83.1070502@riverviewtech.net> <20070621221823.GE31479@samad.com.au> <467AFA5B.8030601@riverviewtech.net> <20070621223016.GF31479@samad.com.au> Message-ID: <467AFE11.2070805@riverviewtech.net> On 06/21/07 17:30, Alex Samad wrote: > Strange I am running openwrt on a linksys wr54gs with 1 cable and 1 > adsl. I load balance, (also have julian patches applied - its > 2.4.30), when the routing notices the link is dead, so if i do a ip > li. then it marks the routes as dead and stops using them, once the > interface is brought down the routes disappear I am not wanting load balancing. Rather I want to use one link and only use the second if the first is down. > I haven;t followed the dgd threads, but I seem to remember it having > some problem with upstream detection. *nod* I'm getting that consensus. > You talked about getting OSPF routing for this, is this from the > ISP's inbound as well as outbound. Wouldn't OSPF handle link state as > well ? (it been a while since I looked at OSPF) The OSPF was for a different project / different installation. Grant. . . . From gustavo at angulosolido.pt Fri Jun 22 13:54:00 2007 From: gustavo at angulosolido.pt (Gustavo Homem) Date: Fri Jun 22 13:54:29 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AAF3A.8090303@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467AAB95.1000204@rabbit.us> <467AAF3A.8090303@riverviewtech.net> Message-ID: <200706221254.01339.gustavo@angulosolido.pt> On Thursday 21 June 2007 18:02, Grant Taylor wrote: > On 06/21/07 11:47, Peter Rabbitson wrote: > > You are misunderstanding how ICMP works. The modems themselves are hops, > > and the thing they connect to is another hop. Just look at the first > > several entries of a traceroute to any destination, and you will see > > what I mean. If you still do not believe me - pull the ISP side cable > > from the modem, while still having your router connected to it, and try > > to do a ping to somewhere. Look at the source of the dest. unreachable > > message - it will come from the modem, not from the linux box. > > Um, if you are using bridging modems (like I am) you are incorrect. This is absolutetly the way to do it with ADSL. Using a modem in bridged mode minimizes the responsability of the modem/router which is a potentially unstable device. Let the stable Linux box do the work (routing+nat) and get the public IP. And firewall the Linux box itself with iptables. This is the most flexible and stable way to go. Cheers Gustavo -- Angulo S?lido - Tecnologias de Informa??o http://angulosolido.pt From gtaylor at riverviewtech.net Fri Jun 22 16:22:39 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 16:20:39 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <200706221254.01339.gustavo@angulosolido.pt> References: <467A2354.1070805@riverviewtech.net> <467AAB95.1000204@rabbit.us> <467AAF3A.8090303@riverviewtech.net> <200706221254.01339.gustavo@angulosolido.pt> Message-ID: <467BDB2F.8080003@riverviewtech.net> (Off thread topic.) On 06/22/07 06:54, Gustavo Homem wrote: > This is absolutetly the way to do it with ADSL. I could not agree more. > Using a modem in bridged mode minimizes the responsability of the > modem/router which is a potentially unstable device. Let the stable > Linux box do the work (routing+nat) and get the public IP. And > firewall the Linux box itself with iptables. This is the most > flexible and stable way to go. *nod* About the only thing that I'm looking at doing differently at my house is to use the Thompson USB SpeedTouch (330) USB ADSL modem to put the ATM stack on the Linux box its self. This way the Linux kernel will handle the bridging and buffering verses an external device that has arbitrary pauses waiting for buffers to fill prior to transmitting data. My preliminary tests with the ATM stack on Linux show a speed increase over the external bridging modem too. :) My tests show that Linux / Windows think the raw ATM with bridging circuit will get close to 1.6 Mbps while the bridged devices get closer to 1.5 Mbps. I also see a lower latency between the device connected to the DSL and the upstream gateway by a factor of 3 - 5 ms. Grant. . . . From gustavo at angulosolido.pt Fri Jun 22 16:57:43 2007 From: gustavo at angulosolido.pt (Gustavo Homem) Date: Fri Jun 22 16:58:08 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467BDB2F.8080003@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <200706221254.01339.gustavo@angulosolido.pt> <467BDB2F.8080003@riverviewtech.net> Message-ID: <200706221557.43742.gustavo@angulosolido.pt> On Friday 22 June 2007 15:22, Grant Taylor wrote: > (Off thread topic.) > > On 06/22/07 06:54, Gustavo Homem wrote: > > This is absolutetly the way to do it with ADSL. > > I could not agree more. > > > Using a modem in bridged mode minimizes the responsability of the > > modem/router which is a potentially unstable device. Let the stable > > Linux box do the work (routing+nat) and get the public IP. And > > firewall the Linux box itself with iptables. This is the most > > flexible and stable way to go. > > *nod* About the only thing that I'm looking at doing differently at my > house is to use the Thompson USB SpeedTouch (330) USB ADSL modem to put > the ATM stack on the Linux box its self. I've done this, but I think it's unreliable for professional use. The USB modems are non-standard so if one burns you can't exchange it for a different one without feasible but time consuming tweaking (tried more then one USB devices...). Even for Ethernet briding devices I only use models which are delivered by ISPs (rather than retail shop devices), to garantee they were tested for stability: POTS: http://www.huawei.com/products/terminal/products/view.do?id=87 ISDN: http://www.acbs-dsl-store.com/contenu/Articles/Article.asp?PdtNum=DSLGP628LP These models run forever in bridged mode. The second one accepts multiple PPPoE clients on different ports. > This way the Linux kernel will > handle the bridging and buffering verses an external device that has > arbitrary pauses waiting for buffers to fill prior to transmitting data. > > My preliminary tests with the ATM stack on Linux show a speed increase > over the external bridging modem too. :) My tests show that Linux / That's expectable since using PPPoA instead of PPPoEoA, reduces the overhead. But I don't know a standard PPPoA setup. But if we want QoS working, we can't use the full line capability anyway. > Windows think the raw ATM with bridging circuit will get close to 1.6 > Mbps while the bridged devices get closer to 1.5 Mbps. I also see a > lower latency between the device connected to the DSL and the upstream > gateway by a factor of 3 - 5 ms. Even if that happens, it would hardly compensate the risk of lower reliability. Cheers Gustavo -- Angulo S?lido - Tecnologias de Informa??o http://angulosolido.pt From gtaylor at riverviewtech.net Fri Jun 22 17:59:53 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 17:57:52 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <200706221557.43742.gustavo@angulosolido.pt> References: <467A2354.1070805@riverviewtech.net> <200706221254.01339.gustavo@angulosolido.pt> <467BDB2F.8080003@riverviewtech.net> <200706221557.43742.gustavo@angulosolido.pt> Message-ID: <467BF1F9.9050207@riverviewtech.net> On 06/22/07 09:57, Gustavo Homem wrote: > I've done this, but I think it's unreliable for professional use. The > USB modems are non-standard so if one burns you can't exchange it for > a different one without feasible but time consuming tweaking (tried > more then one USB devices...). > > Even for Ethernet briding devices I only use models which are > delivered by ISPs (rather than retail shop devices), to garantee they > were tested for stability: > > POTS: http://www.huawei.com/products/terminal/products/view.do?id=87 > > ISDN: http://www.acbs-dsl-store.com/contenu/Articles/Article.asp?PdtNum=DSLGP628LP > > > These models run forever in bridged mode. The second one accepts > multiple PPPoE clients on different ports. > > > That's expectable since using PPPoA instead of PPPoEoA, reduces the > overhead. But I don't know a standard PPPoA setup. > > But if we want QoS working, we can't use the full line capability > anyway. > > Even if that happens, it would hardly compensate the risk of lower > reliability. All very valid points and things to consider. However for a home environment / non critical environment, it provides a lot of potential. Grant. . . . From tenos at ll.mit.edu Fri Jun 22 19:27:14 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Fri Jun 22 19:27:57 2007 Subject: [LARTC] RE: PQ questions In-Reply-To: <1182457354.2684.23.camel@benve-laptop> Message-ID: <200706221727.l5MHRbh2018048@ll.mit.edu> Hi Christian, Good morning, and thank you for proving me correct about how professional and responsive people on this list are (sincerely). Brief comments in-line: > -----Original Message----- > From: Christian Benvenuti [mailto:christian.benvenuti@libero.it] > Sent: Thursday, June 21, 2007 4:23 PM > To: Tim Enos > Cc: lists@andyfurniss.entadsl.com; lartc@mailman.ds9a.nl > Subject: Re: PQ questions > > Hi Tim, Andy, > > On Wed, 2007-06-20 at 19:07 -0400, Tim Enos wrote: > > It's PQ that is required. Here is what I have for config so far: > > > > tc qdisc add dev eth0 root handle 1: prio bands 4 priomap 0 1 2 3 > > Is "priomap 0 1 2 3" what you want/need or just a random mapping? > (this is the default mapping that is used when none of the filters > matches) > > > tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 match ip tos > 0xb8 > > 0xff flowid 1:1 > > > > tc filter add dev eth0 parent 1:0 prio 2 protocol ip u32 match ip tos > 0x50 > > 0xff flowid 1:2 > > > > tc filter add dev eth0 parent 1:0 prio 3 protocol ip u32 match ip tos > 0x28 > > 0xff flowid 1:3 > > > > tc filter add dev eth0 parent 1:0 prio 4 protocol ip u32 match ip tos > 0x00 > > 0xff flowid 1:4 > > > > > > tc qdisc add dev eth0 parent 1:1 handle 10: pfifo limit 2 > > > > tc qdisc add dev eth0 parent 1:2 handle 11: pfifo limit 2 > > > > tc qdisc add dev eth0 parent 1:3 handle 12: pfifo limit 2 > > > > tc qdisc add dev eth0 parent 1:4 handle 13: pfifo limit 2 > > > > __________ > > > > The above config works fine. The last four qdisc lines (handles 10: - > 13: > > inclusive) also work as prio if you leave out the 'limit' part of > course. > > What do you mean? I mean that when saying something like: "qdisc add dev eth0 parent 1:1 handle 10: prio limit 2" you will get the following error (at least I do): " What is "limit"? Usage: ... prio bands NUMBER priomap P1 P2..." Changing the line like so works (and no error messages are generated): "qdisc add dev eth0 parent 1:1 handle 10: prio" > > > The remaining part is to set children for the last four qdiscs (one for > > each). Said children qdiscs would have all the same attributes (as the > > parents (limit is something I'd change; the '2' is just an example). Is > this > > possible? > > Do you mean something like this? > > tc qdisc add dev eth0 parent 10: handle 100: prio ... > tc qdisc add dev eth0 parent 11: handle 110: prio ... > tc qdisc add dev eth0 parent 12: handle 120: prio ... > tc qdisc add dev eth0 parent 13: handle 130: prio ... Yes. > > Why would you need to put a pfifo qdisc between the two prio qdisc? > Wouldn't it be better to have > > prio -> prio > > OR > > prio -> prio -> pfifo > > instead of > > prio -> pfifo -> prio ? > > What criteria are you going to use to assign the right priority to > the packets in the nested (i.e., 2nd level) prio qdisc? The idea is that within each of the four priority classes/queues there would be two queues: one of some very small length (say 2) and another of some larger length (whatever the default is). So the thinking is that the traffic (having been marked by the application say) hits the top-level queue. If the traffic is marked EF, it will go into the highest priority queue. Once in that queue, it will hit the first pfifo (which in this model is 2 packets long). It will then hit the second pfifo queue before heading out onto the wire. The ultimate concern is to know how many packets are in each of the priority queues at any given time. > > Regards > /Christian > [ http://benve.info ] > > > > > > -----Original Message----- > > > From: Andy Furniss [mailto:lists@andyfurniss.entadsl.com] > > > Sent: Tuesday, June 19, 2007 6:17 PM > > > To: Tim Enos > > > Cc: 'Christian Benvenuti'; lartc@mailman.ds9a.nl > > > Subject: Re: [LARTC] Re: PQ questions > > > > > > Tim Enos wrote: > > > > Cool, > > > > > > > > Thanks Christian! I'm wishing that all of those same params showed > up in > > > the > > > > output without having to run anything. No problem. Should it matter > that > > > I'm > > > > using an emulated interface? > > > > > > Quite possibly - using prio on real devices still can appear not to > work > > > until you have filled up any buffer the driver uses. > > > > > > On my 100meg eth it would take 5/6 unscaled tcp connections to fill > > > enough for prio to do anything. > > > > > > You can use prio as a child of hfsc/htb so that they set the rate. It > > > may be nicer to use htb's own prio though, if you need a slow rate and > > > care about latency. > > > > > > Andy. From gustavo at angulosolido.pt Fri Jun 22 20:23:34 2007 From: gustavo at angulosolido.pt (Gustavo Homem) Date: Fri Jun 22 20:23:45 2007 Subject: [LARTC] ATM [Cell Tax] In-Reply-To: <57ca0490706201304p1d7fda13k95be598ecb669e51@mail.gmail.com> References: <57ca0490706201304p1d7fda13k95be598ecb669e51@mail.gmail.com> Message-ID: <200706221923.34274.gustavo@angulosolido.pt> On Wednesday 20 June 2007 21:04, Nate Fuhriman wrote: > I have read the thread at > http://mailman.ds9a.nl/pipermail/lartc/2006q1/018287.html > and still don't know how to fix this problem. It appears alot of work > has gone into it but the HOWTO is so out of date it doesn't even begin > to addresses this method. > > So here are my questions > 1. what is the current state of these patches? are they in a specific > version? do i have to patch myself? > 2. how do i actually use this once patched in? an example script would > work great! > 3. is there a table for us mere mortals that describes how to figure > out which type of adsl/atm i'm using so i can set the appropriate > overhead? 4. Does someone know if there's a plan for the inclusion of these patches on iproute and the kernel? Thanks Gustavo > > thanks for all the great work on QOS! > nate > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -- Angulo S?lido - Tecnologias de Informa??o http://angulosolido.pt From gtaylor at riverviewtech.net Fri Jun 22 20:57:51 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 20:55:50 2007 Subject: [LARTC] Redundant internet connections. In-Reply-To: <467AFD22.1030501@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467AFD22.1030501@riverviewtech.net> Message-ID: <467C1BAF.60403@riverviewtech.net> On 06/21/07 17:35, Grant Taylor wrote: > The problem with this method is that I have yet to get it to start > re-using the primary route when it becomes available again. After doing some more testing and investigation, I think I know why the system appears to not be using the primary route. My test / lab setup consists of a Linux router with two subnets bound to one interface (eth0 and eth0:1) and my (VMWare) test Linux system with two ethernet interfaces bridged the the local LAN with one subnet on each interface. I have two (as far as Linux is concerned) physical interfaces so that I can have TX / RX counters for each interface to see which way the traffic is going out. This worked fine to have the system fall from the primary down to the secondary route when the primary route went away. However I never saw the traffic from the test Linux system back to the interface for the primary route. After doing some investigation I think this is because the same MAC address is used for both the primary and secondary routes, seeing as how both addresses are on the same physical interface on my Linux router. So, to test this, I took down the primary route, let the test Linux box fall back to the backup route, which it did. Then I brought the primary route back on line and waited. As expected the traffic did not start using the primary route, presumably because of MAC addresses for routes being cached with an association to a device. So, while the system was pinging out to the world with the primary route brought back up, I cleared entries from the local test Linux boxes ARP cache and all of the sudden, traffic started going out the correct interface. So, now I think that the method of having two equal cost (metric) routes on the box will work. I'm now going to test where the two routes are different MAC addresses to see if the traffic does indeed start using the proper rout again (Seeing as how there should not be any confusion with MAC addresses.) Grant. . . . From gtaylor at riverviewtech.net Fri Jun 22 23:08:39 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 23:06:39 2007 Subject: [LARTC] Redundant internet connections. !!!SOLVED!!! In-Reply-To: <467C1BAF.60403@riverviewtech.net> References: <467A2354.1070805@riverviewtech.net> <467AFD22.1030501@riverviewtech.net> <467C1BAF.60403@riverviewtech.net> Message-ID: <467C3A57.1080909@riverviewtech.net> On 06/22/07 13:57, Grant Taylor wrote: > I'm now going to test where the two routes are different MAC > addresses to see if the traffic does indeed start using the proper > rout again. Ok, I have done it and it is working. The short answer is all you need to have backup routes is to enter them in reverse order. You do not need to do any special kernel options, patch the kernel or any thing else, or any special ip rules. All you need to do is to enter the routes in the reverse of the order that you want them to be used. For example, if I have two different internet connections, each with their own default gateway. Obviously the two default gateways have to not be on the same subnet. GW1: A.B.C.D GW2: Z.Y.X.W GW3: K.L.M.N route add default gw K.L.M.N route add default gw Z.Y.X.W route add default gw A.B.C.D Note: All the above routes are the same metric (default of 0). I do not know why you have to add the routes in reverse. I have just noticed that route adds the routes as the highest priority to the routing table. Filled from the top, not the bottom type thing. So, conversely add them in the reverse order. In my current test environment I have two identical VMWare virtual machines (literal copy from one to the other) that I have modified the configuration and tested. I'll try to depict it below: ( ISP 1 ) --- ... --- ( ISP 1) --- ( Internet ) ( ) | (DMZ) --- ( Router ) ( Peering Link) ( ) | ( ISP 2 ) --- ... --- ( ISP 2) --- ( Internet ) In this scenario, the DMZ IP address space is from ISP 1. ISP 1 has a route to the DMZ via the ISP 1 IP address on my local Linux router. ISP 1 has a secondary route to the DMZ via the IP address on ISP 2s router over the peering link. ISP 2 has a route to the DMZ via the ISP 2 IP address on my local Linux router. The link between my local Linux router and ISP 1 is a high speed wireless link. The link between my local Linux router and ISP 2 is a lower speed ADSL link. The ADSL link from ISP 2 is *ONLY* used for backup access in case my local Linux router is unable to communicate with ISP 1s router. Thus if for some reason traffic does come in to my ISP 2 IP address it is to go back out the ISP 1 link, thus asymmetric routing. I appreciate all the suggestions that everyone submitted while trying to help resolve this issue. In the end it turned out that everything that was needed is already in the stock / vanilla kernel.org kernel. All I had to do was be smart enough to use it. Some points to help others with this issue if they ever need it: - Equal Cost Multi Path (a.k.a. E.C.M.P.) routing is NOT needed. - NO ip rule(s) were needed to pull this off. - NO additional routing tables were needed to pull this off. - NO patches (i.e. Julian's Dead Gateway Detection patch) were needed to make this off. - NO special scripts were needed to monitor and / or modify the routing table(s). (Note: This is applicable to my scenario, see below.) With regards to the monitoring of routing tables, I did not need to do any thing special, i.e. no ping or arping was needed. I think this was because when my primary route went down I would start using the secondary route and the returning traffic would always try to use the primary and fail back to the secondary route. When the primary route did come back up the inbound traffic would come in the primary interface / route thus incrementing the counters in my kernel thus making the kernel aware that the primary route was indeed back up so it could switch back to it. Note: In my test, I was manually taking the interface down on one VM and subsequently bring it back up and restoring the route(s) across it. In my opinion, this interface fiddling on the upstream end is not automatic, but is out side of the scope of the client end failing back to a backup route. If I were trying to do this between two systems where the link in the middle (between intermediary switches) went down, I believe I would have to do some sort of heart beat across the link. In this case, I would probably use (read: try) arping first and then switch to something else if that did not work. Grant. . . . From andrew.lyon at josims.com Fri Jun 22 23:31:18 2007 From: andrew.lyon at josims.com (Andrew Lyon) Date: Fri Jun 22 23:32:13 2007 Subject: [LARTC] Routing NDAS ? Message-ID: <1741FFAACCEC074AA16B4B57DA95500E01F084@jos-ex1.josims.local> Hi, I believe ndas devices (http://www.ximeta.com/web/technology/) use raw Ethernet frames, as they require no tcp/ip configuration, the client finds and authenticates with a code that is different for each device sold, like a network mac address. My pc is on a different segment to the ndas devices that we have, the two segments are linked by a linux box that is doing routing and proxy arp, can anybody suggest a way that I could access the ndas devices, I can connect to a share on a server that is connected to one of the devices, but that isn't very efficient :( Andy */ Ignore: JOSEDV001TAG /* From gtaylor at riverviewtech.net Fri Jun 22 23:44:55 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Fri Jun 22 23:42:53 2007 Subject: [LARTC] Routing NDAS ? In-Reply-To: <1741FFAACCEC074AA16B4B57DA95500E01F084@jos-ex1.josims.local> References: <1741FFAACCEC074AA16B4B57DA95500E01F084@jos-ex1.josims.local> Message-ID: <467C42D7.2020703@riverviewtech.net> On 06/22/07 16:31, Andrew Lyon wrote: > the two segments are linked by a linux box that is doing routing and > proxy arp, Please bridge and do not use Proxy ARP. Or if you really want to use Proxy ARP make sure that you are only Proxy ARPing for the MAC addresses of the NDAS device(s) and the client(s) that need to connect to it. > can anybody suggest a way that I could access the ndas devices, Set up a bridging router (a.k.a. brouter) to bridge all layer 2 traffic except for IP (and a few other select protocols) traffic. You may only want to bridge traffic that is from the NDAS and or its client(s) and route the rest (DROP in the BROUTING chain of the broute table). Grant. . . . From andrew.lyon at josims.com Sat Jun 23 00:22:29 2007 From: andrew.lyon at josims.com (Andrew Lyon) Date: Sat Jun 23 00:23:22 2007 Subject: [LARTC] Routing NDAS ? In-Reply-To: <467C42D7.2020703@riverviewtech.net> References: <1741FFAACCEC074AA16B4B57DA95500E01F084@jos-ex1.josims.local> <467C42D7.2020703@riverviewtech.net> Message-ID: <1741FFAACCEC074AA16B4B57DA95500E01F085@jos-ex1.josims.local> > >-----Original Message----- >From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] >On Behalf Of Grant Taylor >Sent: 22 June 2007 22:45 >To: Mail List - Linux Advanced Routing and Traffic Control >Subject: Re: [LARTC] Routing NDAS ? > >On 06/22/07 16:31, Andrew Lyon wrote: >> the two segments are linked by a linux box that is doing routing and >> proxy arp, > >Please bridge and do not use Proxy ARP. Or if you really want to use >Proxy ARP make sure that you are only Proxy ARPing for the MAC addresses >of the NDAS device(s) and the client(s) that need to connect to it. Are you saying that there is something wrong with proxy arp? So far it works fine for us, we have 5 segments and approx 150 nodes. Ndas devices don't work with proxy arp, bridge would, but at the moment we are a 24/7 operation and making the necessary config changes for bridge would be disruptive. I will probably end up doing it, but I would like to know if there is any alternative.. Andy >> can anybody suggest a way that I could access the ndas devices, >Set up a bridging router (a.k.a. brouter) to bridge all layer 2 traffic >except for IP (and a few other select protocols) traffic. You may only >want to bridge traffic that is from the NDAS and or its client(s) and >route the rest (DROP in the BROUTING chain of the broute table). Grant. . . . _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc Registered Office: J.O. Sims Ltd, Pudding Lane, Pinchbeck, Spalding, Lincs. PE11 3TJ Company reg No: 2084187 Vat reg No: GB 437 4621 47 Tel: +44 (0) 1775 842100 Fax: +44 (0) 1775 842101 Web: www.josims.com Email:enquiries@josims.com The information contained in this e-mail is confidential and is intended for the addressee only. The contents of this e-mail must not be disclosed or copied without the sender's consent. If you are not the intended recipient of the message, please notify the sender immediately, and delete the message. The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the company. No commitment may be inferred from the contents unless explicitly stated. The company does not take any responsibility for the personal views of the author. This message has been scanned for viruses before sending, but the company does not accept any responsibility for infection and recommends that you scan any attachments.JOSEDV001TAG From gtaylor at riverviewtech.net Sat Jun 23 01:16:00 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Sat Jun 23 01:16:06 2007 Subject: [LARTC] Routing NDAS ? In-Reply-To: <1741FFAACCEC074AA16B4B57DA95500E01F085@jos-ex1.josims.local> References: <1741FFAACCEC074AA16B4B57DA95500E01F084@jos-ex1.josims.local> <467C42D7.2020703@riverviewtech.net> <1741FFAACCEC074AA16B4B57DA95500E01F085@jos-ex1.josims.local> Message-ID: <467C5830.6000606@riverviewtech.net> On 6/22/2007 5:22 PM, Andrew Lyon wrote: > Are you saying that there is something wrong with proxy arp? So far > it works fine for us, we have 5 segments and approx 150 nodes. Is there something wrong with driving a stake in to the ground with a rock verses a sledge hammer, no. I personally see no reason to ever use proxy arp when you can bridge. I also see much finer grained control over bridging than I do of proxy arp. Not to mention that with bridging, devices see the real MAC address verses the MAC of the device doing the proxy arp. That being said, proxy arp has been around for more decades than bridging has. I'm sure that there are situations where proxy arp is the better situation. However personally I would have to have a situation where bridging would not work and proxy arp would for me to use proxy arp over bridging. I guess some of this could be attributed to the fact that I have come in to networking with in the last 10 years and to me proxy arp is the old holdover about like NetBEUI is for some networks. (That is not to say that proxy arp has as many problems as NetBEUI does or vice versa.) > Ndas devices don't work with proxy arp, bridge would, but at the > moment we are a 24/7 operation and making the necessary config > changes for bridge would be disruptive. Do you have another system that you can put in to production that would connect to both broadcast domains and have it bridge just NDAS traffic and let your existing routers do what they are doing? I can understand and appreciate the inability (technical / political / chronological) to be able to replace work on production systems. That does not mean that you can not accomplish what is needed another way. > I will probably end up doing it, but I would like to know if there is > any alternative.. Will adding a system just to bridge NDAS traffic work? Grant. . . . From markdv.lartc at asphyx.net Sat Jun 23 16:10:07 2007 From: markdv.lartc at asphyx.net (mark) Date: Sat Jun 23 16:10:35 2007 Subject: [LARTC] Re: HTB question, tokens. In-Reply-To: <1182460948.2684.44.camel@benve-laptop> References: <1182460948.2684.44.camel@benve-laptop> Message-ID: <1182585559.14831.40.camel@velocity.nl.tiscali.com> On Thu, 2007-06-21 at 23:22 +0200, Christian Benvenuti wrote: > Hi Mark, > > >Hi, > > > >What exactly are the "tokens"? > > > >I thought each token allowed the sending of one byte, that tokens are > >stored in a bucket that can hold a max of "burst" tokens, and that this > >bucket is filled with tokens at "rate". > > > >But theory does not seem to explain the "tc -s .." output in the > >examples below. And I can't figure out why or how... > > Tokens normally represent the number of bytes the token bucket algorithm has > accumulated. However, the numbers you see with tokens/ctokens are not expressed > in bytes: they are expressed in units of time whose size is an approximation of > 1 microsecond (how close a unit of time is to 1 microsecond depends on the kernel > config). > For example, the value of "tokens" that you see soon after configuring the > HTB qdisc (and supposing no traffic has gone through the qdisc yet) is the > number of pseudo microseconds that are necessary to transmit "burst" bytes > at the rate "rate" configured on the class. Thanks for the explanation. I understand, the tokens as displayed are based on implementation details rather then pure concept/theory. Guess it also explains why the number of tokens can be negative. If a (c)bursts causes a class to exceed it's configured rate it will take some time (that many pseudo microseconds) for the rate to drop back to the configured rate. Right? > It may look more complex that what it actually is. Just think of it as > the number of (pseudo) microseconds the class can transmit at rate "rate" > without terminating its tokens. > The last sentence above should answer your questions in the second part of > the email too. Indeed. Thanks, Mark. > Regards > /Christian > [ http://benve.info ] > > > >#tc qdisc del dev eth0 root > >#tc qdisc add dev eth0 root handle 1: htb default 1 > >#tc class add dev eth0 parent 1:0 classid 1:1 htb rate 2mbit > >#tc -s -d class show dev eth0 > >class htb 1:1 root prio 0 quantum 25000 rate 2000Kbit ceil 2000Kbit burst 2599b/8 mpu 0b overhead 0b cburst 2599b/8 mpu 0b overhead 0b level 0 > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > lended: 0 borrowed: 0 giants: 0 > > tokens: 10649 ctokens: 10649 > > > >#tc qdisc del dev eth0 root > >#tc qdisc add dev eth0 root handle 1: htb default 1 > >#tc class add dev eth0 parent 1:0 classid 1:1 htb rate 1mbit > >#tc -s -d class show dev eth0 > >class htb 1:1 root prio 0 quantum 12500 rate 1000Kbit ceil 1000Kbit burst 2099b/8 mpu 0b overhead 0b cburst 2099b/8 mpu 0b overhead 0b level 0 > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > rate 0bit 0pps backlog 0b 0p requeues 0 > > lended: 0 borrowed: 0 giants: 0 > > tokens: 17203 ctokens: 17203 > > > >Why do the amount of tokens go UP if the configured rate (and burst) is > >lower? > >(The commands where run from a script so these amounts of tokens > >available right after the creation of the class.) > > > >If I set the rate to 9mbit the amount of tokens is always lower then the > >burst size. Wouldn't that mean that there are always too few tokens > >available to actually burst the "burst" amount of data? > > > >Regards, > >Mark. > > From ali.sattari at gmail.com Sat Jun 23 22:05:48 2007 From: ali.sattari at gmail.com (Ali Sattari) Date: Sat Jun 23 22:06:28 2007 Subject: [LARTC] Squid behind a ssh tunnel Message-ID: <8798976b0706231305q63c79992n3d23812491021d61@mail.gmail.com> Hi, I have a small network and internet connection is shared through a gateway (running ubuntu/linux) Gateway has squid installed as cache server and other network users connect to gateway:3128 to access internet. now, i need to run squid behind a ssh tunnel to internet. how can i make squid to use sock5 proxy connection through ssh tunnel to access internet (instead of direct connections)? the simple form is like this: network client ---> |gateway(squid)| ----ssh tunnel (socks5 proxy)----> |another server out of local network over internet| ---> internet! any squid config/trick or i should try iptables and ...? Thanks in advance. -- Ali Sattari (AKA Ali ix) http://corelist.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070623/1e828113/attachment.html From covici at ccs.covici.com Sun Jun 24 11:18:31 2007 From: covici at ccs.covici.com (John covici) Date: Sun Jun 24 11:18:55 2007 Subject: [LARTC] selectors for tc filters Message-ID: <18046.14055.442505.588642@ccs.covici.com> Hi. I can't find any documentation on the specific selectors for tc-filters -- what documentation I have says they are in Polish in a file called selectors.html -- is there anything around in English to see those? Thanks. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici covici@ccs.covici.com From bogaurd at gmail.com Sun Jun 24 12:46:12 2007 From: bogaurd at gmail.com (Terry Baume) Date: Sun Jun 24 12:46:20 2007 Subject: [LARTC] Traffic shaping on multiple interfaces Message-ID: <467E4B74.5030809@gmail.com> I'm trying to setup traffic shaping on my linux gateway/router. The system has 3 interfaces: eth0 - My LAN - with IP address 192.168.0.254 eth1 - The ethernet connection to which my ADSL modem is connected. This has a 10.25.x.x IP, more on this later. The ADSL link has an upstream of ~1.2mbit. ppp0 - The PPP connection which is my WAN connection, with a real world IP. The system acts a router, performing NAT for my LAN. This works perfectly, as does traffic shaping on ppp0 - I get very good results. The trouble is that my ISP allows me to user another service over my ADSL line, as a bonus. Basically the modem has 2 virtual circuits, one being for my WAN connection, and the other being a private network between other users of the same ISP, on the same telephone exchange - this is where the 10.25.x.x IP on eth1 comes from. To make things clear, low latency on the eth1 interface is not important, this interface is only used for file sharing and such. Latency on ppp0 is obviously important, being my WAN connection. My IPTables rules provide NAT for both connections, the only thing I cannot get working correctly is traffic shaping. So far, I have experimented with wondershaper, shaping on the ppp0 interface. This works well to keep latency down when traffic is on the ppp0 interface. If there is traffic on eth1 (the 'private' network of 10.25.x.x), with no traffic on the ppp0 interface, latency on ppp0 remains low, regardless of whether traffic shaping is active. I believe this has something to do with the way my ISP has configured priorities at the telephone exchange. I begin to run in to trouble when I am uploading heavily on eth1 & ppp0 simultaneously. Once this happens, ping times over ppp0 rise dramatically, to well over 1200ms (normal is around 7ms). I have tried shaping on eth1 instead of ppp0 (as eth1 should contain all the packets for ppp0, I believe), but this does not yield lower latency, though I did note that it did limit the speed of the connection if I set the upstream and downstream values absurdly low. I think what I need to do is somehow setup a script where traffic directed to 10.25.0.0 on eth1 is somehow counted against the bandwidth specified for ppp0, but I'm really not sure. Could someone offer some advice? Thanks. I believe this is due to the fact that as outbound traffic is occurring on both interfaces, tc does not worry about queues etc, as it does not look as though ppp0 is being maxed out (ppp0 would only be using roughly half of 1.2mbit), when the ADSL link itself is being maxed out. From alex at zoomnet.ro Sun Jun 24 13:58:35 2007 From: alex at zoomnet.ro (Alexandru Dragoi) Date: Sun Jun 24 13:59:35 2007 Subject: [LARTC] Traffic shaping on multiple interfaces In-Reply-To: <467E4B74.5030809@gmail.com> References: <467E4B74.5030809@gmail.com> Message-ID: <467E5C6B.40607@zoomnet.ro> Terry Baume wrote: > I'm trying to setup traffic shaping on my linux gateway/router. > > The system has 3 interfaces: > eth0 - My LAN - with IP address 192.168.0.254 > eth1 - The ethernet connection to which my ADSL modem is connected. > This has a 10.25.x.x IP, more on this later. The ADSL link has an > upstream of ~1.2mbit. > ppp0 - The PPP connection which is my WAN connection, with a real > world IP. > > The system acts a router, performing NAT for my LAN. This works > perfectly, as does traffic shaping on ppp0 - I get very good results. > > The trouble is that my ISP allows me to user another service over my > ADSL line, as a bonus. Basically the modem has 2 virtual circuits, one > being for my WAN connection, and the other being a private network > between other users of the same ISP, on the same telephone exchange - > this is where the 10.25.x.x IP on eth1 comes from. To make things > clear, low latency on the eth1 interface is not important, this > interface is only used for file sharing and such. Latency on ppp0 is > obviously important, being my WAN connection. > > My IPTables rules provide NAT for both connections, the only thing I > cannot get working correctly is traffic shaping. > > So far, I have experimented with wondershaper, shaping on the ppp0 > interface. This works well to keep latency down when traffic is on the > ppp0 interface. If there is traffic on eth1 (the 'private' network of > 10.25.x.x), with no traffic on the ppp0 interface, latency on ppp0 > remains low, regardless of whether traffic shaping is active. I > believe this has something to do with the way my ISP has configured > priorities at the telephone exchange. I begin to run in to trouble > when I am uploading heavily on eth1 & ppp0 simultaneously. Once this > happens, ping times over ppp0 rise dramatically, to well over 1200ms > (normal is around 7ms). I have tried shaping on eth1 instead of ppp0 > (as eth1 should contain all the packets for ppp0, I believe), but this > does not yield lower latency, though I did note that it did limit the > speed of the connection if I set the upstream and downstream values > absurdly low. > > I think what I need to do is somehow setup a script where traffic > directed to 10.25.0.0 on eth1 is somehow counted against the bandwidth > specified for ppp0, but I'm really not sure. Could someone offer some > advice? > > Thanks. > > I believe this is due to the fact that as outbound traffic is > occurring on both interfaces, tc does not worry about queues etc, as > it does not look as though ppp0 is being maxed out (ppp0 would only be > using roughly half of 1.2mbit), when the ADSL link itself is being > maxed out. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc Try using IMQ, from www.linuximq.net From bugfood-ml at fatooh.org Sun Jun 24 21:50:36 2007 From: bugfood-ml at fatooh.org (Corey Hickey) Date: Sun Jun 24 21:50:44 2007 Subject: [LARTC] ESFQ: request for user input Message-ID: <467ECB0C.6020105@fatooh.org> Hello, I haven't been keeping up with sending ESFQ [ANNOUNCE] messages to this list, but I've still been working on the patch. If you're curious about recent changes, take a look at the home page, ChangeLog, and README: http://fatooh.org/esfq-2.6/ http://fatooh.org/esfq-2.6/current/ChangeLog http://fatooh.org/esfq-2.6/current/README Meanwhile, I'm interested in finally getting ESFQ included in the Linux kernel. Before I start sending patches and requesting maintainer review, however, there's one question I want to ask current or potential users of SFQ and ESFQ: Should ESFQ be merged into SFQ or remain as a separate qdisc? Note that I can't promise either is an option, since I haven't queried any maintainers yet; I'd rather have a clear idea of what is more desirable to the users before I propose anything. Of course, if any maintainers read this, I would value their input at this point as well. Here are some advantages and disadvantages of merging ESFQ with SFQ. Please correct me or let me know of any others you think of. ---Advantages--- * There's nothing radically different about ESFQ. A separate sch_esfq.c would duplicate lots of the code in sch_sfq.c. * Current users of SFQ would benefit from the better hashing of using jhash. Other than that, the default parameters of ESFQ are the same as SFQ's hardcoded values, so ESFQ would be a drop-in replacement. * Having two similar-looking similarly-functioning qdiscs could be confusing for new users. ---Disadvantages--- * SFQ has been stable for years; it may be undesirable to make changes that could potentially introduce bugs. * ESFQ is marginally slower than SFQ (although I haven't been able to measure a practical difference; if someone has benchmark tips I'll try them). -Corey From lists at andyfurniss.entadsl.com Sun Jun 24 22:30:25 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sun Jun 24 22:30:30 2007 Subject: [LARTC] ESFQ: request for user input In-Reply-To: <467ECB0C.6020105@fatooh.org> References: <467ECB0C.6020105@fatooh.org> Message-ID: <467ED461.9020908@andyfurniss.entadsl.com> Corey Hickey wrote: > Meanwhile, I'm interested in finally getting ESFQ included in the Linux > kernel. Patrick McHardy recently changed sfq to be more like esfq. http://marc.info/?l=linux-netdev&m=118051806814780&w=2 Andy. From kaber at trash.net Sun Jun 24 23:12:40 2007 From: kaber at trash.net (Patrick McHardy) Date: Sun Jun 24 23:12:56 2007 Subject: [LARTC] ESFQ: request for user input In-Reply-To: <467ECB0C.6020105@fatooh.org> References: <467ECB0C.6020105@fatooh.org> Message-ID: <467EDE48.307@trash.net> Corey Hickey wrote: > Hello, > > I haven't been keeping up with sending ESFQ [ANNOUNCE] messages to this > list, but I've still been working on the patch. If you're curious about > recent changes, take a look at the home page, ChangeLog, and README: > > http://fatooh.org/esfq-2.6/ > http://fatooh.org/esfq-2.6/current/ChangeLog > http://fatooh.org/esfq-2.6/current/README > > Meanwhile, I'm interested in finally getting ESFQ included in the Linux > kernel. Before I start sending patches and requesting maintainer review, > however, there's one question I want to ask current or potential users > of SFQ and ESFQ: > > Should ESFQ be merged into SFQ or remain as a separate qdisc? I've CCed netdev. I think merging parts of ESFQ (dynamic depth and flow number) would make a lot of sense, but I'm intending to submit an alternative to the ESFQ hashing scheme for 2.6.23: http://www.mail-archive.com/netdev@vger.kernel.org/msg39156.html I have enough trust in ESFQ's stability that I don't think we need a new qdisc for this and could merge it in SFQ (and the "uses only 1 page" justification isn't true anymore anyway), but I also wouldn't mind adding a new qdisc. > Note that I can't promise either is an option, since I haven't queried > any maintainers yet; I'd rather have a clear idea of what is more > desirable to the users before I propose anything. Of course, if any > maintainers read this, I would value their input at this point as well. > > Here are some advantages and disadvantages of merging ESFQ with SFQ. > Please correct me or let me know of any others you think of. > > ---Advantages--- > * There's nothing radically different about ESFQ. A separate sch_esfq.c > would duplicate lots of the code in sch_sfq.c. > * Current users of SFQ would benefit from the better hashing of using > jhash. Other than that, the default parameters of ESFQ are the same > as SFQ's hardcoded values, so ESFQ would be a drop-in replacement. > * Having two similar-looking similarly-functioning qdiscs could be > confusing for new users. > > ---Disadvantages--- > * SFQ has been stable for years; it may be undesirable to make changes > that could potentially introduce bugs. > * ESFQ is marginally slower than SFQ (although I haven't been able to > measure a practical difference; if someone has benchmark tips I'll try > them). From lists at andyfurniss.entadsl.com Sun Jun 24 23:25:53 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Sun Jun 24 23:25:56 2007 Subject: [LARTC] A HTB problem In-Reply-To: <87F5D18955699142BE79663129C80FA9023F4EA0@xmb-hkg-415.apac.cisco.com> References: <87F5D18955699142BE79663129C80FA9023F4EA0@xmb-hkg-415.apac.cisco.com> Message-ID: <467EE161.8020708@andyfurniss.entadsl.com> Yong Yao (yoyao) wrote: > qdisc htb 1: dev vlan2 r2q 10 default 13 direct_packets_stat 0 I am not sure if it is the cause of what you see, but arp will go here unless you filter it elsewhere. If you only want to shape IP traffic it's probably better to not use htb default and just make a catch all ip filter for 1:13. Andy. From bugfood-ml at fatooh.org Mon Jun 25 01:09:42 2007 From: bugfood-ml at fatooh.org (Corey Hickey) Date: Mon Jun 25 01:09:47 2007 Subject: [LARTC] ESFQ: request for user input In-Reply-To: <467EDE48.307@trash.net> References: <467ECB0C.6020105@fatooh.org> <467EDE48.307@trash.net> Message-ID: <467EF9B6.3040801@fatooh.org> Patrick McHardy wrote: > Corey Hickey wrote: >> Hello, >> >> I haven't been keeping up with sending ESFQ [ANNOUNCE] messages to this >> list, but I've still been working on the patch. If you're curious about >> recent changes, take a look at the home page, ChangeLog, and README: >> >> http://fatooh.org/esfq-2.6/ >> http://fatooh.org/esfq-2.6/current/ChangeLog >> http://fatooh.org/esfq-2.6/current/README >> >> Meanwhile, I'm interested in finally getting ESFQ included in the Linux >> kernel. Before I start sending patches and requesting maintainer review, >> however, there's one question I want to ask current or potential users >> of SFQ and ESFQ: >> >> Should ESFQ be merged into SFQ or remain as a separate qdisc? > > I've CCed netdev. I think merging parts of ESFQ (dynamic depth and > flow number) would make a lot of sense, but I'm intending to submit > an alternative to the ESFQ hashing scheme for 2.6.23: > > http://www.mail-archive.com/netdev@vger.kernel.org/msg39156.html Nice. I wasn't aware of that. Your patch looks like it supersedes ESFQ's hashing, so, if it gets applied, that already removes a large chunk of the differences between SFQ and ESFQ. If I don't hear any opposition, then I'll keep an eye out for when your patch gets accepted (assuming it does) and then submit patch(es) porting the rest of ESFQ's features to SFQ. I just subscribed myself to netdev. > I have enough trust in ESFQ's stability that I don't think we need > a new qdisc for this and could merge it in SFQ (and the "uses only > 1 page" justification isn't true anymore anyway), but I also > wouldn't mind adding a new qdisc. Thanks for the trust; I'm sure that the patches will have to undergo some cleanup either way, considering my newbieness to kernel development. -Corey From lists at andyfurniss.entadsl.com Mon Jun 25 01:11:08 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 01:11:09 2007 Subject: [LARTC] Prio class HTB In-Reply-To: References: Message-ID: <467EFA0C.10801@andyfurniss.entadsl.com> LERMOYER Alain RD-RESA-ISS wrote: > > > Hello everyone, > > We are working on HTB with TC and would like some clarifications from > your part. > Our example is as follows. We have one HTB root class and two HTB > classes attached to it, as in this figure : > 1: HTB > | > | > | > > --------------------------------------------------------------------- > | | > | > ++++++++++++++++++++++++++ ++++++++++++++++++++++++++ > ++++++++++++++++++++++ > + 1:10 HTB + + 1:20 HTB + + > 1:30 HTB + > +(parameters, ex: prio 0)+ +(parameters, ex: prio 1)+ + > + > ++++++++++++++++++++++++++ ++++++++++++++++++++++++++ > ++++++++++++++++++++++ > | | > | > | | > | > > --------------------------------------------------------------------- > | > | (dequeue to hardware) > | > > > The configuration script is : > > $ tc class add dev ath0 parent 1: classed 1:1 htb rate 100kbps ceil > 100kbps burst 2k kbps = kbytes/sec kbit for bits. HTB uses Hz which means at high bitrates a backlogged class will need more than 2k burst to reach its rate/ceil. > Our questions are : > 1- How priority between classes are defined within HTB ? What > parameter(s) do we need to specify ? prio 0 is top for htb classes, 1 is top for tc filters. > 2- How does the dequeuing algorithm in HTB work ? http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm > > As our understanding, the "prio" parameter specifies the priority order > between the two classes regarding the token sharing policy. > Is this parameter also involved in the classes mixing-up order at the > output (dequeue to hardware) ? Don't really know what you mean here. Andy. From ethy.brito at inexo.com.br Mon Jun 25 01:22:22 2007 From: ethy.brito at inexo.com.br (Ethy H. Brito) Date: Mon Jun 25 01:21:23 2007 Subject: [LARTC] Prio class HTB In-Reply-To: <467EFA0C.10801@andyfurniss.entadsl.com> References: <467EFA0C.10801@andyfurniss.entadsl.com> Message-ID: <20070624202222.7fbdfc4c@babalu.inexo.com.br> On Mon, 25 Jun 2007 00:11:08 +0100 Andy Furniss wrote: > > Our questions are : > > 1- How priority between classes are defined within HTB ? What > > parameter(s) do we need to specify ? > > prio 0 is top for htb classes, 1 is top for tc filters. 0 (zero) is the highest, right? What are the lowest for classes and filters? Ethy From lists at andyfurniss.entadsl.com Mon Jun 25 01:26:15 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 01:26:11 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> Message-ID: <467EFD97.6050101@andyfurniss.entadsl.com> Marc wrote: > Hi, > > I am new to tc and have been reading quite a bit on how to set it up etc. > Everything seems to be working fine, until I started scp-ing a large file > over a low bandwidth connection as part of my testing process. > > Here is the setup: > my pc --- bridge running tc/htb --- rest of network > > TC is filtering traffic from "my pc" and classifies it as 120kbit (see my > script below). I then scp a 5MB file from a server in "rest of network" to > "my pc". Everything seems to work fine and copies at a speed of around > 12KB/s, which is what I would expect from a 120kbit connection. At some > stage scp stalls and eventually disconnects or I get bored and press > +C. The stage at which it stalls is different every time. First it > was > at 76% of the copy progress, then at 32% of the copy progress. > > For my testing purposes, there is no other traffic flowing through either > this class or any other class. My expectation was that it would copy the > entire file, just at a low speed. I expected to be able to copy a 600MB > file > at 12KB/s, which would of course be very slow, but eventually arrive. > > Here are the rules I specified, note that "my pc" does *not* have the ip > address 10.0.2.42 in the test desribed above: > > #eth0 qdisc > tc qdisc add dev eth0 root handle 1:0 htb default 2 > tc class add dev eth0 parent 1:0 classid 1:1 htb rate 10mbit ceil 10mbit > tc class add dev eth0 parent 1:1 classid 1:2 htb rate 120kbit ceil 120kbit > tc class add dev eth0 parent 1:1 classid 1:3 htb rate 200kbit ceil 1mbit > > #eth1 qdisc > tc qdisc add dev eth1 root handle 2:0 htb default 2 > tc class add dev eth1 parent 2:1 classid 2:2 htb rate 120kbit ceil 120kbit > tc class add dev eth1 parent 2:1 classid 2:3 htb rate 200kbit ceil 1mbit > > #eth0 filter > tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip src > 10.0.2.42 flowid 1:3 > > #eth1 filter > tc filter add dev eth1 parent 2:0 protocol ip prio 1 u32 match ip dst > 10.0.2.42 flowid 2:3 > > Thank you for your comments on this situation. It's probably because arp is being sent to 1:2 which is backlogged. Try not using the default parameter and instead use a catch all ip tc filter like - tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match u32 0 0 flowid 1:2 You could also consider adding p/bfifos to the classes and use the limit parameter to make the queues shorter. At low bitrates the default 1000pkts (picked up from the queuelen on eth) is too long. Andy. From kaber at trash.net Mon Jun 25 01:40:54 2007 From: kaber at trash.net (Patrick McHardy) Date: Mon Jun 25 01:41:31 2007 Subject: [LARTC] ESFQ: request for user input In-Reply-To: <467EF9B6.3040801@fatooh.org> References: <467ECB0C.6020105@fatooh.org> <467EDE48.307@trash.net> <467EF9B6.3040801@fatooh.org> Message-ID: <467F0106.1070103@trash.net> Corey Hickey wrote: > Patrick McHardy wrote: >>> >>> Should ESFQ be merged into SFQ or remain as a separate qdisc? >>> >> I've CCed netdev. I think merging parts of ESFQ (dynamic depth and >> flow number) would make a lot of sense, but I'm intending to submit >> an alternative to the ESFQ hashing scheme for 2.6.23: >> >> http://www.mail-archive.com/netdev@vger.kernel.org/msg39156.html >> > > Nice. I wasn't aware of that. Your patch looks like it supersedes ESFQ's > hashing, so, if it gets applied, that already removes a large chunk of > the differences between SFQ and ESFQ. > > If I don't hear any opposition, then I'll keep an eye out for when your > patch gets accepted (assuming it does) and then submit patch(es) porting > the rest of ESFQ's features to SFQ. > I think it would be best if you would start posting patches to add the missing features (without the hash changes) to SFQ, if you're quick this may already go in during the 2.6.23 merge window. My changes are mostly independant of yours, if there are any clashes the one who goes last will just have to rediff their patches :) Since you need to pass additional parameters to SFQ for your changes, have a look at my rtnetlink compat attribute patch: http://article.gmane.org/gmane.linux.network/64851 From bugfood-ml at fatooh.org Mon Jun 25 01:45:28 2007 From: bugfood-ml at fatooh.org (Corey Hickey) Date: Mon Jun 25 01:45:31 2007 Subject: [LARTC] ESFQ: request for user input In-Reply-To: <467F0106.1070103@trash.net> References: <467ECB0C.6020105@fatooh.org> <467EDE48.307@trash.net> <467EF9B6.3040801@fatooh.org> <467F0106.1070103@trash.net> Message-ID: <467F0218.6010400@fatooh.org> Patrick McHardy wrote: > Corey Hickey wrote: >> Patrick McHardy wrote: >>>> Should ESFQ be merged into SFQ or remain as a separate qdisc? >>>> >>> I've CCed netdev. I think merging parts of ESFQ (dynamic depth and >>> flow number) would make a lot of sense, but I'm intending to submit >>> an alternative to the ESFQ hashing scheme for 2.6.23: >>> >>> http://www.mail-archive.com/netdev@vger.kernel.org/msg39156.html >>> >> Nice. I wasn't aware of that. Your patch looks like it supersedes ESFQ's >> hashing, so, if it gets applied, that already removes a large chunk of >> the differences between SFQ and ESFQ. >> >> If I don't hear any opposition, then I'll keep an eye out for when your >> patch gets accepted (assuming it does) and then submit patch(es) porting >> the rest of ESFQ's features to SFQ. >> > > I think it would be best if you would start posting patches > to add the missing features (without the hash changes) to SFQ, > if you're quick this may already go in during the 2.6.23 merge > window. My changes are mostly independant of yours, if there > are any clashes the one who goes last will just have to rediff > their patches :) > > Since you need to pass additional parameters to SFQ for your > changes, have a look at my rtnetlink compat attribute patch: > > http://article.gmane.org/gmane.linux.network/64851 Ok, I'll work on it later. Thanks. -Corey From lists at andyfurniss.entadsl.com Mon Jun 25 02:06:19 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 02:06:21 2007 Subject: [LARTC] Prio class HTB In-Reply-To: <20070624202222.7fbdfc4c@babalu.inexo.com.br> References: <467EFA0C.10801@andyfurniss.entadsl.com> <20070624202222.7fbdfc4c@babalu.inexo.com.br> Message-ID: <467F06FB.4080301@andyfurniss.entadsl.com> Ethy H. Brito wrote: > On Mon, 25 Jun 2007 00:11:08 +0100 > Andy Furniss wrote: > >>> Our questions are : >>> 1- How priority between classes are defined within HTB ? What >>> parameter(s) do we need to specify ? >> prio 0 is top for htb classes, 1 is top for tc filters. > > 0 (zero) is the highest, right? What are the lowest for classes and filters? > > Ethy > > From a quick test it looks like 64k for filters and 7 for htb even though both will let you enter higher values. Andy. From lists at andyfurniss.entadsl.com Mon Jun 25 02:29:02 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 02:29:10 2007 Subject: [LARTC] ATM [Cell Tax] In-Reply-To: <200706221923.34274.gustavo@angulosolido.pt> References: <57ca0490706201304p1d7fda13k95be598ecb669e51@mail.gmail.com> <200706221923.34274.gustavo@angulosolido.pt> Message-ID: <467F0C4E.9020304@andyfurniss.entadsl.com> Gustavo Homem wrote: > > On Wednesday 20 June 2007 21:04, Nate Fuhriman wrote: >> I have read the thread at >> http://mailman.ds9a.nl/pipermail/lartc/2006q1/018287.html >> and still don't know how to fix this problem. It appears alot of work >> has gone into it but the HOWTO is so out of date it doesn't even begin >> to addresses this method. >> >> So here are my questions >> 1. what is the current state of these patches? are they in a specific >> version? do i have to patch myself? Yes - but I think they'll need fixing up a bit for the latest tc/kernels. >> 2. how do i actually use this once patched in? an example script would >> work great! TBH I am not sure as I still use a simpler version. I expect it says somewhere here http://ace-host.stuart.id.au/russell/files/tc/tc-atm/ >> 3. is there a table for us mere mortals that describes how to figure >> out which type of adsl/atm i'm using so i can set the appropriate >> overhead? You may be able to see from your modem/router settings or perhaps your ISP/teleco will have the info. If you have a lowish uprate and get steady ping times you may even be able to work it out from that - if all else fails. Likewise if you can get a cell count from your modem you could deduce it with ping. If you do find what your overhead is, you will need to take 14 from it if you connect to your modem via eth, but not if you shape directly on the ppp. > > 4. Does someone know if there's a plan for the inclusion of these patches on > iproute and the kernel? They almost got in, but a different way was suggested and neither happened. Andy. From lists at andyfurniss.entadsl.com Mon Jun 25 03:01:23 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 03:01:21 2007 Subject: [LARTC] Traffic shaping on multiple interfaces In-Reply-To: <467E4B74.5030809@gmail.com> References: <467E4B74.5030809@gmail.com> Message-ID: <467F13E3.3010300@andyfurniss.entadsl.com> Terry Baume wrote: > I'm trying to setup traffic shaping on my linux gateway/router. > > The system has 3 interfaces: > eth0 - My LAN - with IP address 192.168.0.254 > eth1 - The ethernet connection to which my ADSL modem is connected. This > has a 10.25.x.x IP, more on this later. The ADSL link has an upstream of > ~1.2mbit. > ppp0 - The PPP connection which is my WAN connection, with a real world IP. > > The system acts a router, performing NAT for my LAN. This works > perfectly, as does traffic shaping on ppp0 - I get very good results. > > The trouble is that my ISP allows me to user another service over my > ADSL line, as a bonus. Basically the modem has 2 virtual circuits, one > being for my WAN connection, and the other being a private network > between other users of the same ISP, on the same telephone exchange - > this is where the 10.25.x.x IP on eth1 comes from. To make things clear, > low latency on the eth1 interface is not important, this interface is > only used for file sharing and such. Latency on ppp0 is obviously > important, being my WAN connection. > > My IPTables rules provide NAT for both connections, the only thing I > cannot get working correctly is traffic shaping. > > So far, I have experimented with wondershaper, shaping on the ppp0 > interface. This works well to keep latency down when traffic is on the > ppp0 interface. Wondershaper is slightly flawed, depending on how it's setup. You need to make sure the rateds of children don't add up to more that the parent class. Unless you patch for atm overheads (which is going to be tricky for your case) make sure you back off say 20% from the line rate - but then this bit works for you anyway - I just say because it can appear to be OK testing with bulk traffic, but then fail when you have a lot of small packets going out. If there is traffic on eth1 (the 'private' network of > 10.25.x.x), with no traffic on the ppp0 interface, latency on ppp0 > remains low, regardless of whether traffic shaping is active. I believe > this has something to do with the way my ISP has configured priorities > at the telephone exchange. I begin to run in to trouble when I am > uploading heavily on eth1 & ppp0 simultaneously. Once this happens, ping > times over ppp0 rise dramatically, to well over 1200ms (normal is around > 7ms). I have tried shaping on eth1 instead of ppp0 (as eth1 should > contain all the packets for ppp0, I believe), but this does not yield > lower latency, though I did note that it did limit the speed of the > connection if I set the upstream and downstream values absurdly low. You could in theory do it all on eth - but you would have to use the right tc filter ethertypes to get the pppoe and ip. > I think what I need to do is somehow setup a script where traffic > directed to 10.25.0.0 on eth1 is somehow counted against the bandwidth > specified for ppp0, but I'm really not sure. Could someone offer some > advice? I would use ifb it's been in kernel for a while so you don't need to patch as you would with imq. You can redirect all ip traffic going out on ppp0/eth1 to ifb0 and add your htb rules to that. Something like - tc qdisc add dev eth1 handle 1:0 root prio tc qdisc add dev ppp0 handle 1:0 root prio modprobe ifb ip link set up dev ifb0 tc filter add dev ppp0 parent 1:0 protocol ip prio 1 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 tc filter add dev eth1 parent 1:0 protocol ip prio 1 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 then add your htb rules on dev ifb0 Andy. From lists at andyfurniss.entadsl.com Mon Jun 25 03:26:12 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Mon Jun 25 03:26:12 2007 Subject: [LARTC] HTB deadlock In-Reply-To: <1181752830.9399.66.camel@ranko-fc2.spidernet.net> References: <1181752830.9399.66.camel@ranko-fc2.spidernet.net> Message-ID: <467F19B4.1080307@andyfurniss.entadsl.com> Ranko Zivojnovic wrote: > Greetings, > > I've been experiencing problems with HTB where the whole machine locks > up. This usually happens when the whole qdisc is being removed and > occasionally when a leaf is being removed. > > Common is that it always happens when some sort of removal is in > progress. > > Console output I have captured is at the end of this message. The same > behavior exists from vanilla 2.6.19.7 and above. It is possible that the > problem also exist in the earlier versions however I did not go further > back. > > I also believe I have found where the actual problem is: > > qdisc_destroy() function is always called with dev->queue_lock locked. > htb_destroy() function up the stack is using del_timer_sync() call to > deactivate HTB qdisc timers. > >>From the comments in the source where del_timer_sync() is defined: > > ---copy/paste--- > /** > * del_timer_sync - deactivate a timer and wait for the handler to finish. > * @timer: the timer to be deactivated > * > * This function only differs from del_timer() on SMP: besides deactivating > * the timer it also makes sure the handler has finished executing on other > * CPUs. > * > * Synchronization rules: Callers must prevent restarting of the timer, > * otherwise this function is meaningless. It must not be called from > * interrupt contexts. The caller must not hold locks which would prevent > * completion of the timer's handler. The timer's handler must not call > * add_timer_on(). Upon exit the timer is not queued and the handler is > * not running on any CPU. > * > * The function returns whether it has deactivated a pending timer or not. > */ > ---copy/paste--- > > Now, htb_rate_timer() does exactly what appears to be the source of the > problem - it tries obtain dev->queue_lock - and given the right moment > (timer fired handler while qdisc_destroy was holding the lock) - system > locks up - del_timer_sync is waiting for handler to finish while the > handler is waiting for the dev->queue_lock. > > Of course I could also be completely wrong here and missing something > not so obvious. > > I could also attempt to fix this but I haven't dealt with this code in > the past so I was hoping someone with better insight might just have an > elegant solution up his sleeve. > > Best regards, > > Ranko > > PS: If this list is not the right place for this report - please let me > know. You should send bug reports to netdev@vger.kernel.org > > -----------CONSOLE (2.6.19.7)----------- > BUG: soft lockup detected on CPU#3! > [] softlockup_tick+0x93/0xc2 > [] update_process_times+0x26/0x5c > [] smp_apic_timer_interrupt+0x97/0xb2 > [] apic_timer_interrupt+0x1f/0x24 > [] klist_next+0x4/0x8a > [] _spin_unlock_irqrestore+0xa/0xc > [] try_to_del_timer_sync+0x47/0x4f > [] del_timer_sync+0xe/0x14 > [] htb_destroy+0x20/0x7b [sch_htb] > [] qdisc_destroy+0x44/0x8d > [] htb_destroy_class+0xd0/0x12d [sch_htb] > [] htb_destroy_class+0x52/0x12d [sch_htb] > [] htb_destroy+0x3f/0x7b [sch_htb] > [] qdisc_destroy+0x44/0x8d > [] htb_destroy_class+0xd0/0x12d [sch_htb] > [] htb_destroy_class+0x52/0x12d [sch_htb] > [] htb_destroy+0x3f/0x7b [sch_htb] > [] qdisc_destroy+0x44/0x8d > [] tc_get_qdisc+0x1a3/0x1ef > [] tc_get_qdisc+0x0/0x1ef > [] rtnetlink_rcv_msg+0x158/0x215 > [] rtnetlink_rcv_msg+0x0/0x215 > [] netlink_run_queue+0x88/0x11d > [] rtnetlink_rcv+0x26/0x42 > [] netlink_data_ready+0x12/0x54 > [] netlink_sendskb+0x1c/0x33 > [] netlink_sendmsg+0x1ee/0x2d7 > [] sock_sendmsg+0xe5/0x100 > [] autoremove_wake_function+0x0/0x37 > [] autoremove_wake_function+0x0/0x37 > [] sock_sendmsg+0xe5/0x100 > [] copy_from_user+0x33/0x69 > [] sys_sendmsg+0x12d/0x243 > [] _read_unlock_irq+0x5/0x7 > [] find_get_page+0x37/0x42 > [] filemap_nopage+0x30c/0x3a3 > [] __handle_mm_fault+0x21c/0x943 > [] _spin_unlock_bh+0x5/0xd > [] sock_setsockopt+0x63/0x59d > [] anon_vma_prepare+0x1b/0xcb > [] sys_socketcall+0x24f/0x271 > [] do_page_fault+0x0/0x600 > [] sysenter_past_esp+0x56/0x79 > ======================= > BUG: soft lockup detected on CPU#1! > [] softlockup_tick+0x93/0xc2 > [] update_process_times+0x26/0x5c > [] smp_apic_timer_interrupt+0x97/0xb2 > [] apic_timer_interrupt+0x1f/0x24 > [] blk_do_ordered+0x70/0x27e > [] _raw_spin_lock+0xaa/0x13e > [] htb_rate_timer+0x18/0xc4 [sch_htb] > [] run_timer_softirq+0x163/0x189 > [] htb_rate_timer+0x0/0xc4 [sch_htb] > [] __do_softirq+0x70/0xdb > [] do_softirq+0x3b/0x42 > [] smp_apic_timer_interrupt+0x9c/0xb2 > [] apic_timer_interrupt+0x1f/0x24 > [] mwait_idle_with_hints+0x3b/0x3f > [] mwait_idle+0xc/0x1b > [] cpu_idle+0x63/0x79 > ======================= > BUG: soft lockup detected on CPU#2! > [] softlockup_tick+0x93/0xc2 > [] update_process_times+0x26/0x5c > [] smp_apic_timer_interrupt+0x97/0xb2 > [] apic_timer_interrupt+0x1f/0x24 > [] blk_do_ordered+0x70/0x27e > [] _raw_spin_lock+0xaa/0x13e > [] dev_queue_xmit+0x53/0x2e4 > [] neigh_connected_output+0x80/0xa0 > [] ip_output+0x1b5/0x24b > [] ip_finish_output+0x0/0x192 > [] ip_forward+0x1c8/0x2b9 > [] ip_forward_finish+0x0/0x37 > [] ip_rcv+0x2a5/0x538 > [] ip_rcv_finish+0x0/0x2aa > [] __netdev_alloc_skb+0x12/0x2a > [] ip_rcv+0x0/0x538 > [] netif_receive_skb+0x218/0x318 > [] bitmap_get_counter+0x41/0x1e6 > [] e1000_clean_rx_irq+0x12c/0x4ef [e1000] > [] e1000_clean_rx_irq+0x0/0x4ef [e1000] > [] e1000_clean+0xe5/0x130 [e1000] > [] net_rx_action+0xbc/0x1d5 > [] __do_softirq+0x70/0xdb > [] do_softirq+0x3b/0x42 > [] do_IRQ+0x6c/0xda > [] common_interrupt+0x1a/0x20 > [] mwait_idle_with_hints+0x3b/0x3f > [] mwait_idle+0xc/0x1b > [] cpu_idle+0x63/0x79 > ======================= > BUG: soft lockup detected on CPU#0! > [] softlockup_tick+0x93/0xc2 > [] update_process_times+0x26/0x5c > [] smp_apic_timer_interrupt+0x97/0xb2 > [] apic_timer_interrupt+0x1f/0x24 > [] delay_tsc+0x7/0x13 > [] __delay+0x6/0x7 > [] _raw_spin_lock+0xb8/0x13e > [] dev_queue_xmit+0x53/0x2e4 > [] neigh_connected_output+0x80/0xa0 > [] ip_output+0x1b5/0x24b > [] ip_finish_output+0x0/0x192 > [] ip_forward+0x1c8/0x2b9 > [] ip_forward_finish+0x0/0x37 > [] ip_rcv+0x2a5/0x538 > [] ip_rcv_finish+0x0/0x2aa > [] __alloc_skb+0x47/0xf3 > [] ip_rcv+0x0/0x538 > [] netif_receive_skb+0x218/0x318 > [] bitmap_get_counter+0x41/0x1e6 > [] tg3_poll+0x6d3/0x906 [tg3] > [] net_rx_action+0xbc/0x1d5 > [] __do_softirq+0x70/0xdb > [] do_softirq+0x3b/0x42 > [] do_IRQ+0x6c/0xda > [] common_interrupt+0x1a/0x20 > [] mwait_idle_with_hints+0x3b/0x3f > [] mwait_idle+0xc/0x1b > [] cpu_idle+0x63/0x79 > [] start_kernel+0x353/0x423 > [] unknown_bootoption+0x0/0x260 > ======================= > -----------CONSOLE----------- > > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > From mofish at gmail.com Mon Jun 25 05:07:30 2007 From: mofish at gmail.com (John Chang) Date: Mon Jun 25 05:07:37 2007 Subject: [LARTC] Load Balance and SNAT problem. Message-ID: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> I am developing load balancing router, But I have a question about fail over. The follow diagram is my test environment and scripts. ------------------------------------------------------------------- Environment Setting PC1(192.168.10.2) | (LAN) | PC2-eth2(192.168.10.1) + + PC2-eth0(111.111.111.2) PC2-eth1(222.222.222.2 ) | | (WAN1) (WAN2) | | PC3-eth0(111.111.111.1) PC3-eth1( 222.222.222.1) + + PC2-eth2(172.16.0.1) PC2-Linux Kernel 2.6.21 PC2-Iptables 1.3.7 ------------------------------------------------------------------- Iptables rules: iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 111.111.111.2 iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to 222.222.222.2 # table 101 ip route flush table 101 ip route add 192.168.10.0/24 dev eth2 table 101 ip route add default via 111.111.111.1 dev eth0 table 101 # table 102 ip route flush table 102 ip route add 192.168.10.0/24 dev eth2 table 102 ip route add default via 222.222.222.1 dev eth1 table 102 ip rule del fwmark 1 table 101 ip rule del fwmark 2 table 102 ip rule add fwmark 1 table 101 ip rule add fwmark 2 table 102 iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode nth --every 2 --packet 1 -j MARK --set-mark 1 iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode nth --every 2 --packet 2 -j MARK --set-mark 2 iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark ----------------------------------------------------------------------------- Test Sequence: 1. Run command "ping 172.16.0.1 -t" on PC1 2. I capture packets on WAN1 and WAN2, it works fine. The ICMP request/response would come out on WAN1 and WAN2 sequentially. 3. I unplug WAN1. Only the packets on WAN1 will lost, but WAN2 should works, right? I should saw "ping Time Out" and "ping OK" on PC1 sequentially. 4. But the both connections all breaks. It always "ping Time Out" on PC1. 5. After caputre the packets on WAN1 and WAN2. I saw a weird behavior. The source IP of packets on WAN2 is 111.111.111.2, but it should be 222.222.222.2 That is why WAN2 breaks. ----------------------------------------------------------------------------- Could you give me a suggestion? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070625/b830314d/attachment.htm From hijacker at oldum.net Mon Jun 25 13:04:36 2007 From: hijacker at oldum.net (Nikolay Kichukov) Date: Mon Jun 25 13:05:51 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <467EFD97.6050101@andyfurniss.entadsl.com> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> Message-ID: <467FA144.4030805@oldum.net> Hello Andy, Is that line: tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match u32 0 0 flowid 1:2 not equal to: tc qdisc add dev eth0 root handle 1:0 htb default 2 in terms of achieved results? If not, what is the difference? Thanks, -Nikolay Andy Furniss wrote: > Marc wrote: >> Hi, >> >> I am new to tc and have been reading quite a bit on how to set it up etc. >> Everything seems to be working fine, until I started scp-ing a large file >> over a low bandwidth connection as part of my testing process. >> >> Here is the setup: >> my pc --- bridge running tc/htb --- rest of network >> >> TC is filtering traffic from "my pc" and classifies it as 120kbit (see my >> script below). I then scp a 5MB file from a server in "rest of >> network" to >> "my pc". Everything seems to work fine and copies at a speed of around >> 12KB/s, which is what I would expect from a 120kbit connection. At some >> stage scp stalls and eventually disconnects or I get bored and press >> +C. The stage at which it stalls is different every time. First >> it was >> at 76% of the copy progress, then at 32% of the copy progress. >> >> For my testing purposes, there is no other traffic flowing through either >> this class or any other class. My expectation was that it would copy the >> entire file, just at a low speed. I expected to be able to copy a >> 600MB file >> at 12KB/s, which would of course be very slow, but eventually arrive. >> >> Here are the rules I specified, note that "my pc" does *not* have the ip >> address 10.0.2.42 in the test desribed above: >> >> #eth0 qdisc >> tc qdisc add dev eth0 root handle 1:0 htb default 2 >> tc class add dev eth0 parent 1:0 classid 1:1 htb rate 10mbit ceil 10mbit >> tc class add dev eth0 parent 1:1 classid 1:2 htb rate 120kbit ceil >> 120kbit >> tc class add dev eth0 parent 1:1 classid 1:3 htb rate 200kbit ceil 1mbit >> >> #eth1 qdisc >> tc qdisc add dev eth1 root handle 2:0 htb default 2 >> tc class add dev eth1 parent 2:1 classid 2:2 htb rate 120kbit ceil >> 120kbit >> tc class add dev eth1 parent 2:1 classid 2:3 htb rate 200kbit ceil 1mbit >> >> #eth0 filter >> tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip src >> 10.0.2.42 flowid 1:3 >> >> #eth1 filter >> tc filter add dev eth1 parent 2:0 protocol ip prio 1 u32 match ip dst >> 10.0.2.42 flowid 2:3 >> >> Thank you for your comments on this situation. > > It's probably because arp is being sent to 1:2 which is backlogged. Try > not using the default parameter and instead use a catch all ip tc filter > like - > > tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match u32 0 0 > flowid 1:2 > > You could also consider adding p/bfifos to the classes and use the limit > parameter to make the queues shorter. At low bitrates the default > 1000pkts (picked up from the queuelen on eth) is too long. > > Andy. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From unki at netshadow.at Mon Jun 25 13:20:12 2007 From: unki at netshadow.at (Andreas Unterkircher) Date: Mon Jun 25 13:20:18 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <467FA144.4030805@oldum.net> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> Message-ID: <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> The first one only recognize IP traffic, the line with default will match any kind of traffic. Regards, Andreas Quoting Nikolay Kichukov : > Hello Andy, > Is that line: > tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match u32 0 0 > flowid 1:2 > > not equal to: > tc qdisc add dev eth0 root handle 1:0 htb default 2 > > in terms of achieved results? If not, what is the difference? > > Thanks, > -Nikolay From hijacker at oldum.net Mon Jun 25 13:39:06 2007 From: hijacker at oldum.net (Nikolay Kichukov) Date: Mon Jun 25 13:40:03 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> Message-ID: <467FA95A.9060009@oldum.net> Hello Andreas, and arp is not ip ... thanks for clarification. Where(in which class) would all non-ip traffic go in the filter scenario? Thanks, -Nikolay Andreas Unterkircher wrote: > The first one only recognize IP traffic, the line with default will > match any kind of traffic. > > Regards, > Andreas > > Quoting Nikolay Kichukov : > >> Hello Andy, >> Is that line: >> tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match u32 0 0 >> flowid 1:2 >> >> not equal to: >> tc qdisc add dev eth0 root handle 1:0 htb default 2 >> >> in terms of achieved results? If not, what is the difference? >> >> Thanks, >> -Nikolay > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From seba at mfdlabs.ro Mon Jun 25 13:47:51 2007 From: seba at mfdlabs.ro (Seba Tiponut) Date: Mon Jun 25 13:47:46 2007 Subject: [LARTC] Using Julian Anastasov's 'routes' patches on 2.4 kernel in conjunction with IPSec Message-ID: <200706251447.51518.seba@mfdlabs.ro> Hello, I use Julian Anastasov 'routes' (to be more specific: static_routes, alt_routes and nf_reroute) patches on a 2.4.32 kernel. On the same host I run IPSec. I have discovered after a few hours of networking problems that, when IPSec is enabled on that patched kernel, inspecting packets with tcpdump while arping-ing a host from a network physically connected to this machine, the arp requests show up on the ipsecX interface instead of the ethX interface. When IPSec isn't running, Julian's code works fine. I suspect it has something to do with having two interfaces with the same data (ipsecX mirroring the configuration from ethX). Can anyone give me a hint on how could I solve this problem? I've googled a long time to no avail and I don't have the necessary skills to debug the networking code from kernel. Cheers, Seba. From salim.si at cipherium.com.tw Mon Jun 25 13:59:15 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Mon Jun 25 13:59:33 2007 Subject: [LARTC] Using Julian Anastasov's 'routes' patches on 2.4 kernel inconjunction with IPSec In-Reply-To: <200706251447.51518.seba@mfdlabs.ro> Message-ID: <005001c7b720$44731e20$5901a8c0@SalimSi> I had the same problem. Had to disable ipsec interfaces to make things work. Though the routing rules were in correct order, packets went to ipsec interface. Finally, I removed the patch. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Seba Tiponut > Sent: Monday, June 25, 2007 7:48 PM > To: lartc@mailman.ds9a.nl > Subject: [LARTC] Using Julian Anastasov's 'routes' patches on 2.4 kernel > inconjunction with IPSec > > Hello, > > I use Julian Anastasov 'routes' (to be more specific: static_routes, > alt_routes and nf_reroute) patches on a 2.4.32 kernel. On the same host I > run > IPSec. I have discovered after a few hours of networking problems that, > when IPSec is enabled on that patched kernel, inspecting packets with > tcpdump > while arping-ing a host from a network physically connected to this > machine, > the arp requests show up on the ipsecX interface instead of the ethX > interface. When IPSec isn't running, Julian's code works fine. I suspect > it > has something to do with having two interfaces with the same data (ipsecX > mirroring the configuration from ethX). > Can anyone give me a hint on how could I solve this problem? I've googled > a > long time to no avail and I don't have the necessary skills to debug the > networking code from kernel. > > Cheers, > Seba. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From daniel.schaffrath at mac.com Mon Jun 25 14:45:00 2007 From: daniel.schaffrath at mac.com (Daniel Schaffrath) Date: Mon Jun 25 14:45:29 2007 Subject: [LARTC] RED to use ECN (or work at all?) Message-ID: Dear Community, sorry for the somewhat dumb question. Maybe someone has any pointer to how to setup the RED queue to mark pakets with ECN. In particular what are appropriate parameter settings for limit, min, max, etc. All my trials end up with either "RTNETLINK answers: Invalid argument", although the command line (at least for me) looks fine in regard to what is said on the man page. "tc qdisc replace dev lo red limit 10kb min 2kb max 8kb avpkt 1000 burst 12 ecn" being one such example. I am using kernel version 2.6.16.29 and the most recent iproute2 package. Thanks in advance, Daniel From gtaylor at riverviewtech.net Mon Jun 25 16:47:46 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Mon Jun 25 16:45:46 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> Message-ID: <467FD592.5010700@riverviewtech.net> On 06/24/07 22:07, John Chang wrote: > iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode nth --every 2 --packet 1 -j MARK --set-mark 1 > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode nth --every 2 --packet 2 -j MARK --set-mark 2 > iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark I don't think these rules are going to do what you anticipate them to do. These rules will alternate which route is used based on sequential entry of packets in to the router. Consider if you have any transaction that will take more than one packet. The connection will be sent out both routes, each with different source IP addresses, thus the two packets are no longer associated with each other thus breaking your connection. > 2. I capture packets on WAN1 and WAN2, it works fine. > The ICMP request/response would come out on WAN1 and WAN2 sequentially. (See the above comment.) > 3. I unplug WAN1. Only the packets on WAN1 will lost, but WAN2 should > works, right? > I should saw "ping Time Out" and "ping OK" on PC1 sequentially. *IF* the rules do work, yes this should be what you see. > 4. But the both connections all breaks. It always "ping Time Out" on PC1. *nod* > 5. After caputre the packets on WAN1 and WAN2. I saw a weird behavior. > The source IP of packets on WAN2 is 111.111.111.2 > but it should be 222.222.222.2 > That is why WAN2 breaks. I don't know what to say here, other than something is not working right. > Could you give me a suggestion? > Thanks. Do not use this method to load balance. Look in to Equal Cost Multi Path (a.k.a. ECMP) routing and specifying multiple default gateways on one route command. The kernel should try to load balance across the multiple default gateways for you while maintaining connections. Grant. . . . From christian.benvenuti at libero.it Mon Jun 25 22:21:17 2007 From: christian.benvenuti at libero.it (Christian Benvenuti) Date: Mon Jun 25 22:20:29 2007 Subject: [LARTC] Re: RED to use ECN (or work at all?) Message-ID: <1182802877.2691.4.camel@benve-laptop> Hi Daniel, >Dear Community, > >sorry for the somewhat dumb question. Maybe someone has any pointer >to how to setup the RED queue to mark pakets with ECN. In particular >what are appropriate parameter settings for limit, min, max, etc. > >All my trials end up with either "RTNETLINK answers: Invalid >argument", although the command line (at least for me) looks fine in >regard to what is said on the man page. > >"tc qdisc replace dev lo red limit 10kb min 2kb max 8kb avpkt 1000 >burst 12 ecn" being one such example. You did not specify where to attach the qdisc: (see the keyword "root") tc qdisc replace dev lo root \ red limit 10kb min 2kb max 8kb avpkt 1000 burst 12 ecn >I am using kernel version 2.6.16.29 and the most recent iproute2 >package. > >Thanks in advance, >Daniel Regards /Christian [ http://benve.info ] From vladsun at relef.net Mon Jun 25 23:30:25 2007 From: vladsun at relef.net (VladSun) Date: Mon Jun 25 23:31:00 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> Message-ID: <468033F1.9020408@relef.net> John Chang ??????: > > I am developing load balancing router, But I have a question about > fail over. > The follow diagram is my test environment and scripts. > ------------------------------------------------------------------- > Environment Setting > > PC1(192.168.10.2 ) > | > (LAN) > | > PC2-eth2( 192.168.10.1 ) > + + > PC2-eth0(111.111.111.2 ) PC2-eth1(222.222.222.2 > ) > | | > (WAN1) (WAN2) > | | > PC3-eth0(111.111.111.1 ) PC3-eth1( 222.222.222.1 > ) > + + > PC2-eth2(172.16.0.1 ) > > PC2-Linux Kernel 2.6.21 > PC2-Iptables 1.3.7 > > > ------------------------------------------------------------------- > Iptables rules: > > iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 111.111.111.2 > > iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to 222.222.222.2 > > > # table 101 > ip route flush table 101 > ip route add 192.168.10.0/24 dev eth2 table 101 > ip route add default via 111.111.111.1 dev eth0 > table 101 > > # table 102 > ip route flush table 102 > ip route add 192.168.10.0/24 dev eth2 table 102 > ip route add default via 222.222.222.1 dev eth1 > table 102 > > ip rule del fwmark 1 table 101 > ip rule del fwmark 2 table 102 > ip rule add fwmark 1 table 101 > ip rule add fwmark 2 table 102 > > iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode nth --every 2 --packet 1 -j MARK --set-mark 1 > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode nth --every 2 --packet 2 -j MARK --set-mark 2 > iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark > > ----------------------------------------------------------------------------- > Well ... I am not sure about it but you may try to do it this way: iptables -t nat -A POSTROUTING -o ! eth2 -m mark --mark 1 -j SNAT --to 111.111.111.2 iptables -t nat -A POSTROUTING -o ! eth2 -m mark --mark 2 -j SNAT --to 222.222.222.2 iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode nth --every 2 --packet 1 -j MARK --set-mark 1 iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode nth --every 2 --packet 2 -j MARK --set-mark 2 iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark This is done without using iproute. There is another solution, but it works only with kernels up to 2.6.10: iptables -t nat -A POSTROUTING -o ! eth2 -j SNAT --to 111.111.111.2 ,222.222.222.2 ".... For those kernels, if you specify more than one source address, either via an address range or multiple --to-source options, a simple round-robin (one after another in cycle) takes place between these addresses. Later Kernels (>= 2.6.11-rc1) don't have the ability to NAT to multiple ranges anymore. ..." From ja at ssi.bg Mon Jun 25 23:40:39 2007 From: ja at ssi.bg (Julian Anastasov) Date: Mon Jun 25 23:40:53 2007 Subject: [LARTC] Using Julian Anastasov's 'routes' patches on 2.4 kernel in conjunction with IPSec In-Reply-To: <200706251447.51518.seba@mfdlabs.ro> References: <200706251447.51518.seba@mfdlabs.ro> Message-ID: Hello, On Mon, 25 Jun 2007, Seba Tiponut wrote: > I use Julian Anastasov 'routes' (to be more specific: static_routes, > alt_routes and nf_reroute) patches on a 2.4.32 kernel. On the same host I run > IPSec. I have discovered after a few hours of networking problems that, > when IPSec is enabled on that patched kernel, inspecting packets with tcpdump > while arping-ing a host from a network physically connected to this machine, > the arp requests show up on the ipsecX interface instead of the ethX > interface. When IPSec isn't running, Julian's code works fine. I suspect it > has something to do with having two interfaces with the same data (ipsecX > mirroring the configuration from ethX). > Can anyone give me a hint on how could I solve this problem? I've googled a > long time to no avail and I don't have the necessary skills to debug the > networking code from kernel. May be you have to replace your _updown script with one that supports "ip route" and "ip rule" commands instead of the old "route" tool. By this way you can use "ip rule ... from LNET to RNET" to properly route traffic for the negotiated subnets. If I remember correctly, the default _updown script does not consider negotiated LNET at all. As for routes patch, it will prefer NOARP devices when the neighbours on ARP device are not marked as reachable in ARP cache. So, it is risky to rely on wrong routes, especially after routes patch is applied. Regards -- Julian Anastasov From rabbit at rabbit.us Tue Jun 26 08:46:12 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 26 08:47:18 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <467FD592.5010700@riverviewtech.net> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net> Message-ID: <4680B634.9060802@rabbit.us> Grant Taylor wrote: > >> Could you give me a suggestion? >> Thanks. > > Do not use this method to load balance. Look in to Equal Cost Multi > Path (a.k.a. ECMP) routing and specifying multiple default gateways on > one route command. The kernel should try to load balance across the > multiple default gateways for you while maintaining connections. > This is a bad bad advice in this day and age. If there are not enough users route caching will kill him. Here is a recent discussion of this: http://marc.info/?l=lartc&m=117912699505681&w=2 HTH Peter P.S. I am not insisting that netfilter is superior in this regard, I am simply expressing common requirements and looking into ways of achieving them. If someone can point me to how to do this with kernel routes - I am all ears, since I recognize that the netfilter solution is not very elegant, although it works. From mofish at gmail.com Tue Jun 26 13:36:50 2007 From: mofish at gmail.com (John Chang) Date: Tue Jun 26 13:36:58 2007 Subject: [LARTC] Load Balance and SNAT problem. Message-ID: <7e47206b0706260436xa35438cx25253c9614fd9e47@mail.gmail.com> Thanks for your advices. Currently my test scripts will make both WAN connections break, when I unplug one WAN connection. So I can not implement the fail-over mechanism. My original idea is to mark all packets as 1 when connection WAN2 breaks or mark all packets as 2 when connection WAN1 breaks. But now one connection breaks will make both connections break. I could not identify which connection breaks? It is weird. ><" ------------------------------------------------------------------------------------------------------ Grant Taylor wrote: > >> Could you give me a suggestion? >> Thanks. > > Do not use this method to load balance. Look in to Equal Cost Multi > Path (a.k.a. ECMP) routing and specifying multiple default gateways on > one route command. The kernel should try to load balance across the > multiple default gateways for you while maintaining connections. > This is a bad bad advice in this day and age. If there are not enough users route caching will kill him. Here is a recent discussion of this: http://marc.info/?l=lartc&m=117912699505681&w=2 HTH Peter P.S. I am not insisting that netfilter is superior in this regard, I am simply expressing common requirements and looking into ways of achieving them. If someone can point me to how to do this with kernel routes - I am all ears, since I recognize that the netfilter solution is not very elegant, although it works. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070626/db947458/attachment.html From gtaylor at riverviewtech.net Tue Jun 26 16:37:26 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Tue Jun 26 16:35:26 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4680B634.9060802@rabbit.us> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net> <4680B634.9060802@rabbit.us> Message-ID: <468124A6.8000101@riverviewtech.net> On 06/26/07 01:46, Peter Rabbitson wrote: > This is a bad bad advice in this day and age. I think that is a bit of a bold statement. You are free to have your opinion on what is better for you, as am I. > If there are not enough users route caching will kill him. Here is a > recent discussion of this: > http://marc.info/?l=lartc&m=117912699505681&w=2 Um, I just read this discussion and I have a few issues with it. First and foremost: It did not cover the reason "... route caching will kill ..." to my satisfaction like you indicated. Second: It relies on user space processes to alter and maintain things. Thus if for some reason these processes do not run or do not do so in a timely manner, they may not function correctly. Third: You are altering the way a running kernel is operating from user space, not letting the kernel maintain its self. Fourth: Occam's Razor dictates the use of the simpler and equally effective (equality is debatable) method to achieve the same result. Though the method you site has potential, I think there is just as much room for improvement as there is in the method that I suggested. Each method has its pros and cons. > P.S. I am not insisting that netfilter is superior in this regard, I > am simply expressing common requirements and looking into ways of > achieving them. If someone can point me to how to do this with > kernel routes - I am all ears, since I recognize that the netfilter > solution is not very elegant, although it works. By your own statement, you are indicating that both methods leave something to be desired. Grant. . . . From contato at patrick.eti.br Tue Jun 26 17:04:56 2007 From: contato at patrick.eti.br (=?iso-8859-1?Q?Patrick_Brand=E3o?=) Date: Tue Jun 26 17:05:10 2007 Subject: [LARTC] Load Balance and SNAT problem. References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net><4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> Message-ID: <00dd01c7b803$5d7acc90$c70010ac@notebook> Try this algol: MANGLE: 1 - restore mark 2 - accept mark 1 accept mark 2 3 - random mark 1 ou 2 4 - save mark NAT 5 - SNAT per interface. Att, Patrick Brand?o ----- Original Message ----- From: "Grant Taylor" To: "Mail List - Linux Advanced Routing and Traffic Control" Sent: Tuesday, June 26, 2007 11:37 AM Subject: Re: [LARTC] Load Balance and SNAT problem. > On 06/26/07 01:46, Peter Rabbitson wrote: >> This is a bad bad advice in this day and age. > > I think that is a bit of a bold statement. You are free to have your > opinion on what is better for you, as am I. > >> If there are not enough users route caching will kill him. Here is a >> recent discussion of this: >> http://marc.info/?l=lartc&m=117912699505681&w=2 > > Um, I just read this discussion and I have a few issues with it. > > First and foremost: It did not cover the reason "... route caching will > kill ..." to my satisfaction like you indicated. > > Second: It relies on user space processes to alter and maintain things. > Thus if for some reason these processes do not run or do not do so in a > timely manner, they may not function correctly. > > Third: You are altering the way a running kernel is operating from user > space, not letting the kernel maintain its self. > > Fourth: Occam's Razor dictates the use of the simpler and equally > effective (equality is debatable) method to achieve the same result. > > Though the method you site has potential, I think there is just as much > room for improvement as there is in the method that I suggested. Each > method has its pros and cons. > >> P.S. I am not insisting that netfilter is superior in this regard, I am >> simply expressing common requirements and looking into ways of achieving >> them. If someone can point me to how to do this with kernel routes - I >> am all ears, since I recognize that the netfilter solution is not very >> elegant, although it works. > > By your own statement, you are indicating that both methods leave > something to be desired. > > > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > From tenos at ll.mit.edu Tue Jun 26 19:16:06 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Tue Jun 26 19:16:33 2007 Subject: [LARTC] classification of incoming traffic with tc Message-ID: <200706261716.l5QHGQx3027844@ll.mit.edu> Hi all, Another requirement we have is that traffic entering the DS domain be classified then subsequently assigned a (different?) DSCP based upon its classification. For illustrative purposes only let's say (for traffic entering the DS domain on dev eth0): - WWW traffic would be marked BE - traffic destined for 10.10.10.10 would be marked AF11 - VoIP traffic from 20.20.20.20 would be marked EF - packets 500 bytes in length would be marked AF22 I'm looking for the Linux router to classify and mark the incoming traffic, _not_ the originating host(s). From what I can see, dsmark only works on egress qdiscs (is this indeed the case?). I need something that works on an ingress (qdisc?). From daniel at mks.padinet.com Tue Jun 26 19:24:05 2007 From: daniel at mks.padinet.com (Daniel Harold L.) Date: Tue Jun 26 19:24:17 2007 Subject: [LARTC] classification of incoming traffic with tc In-Reply-To: <200706261716.l5QHGQx3027844@ll.mit.edu> References: <200706261716.l5QHGQx3027844@ll.mit.edu> Message-ID: <200706270124.06059.daniel@mks.padinet.com> On Wednesday 27 June 2007 01:16, Tim Enos wrote: > I'm looking for the Linux router to classify and mark the incoming traffic, > _not_ the originating host(s). From what I can see, dsmark only works on > egress qdiscs (is this indeed the case?). I need something that works on an > ingress (qdisc?). afaik, you can use IMQ or IFB for ingress qdisc. Daniel PadiNet Makassar From rabbit at rabbit.us Tue Jun 26 19:44:24 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Tue Jun 26 19:44:31 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <468124A6.8000101@riverviewtech.net> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> Message-ID: <46815078.5060705@rabbit.us> Grant Taylor wrote: > First and foremost: It did not cover the reason "... route caching will > kill ..." to my satisfaction like you indicated. Can you elaborate on this? My only issue with the kernel route balancing is that route caching can not be disabled entirely, so traffic to the same site will leave via the same channel, regardless if the other channel is empty or not. I know that it is technically possible (kernel option CONFIG_IP_ROUTE_MULTIPATH_RANDOM), but it will work only for globally routable addresses, while breaking NAT badly. The reason I made my bold, as you call it, statement, is because 90% of the time when someone is doing NAT, it is for a tightly joined group, with similar interests - hence a lot of traffic duplication. For instance if every user listens to the same online radiostation - how would you work around it? Let me know your thoughts Peter From tenos at ll.mit.edu Tue Jun 26 20:38:17 2007 From: tenos at ll.mit.edu (Tim Enos) Date: Tue Jun 26 20:39:03 2007 Subject: [LARTC] classification of incoming traffic with tc In-Reply-To: <200706270124.06059.daniel@mks.padinet.com> Message-ID: <200706261838.l5QIcrRo028487@ll.mit.edu> Hi Daniel, Thanks. So how exactly would I use either IMQ or IFB to classify and subsequently mark incoming traffic with a (new) DHCP? If you have a configuration example, that would be most helpful. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Daniel Harold L. > Sent: Tuesday, June 26, 2007 1:24 PM > To: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] classification of incoming traffic with tc > > On Wednesday 27 June 2007 01:16, Tim Enos wrote: > > > I'm looking for the Linux router to classify and mark the incoming > traffic, > > _not_ the originating host(s). From what I can see, dsmark only works on > > egress qdiscs (is this indeed the case?). I need something that works on > an > > ingress (qdisc?). > > afaik, you can use IMQ or IFB for ingress qdisc. > > Daniel > PadiNet Makassar > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From ramoni at databras.com.br Tue Jun 26 22:01:33 2007 From: ramoni at databras.com.br (Andre =?utf-8?q?Guimar=C3=A3es?=) Date: Tue Jun 26 22:01:38 2007 Subject: [LARTC] Load Balance and SSL In-Reply-To: <46815078.5060705@rabbit.us> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> Message-ID: <200706261701.33597.ramoni@databras.com.br> Hi, I have load balance working on a linux server, balancing between two providers with obvious two different IPs (the customer is not an Autonomous System). It works very well except with some sites that establish a session and then redirects the session to another server. These sessions are usually based on informations like cookies and client IP address, and therefore you must reach the destination with the same IP address (thats why routing cache is there). But when the "session" is redirected to another destination server, another destination IP, sometimes the connection go trought the another link, and so, arrives at the destination with another IP, and then the session becomes invalid. I can't see anything linux (and any other) could do to deal with it, since it's a new destination IP. Anyone knows something that could solve this kind of problem ? :: Sorry for the bad english. -- Andr? Guimar?es Databras Inform?tica Matriz RJ - 55 (21) 2518-2363 Filial ES - 55 (27) 3233-0098 http://www.databras.com.br From lists at andyfurniss.entadsl.com Tue Jun 26 23:32:23 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Jun 26 23:32:58 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <467FA95A.9060009@oldum.net> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> <467FA95A.9060009@oldum.net> Message-ID: <468185E7.1030709@andyfurniss.entadsl.com> Nikolay Kichukov wrote: > Hello Andreas, > and arp is not ip ... thanks for clarification. > > Where(in which class) would all non-ip traffic go in the filter scenario? In the case of htb unclassified go unshaped without a default class set (=default 0) you do get a counter - andy@noki:~$ /sbin/tc -s qdisc ls dev eth3 qdisc htb 1: r2q 10 default 0 direct_packets_stat 3223 In the case of HFSC unclassified get dropped - so you really need a default class, but not one that gets low prio IP sent to it :-) Andy. From lists at andyfurniss.entadsl.com Tue Jun 26 23:44:26 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Jun 26 23:44:35 2007 Subject: [LARTC] RED to use ECN (or work at all?) In-Reply-To: References: Message-ID: <468188BA.9020901@andyfurniss.entadsl.com> Daniel Schaffrath wrote: > Dear Community, > > sorry for the somewhat dumb question. Maybe someone has any pointer to > how to setup the RED queue to mark pakets with ECN. In particular what > are appropriate parameter settings for limit, min, max, etc. I have never really played with red - but I wouldn't start by putting it on lo - I don't think you will get a backlog. Even on root of an eth there will be quite big buffer to fill before it gets any backlog. You are best using it as a child of something that rate limits tbf/hfsc/htb. There is a rate parameter which I guess you should use - but it's not to actually do the rate limiting. As for ECN, it's off by default on Linux, so I don't think it will be much use unless you have control over the network. I just turned it on on two of my PCs - echo 1 > /proc/sys/net/ipv4/tcp_ecn and did a test with red and it does work and save some drops. Andy. From lists at andyfurniss.entadsl.com Tue Jun 26 23:47:20 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Tue Jun 26 23:47:45 2007 Subject: [LARTC] selectors for tc filters In-Reply-To: <18046.14055.442505.588642@ccs.covici.com> References: <18046.14055.442505.588642@ccs.covici.com> Message-ID: <46818968.6040703@andyfurniss.entadsl.com> John covici wrote: > Hi. I can't find any documentation on the specific selectors for > tc-filters -- what documentation I have says they are in Polish in a > file called selectors.html -- is there anything around in English to > see those? > > Thanks. > This link to an impressive doc by Russel Stuart was posted recently. http://www.stuart.id.au/russell/files/tc/doc/tc/cls_u32.txt Andy. From covici at ccs.covici.com Wed Jun 27 00:14:18 2007 From: covici at ccs.covici.com (John covici) Date: Wed Jun 27 00:17:10 2007 Subject: [LARTC] selectors for tc filters In-Reply-To: <46818968.6040703@andyfurniss.entadsl.com> References: <18046.14055.442505.588642@ccs.covici.com> <46818968.6040703@andyfurniss.entadsl.com> Message-ID: <18049.36794.838284.183021@ccs.covici.com> Thanks -- looks like just what I was looking for. on Tuesday 06/26/2007 Andy Furniss(lists@andyfurniss.entadsl.com) wrote > John covici wrote: > > Hi. I can't find any documentation on the specific selectors for > > tc-filters -- what documentation I have says they are in Polish in a > > file called selectors.html -- is there anything around in English to > > see those? > > > > Thanks. > > > > This link to an impressive doc by Russel Stuart was posted recently. > > http://www.stuart.id.au/russell/files/tc/doc/tc/cls_u32.txt > > Andy. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici covici@ccs.covici.com From ghartung at photobucket.com Wed Jun 27 01:01:42 2007 From: ghartung at photobucket.com (Greg Hartung) Date: Wed Jun 27 01:01:50 2007 Subject: [LARTC] GRE tunnel In-Reply-To: Message-ID: I'm still stuck on this one and could really use some help. I just finished trying it on an FC3 box too to make sure it wasn't CentOS specific issue but there's still no output from tcpdump. I also spent some time looking over Cisco examples to make sure I wasn't misremembering the concepts. No surprises there. Does anyone have any ideas or can someone suggest a more appropriate forum for the question? Thanks!! On 6/21/07 11:52 AM, "Greg Hartung" wrote: > > I am trying to setup GRE between two CentOS 4.5 boxes. I have tried > several variations of what's listed below, but none of them work. > > box1: > modprobe ip_gre > ip link set gre0 up > ip tunnel add gretun mode gre local 66.1.1.161 remote 66.1.2.161 ttl 20 dev > eth0 > ip addr add dev gretun 10.253.253.1 peer 10.253.253.2/24 > ip link set dev gretun up > ip route add 10.2.0.0/16 via 10.253.253.2 > > box2: > modprobe ip_gre > ip link set gre0 up > ip tunnel add gretun mode gre local 66.1.2.161 remote 66.1.1.161 ttl 20 dev > eth0 > ip addr add dev gretun 10.253.253.2 peer 10.253.253.1/24 > ip link set dev gretun up > ip route add 10.1.0.0/16 via 10.253.253.1 > > tcpdump shows NO rx or tx traffic from either box that isn't ARP or SSH. > > It's as if it's not even trying to bring the tunnel up. I'm a Cisco guy, > so I'm lost with my show commands. > > The other variations I've tried consist mostly of trying different > combinations of on-net (in the same subnet as eth0 and even the same address > as eth0) and off-net (various combinations of loopback /24 and /32 addresses > in separate 10 space) on the 'ip addr add dev gretun' statements. But the > above example is what *should* work on a Cisco, I think. It's been a > while. > > How do I troubleshoot this? This is all I've got so far: > > root@den1tun01:/home/root $ ip link > 1: lo: mtu 16436 qdisc noqueue > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > 2: eth0: mtu 8800 qdisc pfifo_fast qlen 1000 > link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff > 3: eth0.2: mtu 8800 qdisc noqueue > link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff > 4: gre0: mtu 1476 qdisc noqueue > link/gre 0.0.0.0 brd 0.0.0.0 > 5: gretun@eth0: mtu 8776 qdisc noqueue > link/gre 66.1.1.161 peer 66.1.2.161 > > root@den1tun01:/home/root $ ip tun > gre0: gre/ip remote any local any ttl inherit nopmtudisc > gretun: gre/ip remote 66.1.2.161 local 66.1.1.161 dev eth0 ttl 20 > > root@den1tun01:/home/root $ ifconfig > eth0 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 > inet addr:10.1.2.243 Bcast:10.1.3.255 Mask:255.255.254.0 > UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 > RX packets:3357 errors:0 dropped:0 overruns:0 frame:0 > TX packets:484 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:230757 (225.3 KiB) TX bytes:63937 (62.4 KiB) > Interrupt:169 Memory:f8000000-f8011100 > > eth0.2 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 > inet addr:66.1.1.161 Bcast:66.1.1.191 Mask:255.255.255.192 > UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 > RX packets:950 errors:0 dropped:0 overruns:0 frame:0 > TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:43860 (42.8 KiB) TX bytes:1200 (1.1 KiB) > > gretun Link encap:UNSPEC HWaddr > 42-0B-33-A1-FF-C0-00-00-00-00-00-00-00-00-00-00 > inet addr:10.253.253.1 P-t-P:10.253.253.2 Mask:255.255.255.0 > UP POINTOPOINT RUNNING NOARP MTU:8776 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:756 (756.0 b) > > gre0 Link encap:UNSPEC HWaddr > 00-00-00-00-FF-00-00-00-00-00-00-00-00-00-00-00 > UP RUNNING NOARP MTU:1476 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:225 errors:0 dropped:0 overruns:0 frame:0 > TX packets:225 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:13271 (12.9 KiB) TX bytes:13271 (12.9 KiB) > > > I've also tried changing the destination for the route to the near end of > the private subnet and tried pinging various things on the tunnel subnet and > remote network to create "interesting traffic" to bring the tunnel up but > tcpdump still shows nothing. > > Then I noticed that ping does show an error count: > > [root@den1tun01 ~]# ping 10.253.253.2 > PING 10.253.253.2 (10.253.253.2) 56(84) bytes of data. >> From 10.253.253.1 icmp_seq=0 Destination Host Unreachable >> From 10.253.253.1 icmp_seq=1 Destination Host Unreachable > > --- 10.253.253.2 ping statistics --- > 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms > , pipe 2 > > I can ping the local end: 10.253.253.1, but the tunnel is still > non-functinoal. > > Thanks! > Greg > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From florin at andrei.myip.org Wed Jun 27 01:41:29 2007 From: florin at andrei.myip.org (Florin Andrei) Date: Wed Jun 27 01:41:46 2007 Subject: [LARTC] network simulator Message-ID: <4681A429.9090605@andrei.myip.org> I want to build a "network simulator", to create scenarios such as delayed packets, lost packets, low bandwidth, or combinations of such. This document has been helpful for everything except the bandwidth: http://linux-net.osdl.org/index.php/Netem There is some documentation here regarding bandwidth: http://luxik.cdi.cz/~devik/qos/htb/ What's the best documentation to read so as to understand exactly how tc works and to be able to come up with my own list of commands to create the scenarios described above? I have a fairly good understanding of the OS in general, networking, iptables, but I never used the LARTC features until now, so I guess I'm looking for the best starting point. Thank you, -- Florin Andrei http://florin.myip.org/ From florin at andrei.myip.org Wed Jun 27 01:58:06 2007 From: florin at andrei.myip.org (Florin Andrei) Date: Wed Jun 27 01:58:23 2007 Subject: [LARTC] network simulator In-Reply-To: <4681A429.9090605@andrei.myip.org> References: <4681A429.9090605@andrei.myip.org> Message-ID: <4681A80E.40206@andrei.myip.org> Florin Andrei wrote: > I want to build a "network simulator", to create scenarios such as > delayed packets, lost packets, low bandwidth, or combinations of such. I guess I should be more specific: The "simulator" is a dual-homed machine running Linux, sitting between a couple test servers and a bunch of workstations: Servers------Simulator--------Workstations Physically, the network is a mix of GigE and FastE. The delay, loss and bandwidth constraints must be applied to all traffic going through the simulator. For now, there's no need to differentiate between the various protocols, addresses, etc. - everyone is equal. The simulator is a plain router, nothing fancy. At most, it might NAT the addresses of the servers to IP aliases on the interface facing the workstations. -- Florin Andrei http://florin.myip.org/ From gtaylor at riverviewtech.net Wed Jun 27 03:24:37 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 03:25:08 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46815078.5060705@rabbit.us> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> Message-ID: <4681BC55.1080301@riverviewtech.net> On 6/26/2007 12:44 PM, Peter Rabbitson wrote: > Can you elaborate on this? My only issue with the kernel route > balancing is that route caching can not be disabled entirely, so > traffic to the same site will leave via the same channel, regardless > if the other channel is empty or not. I know that it is technically > possible (kernel option CONFIG_IP_ROUTE_MULTIPATH_RANDOM), but it > will work only for globally routable addresses, while breaking NAT > badly. This is a very good point that was not made in the referenced message. I do not have any rebuttal to this point. This is the type of point that I was hoping to see before but did not. My response to this is that you have a good point, something that in my opinion should be addressed by the kernel at some point. > The reason I made my bold, as you call it, statement, is because 90% > of the time when someone is doing NAT, it is for a tightly joined > group, with similar interests - hence a lot of traffic duplication. > For instance if every user listens to the same online radiostation - > how would you work around it? I don't know if the 90% as you say is accurate or not. However if you are even remotely in the ball park, you have a good point. I have been around environments with nearly 1000 computers with very little in similarity between all the people. I think this is really based on where NAT is used and how it is used. If you are talking of many to one NAT I would agree with you. However if you are talking about many to many NAT, I'll disagree with you. I think that the scenarios you are thinking of would be best described as a small office / home office (a.k.a. SOHO), which would definitely qualify with what you are saying. However there are a LOT of uses of NAT outside of SOHOs. Given the prevalence of SOHOs doing NAT, I am willing to bet that you are correct. But, this is why there are different types of solutions to this problem for them. > Let me know your thoughts With regard to streaming radio, I personally believe that it should be multicast so that it can be streamed in one time and have multiple recipients hear it. Or there should be some sort of proxy that will download it and pass it back to multiple clients. Of course, this is beyond the scope of this discussion and would be used in larger environments out side of the SOHOs that I think you are referring to. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 03:26:46 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 03:27:17 2007 Subject: [LARTC] Load Balance and SSL In-Reply-To: <200706261701.33597.ramoni@databras.com.br> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <200706261701.33597.ramoni@databras.com.br> Message-ID: <4681BCD6.3050506@riverviewtech.net> On 6/26/2007 3:01 PM, Andre Guimar?es wrote: > Anyone knows something that could solve this kind of problem ? I would like to see some control over how the cache matches, i.e. a netmask for the destination IP. Something like cache for matches on /24 or the likes. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 03:51:30 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 03:52:04 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46815078.5060705@rabbit.us> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <467FD592.5010700@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> Message-ID: <4681C2A2.2010208@riverviewtech.net> (Sorry, I'm not sure but the answer does impact this discussion.) On 6/26/2007 12:44 PM, Peter Rabbitson wrote: > so traffic to the same site will leave via the same channel, > regardless if the other channel is empty or not. Is the caching per route or per source IP? I'm guessing that it is per route decision such that any and all clients will use the same cached route thus not using additional interfaces. Or is this a clear and concise reason why load balancing via Netfilter would be a better approach? Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 04:07:50 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 04:08:30 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681C560.50008@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"46 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> Message-ID: <4681C676.6030407@riverviewtech.net> On 6/26/2007 9:03 PM, Mohan Sundaram wrote: > The caching would be per destination IP - so it is likely all clients > will use the same route and thus interface. This could be a problem. I was taking the caching to be remembering which route was chosen and believing it to be associated with a specific source IP address. I can see this being a very large issue when trying to do load balancing. In light of this information, I think that better could be done in Netfilter. However if there ever was a way to have route selection per source IP in the kernel, I would be more interested in that. I wonder if route selection caching would be different in different routing tables. In other words use a different routing table for a different (set of) clients. Thus one cached routing decision per routing table which could differ per routing table. Grant. . . . From salim.si at cipherium.com.tw Wed Jun 27 04:22:44 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Wed Jun 27 04:23:03 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681C676.6030407@riverviewtech.net> Message-ID: <000001c7b862$0f009270$b9021d0a@SalimSi> The caching is per destination and source ip. TOS, fwmark and input interface too, if present. Routing with netfilter does not solve cache problems anyway, cache will still be present, and it will be consulted before routing tables are hit. In my opinion, routing in netfilter gives more flexibility in dynamically choosing weights and such. But multipath routing gives a bit more IP persistence. Both solutions work pretty well; there are die-hard fans for both of the above approaches. Recent archives of lartc have lot of discussions on it. > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Grant Taylor > Sent: Wednesday, June 27, 2007 10:08 AM > To: Mail List - Linux Advanced Routing and Traffic Control > Subject: Re: [LARTC] Load Balance and SNAT problem. > > On 6/26/2007 9:03 PM, Mohan Sundaram wrote: > > The caching would be per destination IP - so it is likely all clients > > will use the same route and thus interface. > > This could be a problem. I was taking the caching to be remembering > which route was chosen and believing it to be associated with a specific > source IP address. I can see this being a very large issue when trying > to do load balancing. > > In light of this information, I think that better could be done in > Netfilter. However if there ever was a way to have route selection per > source IP in the kernel, I would be more interested in that. > > I wonder if route selection caching would be different in different > routing tables. In other words use a different routing table for a > different (set of) clients. Thus one cached routing decision per > routing table which could differ per routing table. > > > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From gtaylor at riverviewtech.net Wed Jun 27 04:34:14 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 04:34:30 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681C806.9060703@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> Message-ID: <4681CCA6.9080308@riverviewtech.net> On 6/26/2007 9:14 PM, Mohan Sundaram wrote: > I remember that route balancing has an option to perform per packet > balancing and not per connection. If that were to work, then route > cache would not be used IMHO. Interesting. Do you have any idea where I can get some more information regarding this? > Per packet balancing is normally not done as it would break > connections especially in NAT'ted scenario. Keep in mind that NATing is not the only place that load balancing is used. I call to mind my recent thread "Redundant internet connections" (http://mailman.ds9a.nl/pipermail/lartc/2007q2/021015.html) where I had globally routable IP addresses in side the DMZ. I could have used per packet load balancing with out a problem except for the fact that I specifically wanted to not use the backup connection unless the primary was down. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 04:39:05 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 04:39:13 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <000001c7b862$0f009270$b9021d0a@SalimSi> References: <000001c7b862$0f009270$b9021d0a@SalimSi> Message-ID: <4681CDC9.9000104@riverviewtech.net> On 6/26/2007 9:22 PM, Salim S I wrote: > The caching is per destination and source ip. TOS, fwmark and input > interface too, if present. Is the caching done on the combination of source and destination or singularly source or singularly destination? If caching is done on the former, then as long as the source IP is different, you could potentially have different cached route choices for different workstations with in a company. Grant. . . . From salim.si at cipherium.com.tw Wed Jun 27 05:07:33 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Wed Jun 27 05:07:46 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681CDC9.9000104@riverviewtech.net> Message-ID: <000101c7b868$4f153d10$b9021d0a@SalimSi> Well, this is the relevant code in my kernel. (2.4.27) for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { if (rth->key.dst == key->dst && rth->key.src == key->src && rth->key.iif == 0 && rth->key.oif == key->oif && #ifdef CONFIG_IP_ROUTE_FWMARK rth->key.fwmark == key->fwmark && #endif !((rth->key.tos ^ key->tos) & (IPTOS_RT_MASK | RTO_ONLINK))) > -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] > On Behalf Of Grant Taylor > Sent: Wednesday, June 27, 2007 10:39 AM > To: Mail List - Linux Advanced Routing and Traffic Control > Subject: Re: [LARTC] Load Balance and SNAT problem. > > On 6/26/2007 9:22 PM, Salim S I wrote: > > The caching is per destination and source ip. TOS, fwmark and input > > interface too, if present. > > Is the caching done on the combination of source and destination or > singularly source or singularly destination? > > If caching is done on the former, then as long as the source IP is > different, you could potentially have different cached route choices for > different workstations with in a company. > > > > Grant. . . . > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From gtaylor at riverviewtech.net Wed Jun 27 05:16:53 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 05:17:07 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <000101c7b868$4f153d10$b9021d0a@SalimSi> References: <000101c7b868$4f153d10$b9021d0a@SalimSi> Message-ID: <4681D6A5.2090700@riverviewtech.net> On 6/26/2007 10:07 PM, Salim S I wrote: > for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) > { > if (rth->key.dst == key->dst && > rth->key.src == key->src && > rth->key.iif == 0 && > rth->key.oif == key->oif && > #ifdef CONFIG_IP_ROUTE_FWMARK > rth->key.fwmark == key->fwmark && > #endif > !((rth->key.tos ^ key->tos) & > (IPTOS_RT_MASK | RTO_ONLINK))) I'm no C programmer, but it looks like the source, destination, in interface, and out interface are all part of the conditional, thus leading us to believe that caching (?) might be per combination of all the above? Grant. . . . From rabbit at rabbit.us Wed Jun 27 07:54:42 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed Jun 27 07:54:50 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <000001c7b862$0f009270$b9021d0a@SalimSi> References: <000001c7b862$0f009270$b9021d0a@SalimSi> Message-ID: <4681FBA2.7000709@rabbit.us> Salim S I wrote: > The caching is per destination and source ip. TOS, fwmark and input > interface too, if present. Interesting... It definitely did not work in my scenario though. I am going to test this again in the near future, and if you are right I will rest my case. > Routing with netfilter does not solve cache problems anyway, cache will > still be present, and it will be consulted before routing tables are > hit. This is true for locally generated traffic only. Any incomming/forwarded traffic can be controlled in the PREROUTING, thus the cache is never consulted. > Both solutions work pretty well; there are die-hard fans for both of the > above approaches. Recent archives of lartc have lot of discussions on > it. I am actually simply jealous that some people apparently get it to work in-kernel, and I can't seem to. My requirements are pretty simple: o As transparrent as possible DGD, that can detect 2nd and 3rd hop failures o Robust load balancing - connections are distributed over all available links, regardless of source and destination, with the possibility of assigning relative channel priorities o NAT compatible - link hopping is not an option, traffic with a specific SRC/DST must stay where it started. From salim.si at cipherium.com.tw Wed Jun 27 08:41:12 2007 From: salim.si at cipherium.com.tw (Salim S I) Date: Wed Jun 27 08:41:49 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681FBA2.7000709@rabbit.us> Message-ID: <000201c7b886$2ac85910$b9021d0a@SalimSi> > This is true for locally generated traffic only. Any incomming/forwarded > traffic can be controlled in the PREROUTING, thus the cache is never > consulted. The cache will still be consulted, in ip_route_input. That is for input and forwarded traffic. Only if there is no matching entry, routing tables will be employed. If you look in the cache, you can see routes cached for same destination through both wan interfaces. (well, in my case, I can see...).But their fwmarks are different,as evident from ip_conntrack. From gtaylor at riverviewtech.net Wed Jun 27 08:43:00 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 08:43:50 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681FBA2.7000709@rabbit.us> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> Message-ID: <468206F4.4090900@riverviewtech.net> On 6/27/2007 12:54 AM, Peter Rabbitson wrote: > I am actually simply jealous that some people apparently get it to > work in-kernel, and I can't seem to. Ah, so the truth comes out. ;) > My requirements are pretty simple: > o As transparrent as possible DGD, that can detect 2nd and 3rd hop > failures Think about what you just asked for. "Dead Gateway Detection" is used to detect dead (upstream) (default) gateway(s). Rather it is not meant to detect dead routes beyond your gateway(s). To do this you will need some sort of utility to monitor things for you. I.e. you will not be able to get the kernel to detect that a gateway is good for some things but not for others. Actually if you stop to think about it, this is beyond the scope of what the kernel should do. This is more the scope of a routing protocol and / or a route management daemon. In short, use something to test reachability to destinations and use ip rules to choose routing tables accordingly. I.e. have a default routing table that will try to use any / all interfaces routes and have alternative routing tables that will try fewer interfaces / routes. > o Robust load balancing - connections are distributed over all > available links, regardless of source and destination, with the > possibility of assigning relative channel priorities I think this is close to being possible depending on your scenario (NAT or not) and a few other things. It was my understanding that equal cost multi path routing was suppose to accomplish this very thing. I.e. if you had globally routable IP addresses behind the router, you could send traffic out either link, hopefully in such a fashion as to (hopefully) fully utilize all links. ECMP does include weight options to assign ratios to routes. However, after discussion in this thread, I question if ECMP will do this or not. > o NAT compatible - link hopping is not an option, traffic with a > specific SRC/DST must stay where it started. I think this is the simpler of the above "robust load balancing" as you say. In my opinion, this should be the first of the things to be achieved and then try to extend this to be the above. What you have proposed with load balancing via Netfilter should be able to achieve this with out any problems. Or at least I would think such. Grant. . . . From rabbit at rabbit.us Wed Jun 27 08:58:48 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed Jun 27 08:58:55 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <468206F4.4090900@riverviewtech.net> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> Message-ID: <46820AA8.7020305@rabbit.us> Grant Taylor wrote: > On 6/27/2007 12:54 AM, Peter Rabbitson wrote: >> I am actually simply jealous that some people apparently get it to >> work in-kernel, and I can't seem to. > > Ah, so the truth comes out. ;) Hehe >> My requirements are pretty simple: >> o As transparrent as possible DGD, that can detect 2nd and 3rd hop >> failures > > Think about what you just asked for. "Dead Gateway Detection" is used > to detect dead (upstream) (default) gateway(s). Rather it is not meant > to detect dead routes beyond your gateway(s). To do this you will need > some sort of utility to monitor things for you. I.e. you will not be > able to get the kernel to detect that a gateway is good for some things > but not for others. Actually if you stop to think about it, this is > beyond the scope of what the kernel should do. This is more the scope > of a routing protocol and / or a route management daemon. > > In short, use something to test reachability to destinations and use ip > rules to choose routing tables accordingly. I.e. have a default routing > table that will try to use any / all interfaces routes and have > alternative routing tables that will try fewer interfaces / routes. This is the most fragile part of my current setup. And DGD based on packet counts IMO is an extremely simple thing to do, I discussed it recently with you. If something like this was present in-kernel the world would be a better place. >> o Robust load balancing - connections are distributed over all >> available links, regardless of source and destination, with the >> possibility of assigning relative channel priorities > > I think this is close to being possible depending on your scenario (NAT > or not) and a few other things. > > It was my understanding that equal cost multi path routing was suppose > to accomplish this very thing. I.e. if you had globally routable IP > addresses behind the router, you could send traffic out either link, > hopefully in such a fashion as to (hopefully) fully utilize all links. > ECMP does include weight options to assign ratios to routes. For globally routable addresses it doesn't really matter, because you can not usually detect it (things still work). > What you have proposed with load balancing via Netfilter should be able > to achieve this with out any problems. Or at least I would think such. It actually does work in production for quite some time now. But as said before - it is ugly and fragile. I understand that we are coming from different environments, but I still think that my figure of 90% is rather accurate. If you can afford not to do NAT, means that most likely you also have access to the ISPs dynamic routing protocols as well, and the entire discussion becomes pointless. On the contrary if you run NAT, most likely you are a poor-mans-ISP or smaller, running off two consumer DSL links, and all of the above applies. Either way I rest my case here, as we are comparing apples to dinosaurs, and went too far OT :) Peter From gtaylor at riverviewtech.net Wed Jun 27 09:28:12 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 09:29:09 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46820AA8.7020305@rabbit.us> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> <46820AA8.7020305@rabbit.us> Message-ID: <4682118C.1090106@riverviewtech.net> On 6/27/2007 1:58 AM, Peter Rabbitson wrote: > And DGD based on packet counts IMO is an extremely simple thing to > do, I discussed it recently with you. (If I recall correctly and / or re-read the appropriate thread correctly.) What you were talking about doing was pinging (of sorts, be it ICMP, testing connections, sending layer 7 traffic, etc.) destinations out side of your upstream gateway. Correct? > If something like this was present in-kernel the world would be a > better place. I agree that if there was a way for the kernel to handle this the world would be a better place. However, I think it silly to expect the kernel to do this. Well let me take a moment to be sure we are thinking the same thing. You want the kernel to be able to realize that one route through a given default gateway is no good for a given destination and use a different default gateway even though the kernel can reach other destinations through the first default gateway? In other words, if the kernel can not reach microsoft.com through ISP1 it should use ISP2 despite the fact that it can reach google.com through ISP1? Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 09:37:32 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 09:38:05 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4681C806.9060703@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> Message-ID: <468213BC.3090706@riverviewtech.net> On 6/26/2007 9:14 PM, Mohan Sundaram wrote: > I remember that route balancing has an option to perform per packet > balancing and not per connection. If that were to work, then route cache > would not be used IMHO. Per packet balancing is normally not done as it > would break connections especially in NAT'ted scenario. To quote the man page for ip, it looks like the balancing is not per packet as you indicate, but rather per-flow. """equalize - allow packet by packet randomization on multipath routes. Without this modifier, the route will be frozen to one selected nexthop, so that load splitting will only occur on per-flow base. equalize only works if the kernel is patched.""" Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 09:53:27 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 09:54:04 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <468216D3.1090809@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> <468213BC.3090706@riverviewtech.net> <468216D3.1090809@vsnl.com> Message-ID: <46821777.1000104@riverviewtech.net> On 6/27/2007 2:50 AM, Mohan Sundaram wrote: > """equalize - allow packet by packet randomization on multipath > routes. Without this modifier, the route will be frozen to one > selected nexthop, so that load splitting will only occur on per-flow > base. equalize only works if the kernel is patched.""" I think we both pasted the same quote. If you do use the "equalize" keyword, you do get a packet by packet / per-packet effect. Where as if you do not use the "equalize" keyword, you get a per-flow effect, which is what I was trying to state is the apparent default. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 09:57:36 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 09:58:13 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46821767.2060302@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> <468213BC.3090706@riverviewtech.net> <46821767.2060302@vsnl.com> Message-ID: <46821870.4000807@riverviewtech.net> On 6/27/2007 2:53 AM, Mohan Sundaram wrote: > Pardon my earlier mail. *nod* Pardon my reply. ;) > This says if equalize patch/keyword is used, packet randomisation > happens. Exactly what we want, is it not? (Referring back to your earlier message...) Yes, I think this is what we want in this scenario. Grant. . . . From rabbit at rabbit.us Wed Jun 27 10:03:01 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed Jun 27 10:03:09 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <4682118C.1090106@riverviewtech.net> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> <46820AA8.7020305@rabbit.us> <4682118C.1090106@riverviewtech.net> Message-ID: <468219B5.9060407@rabbit.us> Grant Taylor wrote: > Well let me take a moment to be sure we are thinking the same thing. You > want the kernel to be able to realize that one route through a given > default gateway is no good for a given destination and use a different > default gateway even though the kernel can reach other destinations > through the first default gateway? In other words, if the kernel can > not reach microsoft.com through ISP1 it should use ISP2 despite the fact > that it can reach google.com through ISP1? > No, nothing like this. Basically my idea is that a no-packet-seen timer is maintained for every gateway, excluding any packets with a source within the ISPs netblock. This will reliably detect that no traffic is seen beyond the ISP, and therefore pronounce the gateway dead. The only configuration required from the administrator would be an address/netmask pair for every gateway, to use as an exclusion for the counters, and a no-packets-seen timeout, before a gateway is marked as dead. Any incoming activity on the gateway will immediately change its status back to active. So to answer your exact question - I want the kernel to be able to realize that a gateway is no good for any destinations other than the specified netblock. Peter From gtaylor at riverviewtech.net Wed Jun 27 10:03:20 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 10:04:05 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <468218CF.8090405@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> <468213BC.3090706@riverviewtech.net> <468216D3.1090809@vsnl.com> <46821777.1000104@riverviewtech.net> <468218CF.8090405@vsnl.com> Message-ID: <468219C8.5080708@riverviewtech.net> On 6/27/2007 2:59 AM, Mohan Sundaram wrote: > I think that default makes sense. If we want pkt based balancing, we > enable it explicitly. Agreed. We / people just have to be aware that is what it does so that they don't have false expectations. Of course, this is a fairly common problem in unix. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 10:11:34 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 10:12:12 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <468219B5.9060407@rabbit.us> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> <46820AA8.7020305@rabbit.us> <4682118C.1090106@riverviewtech.net> <468219B5.9060407@rabbit.us> Message-ID: <46821BB6.6000800@riverviewtech.net> On 6/27/2007 3:03 AM, Peter Rabbitson wrote: > I want the kernel to be able to realize that a gateway is no good for > any destinations other than the specified netblock. Would it be fair to say that you are wanting an administratively configurable "ignore addresses that fall with in this " while deciding if a gateway is dead? Obviously would need to be a bit more than just an ip / netmask combination to make this realistic. If this is what you are wanting, it may be possible to augment the kernel code that is used to detect dead gateways and have it check to see if the networks match a list (from somewhere in proc / sysfs / sysctl?) and not increment traffic counters. I am presuming that it is the traffic counters that have to be incremented for the kernel to think that a route is still alive. So, if you purposfully did not increment the counters, you could probably detect that a given gateway is no good. I think you would have to add an additional route that was to the given network(s) that did not use such a feature to provide a way for the routing code to route to those network(s) that it no longer would get to via a default gateway. What do you think? Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 10:24:36 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 10:25:20 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46821E2D.3010704@vsnl.com> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> <468213BC.3090706@riverviewtech.net> <468216D3.1090809@vsnl.com> <46821777.1000104@riverviewtech.net> <468218CF.8090405@vsnl.com> <468219C8.5080708@riverviewtech.net> <46821E2D.3010704@vsnl.com> Message-ID: <46821EC4.2030802@riverviewtech.net> On 6/27/2007 3:22 AM, Mohan Sundaram wrote: > *A word of caution*. My connections went awry more due to out of > order delivery of packets and I had a hell of a time troubleshooting > it as the problem did not appear consistently,:-(. Did not know where > in the whole chain I has the problem. Is like the MTU problem in > PPTP. *nod* This is a warning that you see a LOT of places when you start talking about per packet verses per flow load balancing. Cisco is VERY big in to giving this warning. Despite being aware of this problem, I have yet (knock on wood) to run in to this problem my self. Grant. . . . From gtaylor at riverviewtech.net Wed Jun 27 10:26:45 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 10:27:32 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46821EC4.2030802@riverviewtech.net> References: <7e47206b0706242007q487365d3gb7c12658b9669edd@mail.gmail.com> <"4 6 7FD592.5010700"@riverviewtech.net> <4680B634.9060802@rabbit.us> <468124A6.8000101@riverviewtech.net> <46815078.5060705@rabbit.us> <4681C2A2.2010208@riverviewtech.net> <4681C560.50008@vsnl.com> <4681C676.6030407@riverviewtech.net> <4681C806.9060703@vsnl.com> <468213BC.3090706@riverviewtech.net> <468216D3.1090809@vsnl.com> <46821777.1000104@riverviewtech.net> <468218CF.8090405@vsnl.com> <468219C8.5080708@riverviewtech.net> <46821E2D.3010704@vsnl.com> <46821EC4.2030802@riverviewtech.net> Message-ID: <46821F45.9010308@riverviewtech.net> On 6/27/2007 3:24 AM, Grant Taylor wrote: > This is a warning that you see a LOT of places when you start talking > about per packet verses per flow load balancing. Cisco is VERY big in > to giving this warning. I wonder how much of packet out of order problem would happen with two parallel links verses two asymmetric routes through the internet core where one packet will take 27 hop route while the other will take a 37 hop route. Grant. . . . From rabbit at rabbit.us Wed Jun 27 11:09:21 2007 From: rabbit at rabbit.us (Peter Rabbitson) Date: Wed Jun 27 11:09:28 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46821BB6.6000800@riverviewtech.net> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> <46820AA8.7020305@rabbit.us> <4682118C.1090106@riverviewtech.net> <468219B5.9060407@rabbit.us> <46821BB6.6000800@riverviewtech.net> Message-ID: <46822941.7020604@rabbit.us> Grant Taylor wrote: > On 6/27/2007 3:03 AM, Peter Rabbitson wrote: >> I want the kernel to be able to realize that a gateway is no good for >> any destinations other than the specified netblock. > > Would it be fair to say that you are wanting an administratively > configurable "ignore addresses that fall with in this " while > deciding if a gateway is dead? > > Obviously would need to be a bit more than just an ip / > netmask combination to make this realistic. > > If this is what you are wanting, it may be possible to augment the > kernel code that is used to detect dead gateways and have it check to > see if the networks match a list (from somewhere in proc / sysfs / > sysctl?) and not increment traffic counters. I am presuming that it is > the traffic counters that have to be incremented for the kernel to think > that a route is still alive. So, if you purposfully did not increment > the counters, you could probably detect that a given gateway is no good. Something along these lines, yes. Except that instead of a packet-counter there is a resettable timer, that gets reset anytime a matching packet comes in. When the timer goes over a specified limit - gateway is dead. > I think you would have to add an additional route that was to the given > network(s) that did not use such a feature to provide a way for the > routing code to route to those network(s) that it no longer would get to > via a default gateway. > This would be a manual task for the administrator, there is no place for this in-kernel. From hijacker at oldum.net Wed Jun 27 11:28:10 2007 From: hijacker at oldum.net (Nikolay Kichukov) Date: Wed Jun 27 11:29:26 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <468185E7.1030709@andyfurniss.entadsl.com> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> <467FA95A.9060009@oldum.net> <468185E7.1030709@andyfurniss.entadsl.com> Message-ID: <46822DAA.70109@oldum.net> Hello Andy, unshaped here means with higher priority than the rest of the classes that have filters attached to them? So if an arp packet is sent at the same time an ip packet is sent, the arp packet will go first? And only then the ip packet will be matched by the filters? Regards, -Nikolay Andy Furniss wrote: > Nikolay Kichukov wrote: >> Hello Andreas, >> and arp is not ip ... thanks for clarification. >> >> Where(in which class) would all non-ip traffic go in the filter scenario? > > In the case of htb unclassified go unshaped without a default class set > (=default 0) you do get a counter - > > andy@noki:~$ /sbin/tc -s qdisc ls dev eth3 > qdisc htb 1: r2q 10 default 0 direct_packets_stat 3223 > > In the case of HFSC unclassified get dropped - so you really need a > default class, but not one that gets low prio IP sent to it :-) > > Andy. From gtaylor at riverviewtech.net Wed Jun 27 12:19:38 2007 From: gtaylor at riverviewtech.net (Grant Taylor) Date: Wed Jun 27 12:19:58 2007 Subject: [LARTC] Load Balance and SNAT problem. In-Reply-To: <46822941.7020604@rabbit.us> References: <000001c7b862$0f009270$b9021d0a@SalimSi> <4681FBA2.7000709@rabbit.us> <468206F4.4090900@riverviewtech.net> <46820AA8.7020305@rabbit.us> <4682118C.1090106@riverviewtech.net> <468219B5.9060407@rabbit.us> <46821BB6.6000800@riverviewtech.net> <46822941.7020604@rabbit.us> Message-ID: <468239BA.7030406@riverviewtech.net> On 6/27/2007 4:09 AM, Peter Rabbitson wrote: > Something along these lines, yes. Except that instead of a > packet-counter there is a resettable timer, that gets reset anytime a > matching packet comes in. When the timer goes over a specified limit - > gateway is dead. I think this is usually called / treated as a (time until) "Dead (Counter) Time" as in the timer counts down and as soon as it hits zero, the item is considered dead. Any time something passes through and refreshes it, the time to live is placed in the (time until) "Dead (Counter) Timer". > This would be a manual task for the administrator, there is no place for > this in-kernel. Agreed. I will state that I think you are asking for a bit much, but you are free to ask for what ever you want to, or are willing to code your self. ;) Grant. . . . From hannemann at i4.informatik.rwth-aachen.de Wed Jun 27 12:20:31 2007 From: hannemann at i4.informatik.rwth-aachen.de (Arnd Hannemann) Date: Wed Jun 27 12:26:21 2007 Subject: [LARTC] RED to use ECN (or work at all?) In-Reply-To: <468188BA.9020901@andyfurniss.entadsl.com> References: <468188BA.9020901@andyfurniss.entadsl.com> Message-ID: <468239EF.8000300@i4.informatik.rwth-aachen.de> Andy Furniss schrieb: > Daniel Schaffrath wrote: >> Dear Community, >> >> sorry for the somewhat dumb question. Maybe someone has any >> pointer to how to setup the RED queue to mark pakets with ECN. In >> particular what are appropriate parameter settings for limit, >> min, max, etc. > > I have never really played with red - but I wouldn't start by > putting it on lo - I don't think you will get a backlog. Even on > root of an eth there will be quite big buffer to fill before it > gets any backlog. Could you explain this a bit more in detail, why does it not work on root of an device? I tried it with various configurations and indeed it does not work. Even if the incoming interface is much faster then the outgoing interface I can't get the red queue to drop or mark packets. Packets are always dropped somewhere else? > > You are best using it as a child of something that rate limits > tbf/hfsc/htb. There is a rate parameter which I guess you should > use - but it's not to actually do the rate limiting. Thanks for the hint! It seems to work that way, I used this: tc qdisc add dev wifi0 root handle 1 tbf rate 36mbit burst 5kb latency 100ms peakrate 54mbit minburst 1540 tc qdisc add dev wifi0 parent 1: red limit 10000 min 2000 max 5000 avpkt 1000 burst 2 probability .2 ecn Then suddenly i get marked packets: qdisc red 8003: dev wifi0 parent 1: limit 10000b min 2000b max 5000b ecn Sent 5913561 bytes 3938 pkt (dropped 0, overlimits 6 requeues 2611) rate 0bit 0pps backlog 0b 0p requeues 2611 marked 6 early 0 pdrop 0 other 0 But question remains: Why does it not work with root qdisc? And 6 packets are still a bit few? Ingoing interface is 100 mbit, outgoing link on wifi0 is about 5 mbit... > As for ECN, it's off by default on Linux, so I don't think it will > be much use unless you have control over the network. I just turned > it on on two of my PCs - > > echo 1 > /proc/sys/net/ipv4/tcp_ecn > > and did a test with red and it does work and save some drops. Could you show your configuration or is it too complex? How many packets got marked? Thank you. Best regards, Arnd From sunnyboyfrank at web.de Wed Jun 27 13:42:25 2007 From: sunnyboyfrank at web.de (Frank Remetter) Date: Wed Jun 27 13:41:36 2007 Subject: [LARTC] network simulator In-Reply-To: <4681A80E.40206@andrei.myip.org> References: <4681A429.9090605@andrei.myip.org> <4681A80E.40206@andrei.myip.org> Message-ID: <20070627134225.081af0a3@ocean.remetter.homelinux.org> Hey, > > I want to build a "network simulator", to create scenarios such as > > delayed packets, lost packets, low bandwidth, or combinations of > > such. little offtopic here, but FreeBSD provides DummyNet, which can do what you want. Regards -- Frank Remetter http://www.remetter.de/ GPG-FP: 2B07 B7D8 5C27 AB94 7A37 8B0B DEBE DD89 D68B 7BE6 From lartc at manchotnetworks.net Wed Jun 27 15:47:07 2007 From: lartc at manchotnetworks.net (lartc) Date: Wed Jun 27 15:47:20 2007 Subject: [LARTC] network simulator In-Reply-To: <20070627134225.081af0a3@ocean.remetter.homelinux.org> References: <4681A429.9090605@andrei.myip.org> <4681A80E.40206@andrei.myip.org> <20070627134225.081af0a3@ocean.remetter.homelinux.org> Message-ID: <1182952027.15441.0.camel@sumatra.radius.fr> http://linux-net.osdl.org/index.php/Netem as well ... charles On Wed, 2007-06-27 at 13:42 +0200, Frank Remetter wrote: > Hey, > > > > I want to build a "network simulator", to create scenarios such as > > > delayed packets, lost packets, low bandwidth, or combinations of > > > such. > > little offtopic here, but FreeBSD provides DummyNet, which can do what > you want. > > Regards -- "simplified chinese" is not nearly as easy as they would have you believe ... a superlative oxymoron" --anonymous From rmartija at telcordia.com Wed Jun 27 17:58:52 2007 From: rmartija at telcordia.com (Martija, Ricardo V) Date: Wed Jun 27 17:59:22 2007 Subject: [LARTC] Deleting a tc filter rule Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 4242 bytes Desc: not available Url : http://mailman.ds9a.nl/pipermail/lartc/attachments/20070627/58f51d73/smime-0001.bin From marco.casaroli at gmail.com Wed Jun 27 18:04:01 2007 From: marco.casaroli at gmail.com (Marco Aurelio) Date: Wed Jun 27 18:04:06 2007 Subject: [LARTC] Deleting a tc filter rule In-Reply-To: References: Message-ID: <92ed523b0706270904saf4dec6p38b8ca9609f7101e@mail.gmail.com> On 6/27/07, Martija, Ricardo V wrote: > > > > > Hi, > > > > I am very new to tc. I added a filter using the following command: > > > > tc filter add dev eth0 V parent 20:0 protocol ip prio 1 handle ::128 u32 > match ip tos 0x44 0xfc flowid 20:1 > tc filter add dev eth0 V parent 20:0 protocol ip pref 1234 prio 1 handle ::128 u32 match ip tos 0x44 0xfc flowid 20:1 > > > To check if the filter rule was indeed added, I run > > > > tc filter show dev eth0 parent 20: > > > > This gave me the following output: > > > > filter protocol ip pref 1 u32 > > filter protocol ip pref 1 u32 fh 800: ht divisor 1 > > filter protocol ip pref 1 u32 fh 800::128 order 296 key ht 800 bkt 0 > flowid 20:1 > > match 00440000/00fc0000 at 0 > > > > I tried deleting the filter rule that I added using: > > > > tc filter del dev eth0 pref 1 protocol ip handle 800::160 > tc filter del dev eth0 pref 1234 > > > This gave me the following message: > > > > Must specify filter type when using "handle" > > > > I modified the delete command, as follows: > > > > tc filter del dev eth0 pref 1 protocol ip handle 800::160 u32 > > > > This gave the following error message: > > > > RTNETLINK answers: Invalid argument > > > > I am pretty much stumped. Can anyone tell me how I can delete a tc filter > rule? > > > > Thanks, > > > > Rick > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > -- Marco Casaroli SapucaiNet Telecom +55 35 34712377 ext 5 From ghartung at photobucket.com Wed Jun 27 18:29:08 2007 From: ghartung at photobucket.com (Greg Hartung) Date: Wed Jun 27 18:29:14 2007 Subject: [LARTC] GRE tunnel In-Reply-To: Message-ID: Finally, a hint of light: The first is a tcpdump while pinging the remote end, 66.1.2.161, and it looks normal: 10:12:10.441842 > 00:19:b9:dd:ff:d9 ip 100: IP 66.1.1.161 > 66.1.2.161: icmp 64: echo request seq 1 10:12:10.442344 < 00:01:e8:0f:ee:f8 ip 100: IP 66.1.2.161 > 66.1.1.161: icmp 64: echo reply seq 1 This next is a ping of the remote tunnel end, 10.253.253.2 10:12:18.970786 > 00:19:b9:dd:ff:d9 arp 44: arp who-has 66.1.2.161 tell 66.1.1.161 I am *very* confused by this. Somehow, when I try to send traffic thru the tunnel, it thinks that the remote physical end is directly attached and should ARP for it even tho it is pingable?!?!!? It is definitely not on-net - it is many hops away - but it is reachable via a default route. Routing table before the tunnel is configured: [root@den1tun01 ~]# netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 66.1.1.128 0.0.0.0 255.255.255.192 U 0 0 0 eth0.2 10.1.2.0 0.0.0.0 255.255.254.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0.2 10.0.0.0 10.1.2.254 255.0.0.0 UG 0 0 0 eth0 0.0.0.0 66.11.51.129 0.0.0.0 UG 0 0 0 eth0.2 [root@den1tun01 ~]# And while it's configured: [root@den1tun01 ~]# netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 66.1.1.128 0.0.0.0 255.255.255.192 U 0 0 0 eth0.2 10.253.253.0 0.0.0.0 255.255.255.0 U 0 0 0 gretun 10.1.2.0 0.0.0.0 255.255.254.0 U 0 0 0 eth0 10.50.0.0 0.0.0.0 255.255.0.0 U 0 0 0 gretun 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0.2 10.0.0.0 10.1.2.254 255.0.0.0 UG 0 0 0 eth0 0.0.0.0 66.11.51.129 0.0.0.0 UG 0 0 0 eth0.2 On 6/26/07 5:01 PM, "Greg Hartung" wrote: > > I'm still stuck on this one and could really use some help. I just > finished trying it on an FC3 box too to make sure it wasn't CentOS specific > issue but there's still no output from tcpdump. > > I also spent some time looking over Cisco examples to make sure I wasn't > misremembering the concepts. No surprises there. > > Does anyone have any ideas or can someone suggest a more appropriate > forum for the question? > > Thanks!! > > On 6/21/07 11:52 AM, "Greg Hartung" wrote: > >> >> I am trying to setup GRE between two CentOS 4.5 boxes. I have tried >> several variations of what's listed below, but none of them work. >> >> box1: >> modprobe ip_gre >> ip link set gre0 up >> ip tunnel add gretun mode gre local 66.1.1.161 remote 66.1.2.161 ttl 20 dev >> eth0 >> ip addr add dev gretun 10.253.253.1 peer 10.253.253.2/24 >> ip link set dev gretun up >> ip route add 10.2.0.0/16 via 10.253.253.2 >> >> box2: >> modprobe ip_gre >> ip link set gre0 up >> ip tunnel add gretun mode gre local 66.1.2.161 remote 66.1.1.161 ttl 20 dev >> eth0 >> ip addr add dev gretun 10.253.253.2 peer 10.253.253.1/24 >> ip link set dev gretun up >> ip route add 10.1.0.0/16 via 10.253.253.1 >> >> tcpdump shows NO rx or tx traffic from either box that isn't ARP or SSH. >> >> It's as if it's not even trying to bring the tunnel up. I'm a Cisco guy, >> so I'm lost with my show commands. >> >> The other variations I've tried consist mostly of trying different >> combinations of on-net (in the same subnet as eth0 and even the same address >> as eth0) and off-net (various combinations of loopback /24 and /32 addresses >> in separate 10 space) on the 'ip addr add dev gretun' statements. But the >> above example is what *should* work on a Cisco, I think. It's been a >> while. >> >> How do I troubleshoot this? This is all I've got so far: >> >> root@den1tun01:/home/root $ ip link >> 1: lo: mtu 16436 qdisc noqueue >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> 2: eth0: mtu 8800 qdisc pfifo_fast qlen 1000 >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff >> 3: eth0.2: mtu 8800 qdisc noqueue >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff >> 4: gre0: mtu 1476 qdisc noqueue >> link/gre 0.0.0.0 brd 0.0.0.0 >> 5: gretun@eth0: mtu 8776 qdisc noqueue >> link/gre 66.1.1.161 peer 66.1.2.161 >> >> root@den1tun01:/home/root $ ip tun >> gre0: gre/ip remote any local any ttl inherit nopmtudisc >> gretun: gre/ip remote 66.1.2.161 local 66.1.1.161 dev eth0 ttl 20 >> >> root@den1tun01:/home/root $ ifconfig >> eth0 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 >> inet addr:10.1.2.243 Bcast:10.1.3.255 Mask:255.255.254.0 >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 >> RX packets:3357 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:484 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:230757 (225.3 KiB) TX bytes:63937 (62.4 KiB) >> Interrupt:169 Memory:f8000000-f8011100 >> >> eth0.2 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 >> inet addr:66.1.1.161 Bcast:66.1.1.191 Mask:255.255.255.192 >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 >> RX packets:950 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:43860 (42.8 KiB) TX bytes:1200 (1.1 KiB) >> >> gretun Link encap:UNSPEC HWaddr >> 42-0B-33-A1-FF-C0-00-00-00-00-00-00-00-00-00-00 >> inet addr:10.253.253.1 P-t-P:10.253.253.2 Mask:255.255.255.0 >> UP POINTOPOINT RUNNING NOARP MTU:8776 Metric:1 >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:0 (0.0 b) TX bytes:756 (756.0 b) >> >> gre0 Link encap:UNSPEC HWaddr >> 00-00-00-00-FF-00-00-00-00-00-00-00-00-00-00-00 >> UP RUNNING NOARP MTU:1476 Metric:1 >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) >> >> lo Link encap:Local Loopback >> inet addr:127.0.0.1 Mask:255.0.0.0 >> UP LOOPBACK RUNNING MTU:16436 Metric:1 >> RX packets:225 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:225 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:13271 (12.9 KiB) TX bytes:13271 (12.9 KiB) >> >> >> I've also tried changing the destination for the route to the near end of >> the private subnet and tried pinging various things on the tunnel subnet and >> remote network to create "interesting traffic" to bring the tunnel up but >> tcpdump still shows nothing. >> >> Then I noticed that ping does show an error count: >> >> [root@den1tun01 ~]# ping 10.253.253.2 >> PING 10.253.253.2 (10.253.253.2) 56(84) bytes of data. >>> From 10.253.253.1 icmp_seq=0 Destination Host Unreachable >>> From 10.253.253.1 icmp_seq=1 Destination Host Unreachable >> >> --- 10.253.253.2 ping statistics --- >> 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms >> , pipe 2 >> >> I can ping the local end: 10.253.253.1, but the tunnel is still >> non-functinoal. >> >> Thanks! >> Greg >> >> _______________________________________________ >> LARTC mailing list >> LARTC@mailman.ds9a.nl >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From markdv.lartc at asphyx.net Wed Jun 27 22:54:04 2007 From: markdv.lartc at asphyx.net (mark) Date: Wed Jun 27 22:54:13 2007 Subject: [LARTC] GRE tunnel In-Reply-To: References: Message-ID: <1182977644.3731.5.camel@velocity.nl.tiscali.com> On Wed, 2007-06-27 at 10:29 -0600, Greg Hartung wrote: > Finally, a hint of light: > > The first is a tcpdump while pinging the remote end, 66.1.2.161, and it > looks normal: > > 10:12:10.441842 > 00:19:b9:dd:ff:d9 ip 100: IP 66.1.1.161 > 66.1.2.161: icmp > 64: echo request seq 1 > 10:12:10.442344 < 00:01:e8:0f:ee:f8 ip 100: IP 66.1.2.161 > 66.1.1.161: icmp > 64: echo reply seq 1 > > This next is a ping of the remote tunnel end, 10.253.253.2 > > 10:12:18.970786 > 00:19:b9:dd:ff:d9 arp 44: arp who-has 66.1.2.161 tell > 66.1.1.161 > > I am *very* confused by this. Somehow, when I try to send traffic thru the > tunnel, it thinks that the remote physical end is directly attached and > should ARP for it even tho it is pingable?!?!!? It is definitely not on-net > - it is many hops away - but it is reachable via a default route. Hmmm... interrestig. What does "ip ro get 66.1.2.161" say? And for 10.253.253.2? Regards, Mark. > Routing table before the tunnel is configured: > > [root@den1tun01 ~]# netstat -nr > Kernel IP routing table > Destination Gateway Genmask Flags MSS Window irtt > Iface > 66.1.1.128 0.0.0.0 255.255.255.192 U 0 0 0 > eth0.2 > 10.1.2.0 0.0.0.0 255.255.254.0 U 0 0 0 > eth0 > 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 > eth0.2 > 10.0.0.0 10.1.2.254 255.0.0.0 UG 0 0 0 > eth0 > 0.0.0.0 66.11.51.129 0.0.0.0 UG 0 0 0 > eth0.2 > [root@den1tun01 ~]# > > And while it's configured: > > [root@den1tun01 ~]# netstat -nr > Kernel IP routing table > Destination Gateway Genmask Flags MSS Window irtt > Iface > 66.1.1.128 0.0.0.0 255.255.255.192 U 0 0 0 > eth0.2 > 10.253.253.0 0.0.0.0 255.255.255.0 U 0 0 0 > gretun > 10.1.2.0 0.0.0.0 255.255.254.0 U 0 0 0 > eth0 > 10.50.0.0 0.0.0.0 255.255.0.0 U 0 0 0 > gretun > 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 > eth0.2 > 10.0.0.0 10.1.2.254 255.0.0.0 UG 0 0 0 > eth0 > 0.0.0.0 66.11.51.129 0.0.0.0 UG 0 0 0 > eth0.2 > > > > On 6/26/07 5:01 PM, "Greg Hartung" wrote: > > > > > I'm still stuck on this one and could really use some help. I just > > finished trying it on an FC3 box too to make sure it wasn't CentOS specific > > issue but there's still no output from tcpdump. > > > > I also spent some time looking over Cisco examples to make sure I wasn't > > misremembering the concepts. No surprises there. > > > > Does anyone have any ideas or can someone suggest a more appropriate > > forum for the question? > > > > Thanks!! > > > > On 6/21/07 11:52 AM, "Greg Hartung" wrote: > > > >> > >> I am trying to setup GRE between two CentOS 4.5 boxes. I have tried > >> several variations of what's listed below, but none of them work. > >> > >> box1: > >> modprobe ip_gre > >> ip link set gre0 up > >> ip tunnel add gretun mode gre local 66.1.1.161 remote 66.1.2.161 ttl 20 dev > >> eth0 > >> ip addr add dev gretun 10.253.253.1 peer 10.253.253.2/24 > >> ip link set dev gretun up > >> ip route add 10.2.0.0/16 via 10.253.253.2 > >> > >> box2: > >> modprobe ip_gre > >> ip link set gre0 up > >> ip tunnel add gretun mode gre local 66.1.2.161 remote 66.1.1.161 ttl 20 dev > >> eth0 > >> ip addr add dev gretun 10.253.253.2 peer 10.253.253.1/24 > >> ip link set dev gretun up > >> ip route add 10.1.0.0/16 via 10.253.253.1 > >> > >> tcpdump shows NO rx or tx traffic from either box that isn't ARP or SSH. > >> > >> It's as if it's not even trying to bring the tunnel up. I'm a Cisco guy, > >> so I'm lost with my show commands. > >> > >> The other variations I've tried consist mostly of trying different > >> combinations of on-net (in the same subnet as eth0 and even the same address > >> as eth0) and off-net (various combinations of loopback /24 and /32 addresses > >> in separate 10 space) on the 'ip addr add dev gretun' statements. But the > >> above example is what *should* work on a Cisco, I think. It's been a > >> while. > >> > >> How do I troubleshoot this? This is all I've got so far: > >> > >> root@den1tun01:/home/root $ ip link > >> 1: lo: mtu 16436 qdisc noqueue > >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> 2: eth0: mtu 8800 qdisc pfifo_fast qlen 1000 > >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff > >> 3: eth0.2: mtu 8800 qdisc noqueue > >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff > >> 4: gre0: mtu 1476 qdisc noqueue > >> link/gre 0.0.0.0 brd 0.0.0.0 > >> 5: gretun@eth0: mtu 8776 qdisc noqueue > >> link/gre 66.1.1.161 peer 66.1.2.161 > >> > >> root@den1tun01:/home/root $ ip tun > >> gre0: gre/ip remote any local any ttl inherit nopmtudisc > >> gretun: gre/ip remote 66.1.2.161 local 66.1.1.161 dev eth0 ttl 20 > >> > >> root@den1tun01:/home/root $ ifconfig > >> eth0 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 > >> inet addr:10.1.2.243 Bcast:10.1.3.255 Mask:255.255.254.0 > >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 > >> RX packets:3357 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:484 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:230757 (225.3 KiB) TX bytes:63937 (62.4 KiB) > >> Interrupt:169 Memory:f8000000-f8011100 > >> > >> eth0.2 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 > >> inet addr:66.1.1.161 Bcast:66.1.1.191 Mask:255.255.255.192 > >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 > >> RX packets:950 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:0 > >> RX bytes:43860 (42.8 KiB) TX bytes:1200 (1.1 KiB) > >> > >> gretun Link encap:UNSPEC HWaddr > >> 42-0B-33-A1-FF-C0-00-00-00-00-00-00-00-00-00-00 > >> inet addr:10.253.253.1 P-t-P:10.253.253.2 Mask:255.255.255.0 > >> UP POINTOPOINT RUNNING NOARP MTU:8776 Metric:1 > >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:0 > >> RX bytes:0 (0.0 b) TX bytes:756 (756.0 b) > >> > >> gre0 Link encap:UNSPEC HWaddr > >> 00-00-00-00-FF-00-00-00-00-00-00-00-00-00-00-00 > >> UP RUNNING NOARP MTU:1476 Metric:1 > >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:0 > >> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > >> > >> lo Link encap:Local Loopback > >> inet addr:127.0.0.1 Mask:255.0.0.0 > >> UP LOOPBACK RUNNING MTU:16436 Metric:1 > >> RX packets:225 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:225 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:0 > >> RX bytes:13271 (12.9 KiB) TX bytes:13271 (12.9 KiB) > >> > >> > >> I've also tried changing the destination for the route to the near end of > >> the private subnet and tried pinging various things on the tunnel subnet and > >> remote network to create "interesting traffic" to bring the tunnel up but > >> tcpdump still shows nothing. > >> > >> Then I noticed that ping does show an error count: > >> > >> [root@den1tun01 ~]# ping 10.253.253.2 > >> PING 10.253.253.2 (10.253.253.2) 56(84) bytes of data. > >>> From 10.253.253.1 icmp_seq=0 Destination Host Unreachable > >>> From 10.253.253.1 icmp_seq=1 Destination Host Unreachable > >> > >> --- 10.253.253.2 ping statistics --- > >> 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms > >> , pipe 2 > >> > >> I can ping the local end: 10.253.253.1, but the tunnel is still > >> non-functinoal. > >> > >> Thanks! > >> Greg > >> > >> _______________________________________________ > >> LARTC mailing list > >> LARTC@mailman.ds9a.nl > >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > > _______________________________________________ > > LARTC mailing list > > LARTC@mailman.ds9a.nl > > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From lists at andyfurniss.entadsl.com Thu Jun 28 00:28:55 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu Jun 28 00:28:57 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <46822DAA.70109@oldum.net> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> <467FA95A.9060009@oldum.net> <468185E7.1030709@andyfurniss.entadsl.com> <46822DAA.70109@oldum.net> Message-ID: <4682E4A7.8070408@andyfurniss.entadsl.com> Nikolay Kichukov wrote: > Hello Andy, > unshaped here means with higher priority than the rest of the classes > that have filters attached to them? Yes it will just be passed and not be accounted for by htb (well apart from the counter) > > So if an arp packet is sent at the same time an ip packet is sent, the > arp packet will go first? And only then the ip packet will be matched by > the filters? I don't know if two packets can arrive at the same time. The arp will still pass through the filters and fail to match any then just pass through. The ip packet may or may not pass straight through depending on the state of the class it gets filtered into. Andy. From gustin at echostar.ca Thu Jun 28 01:23:07 2007 From: gustin at echostar.ca (Gustin Johnson) Date: Thu Jun 28 01:23:27 2007 Subject: [LARTC] GRE tunnel In-Reply-To: References: Message-ID: <4682F15B.7090102@echostar.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I noticed that the private ip is on the same subnet on both sides of the tunnel. When I have done this in the past there were two separate subnets (eg. 10.253.253.0/24 and 10.253.254.0/24). I have never tried it exactly as you have. I also do not have any more gre tunnels in service. So this is from an old script of mine. Anyway, the syntax and order that I used is: Box A modprobe ip_gre ip tunnel add gre0 mode gre remote 66.1.2.161 local 66.1.1.161 ttl 255 ip addr add 10.253.253.1 dev gre0 ip link set gre0 up ip route add 10.253.254.0/24 dev gre0 Box B modprobe ip_gre ip tunnel add gre0 mode gre remote 66.1.1.161 local 66.1.2.161 ttl 255 ip addr add 10.253.254.1 dev gre0 ip link set gre0 up ip route add 10.253.253.0/24 dev gre0 Hope this helps, Greg Hartung wrote: > I'm still stuck on this one and could really use some help. I just > finished trying it on an FC3 box too to make sure it wasn't CentOS specific > issue but there's still no output from tcpdump. > > I also spent some time looking over Cisco examples to make sure I wasn't > misremembering the concepts. No surprises there. > > Does anyone have any ideas or can someone suggest a more appropriate > forum for the question? > > Thanks!! > > On 6/21/07 11:52 AM, "Greg Hartung" wrote: > >> I am trying to setup GRE between two CentOS 4.5 boxes. I have tried >> several variations of what's listed below, but none of them work. >> >> box1: >> modprobe ip_gre >> ip link set gre0 up >> ip tunnel add gretun mode gre local 66.1.1.161 remote 66.1.2.161 ttl 20 dev >> eth0 >> ip addr add dev gretun 10.253.253.1 peer 10.253.253.2/24 >> ip link set dev gretun up >> ip route add 10.2.0.0/16 via 10.253.253.2 >> >> box2: >> modprobe ip_gre >> ip link set gre0 up >> ip tunnel add gretun mode gre local 66.1.2.161 remote 66.1.1.161 ttl 20 dev >> eth0 >> ip addr add dev gretun 10.253.253.2 peer 10.253.253.1/24 >> ip link set dev gretun up >> ip route add 10.1.0.0/16 via 10.253.253.1 >> >> tcpdump shows NO rx or tx traffic from either box that isn't ARP or SSH. >> >> It's as if it's not even trying to bring the tunnel up. I'm a Cisco guy, >> so I'm lost with my show commands. >> >> The other variations I've tried consist mostly of trying different >> combinations of on-net (in the same subnet as eth0 and even the same address >> as eth0) and off-net (various combinations of loopback /24 and /32 addresses >> in separate 10 space) on the 'ip addr add dev gretun' statements. But the >> above example is what *should* work on a Cisco, I think. It's been a >> while. >> >> How do I troubleshoot this? This is all I've got so far: >> >> root@den1tun01:/home/root $ ip link >> 1: lo: mtu 16436 qdisc noqueue >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> 2: eth0: mtu 8800 qdisc pfifo_fast qlen 1000 >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff >> 3: eth0.2: mtu 8800 qdisc noqueue >> link/ether 00:19:b9:dd:ff:d9 brd ff:ff:ff:ff:ff:ff >> 4: gre0: mtu 1476 qdisc noqueue >> link/gre 0.0.0.0 brd 0.0.0.0 >> 5: gretun@eth0: mtu 8776 qdisc noqueue >> link/gre 66.1.1.161 peer 66.1.2.161 >> >> root@den1tun01:/home/root $ ip tun >> gre0: gre/ip remote any local any ttl inherit nopmtudisc >> gretun: gre/ip remote 66.1.2.161 local 66.1.1.161 dev eth0 ttl 20 >> >> root@den1tun01:/home/root $ ifconfig >> eth0 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 >> inet addr:10.1.2.243 Bcast:10.1.3.255 Mask:255.255.254.0 >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 >> RX packets:3357 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:484 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:230757 (225.3 KiB) TX bytes:63937 (62.4 KiB) >> Interrupt:169 Memory:f8000000-f8011100 >> >> eth0.2 Link encap:Ethernet HWaddr 00:19:B9:DD:FF:D9 >> inet addr:66.1.1.161 Bcast:66.1.1.191 Mask:255.255.255.192 >> UP BROADCAST RUNNING MULTICAST MTU:8800 Metric:1 >> RX packets:950 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:43860 (42.8 KiB) TX bytes:1200 (1.1 KiB) >> >> gretun Link encap:UNSPEC HWaddr >> 42-0B-33-A1-FF-C0-00-00-00-00-00-00-00-00-00-00 >> inet addr:10.253.253.1 P-t-P:10.253.253.2 Mask:255.255.255.0 >> UP POINTOPOINT RUNNING NOARP MTU:8776 Metric:1 >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:0 (0.0 b) TX bytes:756 (756.0 b) >> >> gre0 Link encap:UNSPEC HWaddr >> 00-00-00-00-FF-00-00-00-00-00-00-00-00-00-00-00 >> UP RUNNING NOARP MTU:1476 Metric:1 >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) >> >> lo Link encap:Local Loopback >> inet addr:127.0.0.1 Mask:255.0.0.0 >> UP LOOPBACK RUNNING MTU:16436 Metric:1 >> RX packets:225 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:225 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:0 >> RX bytes:13271 (12.9 KiB) TX bytes:13271 (12.9 KiB) >> >> >> I've also tried changing the destination for the route to the near end of >> the private subnet and tried pinging various things on the tunnel subnet and >> remote network to create "interesting traffic" to bring the tunnel up but >> tcpdump still shows nothing. >> >> Then I noticed that ping does show an error count: >> >> [root@den1tun01 ~]# ping 10.253.253.2 >> PING 10.253.253.2 (10.253.253.2) 56(84) bytes of data. >>> From 10.253.253.1 icmp_seq=0 Destination Host Unreachable >>> From 10.253.253.1 icmp_seq=1 Destination Host Unreachable >> --- 10.253.253.2 ping statistics --- >> 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms >> , pipe 2 >> >> I can ping the local end: 10.253.253.1, but the tunnel is still >> non-functinoal. >> >> Thanks! >> Greg >> >> _______________________________________________ >> LARTC mailing list >> LARTC@mailman.ds9a.nl >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgvFawRXgH3rKGfMRAnXQAJ9FeeexFg7Qy1M8atRipjVpmTpO+gCdG8er 10WWOmM8YDMj0m9XECRlSv8= =PsPK -----END PGP SIGNATURE----- From lists at andyfurniss.entadsl.com Thu Jun 28 02:13:07 2007 From: lists at andyfurniss.entadsl.com (Andy Furniss) Date: Thu Jun 28 02:13:08 2007 Subject: [LARTC] RED to use ECN (or work at all?) In-Reply-To: <468239EF.8000300@i4.informatik.rwth-aachen.de> References: <468188BA.9020901@andyfurniss.entadsl.com> <468239EF.8000300@i4.informatik.rwth-aachen.de> Message-ID: <4682FD13.4040906@andyfurniss.entadsl.com> Arnd Hannemann wrote: > > Could you explain this a bit more in detail, why does it not work on > root of an device? > I tried it with various configurations and indeed it does not work. > Even if the incoming interface is much faster then the outgoing > interface I can't get the red queue to drop or mark packets. Packets > are always dropped somewhere else? I suppose exactly what happens depends on the device drivers and the type of device. On my 100mbit eth I probably could get RED to work a bit, but I would have to fill a buffer of about 300 MTU size packets first, so it would never be right as such. I see you have wireless, I don't have any wireless, but I guess the drivers may drop/shape without ever backlogging the root device - or maybe you don't generate enough traffic to fill the buffer. I think shaping on wireless is going to be hard. Single duplex so you can't really know the bandwidth, random loss due to errors etc. You can probably do better than doing nothing, though - at the expense of sacrificing bandwidth. It may be possible to use ifb to have ingress and egress share the same bandwidth. > Thanks for the hint! It seems to work that way, I used this: > > tc qdisc add dev wifi0 root handle 1 tbf rate 36mbit burst 5kb latency > 100ms peakrate 54mbit minburst 1540 > tc qdisc add dev wifi0 parent 1: red limit 10000 min 2000 max 5000 > avpkt 1000 burst 2 probability .2 ecn > > Then suddenly i get marked packets: > qdisc red 8003: dev wifi0 parent 1: limit 10000b min 2000b max 5000b ecn > Sent 5913561 bytes 3938 pkt (dropped 0, overlimits 6 requeues 2611) > rate 0bit 0pps backlog 0b 0p requeues 2611 > marked 6 early 0 pdrop 0 other 0 > > But question remains: Why does it not work with root qdisc? > And 6 packets are still a bit few? > Ingoing interface is 100 mbit, outgoing link on wifi0 is about 5 mbit... You could try try 5 or 4mbit as tbf rate rather than 36 :-) > Could you show your configuration or is it too complex? How many > packets got marked? When I first did it I just pasted Daniels config into a script I had that used ifb/hfsc, limited it to 500kbit and tried with and without the ecn parameter. The buffer sizes are too small and my hfsc limiter would fail if not on ifb so i won't post that one :-) I tried again using the bandwidth parameter (which I called rate earlier) and with bigger buffers at 5mbit using 5 netperfs I can get nearly no drops. (if you run multiple netperfs from a script make sure and addresses are in /etc/hosts and 0.0.0.0 is aswell as it seems to insist on doing dns lookups - which can delay some of them) I guess wifi like eth uses arp - so using tbf on root may drop some arp, but just to test I put it on root of my 100mbit eth. tc qdisc add dev eth0 root handle 1: tbf rate 5mbit burst 2k limit 100k tc qdisc add dev eth0 parent 1: red limit 100kb min 10kb max 60kb avpkt 1000 burst 12 probability .2 bandwidth 5mbit ecn After 100 seconds of traffic - qdisc tbf 1: rate 5000Kbit burst 2Kb lat 82.7ms Sent 62495756 bytes 41328 pkt (dropped 1, overlimits 87189 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc red 8002: parent 1: limit 100Kb min 10Kb max 60Kb ecn Sent 62495756 bytes 41328 pkt (dropped 1, overlimits 2412 requeues 87189) rate 0bit 0pps backlog 0b 0p requeues 87189 marked 2411 early 1 pdrop 0 other 0 I wonder if the drop was arp - I also tried again while pinging and got 7% loss. Pings without ecn gave 13% loss. The same test but without the ecn parameter - [root@amd /home/andy/Qos]# tc -s qdisc ls dev eth0 qdisc tbf 1: rate 5000Kbit burst 2Kb lat 82.7ms Sent 62473422 bytes 41315 pkt (dropped 2320, overlimits 90861 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc red 8001: parent 1: limit 100Kb min 10Kb max 60Kb Sent 62473422 bytes 41315 pkt (dropped 2320, overlimits 2320 requeues 90861) rate 0bit 0pps backlog 0b 0p requeues 90861 marked 0 early 2320 pdrop 0 other 0 Andy. From hijacker at oldum.net Thu Jun 28 10:48:43 2007 From: hijacker at oldum.net (Nikolay Kichukov) Date: Thu Jun 28 10:49:23 2007 Subject: [LARTC] Why does scp stall on low bandwidth connections? In-Reply-To: <4682E4A7.8070408@andyfurniss.entadsl.com> References: <6d69d8ac0706200037w4e310612l5b93daae24d84079@mail.gmail.com> <467EFD97.6050101@andyfurniss.entadsl.com> <467FA144.4030805@oldum.net> <20070625132012.vbf3t65gggsk484c@webmail.netshadow.at> <467FA95A.9060009@oldum.net> <468185E7.1030709@andyfurniss.entadsl.com> <46822DAA.70109@oldum.net> <4682E4A7.8070408@andyfurniss.entadsl.com> Message-ID: <468375EB.4020803@oldum.net> Hello Andy, Thanks for the explanation one more time;-) Cheers, -Nikolay Andy Furniss wrote: > Nikolay Kichukov wrote: >> Hello Andy, >> unshaped here means with higher priority than the rest of the classes >> that have filters attached to them? > > Yes it will just be passed and not be accounted for by htb (well apart > from the counter) > >> >> So if an arp packet is sent at the same time an ip packet is sent, the >> arp packet will go first? And only then the ip packet will be matched by >> the filters? > > I don't know if two packets can arrive at the same time. The arp will > still pass through the filters and fail to match any then just pass > through. The ip packet may or may not pass straight through depending on > the state of the class it gets filtered into. > > Andy. From thuleau at gmail.com Thu Jun 28 11:33:17 2007 From: thuleau at gmail.com (Edouard Thuleau) Date: Thu Jun 28 11:33:25 2007 Subject: [LARTC] HTB and ATM patch Message-ID: <81c11a560706280233o14820c88t8a3aa2d88b65f848@mail.gmail.com> Hi all, I patch my kernel (2.6.17) and my tc (iproute2-2.6.18-061002) utilitie for an accurate packet scheduling on an ATM link. I configure my HTB hierarchy on the upload of the link and try with differents flows. It works correctly but in some of case I lose about 50% of my bandwith. I use the overhead (42) configuration for my link (PPPoE, VC/LLC) indicate in the documentation. My question is, how this hoverhead value is calculate ? I try to separate the streams with the lentgh of the packet in differents classes and put a specific overhead for each one, but I don't know how calculate it. Do you think it's a good solution ? Is it necessary to put the atm, nohyst options and configure the overhead for the mother class ? Thanks, Edouard. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ds9a.nl/pipermail/lartc/attachments/20070628/f70806b4/attachment.htm From seba at mfdlabs.ro Thu Jun 28 14:36:59 2007 From: seba at mfdlabs.ro (Seba Tiponut) Date: Thu Jun 28 14:36:52 2007 Subject: [LARTC] Using Julian Anastasov's 'routes' patches on 2.4 kernel in conjunction with IPSec In-Reply-To: References: <200706251447.51518.seba@mfdlabs.ro> Message-ID: <200706281536.59960.seba@mfdlabs.ro> On Tuesday 26 June 2007 00:40, Julian Anastasov wrote: > May be you have to replace your _updown script with one that > supports "ip route" and "ip rule" commands instead of the old "route" > tool. By this way you can use "ip rule ... from LNET to RNET" > to properly route traffic for the negotiated subnets. If I remember > correctly, the default _updown script does not consider negotiated > LNET at all. As for routes patch, it will prefer NOARP devices when > the neighbours on ARP device are not marked as reachable in ARP cache. > So, it is risky to rely on wrong routes, especially after routes patch > is applied. > > Regards > > -- > Julian Anastasov The _updown script is only called when a tunnel is brough up or down, but the problem I am having is not related to a tunnel, but to routing before any tunnel gets established. I mean that even a configuration with only one tunnel that is listening is creating problems because both StrongSWAN and OpenSWAN add IP addresses on the ipsecN interface that are identical to the ones on the real interface (ethN). I think the problem is related to the presence of the ipsecN interface in KLIPS (linux-2.4). On 2.6 kernels there is no such interface and consequently there is no "conflict". Is there any real solution to this problem? On the other hand, my understanding of the solution you gave me (inserting a rule "from LNET to RNET") is that it can be applied once the tunnel is up. However, would you care to elaborate more on this case as well? Cheers, Seba. From rolek at alt001.com Thu Jun 28 18:35:07 2007 From: rolek at alt001.com (Roel van Meer) Date: Thu Jun 28 18:35:16 2007 Subject: [LARTC] pfifo_fast priomap Message-ID: Hi list, I have a quick question about the priority mapping of tos bits. The manpage of tc-prio shows a nice table with tos bits and the band they are mapped to: TOS Bits Means Linux Priority Band ------------------------------------------------------------ 0x0 0 Normal Service 0 Best Effort 1 0x2 1 Minimize Monetary Cost 1 Filler 2 0x4 2 Maximize Reliability 0 Best Effort 1 0x6 3 mmc+mr 0 Best Effort 1 0x8 4 Maximize Throughput 2 Bulk 2 0xa 5 mmc+mt 2 Bulk 2 0xc 6 mr+mt 2 Bulk 2 0xe 7 mmc+mr+mt 2 Bulk 2 0x10 8 Minimize Delay 6 Interactive 0 0x12 9 mmc+md 6 Interactive 0 0x14 10 mr+md 6 Interactive 0 0x16 11 mmc+mr+md 6 Interactive 0 0x18 12 mt+md 4 Int. Bulk 1 0x1a 13 mmc+mt+md 4 Int. Bulk 1 0x1c 14 mr+mt+md 4 Int. Bulk 1 0x1e 15 mmc+mr+mt+md 4 Int. Bulk 1 If I read this correctly, packets with tos 0x0 would be mapped to band 1, packets with tos 0x2 would be mapped to band 2, etc etc. However, the default priomap is 1, 2, 2, 2, 1, 2, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1. These numbers do not correspond to the numbers in the table above. Is there something I have overlooked or is the information in the above table incorrect? Thanks for any help, roel From russell-lartc at stuart.id.au Fri Jun 29 01:27:39 2007 From: russell-lartc at stuart.id.au (Russell Stuart) Date: Fri Jun 29 01:28:10 2007 Subject: [LARTC] pfifo_fast priomap In-Reply-To: References: Message-ID: <1183073259.7777.2.camel@ras.pc.brisbane.lube> On Thu, 2007-06-28 at 18:35 +0200, Roel van Meer wrote: > Hi list, > > I have a quick question about the priority mapping of tos bits. The manpage > of tc-prio shows a nice table with tos bits and the band they > are mapped to: > > TOS Bits Means Linux Priority Band > ------------------------------------------------------------ > 0x0 0 Normal Service 0 Best Effort 1 > 0x2 1 Minimize Monetary Cost 1 Filler 2 > 0x4 2 Maximize Reliability 0 Best Effort 1 > 0x6 3 mmc+mr 0 Best Effort 1 > 0x8 4 Maximize Throughput 2 Bulk 2 > 0xa 5 mmc+mt 2 Bulk 2 > 0xc 6 mr+mt 2 Bulk 2 > 0xe 7 mmc+mr+mt 2 Bulk 2 > 0x10 8 Minimize Delay 6 Interactive 0 > 0x12 9 mmc+md 6 Interactive 0 > 0x14 10 mr+md 6 Interactive 0 > 0x16 11 mmc+mr+md 6 Interactive 0 > 0x18 12 mt+md 4 Int. Bulk 1 > 0x1a 13 mmc+mt+md 4 Int. Bulk 1 > 0x1c 14 mr+mt+md 4 Int. Bulk 1 > 0x1e 15 mmc+mr+mt+md 4 Int. Bulk 1 > > If I read this correctly, packets with tos 0x0 would be mapped to band 1, > packets with tos 0x2 would be mapped to band 2, etc etc. However, the > default priomap is 1, 2, 2, 2, 1, 2, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1. > > These numbers do not correspond to the numbers in the table above. Is there > something I have overlooked or is the information in the above table > incorrect? The table is incorrect. If you have the kernel source handy look at: net/ipv4.c/route.c:173, and include/linux/pkt_sched.h:17 Alternatively, look here: http://www.stuart.id.au/russell/files/tc/doc/tc/priority.txt http://www.stuart.id.au/russell/files/tc/doc/tc/sch_prio.txt From hannemann at i4.informatik.rwth-aachen.de Fri Jun 29 17:05:59 2007 From: hannemann at i4.informatik.rwth-aachen.de (Arnd Hannemann) Date: Fri Jun 29 17:14:47 2007 Subject: [LARTC] RED to use ECN (or work at all?) In-Reply-To: <4682FD13.4040906@andyfurniss.entadsl.com> References: <468188BA.9020901@andyfurniss.entadsl.com> <468239EF.8000300@i4.informatik.rwth-aachen.de> <4682FD13.4040906@andyfurniss.entadsl.com> Message-ID: <46851FD7.2060208@i4.informatik.rwth-aachen.de> Hi Andy, Andy Furniss schrieb: > Arnd Hannemann wrote: > >> >> Could you explain this a bit more in detail, why does it not work on >> root of an device? >> I tried it with various configurations and indeed it does not work. >> Even if the incoming interface is much faster then the outgoing >> interface I can't get the red queue to drop or mark packets. Packets >> are always dropped somewhere else? > > I suppose exactly what happens depends on the device drivers and the > type of device. On my 100mbit eth I probably could get RED to work a > bit, but I would have to fill a buffer of about 300 MTU size packets > first, so it would never be right as such. > I see you have wireless, I don't have any wireless, but I guess the > drivers may drop/shape without ever backlogging the root device - or > maybe you don't generate enough traffic to fill the buffer. ah, I see. I think now I understand more how it works. The kernel hands down the skb to the device driver, and for RED to work on the root device, the device driver has to explicitly handle the skb, back, or at least communicate that it is busy. > > > I tried again using the bandwidth parameter (which I called rate > earlier) and with bigger buffers at 5mbit using 5 netperfs I can get > nearly no drops. (if you run multiple netperfs from a script make sure > and addresses are in /etc/hosts and 0.0.0.0 is aswell as it seems to > insist on doing dns lookups - which can delay some of them) > > I guess wifi like eth uses arp - so using tbf on root may drop some > arp, but just to test I put it on root of my 100mbit eth. > > tc qdisc add dev eth0 root handle 1: tbf rate 5mbit burst 2k limit 100k > > tc qdisc add dev eth0 parent 1: red limit 100kb min 10kb max 60kb > avpkt 1000 burst 12 probability .2 bandwidth 5mbit ecn > > After 100 seconds of traffic - > > qdisc tbf 1: rate 5000Kbit burst 2Kb lat 82.7ms > Sent 62495756 bytes 41328 pkt (dropped 1, overlimits 87189 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc red 8002: parent 1: limit 100Kb min 10Kb max 60Kb ecn > Sent 62495756 bytes 41328 pkt (dropped 1, overlimits 2412 requeues > 87189) > rate 0bit 0pps backlog 0b 0p requeues 87189 > marked 2411 early 1 pdrop 0 other 0 > > I wonder if the drop was arp - I also tried again while pinging and > got 7% loss. Pings without ecn gave 13% loss. > > > The same test but without the ecn parameter - > > [root@amd /home/andy/Qos]# tc -s qdisc ls dev eth0 > > qdisc tbf 1: rate 5000Kbit burst 2Kb lat 82.7ms > Sent 62473422 bytes 41315 pkt (dropped 2320, overlimits 90861 > requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > qdisc red 8001: parent 1: limit 100Kb min 10Kb max 60Kb > Sent 62473422 bytes 41315 pkt (dropped 2320, overlimits 2320 requeues > 90861) > rate 0bit 0pps backlog 0b 0p requeues 90861 > marked 0 early 2320 pdrop 0 other 0 > Andy, thanks a lot for your efforts. Its good to see that in principal ECN really works and can save DROPs ;-) Best regards, Arnd From hcccc.chung at gmail.com Fri Jun 29 17:39:21 2007 From: hcccc.chung at gmail.com (Michael Chung) Date: Fri Jun 29 17:39:25 2007 Subject: [LARTC] ip route tos not always work Message-ID: <6c543c540706290839s987f121o6471032ef14b0fc8@mail.gmail.com> Dear All, I need to setup different route with different tos value. I can use the flowing command to add a tos route to routing table. ip route 192.168.0.2/32 via 192.168.0.1 tos 0x1c and ip route 192.168.0.2/32 via 192.168.0.1 tos 0x40 I used "ping -Q" to test it with different tos, the output packet is marked correctly. The problem that is only the tos value defined at /etc/iproute/rt_dsfield can route base on the tos route. tos 0x1c work but 0x40 not work. Do any one know why? Thank you! -- Regards, Michael Chung Pui Kei Computer Engineering Year 3 School of Engineering Hong Kong University of Science and Technology Email Address: eg_cpkaa@stu.ust.hk From dino at webjogger.net Sat Jun 30 00:41:03 2007 From: dino at webjogger.net (Mario Antonio) Date: Sat Jun 30 00:41:15 2007 Subject: [LARTC] Best kernel Settings for a Bandwidth Management Box References: <6c543c540706290839s987f121o6471032ef14b0fc8@mail.gmail.com> Message-ID: <00b301c7ba9e$93432a00$16140a0a@webjogger.net> Dear List, In order to address inaccurate results using either TBF or HTB according to: http://www.docum.org/docum.org/faq/cache/40.html or "In order to increase the accuracy of the clock we can modify some parameters of the kernel. The parameters we have tried to change have been the PSCHED_CLOCK_SOURCE variable and the HZ variable. The observed effect has been a small increase in the use of the CPU, but a great improvement regarding the accuracy of the output bit rate of the flows. The CBQ scheduler guarantees a minimum bandwidth better and TBF obtains an output rate more adjusted to what is required." https://www.tlm.unavarra.es/~eduardo/publicaciones/20010403-alcom-english.pdf It seems that the best setting for a dedicated box to shape traffic in the network is to build a kernel with: CONFIG_NET_SCHED=y CONFIG_NET_SCH_CLK_CPU=y But since NET_SCH_CLK_CPU depends on ((X86_TSC || X86_64) && !SMP), Does this mean that the ideal Bandwidth Management box has to use only one CPU? The following link also suggests the use of preemption: CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y http://www.sigsegv.cx/qos-2.html Any thoughts? What are the best kernel settings for a Bandwidth Management box in bridge mode using X86 architecture? Regards, Mario Antonio