aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2006-12-04bridge: fix possible overflow in get_fdb_entries (CVE-2006-5751)Chris Wright
Make sure to properly clamp maxnum to avoid overflow (CVE-2006-5751). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[EBTABLES]: Prevent wraparounds in checks for entry components' sizes.Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[EBTABLES]: Deal with the worst-case behaviour in loop checks.Al Viro
No need to revisit a chain we'd already finished with during the check for current hook. It's either instant loop (which we'd just detected) or a duplicate work. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[EBTABLES]: Verify that ebt_entries have zero ->distinguisher.Al Viro
We need that for iterator to work; existing check had been too weak. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[EBTABLES]: Fix wraparounds in ebt_entries verification.Al Viro
We need to verify that a) we are not too close to the end of buffer to dereference b) next entry we'll be checking won't be _before_ our While we are at it, don't subtract unrelated pointers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[NET_SCHED]: policer: restore compatibility with old iproute binariesPatrick McHardy
The tc actions increased the size of struct tc_police, which broke compatibility with old iproute binaries since both the act_police and the old NET_CLS_POLICE code check for an exact size match. Since the new members are not even used, the simple fix is to also accept the size of the old structure. Dumping is not affected since old userspace will receive a bigger structure, which is handled fine. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[PKT_SCHED] act_gact: division by zeroKim Nordlund
Not returning -EINVAL, because someone might want to use the value zero in some future gact_prob algorithm? Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04[IPV6]: Fix address/interface handling in UDP and DCCP, according to the ↵YOSHIFUJI Hideaki
scoping architecture. TCP and RAW do not have this issue. Closes Bug #7432. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04remove garbage the sneaked into the ext3 fixAdrian Bunk
Spotted by Thomas Voegtle. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29SCTP: Always linearise packet on inputHerbert Xu
I was looking at a RHEL5 bug report involving Xen and SCTP (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212550). It turns out that SCTP wasn't written to handle skb fragments at all. The absence of any calls to skb_may_pull is testament to that. It just so happens that Xen creates fragmented packets more often than other scenarios (header & data split when going from domU to dom0). That's what caused this bug to show up. Until someone has the time sits down and audits the entire net/sctp directory, here is a conservative and safe solution that simply linearises all packets on input. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)Al Viro
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of buffer size from address of buffer_head is a bad idea - it will generate junk in any case, may oops if buffer_head is close to the end of slab page and next page is not mapped and isn't what was intended there. IOW, ->b_data is missing in that call. Fortunately, result doesn't go into the primary on-disk data structures, so only backup ones get crap written to them; that had allowed this bug to remain unnoticed until now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29[UDP]: Make udp_encap_rcv use pskb_may_pullOlaf Kirch
Make udp_encap_rcv use pskb_may_pull IPsec with NAT-T breaks on some notebooks using the latest e1000 chipset, when header split is enabled. When receiving sufficiently large packets, the driver puts everything up to and including the UDP header into the header portion of the skb, and the rest goes into the paged part. udp_encap_rcv forgets to use pskb_may_pull, and fails to decapsulate it. Instead, it passes it up it to the IKE daemon. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Annotate and fix IPX checksumAl Viro
Calculation of IPX checksum got buggered about 2.4.0. The old variant mangled the packet; that got fixed, but calculation itself got buggered. Restored the correct logics, fixed a subtle breakage we used to have even back then: if the sum is 0 mod 0xffff, we want to return 0, not 0xffff. The latter has special meaning for IPX (cheksum disabled). Observation (and obvious fix) nicked from history of FreeBSD ipx_cksum.c... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Fix typo, ipxhdr() --> ipx_hdr()David S. Miller
Noticed by Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Another nonlinear receive fixStephen Hemminger
Need to check some more cases in IPX receive. If the skb is purely fragments, the IPX header needs to be extracted. The function pskb_may_pull() may in theory invalidate all the pointers in the skb, so references to ipx header must be refreshed. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Header length validation neededStephen Hemminger
This patch will linearize and check there is enough data. It handles the pprop case as well as avoiding a whole audit of the routing code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Correct return type of ipx_map_frame_type().Alexey Dobriyan
Casting BE16 to int and back may or may not work. Correct, to be sure. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-20[RTNETLINK]: Fix IFLA_ADDRESS handling.David Miller
The ->set_mac_address handlers expect a pointer to a sockaddr which contains the MAC address, whereas IFLA_ADDRESS provides just the MAC address itself. So whip up a sockaddr to wrap around the netlink attribute for the ->set_mac_address call. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-17Fix timer race in dst GC codeDmitry Mishin
Replace add_timer() by mod_timer() in dst_run_gc in order to avoid BUG message. CPU1 CPU2 dst_run_gc() entered dst_run_gc() entered spin_lock(&dst_lock) ..... del_timer(&dst_gc_timer) fail to get lock .... mod_timer() <--- puts timer back to the list add_timer(&dst_gc_timer) <--- BUG because timer is in list already. Found during OpenVZ internal testing. At first we thought that it is OpenVZ specific as we added dst_run_gc(0) call in dst_dev_event(), but as Alexey pointed to me it is possible to trigger this condition in mainstream kernel. F.e. timer has fired on CPU2, but the handler was preeempted by an irq before dst_lock is tried. Meanwhile, someone on CPU1 adds an entry to gc list and starts the timer. If CPU2 was preempted long enough, this timer can expire simultaneously with resuming timer handler on CPU1, arriving exactly to the situation described. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-11[NET]: Update frag_list in pskb_trimHerbert Xu
When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of the packet, the frag_list is not updated to reflect the trimming. This will usually work fine until you hit something that uses the packet length or tail from the frag_list. Examples include esp_output and ip_fragment. Another problem caused by this is that you can end up with a linear packet with a frag_list attached. It is possible to get away with this if we audit everything to make sure that they always consult skb->len before going down onto frag_list. In fact we can do the samething for the paged part as well to avoid copying the data area of the skb. For now though, let's do the conservative fix and update frag_list. Many thanks to Marco Berizzi for helping me to track down this bug. This 4-year old bug took 3 months to track down. Marco was very patient indeed :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09[NET]: __alloc_pages() failures reported due to fragmentationLarry Woodman
We have seen a couple of __alloc_pages() failures due to fragmentation, there is plenty of free memory but no large order pages available. I think the problem is in sock_alloc_send_pskb(), the gfp_mask includes __GFP_REPEAT but its never used/passed to the page allocator. Shouldnt the gfp_mask be passed to alloc_skb() ? Signed-off-by: Larry Woodman <lwoodman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09[NET]: Set truesize in pskb_copyHerbert Xu
Since pskb_copy tacks on the non-linear bits from the original skb, it needs to count them in the truesize field of the new skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09[TCP]: Don't use highmem in tcp hash size calculation.John Heffner
This patch removes consideration of high memory when determining TCP hash table sizes. Taking into account high memory results in tcp_mem values that are too large. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-08[IPV4]: Limit rt cache size properly.Kirill Korotaev
During OpenVZ stress testing we found that UDP traffic with random src can generate too much excessive rt hash growing leading finally to OOM and kernel panics. It was found that for 4GB i686 system (having 1048576 total pages and 225280 normal zone pages) kernel allocates the following route hash: syslog: IP route cache hash table entries: 262144 (order: 8, 1048576 bytes) => ip_rt_max_size = 4194304 entries, i.e. max rt size is 4194304 * 256b = 1Gb of RAM > normal_zone Attached the patch which removes HASH_HIGHMEM flag from alloc_large_system_hash() call. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-08[NET]: Add missing UFO initialisationsHerbert Xu
This bug was unknowingly fixed the GSO patches (or rather, its effect was unknown at the time). Thanks to Marco Berizzi's persistence which is documented in the thread "ipsec tunnel asymmetrical mtu", we now know that it can have highly non-obvious symptoms. What happens is that uninitialised uso_size fields can cause packets to be incorrectly identified as UFO, which means that it does not get fragmented even if it's over the MTU. The fix is simple enough. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07[MAINTAINERS]: Add proper entry for TC classifierStephen Hemminger
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07[PKT_SCHED]: act_api: Fix module leak while flushing actionsThomas Graf
Module reference needs to be given back if message header construction fails. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07PKT_SCHED: Return ENOENT if action module is unavailableThomas Graf
Return ENOENT if action module is unavailable Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07PKT_SCHED: Fix illegal memory dereferences when dumping actionsThomas Graf
The TCA_ACT_KIND attribute is used without checking its availability when dumping actions therefore leading to a value of 0x4 being dereferenced. The use of strcmp() in tc_lookup_action_n() isn't safe when fed with string from an attribute without enforcing proper NUL termination. Both bugs can be triggered with malformed netlink message and don't require any privileges. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07PKT_SCHED: Fix error handling while dumping actionsThomas Graf
"return -err" and blindly inheriting the error code in the netlink failure exception handler causes errors codes to be returned as positive value therefore making them being ignored by the caller. May lead to sending out incomplete netlink messages. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05[NETFILTER]: Fix ip6_tables extension header bypass bug (CVE-2006-4572)Patrick McHardy
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible to a fragmentation attack causing false negatives on extension header matches. When extension headers occur in the non-first fragment after the fragment header (possibly with an incorrect nexthdr value in the fragment header) a rule looking for this extension header will never match. Drop fragments that are at offset 0 and don't contain the final protocol header regardless of the ruleset, since this should not happen normally. Since all extension headers are before the protocol header this makes sure an extension header is either not present or in the first fragment, where we can properly parse it. With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05[NETFILTER]: Fix ip6_tables protocol bypass bug (CVE-2006-4572)Patrick McHardy
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible to a fragmentation attack causing false negatives on protocol matches. When the protocol header doesn't follow the fragment header immediately, the fragment header contains the protocol number of the next extension header. When the extension header and the protocol header are sent in a second fragment a rule like "ip6tables .. -p udp -j DROP" will never match. Drop fragments that are at offset 0 and don't contain the final protocol header regardless of the ruleset, since this should not happen normally. With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05knfsd: Fix race that can disable NFS server.Neil Brown
This is a long standing bug that seems to have only recently become apparent, presumably due to increasing use of NFS over TCP - many distros seem to be making it the default. The SK_CONN bit gets set when a listening socket may be ready for an accept, just as SK_DATA is set when data may be available. It is entirely possible for svc_tcp_accept to be called with neither of these set. It doesn't happen often but there is a small race in svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the spin_lock. They could be cleared immediately after the test and before the lock is gained. This normally shouldn't be a problem. The sockets are non-blocking so trying to read() or accept() when ther is nothing to do is not a problem. However: svc_tcp_recvfrom makes the decision "Should I accept() or should I read()" based on whether SK_CONN is set or not. This usually works but is not safe. The decision should be based on whether it is a TCP_LISTEN socket or a TCP_CONNECTED socket. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05[IPV6]: fix lockup via /proc/net/ip6_flowlabel (CVE-2006-5619)James Morris
There's a bug in the seqfile handling for /proc/net/ip6_flowlabel, where, after finding a flowlabel, the code will loop forever not finding any further flowlabels, first traversing the rest of the hash bucket then just looping. This patch fixes the problem by breaking after the hash bucket has been traversed. Note that this bug can cause lockups and oopses, and is trivially invoked by an unpriveleged user. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05fix RARP ic_servaddr breakageAl Viro
memcpy 4 bytes to address of auto unsigned long variable followed by comparison with u32 is a bloody bad idea. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14PKT_SCHED: cls_basic: Use unsigned int when generating handleKim Nordlund
Prevents filters from being added if the first generated handle already exists. Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[ATM] CLIP: Do not refer freed skbuff in clip_mkip() (CVE-2006-4997)YOSHIFUJI Hideaki
In clip_mkip(), skb->dev is dereferenced after clip_push(), which frees up skb. Advisory: AD_LAB-06009 (<adlab@venustech.com.cn>). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-13IPV6: Sum real space for RTAs.YOSHIFUJI Hideaki
This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue pointed out by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-06[PKTGEN]: Make sure skb->{nh,h} are initialized in fill_packet_ipv6() too.David S. Miller
Mirror the bug fix from fill_packet_ipv4() Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-06[PKTGEN]: Fix oops when used with balance-tlb bondingChen-Li Tien
Signed-off-by: Chen-Li Tien <cltien@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-06[IPV6]: Fix kernel OOPs when setting sticky socket options.YOSHIFUJI Hideaki
Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-05Fix sctp_primitive_ABORT() call in sctp_close()Sridhar Samudrala
With the recent fix, the callers of sctp_primitive_ABORT() need to create an ABORT chunk and pass it as an argument rather than msghdr that was passed earlier. Adrian Bunk: Ported to 2.6.16. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-31ethtool: fix oops in ethtool_set_pauseparam()Willy Tarreau
The function pointers which were checked were for their get_* counterparts. Typically a copy-paste typo. Signed-off-by: Willy Tarreau <w@1wt.eu> Acked-by: Jeff Garzik <jeff@garzik.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-31ETHTOOL: Fix UFO typoHerbert Xu
The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of ETHTOOL_GUFO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30ip_tables: fix table locking in ipt_do_tablePatrick McHardy
table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26ulog: fix panic on SMP kernelsMark Huang
Fix kernel panic on various SMP machines. The culprit is a null ub->skb in ulog_send(). If ulog_timer() has already been scheduled on one CPU and is spinning on the lock, and ipt_ulog_packet() flushes the queue on another CPU by calling ulog_send() right before it exits, there will be no skbuff when ulog_timer() acquires the lock and calls ulog_send(). Cancelling the timer in ulog_send() doesn't help because it has already been scheduled and is running on the first CPU. Similar problem exists in ebt_ulog.c and nfnetlink_log.c. Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26SCTP: Send only 1 window update SACK per message.Tsutomu Fujii
Right now, every time we increase our rwnd by more then MTU bytes, we trigger a SACK. When processing large messages, this will generate a SACK for almost every other SCTP fragment. However since we are freeing the entire message at the same time, we might as well collapse the SACK generation to 1. Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26SCTP: Reset rtt_in_progress for the chunk when processing its sack.Vlad Yasevich
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26SCTP: Limit association max_retrans setting in setsockopt.Vlad Yasevich
When using ASSOCINFO socket option, we need to limit the number of maximum association retransmissions to be no greater than the sum of all the path retransmissions. This is specified in Section 7.1.2 of the SCTP socket API draft. However, we only do this if the association has multiple paths. If there is only one path, the protocol stack will use the assoc_max_retrans setting when trying to retransmit packets. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26SCTP: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.Neil Horman
In the event that our entire receive buffer is full with a series of chunks that represent a single gap-ack, and then we accept a chunk (or chunks) that fill in the gap between the ctsn and the first gap, we renege chunks from the end of the buffer, which effectively does nothing but move our gap to the end of our received tsn stream. This does little but move our missing tsns down stream a little, and, if the sender is sending sufficiently large retransmit frames, the result is a perpetual slowdown which can never be recovered from, since the only chunk that can be accepted to allow progress in the tsn stream necessitates that a new gap be created to make room for it. This leads to a constant need for retransmits, and subsequent receiver stalls. The fix I've come up with is to deliver the frame without reneging if we have a full receive buffer and the receiving sockets sk_receive_queue is empty(indicating that the receive buffer is being blocked by a missing tsn). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>