aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4/ip_options.c
AgeCommit message (Collapse)Author
2011-05-31ip_options_compile: properly handle unaligned pointerChris Metcalf
The current code takes an unaligned pointer and does htonl() on it to make it big-endian, then does a memcpy(). The problem is that the compiler decides that since the pointer is to a __be32, it is legal to optimize the copy into a processor word store. However, on an architecture that does not handled unaligned writes in kernel space, this produces an unaligned exception fault. The solution is to track the pointer as a "char *" (which removes a bunch of unpleasant casts in any case), and then just use put_unaligned_be32() to write the value to memory. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@zippy.davemloft.net>
2011-05-13ipv4: Remove rt->rt_dst reference from ip_forward_options().David S. Miller
At this point iph->daddr equals what rt->rt_dst would hold. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13ipv4: Remove route key identity dependencies in ip_rt_get_source().David S. Miller
Pass in the sk_buff so that we can fetch the necessary keys from the packet header when working with input routes. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13ipv4: Kill spurious write to iph->daddr in ip_forward_options().David S. Miller
This code block executes when opt->srr_is_hit is set. It will be set only by ip_options_rcv_srr(). ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR option addresses, and when it matches one 1) looks up the route for that nexthop and 2) on route lookup success it writes that nexthop value into iph->daddr. ip_forward_options() runs later, and again walks the SRR option addresses looking for the option matching the destination of the route stored in skb_rtable(). This route will be the same exact one looked up for the nexthop by ip_options_rcv_srr(). Therefore "rt->rt_dst == iph->daddr" must be true. All it really needs to do is record the route's source address in the matching SRR option adddress. It need not write iph->daddr again, since that has already been done by ip_options_rcv_srr() as detailed above. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12ipv4: Simplify iph->daddr overwrite in ip_options_rcv_srr().David S. Miller
We already copy the 4-byte nexthop from the options block into local variable "nexthop" for the route lookup. Re-use that variable instead of memcpy()'ing again when assigning to iph->daddr after the route lookup succeeds. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12ipv4: Kill spurious opt->srr check in ip_options_rcv_srr().David S. Miller
All call sites conditionalize the call to ip_options_rcv_srr() with a check of opt->srr, so no need to check it again there. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-28inet: add RCU protection to inet->optEric Dumazet
We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-14ip: ip_options_compile() resilient to NULL skb routeEric Dumazet
Scot Doyle demonstrated ip_options_compile() could be called with an skb without an attached route, using a setup involving a bridge, netfilter, and forged IP packets. Let's make ip_options_compile() and ip_options_rcv_srr() a bit more robust, instead of changing bridge/netfilter code. With help from Hiroaki SHIMODA. Reported-by: Scot Doyle <lkml@scotdoyle.com> Tested-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()Jan Luebbe
The current handling of echoed IP timestamp options with prespecified addresses is rather broken since the 2.2.x kernels. As far as i understand it, it should behave like when originating packets. Currently it will only timestamp the next free slot if: - there is space for *two* timestamps - some random data from the echoed packet taken as an IP is *not* a local IP This first is caused by an off-by-one error. 'soffset' points to the next free slot and so we only need to have 'soffset + 7 <= optlen'. The second bug is using sptr as the start of the option, when it really is set to 'skb_network_header(skb)'. I just use dptr instead which points to the timestamp option. Finally it would only timestamp for non-local IPs, which we shouldn't do. So instead we exclude all unicast destinations, similar to what we do in ip_options_compile(). Signed-off-by: Jan Luebbe <jluebbe@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-19bridge : Sanitize skb before it enters the IP stackBandan Das
Related dicussion here : http://lkml.org/lkml/2010/9/3/16 Introduce a function br_parse_ip_options that will audit the skb and possibly refill IP options before a packet enters the IP stack. If no options are present, the function will zero out the skb cb area so that it is not misinterpreted as options by some unsuspecting IP layer routine. If packet consistency fails, drop it. Signed-off-by: Bandan Das <bandan.das@stratus.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17net: Remove unnecessary returns from void function()sJoe Perches
This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17net: add a noref bit on skb dstEric Dumazet
Use low order bit of skb->_skb_dst to tell dst is not refcounted. Change _skb_dst to _skb_refdst to make sure all uses are catched. skb_dst() returns the dst, regardless of noref bit set or not, but with a lockdep check to make sure a noref dst is not given if current user is not rcu protected. New skb_dst_set_noref() helper to set an notrefcounted dst on a skb. (with lockdep check) skb_dst_drop() drops a reference only if skb dst was refcounted. skb_dst_force() helper is used to force a refcount on dst, when skb is queued and not anymore RCU protected. Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if !IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in sock_queue_rcv_skb(), in __nf_queue(). Use skb_dst_force() in dev_requeue_skb(). Note: dst_use_noref() still dirties dst, we might transform it later to do one dirtying per jiffies. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2009-06-03net: skb->dst accessorsEric Dumazet
Define three accessors to get/set dst attached to a skb struct dst_entry *skb_dst(const struct sk_buff *skb) void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03net: skb->rtable accessorEric Dumazet
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb Delete skb->rtable field Setting rtable is not allowed, just set dst instead as rtable is an alias. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10cipso: Add support for native local labeling and fixup mapping namesPaul Moore
This patch accomplishes three minor tasks: add a new tag type for local labeling, rename the CIPSO_V4_MAP_STD define to CIPSO_V4_MAP_TRANS and replace some of the CIPSO "magic numbers" with constants from the header file. The first change allows CIPSO to support full LSM labels/contexts, not just MLS attributes. The second change brings the mapping names inline with what userspace is using, compatibility is preserved since we don't actually change the value. The last change is to aid readability and help prevent mistakes. Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-06-11net: remove CVS keywordsAdrian Bunk
This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-21[IPV4]: Convert do_gettimeofday() to getnstimeofday().YOSHIFUJI Hideaki
What do_gettimeofday() does is to call getnstimeofday() and to convert the result from timespec{} to timeval{}. After that, these callers convert the result again to msec. Use getnstimeofday() and convert the units at once. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26[NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.YOSHIFUJI Hideaki
Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-24[NETNS]: Process IP layer in the context of the correct namespace.Denis V. Lunev
Replace all the rest of the init_net with a proper net on the IP layer. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24[NETNS]: Add namespace parameter to ip_options_get(...).Denis V. Lunev
Pass the init_net there for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24[NETNS]: Add namespace parameter to ip_options_compile.Denis V. Lunev
ip_options_compile uses inet_addr_type which requires a namespace. The packet argument is optional, so parameter is the only way to obtain it. Pass the init_net there for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22[IPV4]: Always pass ip_options pointer into ip_options_compile.Denis V. Lunev
This makes code a bit more uniform and straigthforward. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22[IPV4]: Remove unused ip_options->is_data.Denis V. Lunev
ip_options->is_data is assigned only and never checked. The structure is not a part of kernel interface to the userspace. So, it is safe to remove this field. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22[IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.Denis V. Lunev
There is the only way to reach ip_options compile with opt != NULL: ip_options_get_finish opt->is_data = 1; ip_options_compile(opt, NULL) So, checking for is_data inside opt != NULL branch is not needed. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05[IPV4]: Add 'rtable' field in struct sk_buff to alias 'dst' and avoid castsEric Dumazet
(Anonymous) unions can help us to avoid ugly casts. A common cast it the (struct rtable *)skb->dst one. Defining an union like : union { struct dst_entry *dst; struct rtable *rtable; }; permits to use skb->rtable in place. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03[IPV4]: skb->dst can't be NULL in ip_options_echo.Denis V. Lunev
ip_options_echo is called on the packet input path after the initial routing. The dst entry on the packet is cleared only in the several very specific places and immidiately assigned back (may be new). Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add netns parameter to inet_(dev_)add_type.Eric W. Biederman
The patch extends the inet_addr_type and inet_dev_addr_type with the network namespace pointer. That allows to access the different tables relatively to the network namespace. The modification of the signature function is reported in all the callers of the inet_addr_type using the pointer to the well known init_net. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31[IPV4] ip_options.c: kmalloc + memset conversion to kzallocMariusz Kozlowski
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iphArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_network_header()Arnaldo Carvalho de Melo
For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10[NET] IPV4: Fix whitespace errors.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30[NetLabel]: protect the CIPSOv4 socket option from setsockopt()Paul Moore
This patch makes two changes to protect applications from either removing or tampering with the CIPSOv4 IP option on a socket. The first is the requirement that applications have the CAP_NET_RAW capability to set an IPOPT_CIPSO option on a socket; this prevents untrusted applications from setting their own CIPSOv4 security attributes on the packets they send. The second change is to SELinux and it prevents applications from setting any IPv4 options when there is an IPOPT_CIPSO option already present on the socket; this prevents applications from removing CIPSOv4 security attributes from the packets they send. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: trivial ip_options.c annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: struct ip_options annotationsAl Viro
->faddr is net-endian; annotated as such, variables inferred to be net-endian annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: ip_options_build() annotationsAl Viro
daddr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: inet_addr_type() annotationsAl Viro
argument and inferred net-endian variables in callers annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: ip_route_input() annotationsAl Viro
ip_route_input() takes net-endian source and destination address. * Annotated as such. * arguments of its invocations annotated where needed. * local helpers getting the same values passed to by it (ip_route_input_mc(), ip_route_input_slow(), ip_handle_martian_source(), ip_mkroute_input(), ip_mkroute_input_def(), __mkroute_input()) annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[INET]: Remove is_setbyuser patchLouis Nyffenegger
The value is_setbyuser from struct ip_options is never used and set only one time (http://linux-net.osdl.org/index.php/TODO#IPV4). This little patch removes it from the kernel source. Signed-off-by: Louis Nyffenegger <louis.nyffenegger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NetLabel]: core network changesPaul Moore
Changes to the core network stack to support the NetLabel subsystem. This includes changes to the IPv4 option handling to support CIPSO labels. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21[IPV4]: Get rid of redundant IPCB->opts initialisationHerbert Xu
Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09[IPV4]: ip_options_fragment() has no effect on fragmentationWei Yongjun
Fix error point to options in ip_options_fragment(). optptr get a error pointer to the ipv4 header, correct is pointer to ipv4 options. Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11[PATCH] capable/capability.h (net/)Randy Dunlap
net: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-03[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.hArnaldo Carvalho de Melo
To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08[NET]: kfree cleanupJesper Juhl
From: Jesper Juhl <jesper.juhl@gmail.com> This is the net/ part of the big kfree cleanup patch. Remove pointless checks for NULL prior to calling kfree() in net/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Acked-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-08-29[IP]: Introduce ip_options_get_from_userArnaldo Carvalho de Melo
This variant is needed to satisfy sparse __user annotations. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[IPV4]: possible cleanupsAdrian Bunk
This patch contains the following possible cleanups: - make needlessly global code static - #if 0 the following unused global function: - xfrm4_state.c: xfrm4_state_fini - remove the following unneeded EXPORT_SYMBOL's: - ip_output.c: ip_finish_output - ip_output.c: sysctl_ip_default_ttl - fib_frontend.c: ip_dev_find - inetpeer.c: inet_peer_idlock - ip_options.c: ip_options_compile - ip_options.c: ip_options_undo - net/core/request_sock.c: sysctl_max_syn_backlog Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!