linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2014-01-16	net_sched: fix error return code in fw_change_attrs()	Wei Yongjun
	The error code was not set if change indev fail, so the error condition wasn't reflected in the return value. Fix to return a negative error code from this error handling case instead of 0. Fixes: 2519a602c273 ('net_sched: optimize tcf_match_indev()') Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	Merge branch 'tipc'	David S. Miller
	Ying Xue says: ==================== tipc: align TIPC behaviours of waiting for events with other stacks Comparing the current implementations of waiting for events in TIPC socket layer with other stacks, TIPC's behaviour is very different because wait_event_interruptible_timeout()/wait_event_interruptible() are always used by TIPC to wait for events while relevant socket or port variables are fed to them as their arguments. As socket lock has to be released temporarily before the two routines of waiting for events are called, their arguments associated with socket or port structures are out of socket lock protection. This might cause serious issues where the process of calling socket syscall such as sendsmg(), connect(), accept(), and recvmsg(), cannot be waken up at all even if proper event arrives or improperly be woken up although the condition of waking up the process is not satisfied in practice. Therefore, aligning its behaviours with similar functions implemented in other stacks, for instance, sk_stream_wait_connect() and inet_csk_wait_for_connect() etc, can avoid above risks for us. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	tipc: standardize recvmsg routine	Ying Xue
	Standardize the behaviour of waiting for events in TIPC recvmsg() so that all variables of socket or port structures are protected within socket lock, allowing the process of calling recvmsg() to be woken up at appropriate time. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	tipc: standardize sendmsg routine of connected socket	Ying Xue
	Standardize the behaviour of waiting for events in TIPC send_packet() so that all variables of socket or port structures are protected within socket lock, allowing the process of calling sendmsg() to be woken up at appropriate time. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	tipc: standardize sendmsg routine of connectionless socket	Ying Xue
	Comparing the behaviour of how to wait for events in TIPC sendmsg() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, sk_sleep() and tport->congested variables associated with socket are exposed without socket lock protection while wait_event_interruptible_timeout() accesses them. So standardizing it with similar implementation in other stacks can help us correct these errors which the process of calling sendmsg() cannot be woken up event if an expected event arrive at socket or improperly woken up although the wake condition doesn't match. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	tipc: standardize accept routine	Ying Xue
	Comparing the behaviour of how to wait for events in TIPC accept() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. As sk_sleep() and sk->sk_receive_queue variables associated with socket are not protected by socket lock, the process of calling accept() may be woken up improperly or sometimes cannot be woken up at all. After standardizing it with inet_csk_wait_for_connect routine, we can get benefits including: avoiding 'thundering herd' phenomenon, adding a timeout mechanism for accept(), coping with a pending signal, and having sk_sleep() and sk->sk_receive_queue being always protected within socket lock scope and so on. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	tipc: standardize connect routine	Ying Xue
	Comparing the behaviour of how to wait for events in TIPC connect() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, as both sock->state and sk_sleep() are directly fed to wait_event_interruptible_timeout() as its arguments, and socket lock has to be released before we call wait_event_interruptible_timeout(), the two variables associated with socket are exposed out of socket lock protection, thereby probably getting stale values so that the process of calling connect() cannot be woken up exactly even if correct event arrives or it is woken up improperly even if the wake condition is not satisfied in practice. Therefore, standardizing its behaviour with sk_stream_wait_connect routine can avoid these risks. Additionally the implementation of connect routine is simplified as a whole, allowing it to return correct values in all different cases. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	sctp: remove the unnecessary assignment	wangweidong
	When go the right path, the status is 0, no need to assign it again. So just remove the assignment. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	virtio-net: drop rq->max and rq->num	Jason Wang
	It looks like there's no need for those two fields: - Unless there's a failure for the first refill try, rq->max should be always equal to the vring size. - rq->num is only used to determine the condition that we need to do the refill, we could check vq->num_free instead. - rq->num was required to be increased or decreased explicitly after each get/put which results a bad API. So this patch removes them both to make the code simpler. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: davinci_mdio: Fix sparse warning	Lad, Prabhakar
	This patch fixes following sparse warning davinci_mdio.c:85:27: warning: symbol 'default_pdata' was not declared. Should it be static? Also makes the default_pdata as a constant. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	bonding: handle slave's name change with primary_slave logic	Veaceslav Falico
	Currently, if a slave's name change, we just pass it by. However, if the slave is a current primary_slave, then we end up with using a slave, whose name != params.primary, for primary_slave. And vice-versa, if we don't have a primary_slave but have params.primary set - we will not detected a new primary_slave. Fix this by catching the NETDEV_CHANGENAME event and setting primary_slave accordingly. Also, if the primary_slave was changed, issue a reselection of the active slave, cause the priorities have changed. Reported-by: Ding Tianhong <dingtianhong@huawei.com> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net_sched: act: pick a different type for act_xt	WANG Cong
	In tcf_register_action() we check either ->type or ->kind to see if there is an existing action registered, but ipt action registers two actions with same type but different kinds. They should have different types too. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge	David S. Miller
	Included change: - properly format already existing kerneldoc Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net_sched: act: use tcf_hash_release() in net/sched/act_police.c	WANG Cong
	Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	i40e: updates to AdminQ interface	Shannon Nelson
	Refinements to cloud support in the Firmware API. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	i40e: check desc pointer before printing	Shannon Nelson
	Check that the descriptors were allocated before trying to dump them to the logfile. While we're there, de-trick-ify the code so as to be easier to read and not abusing the types and unions. Change-ID: I22898f4b22cecda3582d4d9e4018da9cd540f177 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	team: block mtu change before it happens via NETDEV_PRECHANGEMTU	Veaceslav Falico
	Now it catches the NETDEV_CHANGEMTU notification, which is signaled after the actual change happened on the device, and returns NOTIFY_BAD, so that the change on the device is reverted. This might be quite costly and messy, so use the new NETDEV_PRECHANGEMTU to catch the MTU change before the actual change happens and signal that it's forbidden to do it. CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: add NETDEV_PRECHANGEMTU to notify before mtu change happens	Veaceslav Falico
	Currently, if a device changes its mtu, first the change happens (invloving all the side effects), and after that the NETDEV_CHANGEMTU is sent so that other devices can catch up with the new mtu. However, if they return NOTIFY_BAD, then the change is reverted and error returned. This is a really long and costy operation (sometimes). To fix this, add NETDEV_PRECHANGEMTU notification which is called prior to any change actually happening, and if any callee returns NOTIFY_BAD - the change is aborted. This way we're skipping all the playing with apply/revert the mtu. CC: "David S. Miller" <davem@davemloft.net> CC: Jiri Pirko <jiri@resnulli.us> CC: Eric Dumazet <edumazet@google.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Cong Wang <amwang@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	r6040: use ETH_ZLEN instead of MISR for SKB length checking	Florian Fainelli
	Ever since this driver was merged the following code was included: if (skb->len < MISR) skb->len = MISR; MISR is defined to 0x3C which is also equivalent to ETH_ZLEN, but use ETH_ZLEN directly which is exactly what we want to be checking for. Reported-by: Marc Volovic <marcv@ezchip.com> Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	r6040: add delays in MDIO read/write polling loops	Florian Fainelli
	On newer and faster machines (Vortex X86DX) using the r6040 driver, it was noticed that the driver was returning an error during probing traced down to being the MDIO bus probing and the inability to complete a MDIO read operation in time. It turns out that the MDIO operations on these faster machines usually complete after ~2140 iterations which is bigger than 2048 (MAC_DEF_TIMEOUT) and results in spurious timeouts depending on the system load. Update r6040_phy_read() and r6040_phy_write() to include a 1 micro second delay in each busy-looping iteration of the loop which is a much safer operation than incrementing MAC_DEF_TIMEOUT. Reported-by: Nils Koehler <nils.koehler@ibt-interfaces.de> Reported-by: Daniel Goertzen <daniel.goertzen@gmail.com> Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	xen-netfront: add support for IPv6 offloads	Paul Durrant
	This patch adds support for IPv6 checksum offload and GSO when those features are available in the backend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: Check skb->rxhash in gro_receive	Tom Herbert
	When initializing a gro_list for a packet, first check the rxhash of the incoming skb against that of the skb's in the list. This should be a very strong inidicator of whether the flow is going to be matched, and potentially allows a lot of other checks to be short circuited. Use skb_hash_raw so that we don't force the hash to be calculated. Tested by running netperf 200 TCP_STREAMs between two machines with GRO, HW rxhash, and 1G. Saw no performance degration, slight reduction of time in dev_gro_receive. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: Add skb_get_hash_raw	Tom Herbert
	Function to just return skb->rxhash without checking to see if it needs to be recomputed. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	vxge: make local functions static	stephen hemminger
	Remove unused function vxge_hw_vpath_vid_get Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	bnad: code cleanup	stephen hemminger
	Use 'make namespacecheck' to code that could be declared static. After that remove code that is not being used. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	dm9000: fix a lot of checkpatch issues	Barry Song
	recently, dm9000 codes have many checkpatch errors and warnings: WARNING: please, no space before tabs 3: FILE: dm9000.c:3: + * ^ICopyright (C) 1997 Sten Wang$ WARNING: please, no space before tabs 5: FILE: dm9000.c:5: + * ^IThis program is free software; you can redistribute it and/or$ WARNING: please, no space before tabs 6: FILE: dm9000.c:6: + * ^Imodify it under the terms of the GNU General Public License$ WARNING: please, no space before tabs 7: FILE: dm9000.c:7: + * ^Ias published by the Free Software Foundation; either version 2$ WARNING: please, no space before tabs 8: FILE: dm9000.c:8: + * ^Iof the License, or (at your option) any later version.$ WARNING: please, no space before tabs 10: FILE: dm9000.c:10: + * ^IThis program is distributed in the hope that it will be useful,$ WARNING: please, no space before tabs 11: FILE: dm9000.c:11: + * ^Ibut WITHOUT ANY WARRANTY; without even the implied warranty of$ WARNING: please, no space before tabs 12: FILE: dm9000.c:12: + * ^IMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the$ WARNING: please, no space before tabs 13: FILE: dm9000.c:13: + * ^IGNU General Public License for more details.$ WARNING: do not add new typedefs 97: FILE: dm9000.c:97: +typedef struct board_info { ERROR: spaces prohibited around that ':' (ctx:WxV) 113: FILE: dm9000.c:113: + unsigned int in_suspend :1; ^ ERROR: spaces prohibited around that ':' (ctx:WxV) 114: FILE: dm9000.c:114: + unsigned int wake_supported :1; ^ This patch fixes important errors in it. Signed-off-by: Barry Song <Baohua.Song@csr.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	packet: use percpu mmap tx frame pending refcount	Daniel Borkmann
	In PF_PACKET's packet mmap(), we can avoid using one atomic_inc() and one atomic_dec() call in skb destructor and use a percpu reference count instead in order to determine if packets are still pending to be sent out. Micro-benchmark with [1] that has been slightly modified (that is, protcol = 0 in socket(2) and bind(2)), example on a rather crappy testing machine; I expect it to scale and have even better results on bigger machines: ./packet_mm_tx -s7000 -m7200 -z700000 em1, avg over 2500 runs: With patch: 4,022,015 cyc Without patch: 4,812,994 cyc time ./packet_mm_tx -s64 -c10000000 em1 > /dev/null, stable: With patch: real 1m32.241s user 0m0.287s sys 1m29.316s Without patch: real 1m38.386s user 0m0.265s sys 1m35.572s In function tpacket_snd(), it is okay to use packet_read_pending() since in fast-path we short-circuit the condition already with ph != NULL, since we have next frames to process. In case we have MSG_DONTWAIT, we also do not execute this path as need_wait is false here anyway, and in case of _no_ MSG_DONTWAIT flag, it is okay to call a packet_read_pending(), because when we ever reach that path, we're done processing outgoing frames anyway and only look if there are skbs still outstanding to be orphaned. We can stay lockless in this percpu counter since it's acceptable when we reach this path for the sum to be imprecise first, but we'll level out at 0 after all pending frames have reached the skb destructor eventually through tx reclaim. When people pin a tx process to particular CPUs, we expect overflows to happen in the reference counter as on one CPU we expect heavy increase; and distributed through ksoftirqd on all CPUs a decrease, for example. As David Laight points out, since the C language doesn't define the result of signed int overflow (i.e. rather than wrap, it is allowed to saturate as a possible outcome), we have to use unsigned int as reference count. The sum over all CPUs when tx is complete will result in 0 again. The BUG_ON() in tpacket_destruct_skb() we can remove as well. It can _only_ be set from inside tpacket_snd() path and we made sure to increase tx_ring.pending in any case before we called po->xmit(skb). So testing for tx_ring.pending == 0 is not too useful. Instead, it would rather have been useful to test if lower layers didn't orphan the skb so that we're missing ring slots being put back to TP_STATUS_AVAILABLE. But such a bug will be caught in user space already as we end up realizing that we do not have any TP_STATUS_AVAILABLE slots left anymore. Therefore, we're all set. Btw, in case of RX_RING path, we do not make use of the pending member, therefore we also don't need to use up any percpu memory here. Also note that __alloc_percpu() already returns a zero-filled percpu area, so initialization is done already. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	packet: don't unconditionally schedule() in case of MSG_DONTWAIT	Daniel Borkmann
	In tpacket_snd(), when we've discovered a first frame that is not in status TP_STATUS_SEND_REQUEST, and return a NULL buffer, we exit the send routine in case of MSG_DONTWAIT, since we've finished traversing the mmaped send ring buffer and don't care about pending frames. While doing so, we still unconditionally call an expensive schedule() in the packet_current_frame() "error" path, which is unnecessary in this case since it's enough to just quit the function. Also, in case MSG_DONTWAIT is not set, we should rather test for need_resched() first and do schedule() only if necessary since meanwhile pending frames could already have finished processing and called skb destructor. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	packet: improve socket create/bind latency in some cases	Daniel Borkmann
	Most people acquire PF_PACKET sockets with a protocol argument in the socket call, e.g. libpcap does so with htons(ETH_P_ALL) for all its sockets. Most likely, at some point in time a subsequent bind() call will follow, e.g. in libpcap with ... memset(&sll, 0, sizeof(sll)); sll.sll_family = AF_PACKET; sll.sll_ifindex = ifindex; sll.sll_protocol = htons(ETH_P_ALL); ... as arguments. What happens in the kernel is that already in socket() syscall, we install a proto hook via register_prot_hook() if our protocol argument is != 0. Yet, in bind() we're almost doing the same work by doing a unregister_prot_hook() with an expensive synchronize_net() call in case during socket() the proto was != 0, plus follow-up register_prot_hook() with a bound device to it this time, in order to limit traffic we get. In the case when the protocol and user supplied device index (== 0) does not change from socket() to bind(), we can spare us doing the same work twice. Similarly for re-binding to the same device and protocol. For these scenarios, we can decrease create/bind latency from ~7447us (sock-bind-2 case) to ~89us (sock-bind-1 case) with this patch. Alternatively, for the first case, if people care, they should simply create their sockets with proto == 0 argument and define the protocol during bind() as this saves a call to synchronize_net() as well (sock-bind-3 case). In all other cases, we're tied to user space behaviour we must not change, also since a bind() is not strictly required. Thus, we need the synchronize_net() to make sure no asynchronous packet processing paths still refer to the previous elements of po->prot_hook. In case of mmap()ed sockets, the workflow that includes bind() is socket() -> setsockopt(<ring>) -> bind(). In that case, a pair of {__unregister, register}_prot_hook is being called from setsockopt() in order to install the new protocol receive handler. Thus, when we call bind and can skip a re-hook, we have already previously installed the new handler. For fanout, this is handled different entirely, so we should be good. Timings on an i7-3520M machine: * sock-bind-1: 89 us * sock-bind-2: 7447 us * sock-bind-3: 75 us sock-bind-1: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=all(0), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-2: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-3: socket(PF_PACKET, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	i40e: Remove autogenerated Module.symvers file.	David S. Miller
	Fixes: 9d8bf54 ("i40e: associate VMDq queue with VM type") Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net/ipv4: don't use module_init in non-modular gre_offload	Paul Gortmaker
	Recent commit 438e38fadca2f6e57eeecc08326c8a95758594d4 ("gre_offload: statically build GRE offloading support") added new module_init/module_exit calls to the gre_offload.c file. The file is obj-y and can't be anything other than built-in. Currently it can never be built modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. We also make the inclusion explicit. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- it will remain at level 6 in initcall ordering. As for the module_exit, rather than replace it with __exitcall, we simply remove it, since it appears only UML does anything with those, and even for UML, there is no relevant cleanup to be done here. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net/mlx4_core: clean up srq_res_start_move_to()	Paul Bolle
	Building resource_tracker.o triggers a GCC warning: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_SRQ_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3202:17: warning: 'srq' may be used uninitialized in this function [-Wmaybe-uninitialized] atomic_dec(&srq->mtt->ref_count); ^ This is a false positive. But a cleanup of srq_res_start_move_to() can help GCC here. The code currently uses a switch statement where a plain if/else would do, since only two of the switch's four cases can ever occur. Dropping that switch makes the warning go away. While we're at it, add some missing braces, and convert state to the correct type. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net/mlx4_core: clean up cq_res_start_move_to()	Paul Bolle
	Building resource_tracker.o triggers a GCC warning: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_CQ_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3019:16: warning: 'cq' may be used uninitialized in this function [-Wmaybe-uninitialized] atomic_dec(&cq->mtt->ref_count); ^ This is a false positive. But a cleanup of cq_res_start_move_to() can help GCC here. The code currently uses a switch statement where an if/else construct would do too, since only two of the switch's four cases can ever occur. Dropping that switch makes the warning go away. While we're at it, add some missing braces. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	Merge branch 'ixgbe-next'	David S. Miller
	Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe and ixgbevf. John adds rtnl lock / unlock semantics for ixgbe_reinit_locked() which was being called without the rtnl lock being held. Jacob corrects an issue where ixgbevf_qv_disable function does not set the disabled bit correctly. From the community, Wei uses a type of struct for pci driver-specific data in ixgbevf_suspend() Don changes the way we store ring arrays in a manner that allows support of multiple queues on multiple nodes and creates new ring initialization functions for work previously done across multiple functions - making the code closer to ixgbe and hopefully more readable. He also fixes incorrect fiber eeprom write logic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbe: Fix incorrect logic for fixed fiber eeprom write	Don Skidmore
	In this code we wanted to set the bit in IXGBE_SFF_SOFT_RS_SELECT_MASK to the value in rs. So we really needed a logical or rather than an and, this patch makes that change. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbevf: create function for all of ring init	Don Skidmore
	This patch creates new functions for ring initialization, ixgbevf_configure_tx_ring() and ixgbevf_configure_rx_ring(). The work done in these function previously was spread between several other functions and this change should hopefully lead to greater readability and make the code more like ixgbe. This patch also moves the placement of some older functions to avoid having to write prototypes. It also promotes a couple of debug messages to errors. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbevf: Convert ring storage form pointer to an array to array of pointers	Don Skidmore
	This will change how we store rings arrays in the adapter sturct. We use to have a pointer to an array now we will be using an array of pointers. This will allow us to support multiple queues on muliple nodes at some point we would be able to reallocate the rings so that each is on a local node if needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbevf: use pci drvdata correctly in ixgbevf_suspend()	Wei Yongjun
	We had set the pci driver-specific data in ixgbevf_probe() as a type of struct net_device, so we should use it as netdev in ixgbevf_suspend(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbevf: set the disable state when ixgbevf_qv_disable is called	Jacob Keller
	The ixgbevf_qv_disable function used by CONFIG_NET_RX_BUSY_POLL is broken, because it does not properly set the IXGBEVF_QV_STATE_DISABLED bit, indicating that the q_vector should be disabled (and preventing future locks from obtaining the vector). This patch corrects the issue by setting the disable state. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	ixgbe: reinit_locked() should be called with rtnl_lock	John Fastabend
	ixgbe_service_task() is calling ixgbe_reinit_locked() without the rtnl_lock being held. This is because it is being called from a worker thread and not a rtnl netlink or dcbnl path. Add rtnl_{un}lock() semantics. I found this during code review. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: eth_type_trans() should use skb_header_pointer()	Eric Dumazet
	eth_type_trans() can read uninitialized memory as drivers do not necessarily pull more than 14 bytes in skb->head before calling it. As David suggested, we can use skb_header_pointer() to fix this without breaking some drivers that might not expect eth_type_trans() pulling 2 additional bytes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	Merge branch 'stmmac_pm'	David S. Miller
	Srinivas Kandagatla says: ==================== net: stmmac PM related fixes. During PM_SUSPEND_FREEZE testing, I have noticed that PM support in STMMAC is partly broken. I had to re-arrange the code to do PM correctly. There were lot of things I did not like personally and some bits did not work in the first place. I thought this is the nice opportunity to clean the mess up. Here is what I did: any 1> Test PM suspend freeze via pm_test It did not work for following reasons. - If the power to gmac is removed when it enters in low power state. stmmac_resume could not cope up with such behaviour, it was expecting the ip register contents to be still same as before entering low power, This assumption is wrong. So I started to add some code to do Hardware initialization, thats when I started to re-arrange the code. stmmac_open contains both resource and memory allocations and hardware initialization. I had to separate these two things in two different functions. These two patches do that net: stmmac: move dma allocation to new function net: stmmac: move hardware setup for stmmac_open to new function And rest of the other patches are fixing the loose ends, things like mdio reset, which might be necessary in cases likes hibernation(I did not test). In hibernation cases the driver was just unregistering with subsystems and releasing resources which I did not like and its not necessary to do this as part of PM. So using the same stmmac_suspend/resume made more sense for hibernation cases than using stmmac_open/release. Also fixed a NULL pointer dereference bug too. 2> Test WOL via PM_SUSPEND_FREEZE Did get an wakeup interrupt, but could not wakeup a freeze system. So I had to add pm_wakeup_event to the driver. net: stmmac: notify the PM core of a wakeup event. patch. Also few patches like net: stmmac: make stmmac_mdio_reset non-static net: stmmac: restore pinstate in pm resume. helps the resume function to reset the phy and put back the pins in default state. Changes since RFC: - Rebased to net-next on Dave's suggestion. All these patches are Acked by Peppe. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: notify the PM core of a wakeup event.	Srinivas Kandagatla
	In PM_SUSPEND_FREEZE and WOL(Wakeup On Lan) case, when the driver gets a wakeup event, either the driver or platform specific PM code should notify the pm core about it, so that the system can wakeup from low power. In cases where there is no involvement of platform specific PM, it becomes driver responsibility to notify the PM core to wakeup the system. Without this WOL with PM_SUSPEND_FREEZE does not work on STi based SOCs. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: restore pinstate in pm resume.	Srinivas Kandagatla
	This patch adds code to restore default pinstate of the pins when it comes back from low power state. Without this patch the state of the pins would be unknown and the driver would not work. This patch also adds code to put the pins in to sleep state when the driver enters low power state. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: use suspend functions for hibernation	Srinivas Kandagatla
	In hibernation freeze case the driver just releases the resources like dma buffers, irqs, unregisters the drivers and during restore it does register, request the resources. This is not really necessary, as part of power management all the data structures are intact, all the previously allocated resources can be used after coming out of low power. This patch uses the suspend and resume callbacks for freeze and restore which initializes the hardware correctly without unregistering or releasing the resources, this should also help in reducing the time to restore. Also this patch fixes a bug in stmmac_pltfr_restore and stmmac_pltfr_freeze where it tries to get hold of platform data via dev_get_platdata call, which would return NULL in device tree cases and the next if statement would crash as there is no NULL check. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: fix power management suspend-resume case	Srinivas Kandagatla
	The driver PM resume assumes that the IP is still powered up and the all the register contents are not disturbed when it comes out of low power suspend case. This assumption is wrong, basically the driver should not consider any state of registers after it comes out of low power. However driver can keep the part of the IP powered up if its a wake up source. But it can not assume the register state of the IP. Also its possible that SOC glue layer can take the power off the IP if its not wake-up source to reduce the power consumption. This patch re initializes hardware by calling stmmac_hw_setup function in resume case. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: make stmmac_mdio_reset non-static	Srinivas Kandagatla
	This patch promotes stmmac_mdio_reset function from static to non-static, so that power management functions can decide to reset if the IP comes out from lowe power state specially hibernation cases. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: move hardware setup for stmmac_open to new function	Srinivas Kandagatla
	This patch moves hardware setup part of the code in stmmac_open to a new function stmmac_hw_setup, the reason for doing this is to make hw initialization independent function so that PM functions can re-use it to re-initialize the IP after returning from low power state. This will also avoid code duplication across stmmac_resume/restore and stmmac_open. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: move dma allocation to new function	Srinivas Kandagatla
	This patch moves dma resource allocation to a new function alloc_dma_desc_resources, the reason for moving this to a new function is to keep the memory allocations in a separate function. One more reason it to get suspend and hibernation cases working without releasing and allocating these resources during suspend-resume and freeze-restore cases. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16	net: stmmac: mdio: remove reset gpio free	Srinivas Kandagatla
	This patch removes gpio_free for reset line of the phy, driver stores the gpio number in its private data-structure to use in future. As the driver uses this pin in future this pin should not be freed. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>