linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2014-03-11	Linux 3.4.83v3.4.83	Greg Kroah-Hartman

2014-03-11	net: asix: add missing flag to struct driver_info	Emil Goode
	commit d43ff4cd798911736fb39025ec8004284b1b0bc2 upstream. The struct driver_info ax88178_info is assigned the function asix_rx_fixup_common as it's rx_fixup callback. This means that FLAG_MULTI_PACKET must be set as this function is cloning the data and calling usbnet_skb_return. Not setting this flag leads to usbnet_skb_return beeing called a second time from within the rx_process function in the usbnet module. Signed-off-by: Emil Goode <emilgoode@gmail.com> Reported-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	net: asix: handle packets crossing URB boundaries	Lucas Stach
	commit 8b5b6f5413e97c3e8bafcdd67553d508f4f698cd upstream. ASIX AX88772B started to pack data even more tightly. Packets and the ASIX packet header may now cross URB boundaries. To handle this we have to introduce some state between individual calls to asix_rx_fixup(). Signed-off-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: David S. Miller <davem@davemloft.net> [ Emil: backported to 3.4: dropped changes to drivers/net/usb/ax88172a.c ] Signed-off-by: Emil Goode <emilgoode@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	rtlwifi: Fix endian error in extracting packet type	Mark Cave-Ayland
	commit 0c5d63f0ab6728f05ddefa25aff55e31297f95e6 upstream. All of the rtlwifi drivers have an error in the routine that tests if the data is "special". If it is, the subsequant transmission will be at the lowest rate to enhance reliability. The 16-bit quantity is big-endian, but was being extracted in native CPU mode. One of the effects of this bug is to inhibit association under some conditions as the TX rate is too high. Based on suggestions by Joe Perches, the entire routine is rewritten. One of the local headers contained duplicates of some of the ETH_P_XXX definitions. These are deleted. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com> [wujg: Backported to 3.4: - adjust context - remove rtlpriv->enter_ps = false - use schedule_work(&rtlpriv->works.lps_leave_work)] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series	Emmanuel Grumbach
	commit 08a5dd3842f2ac61c6d69661d2d96022df8ae359 upstream. Add some new PCI IDs to the table for 6000, 6005 and 6235 series. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: - Adjust filenames - Drop const from struct iwl_cfg] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: - Adjust context - Do not drop const from struct iwl_cfg] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL	Stanislaw Gruszka
	commit 9186a1fd9ed190739423db84bc344d258ef3e3d7 upstream. If channel switch is pending and we remove interface we can crash like showed below due to passing NULL vif to mac80211: BUG: unable to handle kernel paging request at fffffffffffff8cc IP: [<ffffffff8130924d>] strnlen+0xd/0x40 Call Trace: [<ffffffff8130ad2e>] string.isra.3+0x3e/0xd0 [<ffffffff8130bf99>] vsnprintf+0x219/0x640 [<ffffffff8130c481>] vscnprintf+0x11/0x30 [<ffffffff81061585>] vprintk_emit+0x115/0x4f0 [<ffffffff81657bd5>] printk+0x61/0x63 [<ffffffffa048987f>] ieee80211_chswitch_done+0xaf/0xd0 [mac80211] [<ffffffffa04e7b34>] iwl_chswitch_done+0x34/0x40 [iwldvm] [<ffffffffa04f83c3>] iwlagn_commit_rxon+0x2a3/0xdc0 [iwldvm] [<ffffffffa04ebc50>] ? iwlagn_set_rxon_chain+0x180/0x2c0 [iwldvm] [<ffffffffa04e5e76>] iwl_set_mode+0x36/0x40 [iwldvm] [<ffffffffa04e5f0d>] iwlagn_mac_remove_interface+0x8d/0x1b0 [iwldvm] [<ffffffffa0459b3d>] ieee80211_do_stop+0x29d/0x7f0 [mac80211] This is because we nulify ctx->vif in iwlagn_mac_remove_interface() before calling some other functions that teardown interface. To fix just check ctx->vif on iwl_chswitch_done(). We should not call ieee80211_chswitch_done() as channel switch works were already canceled by mac80211 in ieee80211_do_stop() -> ieee80211_mgd_stop(). Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=979581 Reported-by: Lukasz Jagiello <jagiello.lukasz@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: adjust context, filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: - adjust context] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth	Johannes Berg
	commit 707aee401d2467baa785a697f40a6e2d9ee79ad5 upstream. The BT_CONFIG command that is sent to the device during startup will enable BT coex unless the module parameter turns it off, but on devices without Bluetooth this may cause problems, as reported in Redhat BZ 885407. Fix this by sending the BT_CONFIG command only when the device has Bluetooth. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> [bwh: Backported to 3.2: - Adjust filename - s/priv->lib/priv->cfg/] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: - s/priv->cfg/priv->shrd->cfg/] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: always copy first 16 bytes of commands	Johannes Berg
	commit 8a964f44e01ad3bbc208c3e80d931ba91b9ea786 upstream. The FH hardware will always write back to the scratch field in commands, even host commands not just TX commands, which can overwrite parts of the command. This is problematic if the command is re-used (with IWL_HCMD_DFL_NOCOPY) and can cause calibration issues. Address this problem by always putting at least the first 16 bytes into the buffer we also use for the command header and therefore make the DMA engine write back into this. For commands that are smaller than 16 bytes also always map enough memory for the DMA engine to write back to. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: - Adjust context - Drop the IWL_HCMD_DFL_DUP handling - Fix descriptor addresses and lengths for tracepoint, but otherwise leave it unchanged] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: handle DMA mapping failures	Johannes Berg
	commit 7c34158231b2eda8dcbd297be2bb1559e69cb433 upstream. The RX replenish code doesn't handle DMA mapping failures, which will cause issues if there actually is a failure. This was reported by Shuah Khan who found a DMA mapping framework warning ("device driver failed to check map error"). Reported-by: Shuah Khan <shuah.khan@hp.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: - Adjust filename, context, indentation - Use bus(trans) instead of trans where necessary - Use hw_params(trans).rx_page_order instead of trans_pcie->rx_page_order] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: - Adjust context - Use trans instead of bus(trans) - Use hw_params(trans).rx_page_order instead of trans_pcie->rx_page_order] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: don't handle masked interrupt	Emmanuel Grumbach
	commit 25a172655f837bdb032e451f95441bb4acec51bb upstream. This can lead to a panic if the driver isn't ready to handle them. Since our interrupt line is shared, we can get an interrupt at any time (and CONFIG_DEBUG_SHIRQ checks that even when the interrupt is being freed). If the op_mode has gone away, we musn't call it. To avoid this the transport disables the interrupts when the hw is stopped and the op_mode is leaving. If there is an event that would cause an interrupt the INTA register is updated regardless of the enablement of the interrupts: even if the interrupts are disabled, the INTA will be changed, but the device won't issue an interrupt. But the ISR can be called at any time, so we ought ignore the value in the INTA otherwise we can call the op_mode after it was freed. I found this bug when the op_mode_start failed, and called iwl_trans_stop_hw(trans, true). Then I played with the RFKILL button, and removed the module. While removing the module, the IRQ is freed, and the ISR is called (CONFIG_DEBUG_SHIRQ enabled). Panic. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: - Adjust context - Pass bus(trans), not trans, to iwl_{read,write}32()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: - adjust context - Pass trans to iwl_{read,write}32()}] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: protect SRAM debugfs	Johannes Berg
	commit 4fc79db178f0a0ede479b4713e00df2d106028b3 upstream. If the device is not started, we can't read its SRAM and attempting to do so will cause issues. Protect the debugfs read. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	iwlwifi: fix flow handler debug code	Johannes Berg
	commit 94543a8d4fb302817014981489f15cb3b92ec3c2 upstream. iwl_dbgfs_fh_reg_read() can cause crashes and/or BUG_ON in slub because the ifdefs are wrong, the code in iwl_dump_fh() should use DEBUGFS, not DEBUG to protect the buffer writing code. Also, while at it, clean up the arguments to the function, some code and make it generally safer. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> [bwh: Backported to 3.2: adjust filenames and context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: asihpi: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 60478295d6876619f8f47f6d1a5c25eaade69ee3 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	staging: line6: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 86f0b5b86d142b9323432fef078a6cf0fb5dda74 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ASoC: s6000: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 61be2b9a18ec70f3cbe3deef7a5f77869c71b5ae upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Mark Brown <broonie@linaro.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: pxa2xx: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 46f6c1aaf790be9ea3c8ddfc8f235a5f677d08e2 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Mark Brown <broonie@linaro.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: usx2y: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 5be1efb4c2ed79c3d7c0cbcbecae768377666e84 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: ua101: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 9538aa46c2427d6782aa10036c4da4c541605e0e upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: 6fire: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit 5b9ab3f7324a1b94a5a5a76d44cf92dfeb3b5e80 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ALSA: atiixp: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit cc7282b8d5abbd48c81d1465925d464d9e3eaa8f upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ASoC: sglt5000: Fix the default value of CHIP_SSS_CTRL	Fabio Estevam
	commit 016fcab8ff46fca29375d484226ec91932aa4a07 upstream. According to the sgtl5000 reference manual, the default value of CHIP_SSS_CTRL is 0x10. Reported-by: Oskar Schirmer <oskar@scara.com> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@linaro.org> [bwh: Backported to 3.2: format of register defaults array is different] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ASoC: imx-ssi: Fix occasional AC97 reset failure	Sascha Hauer
	commit b6e51600f4e983e757b1b6942becaa1ae7d82e67 upstream. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	SUNRPC: Prevent an rpc_task wakeup race	Trond Myklebust
	commit a3c3cac5d31879cd9ae2de7874dc6544ca704aec upstream. The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to be careful about ordering the calls to rpc_test_and_set_running(task) and rpc_clear_queued(task). If we get the order wrong, then we may end up testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped and changed the state of the rpc_task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	sunrpc: clarify comments on rpc_make_runnable	Jeff Layton
	commit 506026c3ec270e18402f0c9d33fee37482c23861 upstream. rpc_make_runnable is not generally called with the queue lock held, unless it's waking up a task that has been sitting on a waitqueue. This is safe when the task has not entered the FSM yet, but the comments don't really spell this out. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/events: mask events when changing their VCPU binding	David Vrabel
	commit 5e72fdb8d827560893642e85a251d339109a00f4 upstream. commit 4704fe4f03a5ab27e3c36184af85d5000e0f8a48 upstream. When a event is being bound to a VCPU there is a window between the EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks where an event may be lost. The hypervisor upcalls the new VCPU but the kernel thinks that event is still bound to the old VCPU and ignores it. There is even a problem when the event is being bound to the same VCPU as there is a small window beween the clear_bit() and set_bit() calls in bind_evtchn_to_cpu(). When scanning for pending events, the kernel may read the bit when it is momentarily clear and ignore the event. Avoid this by masking the event during the whole bind operation. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> [bwh: Backported to 3.2: remove the BM() cast] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/blkback: Check for insane amounts of request on the ring (v6).	Konrad Rzeszutek Wilk
	commit 9371cadbbcc7c00c81753b9727b19fb3bc74d458 upstream. commit 8e3f8755545cc4a7f4da8e9ef76d6d32e0dca576 upstream. Check that the ring does not have an insane amount of requests (more than there could fit on the ring). If we detect this case we will stop processing the requests and wait until the XenBus disconnects the ring. The existing check RING_REQUEST_CONS_OVERFLOW which checks for how many responses we have created in the past (rsp_prod_pvt) vs requests consumed (req_cons) and whether said difference is greater or equal to the size of the ring, does not catch this case. Wha the condition does check if there is a need to process more as we still have a backlog of responses to finish. Note that both of those values (rsp_prod_pvt and req_cons) are not exposed on the shared ring. To understand this problem a mini crash course in ring protocol response/request updates is in place. There are four entries: req_prod and rsp_prod; req_event and rsp_event to track the ring entries. We are only concerned about the first two - which set the tone of this bug. The req_prod is a value incremented by frontend for each request put on the ring. Conversely the rsp_prod is a value incremented by the backend for each response put on the ring (rsp_prod gets set by rsp_prod_pvt when pushing the responses on the ring). Both values can wrap and are modulo the size of the ring (in block case that is 32). Please see RING_GET_REQUEST and RING_GET_RESPONSE for the more details. The culprit here is that if the difference between the req_prod and req_cons is greater than the ring size we have a problem. Fortunately for us, the '__do_block_io_op' loop: rc = blk_rings->common.req_cons; rp = blk_rings->common.sring->req_prod; while (rc != rp) { .. blk_rings->common.req_cons = ++rc; /* before make_response() */ } will loop up to the point when rc == rp. The macros inside of the loop (RING_GET_REQUEST) is smart and is indexing based on the modulo of the ring size. If the frontend has provided a bogus req_prod value we will loop until the 'rc == rp' - which means we could be processing already processed requests (or responses) often. The reason the RING_REQUEST_CONS_OVERFLOW is not helping here is b/c it only tracks how many responses we have internally produced and whether we would should process more. The astute reader will notice that the macro RING_REQUEST_CONS_OVERFLOW provides two arguments - more on this later. For example, if we were to enter this function with these values: blk_rings->common.sring->req_prod = X+31415 (X is the value from the last time __do_block_io_op was called). blk_rings->common.req_cons = X blk_rings->common.rsp_prod_pvt = X The RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, blk_rings->common.req_cons) is doing: req_cons - rsp_prod_pvt >= 32 Which is, X - X >= 32 or 0 >= 32 And that is false, so we continue on looping (this bug). If we re-use said macro RING_REQUEST_CONS_OVERFLOW and pass in the rp instead (sring->req_prod) of rc, the this macro can do the check: req_prod - rsp_prov_pvt >= 32 Which is, X + 31415 - X >= 32 , or 31415 >= 32 which is true, so we can error out and break out of the function. Unfortunatly the difference between rsp_prov_pvt and req_prod can be at 32 (which would error out in the macro). This condition exists when the backend is lagging behind with the responses and still has not finished responding to all of them (so make_response has not been called), and the rsp_prov_pvt + 32 == req_cons. This ends up with us not being able to use said macro. Hence introducing a new macro called RING_REQUEST_PROD_OVERFLOW which does a simple check of: req_prod - rsp_prod_pvt > RING_SIZE And with the X values from above: X + 31415 - X > 32 Returns true. Also not that if the ring is full (which is where the RING_REQUEST_CONS_OVERFLOW triggered), we would not hit the same condition: X + 32 - X > 32 Which is false. Lets use that macro. Note that in v5 of this patchset the macro was different - we used an earlier version. [v1: Move the check outside the loop] [v2: Add a pr_warn as suggested by David] [v3: Use RING_REQUEST_CONS_OVERFLOW as suggested by Jan] [v4: Move wake_up after kthread_stop as suggested by Jan] [v5: Use RING_REQUEST_PROD_OVERFLOW instead] [v6: Use RING_REQUEST_PROD_OVERFLOW - Jan's version] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/io/ring.h: new macro to detect whether there are too many requests on ↵	Jan Beulich
	the ring commit 8d9256906a97c24e97e016482b9be06ea2532b05 upstream. Backends may need to protect themselves against an insane number of produced requests stored by a frontend, in case they iterate over requests until reaching the req_prod value. There can't be more requests on the ring than the difference between produced requests and produced (but possibly not yet published) responses. This is a more strict alternative to a patch previously posted by Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen-netback: don't disconnect frontend when seeing oversize packet	Wei Liu
	commit 03393fd5cc2b6cdeec32b704ecba64dbb0feae3c upstream. Some frontend drivers are sending packets > 64 KiB in length. This length overflows the length field in the first slot making the following slots have an invalid length. Turn this error back into a non-fatal error by dropping the packet. To avoid having the following slots having fatal errors, consume all slots in the packet. This does not reopen the security hole in XSA-39 as if the packet as an invalid number of slots it will still hit fatal error case. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen-netback: coalesce slots in TX path and fix regressions	Wei Liu
	commit 2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f upstream. This patch tries to coalesce tx requests when constructing grant copy structures. It enables netback to deal with situation when frontend's MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS. With the help of coalescing, this patch tries to address two regressions avoid reopening the security hole in XSA-39. Regression 1. The reduction of the number of supported ring entries (slots) per packet (from 18 to 17). This regression has been around for some time but remains unnoticed until XSA-39 security fix. This is fixed by coalescing slots. Regression 2. The XSA-39 security fix turning "too many frags" errors from just dropping the packet to a fatal error and disabling the VIF. This is fixed by coalescing slots (handling 18 slots when backend's MAX_SKB_FRAGS is 17) which rules out false positive (using 18 slots is legit) and dropping packets using 19 to `max_skb_slots` slots. To avoid reopening security hole in XSA-39, frontend sending packet using more than max_skb_slots is considered malicious. The behavior of netback for packet is thus: 1-18 slots: valid 19-max_skb_slots slots: drop and respond with an error max_skb_slots+ slots: fatal error max_skb_slots is configurable by admin, default value is 20. Also change variable name from "frags" to "slots" in netbk_count_requests. Please note that RX path still has dependency on MAX_SKB_FRAGS. This will be fixed with separate patch. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen-netback: fix sparse warning	stephen hemminger
	commit 9eaee8beeeb3bca0d9b14324fd9d467d48db784c upstream. Fix warning about 0 used as NULL. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU ↵	Konrad Rzeszutek Wilk
	online/offline commit 66ff0fe9e7bda8aec99985b24daad03652f7304e upstream. While we don't use the spinlock interrupt line (see for details commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 - xen: disable PV spinlocks on HVM) - we should still do the proper init / deinit sequence. We did not do that correctly and for the CPU init for PVHVM guest we would allocate an interrupt line - but failed to deallocate the old interrupt line. This resulted in leakage of an irq_desc but more importantly this splat as we online an offlined CPU: genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1) Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1 Call Trace: [<ffffffff811156de>] __setup_irq+0x23e/0x4a0 [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250 [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160 [<ffffffff81303758>] ? kasprintf+0x38/0x40 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff810cad35>] ? update_max_interval+0x15/0x40 [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78 [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33 [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70 [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff8109402b>] __cpu_notify+0x1b/0x30 [<ffffffff8166834a>] _cpu_up+0xa0/0x14b [<ffffffff816684ce>] cpu_up+0xd9/0xec [<ffffffff8165f754>] store_online+0x94/0xd0 [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20 [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170 [<ffffffff811a2864>] vfs_write+0xb4/0x130 [<ffffffff811a302a>] sys_write+0x5a/0xa0 [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b cpu 1 spinlock event irq -16 smpboot: Booting Node 0 Processor 1 APIC 0x2 And if one looks at the /proc/interrupts right after offlining (CPU1): 70: 0 0 xen-percpu-ipi spinlock0 71: 0 0 xen-percpu-ipi spinlock1 77: 0 0 xen-percpu-ipi spinlock2 There is the oddity of the 'spinlock1' still being present. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.	Konrad Rzeszutek Wilk
	commit 888b65b4bc5e7fcbbb967023300cd5d44dba1950 upstream. In the PVHVM path when we do CPU online/offline path we would leak the timer%d IRQ line everytime we do a offline event. The online path (xen_hvm_setup_cpu_clockevents via x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt line for the timer%d. But we would still use the old interrupt line leading to: kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261! invalid opcode: 0000 [#1] SMP RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270 .. snip.. <IRQ> [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0 [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0 [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240 [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70 [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0 [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40 [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80 <EOI> [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8 [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8 There is also the oddity (timer1) in the /proc/interrupts after offlining CPU1: 64: 1121 0 xen-percpu-virq timer0 78: 0 0 xen-percpu-virq timer1 84: 0 2483 xen-percpu-virq timer2 This patch fixes it. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	xen/boot: Disable BIOS SMP MP table search.	Konrad Rzeszutek Wilk
	commit bd49940a35ec7d488ae63bd625639893b3385b97 upstream. As the initial domain we are able to search/map certain regions of memory to harvest configuration data. For all low-level we use ACPI tables - for interrupts we use exclusively ACPI _PRT (so DSDT) and MADT for INT_SRC_OVR. The SMP MP table is not used at all. As a matter of fact we do not even support machines that only have SMP MP but no ACPI tables. Lets follow how Moorestown does it and just disable searching for BIOS SMP tables. This also fixes an issue on HP Proliant BL680c G5 and DL380 G6: 9f->100 for 1:1 PTE Freeing 9f-100 pfn range: 97 pages freed 1-1 mapping on 9f->100 .. snip.. e820: BIOS-provided physical RAM map: Xen: [mem 0x0000000000000000-0x000000000009efff] usable Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable .. snip.. Scan for SMP in [mem 0x00000000-0x000003ff] Scan for SMP in [mem 0x0009fc00-0x0009ffff] Scan for SMP in [mem 0x000f0000-0x000fffff] found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0] (XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0 (XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e() BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71 . snip.. Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5 .. snip.. Call Trace: [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248 [<ffffffff81624731>] ? printk+0x48/0x4a [<ffffffff81ad32ac>] early_ioremap+0x13/0x15 [<ffffffff81acc140>] get_mpc_size+0x2f/0x67 [<ffffffff81acc284>] smp_scan_config+0x10c/0x136 [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b [<ffffffff81624731>] ? printk+0x48/0x4a [<ffffffff81abca7f>] start_kernel+0x90/0x390 [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136 [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. which is that ioremap would end up mapping 0xff using _PAGE_IOMAP (which is what early_ioremap sticks as a flag) - which meant we would get MFN 0xFF (pte ff461, which is OK), and then it would also map 0x100 (b/c ioremap tries to get page aligned request, and it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page) as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP bypasses the P2M lookup we would happily set the PTE to 1000461. Xen would deny the request since we do not have access to the Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example 0x80140. Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665 Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	saa7134: Fix unlocked snd_pcm_stop() call	Takashi Iwai
	commit e6355ad7b1c6f70e2f48ae159f5658b441ccff95 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai <tiwai@suse.de> [wml: Backported to 3.4: Adjust filename] Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ext4: return ENOMEM if sb_getblk() fails	Theodore Ts'o
	commit 860d21e2c585f7ee8a4ecc06f474fdc33c9474f4 upstream. The only reason for sb_getblk() failing is if it can't allocate the buffer_head. So ENOMEM is more appropriate than EIO. In addition, make sure that the file system is marked as being inconsistent if sb_getblk() fails. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> [xr: Backported to 3.4: - Drop change to inline.c - Call to ext4_ext_check() from ext4_ext_find_extent() is conditional] Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	block: Don't access request after it might be freed	Roland Dreier
	commit 893d290f1d7496db97c9471bc352ad4a11dc8a25 upstream. After we've done __elv_add_request() and __blk_run_queue() in blk_execute_rq_nowait(), the request might finish and be freed immediately. Therefore checking if the type is REQ_TYPE_PM_RESUME isn't safe afterwards, because if it isn't, rq might be gone. Instead, check beforehand and stash the result in a temporary. This fixes crashes in blk_execute_rq_nowait() I get occasionally when running with lots of memory debugging options enabled -- I think this race is usually harmless because the window for rq to be reallocated is so small. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> [xr: Backported to 3.4: adjust context] Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	nbd: correct disconnect behavior	Paul Clements
	commit c378f70adbc1bbecd9e6db145019f14b2f688c7c upstream. Currently, when a disconnect is requested by the user (via NBD_DISCONNECT ioctl) the return from NBD_DO_IT is undefined (it is usually one of several error codes). This means that nbd-client does not know if a manual disconnect was performed or whether a network error occurred. Because of this, nbd-client's persist mode (which tries to reconnect after error, but not after manual disconnect) does not always work correctly. This change fixes this by causing NBD_DO_IT to always return 0 if a user requests a disconnect. This means that nbd-client can correctly either persist the connection (if an error occurred) or disconnect (if the user requested it). Signed-off-by: Paul Clements <paul.clements@steeleye.com> Acked-by: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [xr: Backported to 3.4: adjust context] Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	cifs: adjust sequence number downward after signing NT_CANCEL request	Jeff Layton
	commit 31efee60f489c759c341454d755a9fd13de8c03d upstream. When a call goes out, the signing code adjusts the sequence number upward by two to account for the request and the response. An NT_CANCEL however doesn't get a response of its own, it just hurries the server along to get it to respond to the original request more quickly. Therefore, we must adjust the sequence number back down by one after signing a NT_CANCEL request. Reported-by: Tim Perry <tdparmor-sambabugs@yahoo.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ext4: fix possible use-after-free with AIO	Jan Kara
	commit 091e26dfc156aeb3b73bc5c5f277e433ad39331c upstream. Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	UBIFS: fix double free of ubifs_orphan objects	Adam Thomas
	commit 8afd500cb52a5d00bab4525dd5a560d199f979b9 upstream. The last orphan in the dnext list has its dnext set to NULL. Because of that, ubifs_delete_orphan assumes that it is not on the dnext list and frees it immediately instead ignoring it as a second delete. The orphan is later freed again by erase_deleted. This change adds an explicit flag to ubifs_orphan indicating whether it is pending delete. Signed-off-by: Adam Thomas <adamthomas1111@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ext4/jbd2: don't wait (forever) for stale tid caused by wraparound	Theodore Ts'o
	commit d76a3a77113db020d9bb1e894822869410450bd9 upstream. In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: George Barnett <gbarnett@atlassian.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	ncpfs: fix rmdir returns Device or resource busy	Dave Chiluk
	commit 698b8223631472bf982ed570b0812faa61955683 upstream. 1d2ef5901483004d74947bbf78d5146c24038fe7 caused a regression in ncpfs such that directories could no longer be removed. This was because ncp_rmdir checked to see if a dentry could be unhashed before allowing it to be removed. Since 1d2ef5901483004d74947bbf78d5146c24038fe7 introduced a change that incremented dentry->d_count causing it to always be greater than 1 unhash would always fail. Thus causing the error path in ncp_rmdir to always be taken. Removing this error path is safe as unhashing is still accomplished by calls to dput from vfs_rmdir. Signed-off-by: Dave Chiluk <chiluk@canonical.com> Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	cifs: don't instantiate new dentries in readdir for inodes that need to be ↵	Jeff Layton
	revalidated immediately commit 757c4f6260febff982276818bb946df89c1105aa upstream. David reported that commit c2b93e06 (cifs: only set ops for inodes in I_NEW state) caused a regression with mfsymlinks. Prior to that patch, if a mfsymlink dentry was instantiated at readdir time, the inode would get a new set of ops when it was revalidated. After that patch, this did not occur. This patch addresses this by simply skipping instantiating dentries in the readdir codepath when we know that they will need to be immediately revalidated. The next attempt to use that dentry will cause a new lookup to occur (which is basically what we want to happen anyway). Cc: "Stefan (metze) Metzmacher" <metze@samba.org> Cc: Sachin Prabhu <sprabhu@redhat.com> Reported-and-Tested-by: David McBride <dwm37@cam.ac.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> [bwh: Backported to 3.2: need to return NULL] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	libceph: unregister request in __map_request failed and nofail == false	majianpeng
	commit 73d9f7eef3d98c3920e144797cc1894c6b005a1e upstream. For nofail == false request, if __map_request failed, the caller does cleanup work, like releasing the relative pages. It doesn't make any sense to retry this request. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Reviewed-by: Sage Weil <sage@inktank.com> [bwh: Backported to 3.2: adjust indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	fuse: hotfix truncate_pagecache() issue	Maxim Patlasov
	commit 06a7c3c2781409af95000c60a5df743fd4e2f8b4 upstream. The way how fuse calls truncate_pagecache() from fuse_change_attributes() is completely wrong. Because, w/o i_mutex held, we never sure whether 'oldsize' and 'attr->size' are valid by the time of execution of truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we released fc->lock in the middle of fuse_change_attributes(), we completely loose control of actions which may happen with given inode until we reach truncate_pagecache. The list of potentially dangerous actions includes mmap-ed reads and writes, ftruncate(2) and write(2) extending file size. The typical outcome of doing truncate_pagecache() with outdated arguments is data corruption from user point of view. This is (in some sense) acceptable in cases when the issue is triggered by a change of the file on the server (i.e. externally wrt fuse operation), but it is absolutely intolerable in scenarios when a single fuse client modifies a file without any external intervention. A real life case I discovered by fsx-linux looked like this: 1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends FUSE_SETATTR to the server synchronously, but before getting fc->lock ... 2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP to the server synchronously, then calls fuse_change_attributes(). The latter updates i_size, releases fc->lock, but before comparing oldsize vs attr->size.. 3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and updating attributes and i_size, but now oldsize is equal to outarg.attr.size because i_size has just been updated (step 2). Hence, fuse_do_setattr() returns w/o calling truncate_pagecache(). 4. As soon as ftruncate(2) completes, the user extends file size by write(2) making a hole in the middle of file, then reads data from the hole either by read(2) or mmap-ed read. The user expects to get zero data from the hole, but gets stale data because truncate_pagecache() is not executed yet. The scenario above illustrates one side of the problem: not truncating the page cache even though we should. Another side corresponds to truncating page cache too late, when the state of inode changed significantly. Theoretically, the following is possible: 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size changed (due to our own fuse_do_setattr()) and is going to call truncate_pagecache() for some 'new_size' it believes valid right now. But by the time that particular truncate_pagecache() is called ... 2. fuse_do_setattr() returns (either having called truncate_pagecache() or not -- it doesn't matter). 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2). 4. mmap-ed write makes a page in the extended region dirty. The result will be the lost of data user wrote on the fourth step. The patch is a hotfix resolving the issue in a simplistic way: let's skip dangerous i_size update and truncate_pagecache if an operation changing file size is in progress. This simplistic approach looks correct for the cases w/o external changes. And to handle them properly, more sophisticated and intrusive techniques (e.g. NFS-like one) would be required. I'd like to postpone it until the issue is well discussed on the mailing list(s). Changed in v2: - improved patch description to cover both sides of the issue. Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> [bwh: Backported to 3.2: add the fuse_inode::state field which we didn't have] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	fuse: readdir: check for slash in names	Miklos Szeredi
	commit efeb9e60d48f7778fdcad4a0f3ad9ea9b19e5dfd upstream. Userspace can add names containing a slash character to the directory listing. Don't allow this as it could cause all sorts of trouble. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> [bwh: Backported to 3.2: drop changes to parse_dirplusfile() which we don't have] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	nilfs2: fix issue with race condition of competition between segments for ↵	Vyacheslav Dubeyko
	dirty blocks commit 7f42ec3941560f0902fe3671e36f2c20ffd3af0a upstream. Many NILFS2 users were reported about strange file system corruption (for example): NILFS: bad btree node (blocknr=185027): level = 0, flags = 0x0, nchildren = 768 NILFS error (device sda4): nilfs_bmap_last_key: broken bmap (inode number=11540) But such error messages are consequence of file system's issue that takes place more earlier. Fortunately, Jerome Poulin <jeromepoulin@gmail.com> and Anton Eliasson <devel@antoneliasson.se> were reported about another issue not so recently. These reports describe the issue with segctor thread's crash: BUG: unable to handle kernel paging request at 0000000000004c83 IP: nilfs_end_page_io+0x12/0xd0 [nilfs2] Call Trace: nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] nilfs_segctor_construct+0x17b/0x290 [nilfs2] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] kthread+0xc0/0xd0 ret_from_fork+0x7c/0xb0 These two issues have one reason. This reason can raise third issue too. Third issue results in hanging of segctor thread with eating of 100% CPU. REPRODUCING PATH: One of the possible way or the issue reproducing was described by Jermoe me Poulin <jeromepoulin@gmail.com>: 1. init S to get to single user mode. 2. sysrq+E to make sure only my shell is running 3. start network-manager to get my wifi connection up 4. login as root and launch "screen" 5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies. 6. lscp \| xz -9e > lscp.txt.xz 7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs 8. start a screen to dump /proc/kmsg to text file since rsyslog is killed 9. start a screen and launch strace -f -o find-cat.log -t find /mnt/nilfs -type f -exec cat {} > /dev/null \; 10. start a screen and launch strace -f -o apt-get.log -t apt-get update 11. launch the last command again as it did not crash the first time 12. apt-get crashes 13. ps aux > ps-aux-crashed.log 13. sysrq+W 14. sysrq+E wait for everything to terminate 15. sysrq+SUSB Simplified way of the issue reproducing is starting kernel compilation task and "apt-get update" in parallel. REPRODUCIBILITY: The issue is reproduced not stable [60% - 80%]. It is very important to have proper environment for the issue reproducing. The critical conditions for successful reproducing: (1) It should have big modified file by mmap() way. (2) This file should have the count of dirty blocks are greater that several segments in size (for example, two or three) from time to time during processing. (3) It should be intensive background activity of files modification in another thread. INVESTIGATION: First of all, it is possible to see that the reason of crash is not valid page address: NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 NILFS [nilfs_segctor_complete_write]:2101 segbuf->sb_segnum 6783 Moreover, value of b_page (0x1a82) is 6786. This value looks like segment number. And b_blocknr with b_size values look like block numbers. So, buffer_head's pointer points on not proper address value. Detailed investigation of the issue is discovered such picture: [-----------------------------SEGMENT 6783-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111149024, segbuf->sb_segnum 6783 [-----------------------------SEGMENT 6784-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff8802174a6798, bh->b_assoc_buffers.prev ffff880221cffee8 NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111150080, segbuf->sb_segnum 6784, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111164416, segbuf->sb_segnum 6784, segbuf->sb_nbio 15 [-----------------------------SEGMENT 6785-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff880219277e80, bh->b_assoc_buffers.prev ffff880221cffc88 NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6785 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111165440, segbuf->sb_segnum 6785, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111177728, segbuf->sb_segnum 6785, segbuf->sb_nbio 12 NILFS [nilfs_segctor_do_construct]:2399 nilfs_segctor_wait NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6783 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6785 NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 BUG: unable to handle kernel paging request at 0000000000001a82 IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] Usually, for every segment we collect dirty files in list. Then, dirty blocks are gathered for every dirty file, prepared for write and submitted by means of nilfs_segbuf_submit_bh() call. Finally, it takes place complete write phase after calling nilfs_end_bio_write() on the block layer. Buffers/pages are marked as not dirty on final phase and processed files removed from the list of dirty files. It is possible to see that we had three prepare_write and submit_bio phases before segbuf_wait and complete_write phase. Moreover, segments compete between each other for dirty blocks because on every iteration of segments processing dirty buffer_heads are added in several lists of payload_buffers: [SEGMENT 6784]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 [SEGMENT 6785]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 The next pointer is the same but prev pointer has changed. It means that buffer_head has next pointer from one list but prev pointer from another. Such modification can be made several times. And, finally, it can be resulted in various issues: (1) segctor hanging, (2) segctor crashing, (3) file system metadata corruption. FIX: This patch adds: (1) setting of BH_Async_Write flag in nilfs_segctor_prepare_write() for every proccessed dirty block; (2) checking of BH_Async_Write flag in nilfs_lookup_dirty_data_buffers() and nilfs_lookup_dirty_node_buffers(); (3) clearing of BH_Async_Write flag in nilfs_segctor_complete_write(), nilfs_abort_logs(), nilfs_forget_buffer(), nilfs_clear_dirty_page(). Reported-by: Jerome Poulin <jeromepoulin@gmail.com> Reported-by: Anton Eliasson <devel@antoneliasson.se> Cc: Paul Fertser <fercerpav@gmail.com> Cc: ARAI Shun-ichi <hermes@ceres.dti.ne.jp> Cc: Piotr Szymaniak <szarpaj@grubelek.pl> Cc: Juan Barry Manuel Canham <Linux@riotingpacifist.net> Cc: Zahid Chowdhury <zahid.chowdhury@starsolutions.com> Cc: Elmer Zhang <freeboy6716@gmail.com> Cc: Kenneth Langga <klangga@gmail.com> Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: nilfs_clear_dirty_page() has not been separated from nilfs_clear_dirty_pages()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	perf tools: Fix cache event name generation	Jiri Olsa
	commit 275ef3878f698941353780440fec6926107a320b upstream. If the event name is specified with all 3 components, the last one overwrites the previous one during the name composing within the parse_events_add_cache function. Fixing this by properly adjusting the string index. Reported-by: Joel Uckelman <joel@lightboxtechnologies.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joel Uckelman <joel@lightboxtechnologies.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LPU-Reference: 20120905175133.GA18352@krava.brq.redhat.com [ committer note: Remove the newline fix, done already in 42e1fb7 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Vinson Lee <vlee@twopensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	perf tools: Remove extraneous newline when parsing hardware cache events	Arnaldo Carvalho de Melo
	commit 42e1fb776087713b5482cd7cf6cac998fbdd6544 upstream. Noticed while developing a 'perf test' entry to verify that perf_evsel__name works. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xz6zgh38mp3cjnd2udh38z8f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Vinson Lee <vlee@twopensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-11	mm/hotplug: correctly add new zone to all other nodes' zone lists	Jiang Liu
	commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream. When online_pages() is called to add new memory to an empty zone, it rebuilds all zone lists by calling build_all_zonelists(). But there's a bug which prevents the new zone to be added to other nodes' zone lists. online_pages() { build_all_zonelists() ..... node_set_state(zone_to_nid(zone), N_HIGH_MEMORY) } Here the node of the zone is put into N_HIGH_MEMORY state after calling build_all_zonelists(), but build_all_zonelists() only adds zones from nodes in N_HIGH_MEMORY state to the fallback zone lists. build_all_zonelists() ->__build_all_zonelists() ->build_zonelists() ->find_next_best_node() ->for_each_node_state(n, N_HIGH_MEMORY) So memory in the new zone will never be used by other nodes, and it may cause strange behavor when system is under memory pressure. So put node into N_HIGH_MEMORY state before calling build_all_zonelists(). Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Keping Chen <chenkeping@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>