linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2009-12-18	fix csum_ipv6_magic()	Jiri Bohac
	commit 5afe18d2f58812f3924edbd215464e5e3e8545e7 upstream. The 32-bit parameters (len and csum) of csum_ipv6_magic() are passed in 64-bit registers in2 and in4. The high order 32 bits of the registers were never cleared, and garbage was sometimes calculated into the checksum. Fix this by clearing the high order 32 bits of these registers. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Dennis Schridde <devurandom@gmx.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-23	Build fix for __early_pfn_to_nid() undefined link error	Tony Luck
	commit 334f85b647bc46ff4d27ace55aa65f44d6a2f4db upstream. ia64 only defines __early_pfn_to_nid() for SPARSEMEM && NUMA configurations, so the recent: commit: f2dbcfa738368c8a40d4a5f0b65dc9879577cb21 mm: clean up for early_pfn_to_nid() ends up with some link problems for certain configuration files. Fix arch/ia64/Kconfig to only define HAVE_ARCH_EARLY_PFN_TO_NID in the cases where we do provide this function. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-16	mm: fix memmap init for handling memory hole	KAMEZAWA Hiroyuki
	commit cc2559bccc72767cb446f79b071d96c30c26439b upstream. Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole. and memmap initialization was not done. This was a trouble for sparc boot. To fix this, the PFN should be initialized and marked as PG_reserved. This patch changes early_pfn_in_nid() return true if PFN is a hole. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reported-by: David Miller <davem@davemlloft.net> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-16	mm: clean up for early_pfn_to_nid()	KAMEZAWA Hiroyuki
	commit f2dbcfa738368c8a40d4a5f0b65dc9879577cb21 upstream. What's happening is that the assertion in mm/page_alloc.c:move_freepages() is triggering: BUG_ON(page_zone(start_page) != page_zone(end_page)); Once I knew this is what was happening, I added some annotations: if (unlikely(page_zone(start_page) != page_zone(end_page))) { printk(KERN_ERR "move_freepages: Bogus zones: " "start_page[%p] end_page[%p] zone[%p]\n", start_page, end_page, zone); printk(KERN_ERR "move_freepages: " "start_zone[%p] end_zone[%p]\n", page_zone(start_page), page_zone(end_page)); printk(KERN_ERR "move_freepages: " "start_pfn[0x%lx] end_pfn[0x%lx]\n", page_to_pfn(start_page), page_to_pfn(end_page)); printk(KERN_ERR "move_freepages: " "start_nid[%d] end_nid[%d]\n", page_to_nid(start_page), page_to_nid(end_page)); ... And here's what I got: move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00] move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00] move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff] move_freepages: start_nid[1] end_nid[0] My memory layout on this box is: [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0x00000000 -> 0x0081ff5d [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[8] active PFN ranges [ 0.000000] 0: 0x00000000 -> 0x00020000 [ 0.000000] 1: 0x00800000 -> 0x0081f7ff [ 0.000000] 1: 0x0081f800 -> 0x0081fe50 [ 0.000000] 1: 0x0081fed1 -> 0x0081fed8 [ 0.000000] 1: 0x0081feda -> 0x0081fedb [ 0.000000] 1: 0x0081fedd -> 0x0081fee5 [ 0.000000] 1: 0x0081fee7 -> 0x0081ff51 [ 0.000000] 1: 0x0081ff59 -> 0x0081ff5d So it's a block move in that 0x81f600-->0x81f7ff region which triggers the problem. This patch: Declaration of early_pfn_to_nid() is scattered over per-arch include files, and it seems it's complicated to know when the declaration is used. I think it makes fix-for-memmap-init not easy. This patch moves all declaration to include/linux/mm.h After this, if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use static definition in include/linux/mm.h else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use generic definition in mm/page_alloc.c else -> per-arch back end function will be called. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: David Miller <davem@davemlloft.net> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-12	PCI: return error on failure to read PCI ROMs	Timothy S. Nelson
	commit 97c44836cdec1ea713a15d84098a1a908157e68f upstream. This patch makes the ROM reading code return an error to user space if the size of the ROM read is equal to 0. The patch also emits a warnings if the contents of the ROM are invalid, and documents the effects of the "enable" file on ROM reading. Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au> Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-06	ACPI: Change acpi_evaluate_integer to support 64-bit on 32-bit kernels	Matthew Wilcox
	commit 27663c5855b10af9ec67bc7dfba001426ba21222 upstream As of version 2.0, ACPI can return 64-bit integers. The current acpi_evaluate_integer only supports 64-bit integers on 64-bit platforms. Change the argument to take a pointer to an acpi_integer so we support 64-bit integers on all platforms. lenb: replaced use of "acpi_integer" with "unsigned long long" lenb: fixed bug in acpi_thermal_trips_update() Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	IA64: Turn on CONFIG_HAVE_UNSTABLE_CLOCK	Tony Luck
	commit 0773a6cf673316440999752e23f8c3d4f85e48b9 upstream. sched_clock() on ia64 is based on ar.itc, so is never completely synchronized between cpus. On some platforms (e.g. certain models of SGI Altix) it may be running at radically different frequencies. Based on a patch from Dimitri Sivanich which set this just for SN2 && GENERIC kernels ... it is needed for all ia64 machines. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-18	Remove __attribute__((weak)) from sys_pipe/sys_pipe2	Heiko Carstens
	commit 1134723e96f6e2abcf8bfd7a2d1c96fcc323ef35 upstream. Remove __attribute__((weak)) from common code sys_pipe implemantation. IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations with the same name. Just rename them. For sys_pipe2 there is no architecture specific implementation. Cc: Richard Henderson <rth@twiddle.net> Cc: David S. Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-12-05	IA64: fix boot panic caused by offline CPUs	Doug Chapman
	commit 62ee0540f5e5a804b79cae8b3c0185a85f02436b upstream. This fixes a regression introduced by 2c6e6db41f01b6b4eb98809350827c9678996698 "Minimize per_cpu reservations." That patch incorrectly used information about what CPUs are possible that was not yet initialized by ACPI. The end result was that per_cpu structures for offline CPUs were not initialized causing a NULL pointer reference. Since we cannot do the full acpi_boot_init() call any earlier, the simplest fix is to just parse the MADT for SAPIC entries early to find the CPU info. This should also allow for some cleanup of the code added by the "Minimize per_cpu reservations". This patch just fixes the regressions, the cleanup will come in a later patch. Signed-off-by: Doug Chapman <doug.chapman@hp.com> Signed-off-by: Alex Chiang <achiang@hp.com> CC: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-09-29	[IA64] Put the space for cpu0 per-cpu area into .data section	Tony Luck
	Initial fix for making sure that we can access percpu variables in all C code (commit: 10617bbe84628eb18ab5f723d3ba35005adde143) inadvertantly allocated the memory in the "percpu" section of the vmlinux ELF executable. This confused kexec/dump. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-22	[IA64] kexec fails on systems with blocks of uncached memory	Jay Lan
	Currently a memory segment in memory map with attribute of EFI_MEMORY_UC is denoted as "System RAM" in /proc/iomem, while memory of attribute (EFI_MEMORY_WB\|EFI_MEMORY_UC) is also labeled the same. The kexec utility then includes uncached memory as part of vmcore. The kdump kernel MCA'ed when it tries to save the vmcore to a disk. A normal "cached" access may cause MCAs. This patch would label memory with attribute of EFI_MEMORY_UC only as "Uncached RAM" so that kexec would know not to include it in the vmcore. I will submit a separate kexec-tools patch to the kexec list. Signed-off-by: Jay Lan <jlan@sgi.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-22	[IA64] Ski simulator doesn't need check_sal_cache_flush	Alex Chiang
	Peter Chubb reported that commit 3463a93def55c309f3c0d0a8aaf216be3be42d64 (Update check_sal_cache_flush to use platform_send_ipi()) broke Ski because it does not implement IPIs. Tony Luck suggested we just #ifndef out the call (since the simulator does not have the SAL bug that this code is attempting to detect and workaround) Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-19	KVM: ia64: 'struct fdesc' build fix	Jes Sorensen
	Commit 4611a77 ("[IA64] fix compile failure with non modular builds") introduced struct fdesc into asm/elf.h, which duplicates KVM's definition. Remove the latter to avoid the build error. Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2008-09-10	[IA64] prevent ia64 from invoking irq handlers on offline CPUs	Paul E. McKenney
	Make ia64 refrain from clearing a given to-be-offlined CPU's bit in the cpu_online_mask until it has processed pending irqs. This change prevents other CPUs from being blindsided by an apparently offline CPU nevertheless changing globally visible state. Also remove the existing redundant cpu_clear(cpu, cpu_online_map). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-10	[IA64] arch/ia64/sn/pci/tioca_provider.c: introduce missing kfree	Julia Lawall
	Error handling code following a kmalloc should free the allocated data. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-10	[IA64] fix up bte.h	Robin Holt
	bte.h expects a #define of L1_CACHE_MASK which is currently only in bte.c. This small patch gets bte.h to include cleanly and makes BTE_UNALIGNED_COPY not report errors. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-10	[IA64] fix compile failure with non modular builds	James Bottomley
	Broke the non modular builds by moving an essential function into modules.c. Fix this by moving it out again and into asm/sections.h as an inline. To do this, the definitions of struct fdesc and struct got_val have been lifted out of modules.c and put in asm/elf.h where they belong. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-09-09	lib: Correct printk %pF to work on all architectures	James Bottomley
	It was introduced by "vsprintf: add support for '%pS' and '%pF' pointer formats" in commit 0fe1ef24f7bd0020f29ffe287dfdb9ead33ca0b2. However, the current way its coded doesn't work on parisc64. For two reasons: 1) parisc isn't in the #ifdef and 2) parisc has a different format for function descriptors Make dereference_function_descriptor() more accommodating by allowing architecture overrides. I put the three overrides (for parisc64, ppc64 and ia64) in arch/kernel/module.c because that's where the kernel internal linker which knows how to deal with function descriptors sits. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-25	[IA64] Fix __{in,out}s{w,l} to handle unaligned data	James Bottomley
	Some ia64 systems produce several repeats of kernel messages like this: kernel unaligned access to 0xe000000644220466, ip=0xa000000100516fa1 This was tracked to ide code using the __cmd[] field in "struct request" via the __outsw() function. __cmd[] is a char array, so is not guaranteed to be properly aligned when accessed as words. Tested-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-25	[IA64] Fix ia64 build failure when CONFIG_SFC=m	Robin Holt
	CONFIG_SFC=m uses topology_core_siblings() which, for ia64, expects cpu_core_map to be exported. It is not. This patch exports the needed symbol. Maintainers note: This really looks like the wrong thing to do ... it would be much better for the kernel to export an API to provide drivers like this with data they need (which in the case of this driver seems to be an estimate of the effective parallelism available on the platform). But x86 has exported this forever ... so go with the flow until such an API is defined. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-18	[IA64] use generic compat_old_sys_readdir	Christoph Hellwig
	Switch ia64 to the generic compat_sys_old_readdir which is identical except for slightly better error handling. Also remove sys32_getdents which already isn't wired up to the syscall table anymore in favour of compat_sys_getdents. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-18	[IA64] pci_acpi_scan_root cleanup	Luck, Tony
	The code walks all the acpi _CRS methods to see how many windows to allocate. It then scans them all again to insert_resource() for each even if the first scan found that there were none. Move the second scan inside the "if (windows)" clause. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-18	[IA64] Shrink shadow_flush_counts to a short array to save 8k of per_cpu area.	Robin Holt
	Making allmodconfig will break the current build. This patch shrinks the per_cpu__shadow_flush_counts from 16k to 8k which frees enough space to allow allmodconfig to successfully complete. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=11338 Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-18	[IA64] Remove sn2_defconfig.	Robin Holt
	Not really a patch as much as a remove this file request. Now that generic_defconfig supports all the configurations SGI currently supports and has NR_CPUS and NR_NODES at our largest configurations, we have no reason to maintain the extra defconfig file. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-15	kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE	Huang Ying
	Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control page is used for not only code on some platform. For example in kexec jump, it is used for data and stack too. [akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion] Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12	[IA64] use bcd2bin/bin2bcd	Adrian Bunk
	This patch changes ia64 to use the new bcd2bin/bin2bcd functions instead of the obsolete BCD2BIN/BIN2BCD macros. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-12	[IA64] Ensure cpu0 can access per-cpu variables in early boot code	Tony Luck
	ia64 handles per-cpu variables a litle differently from other architectures in that it maps the physical memory allocated for each cpu at a constant virtual address (0xffffffffffff0000). This mapping is not enabled until the architecture specific cpu_init() function is run, which causes problems since some generic code is run before this point. In particular when CONFIG_PRINTK_TIME is enabled, the boot cpu will trap on the access to per-cpu memory at the first printk() call so the boot will fail without the kernel printing anything to the console. Fix this by allocating percpu memory for cpu0 in the kernel data section and doing all initialization to enable percpu access in head.S before calling any generic code. Other cpus must take care not to access per-cpu variables too early, but their code path from start_secondary() to cpu_init() is all in arch/ia64 Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] Update generic config	Tony Luck
	Changes to support a new platform in my lab. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] Eliminate trailing backquote in IA64_SGI_UV	Jack Steiner
	Eliminate trailing backquote in IA64_SGI_UV config. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] update generic_defconfig to support sn2.	Robin Holt
	This patch changes the generic_defconfig so it works on all sn2 platforms I have access to. There is only one support configuration which was not tested and that configuration is only a combination of two tested configurations. With this patchset applied, a generic kernel can be booted on either a RHEL 5.2, RHEL5.3, or SLES10 SP1 root and operate. All features needed by SGI's ProPack are also working. I have not tested all features of RHEL or SLES, but they do at least boot. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] update generic_defconfig for 2.6.27-rc1	Robin Holt
	This patch updates the generic_defconfig for 2.6.27-rc1 by simply doing a make oldconfig and holding down the carriage return. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] Allow ia64 to CONFIG_NR_CPUS up to 4096	Robin Holt
	ia64 has compiled with NR_CPUS=4096 for a couple releases, just forgot to update Kconfig to allow it. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] Cleanup generated file not ignored by .gitignore	Robin Holt
	arch/ia64/kernel/vmlinux.lds is a generated file. Tell git to ignore it. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-04	[IA64] pv_ops: fix ivt.S paravirtualization	Isaku Yamahata
	Recent kernels are not booting on some HP systems (though it does boot on others). James and Willy reported the problem. James did the bisection to find the commit that caused the problem: 498c5170472ff0c03a29d22dbd33225a0be038f4. [IA64] pvops: paravirtualize ivt.S Two instructions were wrongly paravirtualized such that _FROM_ macro had been used where _TO_ was intended Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-01	[IA64] Move include/asm-ia64 to arch/ia64/include/asm	Tony Luck
	After moving the the include files there were a few clean-ups: 1) Some files used #include <asm-ia64/xyz.h>, changed to <asm/xyz.h> 2) Some comments alerted maintainers to look at various header files to make matching updates if certain code were to be changed. Updated these comments to use the new include paths. 3) Some header files mentioned their own names in initial comments. Just deleted these self references. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-30	GRU Driver: hardware data structures	Jack Steiner
	This series of patches adds a driver for the SGI UV GRU. The driver is still in development but it currently compiles for both x86_64 & IA64. All simple regression tests pass on IA64. Although features remain to be added, I'd like to start the process of getting the driver into the kernel. Additional kernel drivers will depend on services provide by the GRU driver. The GRU is a hardware resource located in the system chipset. The GRU contains memory that is mmaped into the user address space. This memory is used to communicate with the GRU to perform functions such as load/store, scatter/gather, bcopy, AMOs, etc. The GRU is directly accessed by user instructions using user virtual addresses. GRU instructions (ex., bcopy) use user virtual addresses for operands. The GRU contains a large TLB that is functionally very similar to processor TLBs. Because the external contains a TLB with user virtual address, it requires callouts from the core VM system when certain types of changes are made to the process page tables. There are several MMUOPS patches currently being discussed but none has been accepted into the kernel. The GRU driver is built using version V18 from Andrea Arcangeli. This patch: Contains the definitions of the hardware GRU data structures that are used by the driver to manage the GRU. [akpm@linux-foundation;org: export hpage_shift] Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-27	KVM: ia64: Fix irq disabling leak in error handling code	Julia Lawall
	There is a call to local_irq_restore in the normal exit case, so it would seem that there should be one on an error return as well. The semantic patch that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression l; expression E,E1,E2; @@ local_irq_save(l); ... when != local_irq_restore(l) when != spin_unlock_irqrestore(E,l) when any when strict ( if (...) { ... when != local_irq_restore(l) when != spin_unlock_irqrestore(E1,l) + local_irq_restore(l); return ...; } \| if (...) + {local_irq_restore(l); return ...; + } \| spin_unlock_irqrestore(E2,l); \| local_irq_restore(l); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-26	tracehook: wait_task_inactive	Roland McGrath
	This extends wait_task_inactive() with a new argument so it can be used in a "soft" mode where it will check for the task changing state unexpectedly and back off. There is no change to existing callers. This lays the groundwork to allow robust, noninvasive tracing that can try to sample a blocked thread but back off safely if it wakes up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26	dma-mapping: add the device argument to dma_mapping_error()	FUJITA Tomonori
	Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25	Merge branch 'release' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Wire up new system calls
2008-07-25	kprobes: improve kretprobe scalability with hashed locking	Srinivasa D S
	Currently list of kretprobe instances are stored in kretprobe object (as used_instances,free_instances) and in kretprobe hash table. We have one global kretprobe lock to serialise the access to these lists. This causes only one kretprobe handler to execute at a time. Hence affects system performance, particularly on SMP systems and when return probe is set on lot of functions (like on all systemcalls). Solution proposed here gives fine-grain locks that performs better on SMP system compared to present kretprobe implementation. Solution: 1) Instead of having one global lock to protect kretprobe instances present in kretprobe object and kretprobe hash table. We will have two locks, one lock for protecting kretprobe hash table and another lock for kretporbe object. 2) We hold lock present in kretprobe object while we modify kretprobe instance in kretprobe object and we hold per-hash-list lock while modifying kretprobe instances present in that hash list. To prevent deadlock, we never grab a per-hash-list lock while holding a kretprobe lock. 3) We can remove used_instances from struct kretprobe, as we can track used instances of kretprobe instances using kretprobe hash table. Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system with return probes set on all systemcalls looks like this. cacheline non-cacheline Un-patched kernel aligned patch aligned patch =============================================================================== real 9m46.784s 9m54.412s 10m2.450s user 40m5.715s 40m7.142s 40m4.273s sys 2m57.754s 2m58.583s 3m17.430s =========================================================== Time duration for kernel compilation ("make -j 8) on the same system, when kernel is not probed. ========================= real 9m26.389s user 40m8.775s sys 2m7.283s ========================= Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25	[IA64] Wire up new system calls	Tony Luck
	Six new system calls: signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-24	flag parameters: pipe	Ulrich Drepper
	This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	bootmem: replace node_boot_start in struct bootmem_data	Johannes Weiner
	Almost all users of this field need a PFN instead of a physical address, so replace node_boot_start with node_min_pfn. [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()] Signed-off-by: Johannes Weiner <hannes@saeureba.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	hugetlb: introduce pud_huge	Andi Kleen
	Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	hugetlb: modular state for hugetlb page size	Andi Kleen
	The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	mm: remove double indirection on tlb parameter to free_pgd_range() & Co	Jan Beulich
	The double indirection here is not needed anywhere and hence (at least) confusing. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	mm: move bootmem descriptors definition to a single place	Johannes Weiner
	There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-21	sysdev: Pass the attribute to the low level sysdev show/store function	Andi Kleen
	This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21	Merge branch 'release' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits) mmtimer: Push BKL down into the ioctl handler [IA64] Remove experimental status of kdump [IA64] Update ia64 mmr list for SGI uv [IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic() [IA64] adding parameter check to module_free() [IA64] improper printk format in acpi-cpufreq [IA64] pv_ops: move some functions in ivt.S to avoid lack of space. [IA64] pvops: documentation on ia64/pv_ops [IA64] pvops: add to hooks, pv_time_ops, for steal time accounting. [IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations. [IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic. [IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment. [IA64] pvops: paravirtualize NR_IRQS [IA64] pvops: paravirtualize entry.S [IA64] pvops: paravirtualize ivt.S [IA64] pvops: paravirtualize minstate.h. [IA64] pvops: define paravirtualized instructions for native. [IA64] pvops: preparation for paravirtulization of hand written assembly code. [IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions. [IA64] pvops: add an early setup hook for pv_ops. ...