aboutsummaryrefslogtreecommitdiff
path: root/include/asm-powerpc
AgeCommit message (Collapse)Author
2008-07-01powerpc: Add VSX assembler code macrosMichael Neuling
This adds the macros for the VSX load/store instruction as most binutils are not going to support this for a while. Also add VSX register save/restore macros and vsr[0-63] register definitions. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Add VSX CPU featureMichael Neuling
Add a VSX CPU feature. Also add code to detect if VSX is available from the device tree. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Introduce VSX thread_struct and CONFIG_VSXMichael Neuling
The layout of the new VSR registers and how they overlap on top of the legacy FPR and VR registers is: VSR doubleword 0 VSR doubleword 1 ---------------------------------------------------------------- VSR[0] | FPR[0] | | ---------------------------------------------------------------- VSR[1] | FPR[1] | | ---------------------------------------------------------------- | ... | | | ... | | ---------------------------------------------------------------- VSR[30] | FPR[30] | | ---------------------------------------------------------------- VSR[31] | FPR[31] | | ---------------------------------------------------------------- VSR[32] | VR[0] | ---------------------------------------------------------------- VSR[33] | VR[1] | ---------------------------------------------------------------- | ... | | ... | ---------------------------------------------------------------- VSR[62] | VR[30] | ---------------------------------------------------------------- VSR[63] | VR[31] | ---------------------------------------------------------------- VSX has 64 128bit registers. The first 32 regs overlap with the FP registers and hence extend them with and additional 64 bits. The second 32 regs overlap with the VMX registers. This commit introduces the thread_struct changes required to reflect this register layout. Ptrace and signals code is updated so that the floating point registers are correctly accessed from the thread_struct when CONFIG_VSX is enabled. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Add macros to access floating point registers in thread_struct.Michael Neuling
We are going to change where the floating point registers are stored in the thread_struct, so in preparation add some macros to access the floating point registers. Update all code to use these new macros. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Introduce infrastructure for feature sections with alternativesMichael Ellerman
The current feature section logic only supports nop'ing out code, this means if you want to choose at runtime between instruction sequences, one or both cases will have to execute the nop'ed out contents of the other section, eg: BEGIN_FTR_SECTION or 1,1,1 END_FTR_SECTION_IFSET(FOO) BEGIN_FTR_SECTION or 2,2,2 END_FTR_SECTION_IFCLR(FOO) and the resulting code will be either, or 1,1,1 nop or, nop or 2,2,2 For small code segments this is fine, but for larger code blocks and in performance criticial code segments, it would be nice to avoid the nops. This commit starts to implement logic to allow the following: BEGIN_FTR_SECTION or 1,1,1 FTR_SECTION_ELSE or 2,2,2 ALT_FTR_SECTION_END_IFSET(FOO) and the resulting code will be: or 1,1,1 or, or 2,2,2 We achieve this by extending the existing FTR macros. The current feature section semantic just becomes a special case, ie. if the else case is empty we nop out the default case. The key limitation is that the size of the else case must be less than or equal to the size of the default case. If the else case is smaller the remainder of the section is nop'ed. We let the linker put the else case code in with the rest of the text, so that relative branches from the else case are more likley to link, this has the disadvantage that we can't free the unused else cases. This commit introduces the required macro and linker script changes, but does not enable the patching of the alternative sections. We also need to update two hand-made section entries in reg.h and timex.h Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Consolidate feature fixup macros for 64/32 bitMichael Ellerman
Currently we have three versions of MAKE_FTR_SECTION_ENTRY(), the macro that generates a feature section entry. There is 64bit version, a 32bit version and version for 32bit code built with a 64bit kernel. Rather than triplicating (?) the MAKE_FTR_SECTION_ENTRY() logic, we can move the 64bit/32bit differences into separate macros, and then only have one version of MAKE_FTR_SECTION_ENTRY(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Consolidate CPU and firmware feature fixup macrosMichael Ellerman
The CPU and firmware feature fixup macros are currently spread across three files, firmware.h, cputable.h and asm-compat.h. Consolidate them into their own file, feature-fixups.h Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Add PPC_NOP_INSTR, a hash define for the preferred nop instructionMichael Ellerman
A bunch of code has hard-coded the value for a "nop" instruction, it would be nice to have a #define for it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Add new code patching routinesMichael Ellerman
This commit adds some new routines for patching code, which will be used in a following commit. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Add ppc_function_entry() which gets the entry point for a functionMichael Ellerman
Because function pointers point to different things on 32-bit vs 64-bit, add a macro that deals with dereferencing the OPD on 64-bit. The soon to be merged ftrace wants this, as well as other code I am working on. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Allow create_branch() to return errorsMichael Ellerman
Currently create_branch() creates a branch instruction for you, and patches it into the call site. In some circumstances it would be nice to be able to create the instruction and patch it later, and also some code might want to check for errors in the branch creation before doing the patching. A future commit will change create_branch() to check for errors. For callers that don't care, replace create_branch() with patch_branch(), which just creates the branch and patches it directly. While we're touching all the callers, change to using unsigned int *, as this seems to match usage better. That allows (and requires) us to remove the volatile in the definition of vector in powermac/smp.c and mpc86xx_smp.c, that's correct because now that we're passing vector as an unsigned int * the compiler knows that it's value might change across the patch_branch() call. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Jon Loeliger <jdl@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Move code patching code into arch/powerpc/lib/code-patching.cMichael Ellerman
We currently have a few routines for patching code in asm/system.h, because they didn't fit anywhere else. I'd like to clean them up a little and add some more, so first move them into a dedicated C file - they don't need to be inlined. While we're moving the code, drop create_function_call(), it's intended caller never got merged and will be replaced in future with something different. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: asm/elf.h: Reduce userspace headerAdrian Bunk
This makes asm/elf.h export less non-userspace stuff to userspace. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Don't export asm/asm-compat.h to userspaceAdrian Bunk
asm/asm-compat.h doesn't seem to be intended for userspace usage. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Only demote individual slices rather than whole processPaul Mackerras
At present, if we have a kernel with a 64kB page size, and some process maps something that has to be mapped with 4kB pages (such as a cache-inhibited mapping on POWER5+, or the eHCA infiniband queue-pair pages), we change the process to use 4kB pages everywhere. This hurts the performance of HPC programs that access eHCA from userspace. With this patch, the kernel will only demote the slice(s) containing the eHCA or cache-inhibited mappings, leaving the remaining slices able to use 64kB hardware pages. This also changes the slice_get_unmapped_area code so that it is willing to place a 64k-page mapping into (or across) a 4k-page slice if there is no better alternative, i.e. if the program specified MAP_FIXED or if there is not sufficient space available in slices that are either empty or already have 64k-page mappings in them. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-06-30powerpc: Add cputable entry for POWER7Michael Neuling
Add a cputable entry for the POWER7 processor. Also tell firmware that we know about POWER7. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Fix copy-and-paste error in clrsetbits_le16Scott Wood
This was pointed out by Detlev Zundel when this code was being added to U-boot. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Get rid of bitfields in ppc_bat structBecky Bruce
While working on the 36-bit physical support, I noticed that there was exactly one line of code that actually referenced the bitfields. So I got rid of them and redefined ppc_bat as a struct of 2 u32's: batu and batl. I also got rid of the previous union that held the bitfield structs and a word representation of the batu/l values. This seems like a nicer solution than adding in a bunch of new bitfields to support extended bat addressing that would never get used, and just leaving the struct as-is would have been incomplete in the face of large physical addressing. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Change BAT code to use phys_addr_tBecky Bruce
Currently, the physical address is an unsigned long, but it should be phys_addr_t in set_bat, [v/p]_mapped_by_bat. Also, create a macro that can convert a large physical address into the correct format for programming the BAT registers. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Silly spelling fix in pgtable-ppc32Becky Bruce
Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Provide dummy crash_shutdown_registerArnd Bergmann
When kexec is disabled, the crash_shutdown_{un,}register functions are not available in the kernel. This provides dummy inline functions for those so that the callers don't have to worry about it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Free a PTE bit on ppc64 with 64K pagesBenjamin Herrenschmidt
This frees a PTE bit when using 64K pages on ppc64. This is done by getting rid of the separate _PAGE_HASHPTE bit. Instead, we just test if any of the 16 sub-page bits is set. For non-combo pages (ie. real 64K pages), we set SUB0 and the location encoding in that field. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Implement OF PCI address accessors stubs for CONFIG_PCI=nAnton Vorontsov
To avoid "#ifdef CONFIG_PCI" in the drivers we should provide stubs in place of OF PCI address accessors. Without these stubs build breaks for drivers not strictly requiring PCI, for example CONFIG_FB_OF=y without CONFIG_PCI: LD .tmp_vmlinux1 drivers/built-in.o: In function `offb_map_reg': offb.c:(.text+0x6e7c): undefined reference to `of_get_pci_address' OF PCI IRQ accessors require pci_dev argument, so drivers using PCI IRQs should depend on CONFIG_PCI anyway. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30powerpc: Optimise smp_wmb on 64-bit processorsNick Piggin
For 64-bit processors, lwsync is the recommended method of store/store ordering on caching enabled memory. For those subarchs which have lwsync, use it rather than eieio for smp_wmb. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-30Merge branch 'linux-2.6'Paul Mackerras
2008-06-30Merge branch 'next' of ↵Paul Mackerras
master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx
2008-06-27kbuild: fix a.out.h export to userspace with O= build.David Woodhouse
We need to check for existence of the a.out.h header in the source tree, not the object tree, if we want it to get the right answer with O=. Signed-off-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-06-26powerpc/QE: use arch_initcall to probe QUICC Engine GPIOsAnton Vorontsov
It was discussed that global arch_initcall() is preferred way to probe QE GPIOs, so let's use it. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-26powerpc/cpm: Remove !CONFIG_PPC_CPM_NEW_BINDING codeKumar Gala
Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-26powerpc/e500mc: flush L2 on NAP for e500mcKumar Gala
If we have an L2CSR register (e500mc) we need to flush the L2 before going to nap. We use the HW flush mechanism provided in that register. The code reuses the CPU_FTR_604_PERF_MON bit as it is no longer used by any code in the kernel. Additionally we didn't reuse the exist L2CR feature bit as this is intended for the 7xxx L2CR register and L2CSR is part of the new Freescale "Book-E" registers. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-26powerpc/85xx: add DOZE/NAP support for e500 coreKumar Gala
The e500 core enter DOZE/NAP power-saving modes when the core go to cpu_idle routine. The power management default running mode is DOZE, If the user echo 1 > /proc/sys/kernel/powersave-nap the system will change to NAP running mode. Signed-off-by: Dave Liu <daveliu@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-18powerpc/booke: Add support for new e500mc coreKumar Gala
The new e500mc core from Freescale is based on the e500v2 but with the following changes: * Supports only the Enhanced Debug Architecture (DSRR0/1, etc) * Floating Point * No SPE * Supports lwsync * Doorbell Exceptions * Hypervisor * Cache line size is now 64-bytes (e500v1/v2 have a 32-byte cache line) Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-17powerpc/4xx: Workaround for PPC440EPx/GRx PCI_28 ErrataJosh Boyer
The 440EPx/GRx chips don't support PCI MRM commands. Drivers determine this by looking for a zero value in the PCI cache line size register. However, some drivers write to this register upon initialization. This can cause MRMs to be used on these chips, which may cause deadlocks on PLB4. The workaround implemented here introduces a new indirect_type flag, called PPC_INDIRECT_TYPE_BROKEN_MRM. This is set in the pci_controller structure in the pci fixup function for 4xx PCI bridges by determining if the bridge is compatible with 440EPx/GRx. The flag is checked in the indirect_write_config function, and forces any writes to the PCI_CACHE_LINE_SIZE register to be zero, which will disable MRMs for these chips. A similar workaround has been tested by AMCC on various PCI cards, such as the Silicon Image ATA card and Intel E1000 GIGE card. Hangs were seen with the Silicon Image card, and MRMs were seen on the bus with a PCI analyzer. With the workaround in place, the card functioned properly and only Memory Reads were seen on the bus with the analyzer. Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2008-06-16powerpc/booke: Fix definitions for dbcr[1-2] and dbsr registersJerone Young
This takes values from the PowerPC ISA BookIII-E specifications that are for DBCR0. Many of these values are different from those currently specified, which are for the ppc405. Also added some bookE definitions for DBCR1 & DBCR2. [ galak@kernel.crashing.org: Added aliases to 40x DBCR0 to match Book-E, Added enhanced debug DBCR0/DBSR _CIRPT and _CRET defines and DBSR IRPT and RET. ] Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-16[POWERPC] Build fix for drivers/macintosh/mediabay.cAdrian Bunk
This fixes the following build error with CONFIG_BLK_DEV_IDE_PMAC=n: <-- snip --> ... CC drivers/macintosh/mediabay.o /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/macintosh/mediabay.c: In function 'check_media_bay': /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/macintosh/mediabay.c:428: error: 'struct media_bay_info' has no member named 'cd_index' make[3]: *** [drivers/macintosh/mediabay.o] Error 1 <-- snip --> Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-16[POWERPC] Fix rmb to order cacheable vs. noncacheableNick Piggin
lwsync is explicitly defined not to have any effect on the ordering of accesses to device memory, so it cannot be used for rmb(). sync appears to be the only barrier which fits the bill. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-16Merge branch 'linux-2.6' into mergePaul Mackerras
2008-06-16powerpc/spufs: remove class_0_dsisr from spu exception handlingLuke Browning
According to the CBEA, the SPU dsisr is not updated for class 0 exceptions. spu_stopped() is testing the dsisr that was passed to it from the class 0 exception handler, so we return a false positive here. This patch cleans up the interrupt handler and erroneous tests in spu_stopped. It also removes the fields from the csa since it is not needed to process class 0 events. Signed-off-by: Luke Browning <lukebrowning@us.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
2008-06-11powerpc/QE: qe_reset should be __initAnton Vorontsov
This patch fixes following section mismatch: WARNING: arch/powerpc/sysdev/built-in.o(.text+0x11d8): Section mismatch in reference from the function qe_reset() to the function .init.text:cpm_muram_init() Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-10powerpc/QE: switch to the cpm_muram implementationAnton Vorontsov
This is very trivial patch. We're transitioning to the cpm_muram_* calls. That's it. Less trivial changes: - BD_SC_* defines were defined in the cpm.h and qe.h, so to avoid redefines we remove BD_SC from the qe.h and use cpm.h along with cpm_muram_* prototypes; - qe_muram_dump was unused and thus removed; - added some code to the cpm_common.c to support legacy QE bindings (data-only node name). - For convenience, define qe_* calls to cpm_*. So drivers need not to be changed. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-10powerpc/QE: implement support for the GPIO LIB APIAnton Vorontsov
This is needed to access QE GPIOs via Linux GPIO API. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-By: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-10powerpc/QE: prepare QE PIO code for GPIO LIB supportAnton Vorontsov
- split and export __par_io_config_pin() out of par_io_config_pin(), so we could use the prefixed version with GPIO LIB API; - rename struct port_regs to qe_pio_regs, and place it into qe.h; - rename #define NUM_OF_PINS to QE_PIO_PINS, and place it into qe.h. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-By: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-10powerpc/QE: add support for QE USB clocks routingAnton Vorontsov
This patch adds a function to the qe_lib to setup QE USB clocks routing. To setup clocks safely, cmxgcr register needs locking, so I just reused ucc_lock since it was used only to protect cmxgcr. The idea behind placing clocks routing functions into the qe_lib is that later we'll hopefully switch to the generic Linux Clock API, thus, for example, FHCI driver may be used for QE and CPM chips without nasty #ifdefs. This patch also fixes QE_USB_RESTART_TX command definition in the qe.h. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-By: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-10powerpc/sysdev: implement FSL GTM supportAnton Vorontsov
GTM stands for General-purpose Timers Module and able to generate timer{1,2,3,4} interrupts. These timers are used by the drivers that need time precise interrupts (like for USB transactions scheduling for the Freescale USB Host controller as found in some QE and CPM chips), or these timers could be used as wakeup events from the CPU deep-sleep mode. Things unimplemented: 1. Cascaded (32 bit) timers (1-2, 3-4). This is straightforward to implement when needed, two timers should be marked as "requested" and configured as appropriate. 2. Super-cascaded (64 bit) timers (1-2-3-4). This is also straightforward to implement when needed, all timers should be marked as "requested" and configured as appropriate. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-06-09powerpc: Improve (in|out)_[bl]eXX() asm codeTrent Piepho
Since commit 4cb3cee03d558fd457cb58f56c80a2a09a66110c the code generated for the in_beXX() and out_beXX() mmio functions has been sub-optimal. The out_leXX() family of functions are created with the macro DEF_MMIO_OUT_LE() while the out_beXX() family are created with DEF_MMIO_OUT_BE(). In what was perhaps a bit too much macro use, both of these macros are in turn created via the macro DEF_MMIO_OUT(). For the LE versions, eventually they boil down to an asm that will look something like this: asm("sync; stwbrx %1,0,%2" : "=m" (*addr) : "r" (val), "r" (addr)); The issue is that the "stwbrx" instruction only comes in an indexed, or 'x', version, in which the address is represented by the sum of two registers (the "0,%2"). Unfortunately, gcc doesn't have a constraint for an indexed memory reference. The "m" constraint allows both indexed and offset, i.e. register plus constant, memory references and there is no "stwbr" version for offset references. "m" also allows updating addresses and there is no 'u' version of "stwbrx" like there is with "stwux". The unused first operand to the asm is just to tell gcc that *addr is an output of the asm. The address used is passed in a single register via the third asm operand, and the index register is just hard coded as 0. This means gcc is forced to put the address in a single register and can't use index addressing, e.g. if one has the data in register 9, a base address in register 3 and an index in register 4, gcc must emit code like "add 11,4,3; stwbrx 9,0,11" instead of just "stwbrx 9,4,3". This costs an extra add instruction and another register. For gcc 4.0 and older, there doesn't appear to be anything that can be done. But for 4.1 and newer, there is a 'Z' constraint. It does not allow "updating" addresses, but does allow both indexed and offset addresses. However, the only allowed constant offset is 0. We can then use the undocumented 'y' operand modifier, which causes gcc to convert "0(reg)" into the equivilient "0,reg" format that can be used with stwbrx. This brings us the to problem with the BE version. In this case, the "stw" instruction does have both indexed and non-indexed versions. The final asm ends up looking like this: asm("sync; stw%U0%X0 %1,%0" : "=m" (*addr) : "r" (val), "r" (addr)); The undocumented codes "%U0" and "%0X" will generate a 'u' if the memory reference should be an auto-updating one, and an 'x' if the memory reference is indexed, respectively. The third operand is unused, it's just there because asm the code is reused from the LE version. However, gcc does not know this, and generates unnecessary code to stick addr in a register! To use the example from the LE version, gcc will generate "add 11,4,3; stwx 9,4,3". It is able to use the indexed address "4,3" for the "stwx", but still thinks it needs to put 4+3 into register 11, which will never be used. This also ends up happening a lot for the offset addressing mode, where common code like this: out_be32(&device_registers->some_register, data); uses an instruction like "stw 9, 42(3)", where register 3 has the pointer device_registers and 42 is the offset of some_register in that structure. gcc will be forced to generate the unnecessary instruction "addi 11, 3, 42" to put the address into a single (unused) register. The in_* versions end up having these exact same problems as well. Signed-off-by: Trent Piepho <tpiepho@freescale.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Andreas Schwab <schwab@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-09powerpc: Check that TASK_SIZE does not overlap KERNEL_STARTRune Torgersen
Make sure CONFIG_TASK_SIZE does not overlap CONFIG_KERNEL_START This could happen when overriding settings to get 1GB lowmem, and would lead to userland mysteriousely hanging. This setting is only used by PPC32. Signed-off-by: Rune Torgersen <runet@innovsys.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-09Merge branch 'merge'Paul Mackerras
Conflicts: arch/powerpc/sysdev/fsl_soc.c
2008-06-06KVM: ppc: Remove duplicate functionHollis Blanchard
This was left behind from some code movement. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-02[POWERPC] Move to runtime allocated exception stacksKumar Gala
For the additonal exception levels (critical, debug, machine check) on 40x/book-e we were using "static" allocations of the stack in the associated head.S. Move to a runtime allocation to make the code a bit easier to read as we mimic how we handle IRQ stacks. Its also a bit easier to setup the stack with a "dummy" thread_info in C code. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Paul Mackerras <paulus@samba.org>
2008-05-31[POWERPC] Add "memory" clobber to MMIO accessorsBenjamin Herrenschmidt
Gcc might re-order MMIO accessors vs. surrounding consistent memory accesses, which is a "bad thing", and could break drivers. This fixes it by adding a "memory" clobber to the MMIO accessors, which should prevent gcc from doing that reordering. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>