aboutsummaryrefslogtreecommitdiff
path: root/Documentation/powerpc
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/powerpc')
-rw-r--r--Documentation/powerpc/00-INDEX26
-rw-r--r--Documentation/powerpc/SBC8260_memory_mapping.txt197
-rw-r--r--Documentation/powerpc/bootwrapper.txt141
-rw-r--r--Documentation/powerpc/cpu_families.txt221
-rw-r--r--Documentation/powerpc/cpu_features.txt12
-rw-r--r--Documentation/powerpc/eeh-pci-error-recovery.txt23
-rw-r--r--Documentation/powerpc/firmware-assisted-dump.txt270
-rw-r--r--Documentation/powerpc/hvcs.txt8
-rw-r--r--Documentation/powerpc/kvm_440.txt41
-rw-r--r--Documentation/powerpc/mpc52xx.txt12
-rw-r--r--Documentation/powerpc/pmu-ebb.txt137
-rw-r--r--Documentation/powerpc/ppc_htab.txt118
-rw-r--r--Documentation/powerpc/ptrace.txt151
-rw-r--r--Documentation/powerpc/qe_firmware.txt295
-rw-r--r--Documentation/powerpc/smp.txt34
-rw-r--r--Documentation/powerpc/sound.txt81
-rw-r--r--Documentation/powerpc/transactional_memory.txt198
-rw-r--r--Documentation/powerpc/zImage_layout.txt47
18 files changed, 1498 insertions, 514 deletions
diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX
index d6d65b9bcfe..6db73df0427 100644
--- a/Documentation/powerpc/00-INDEX
+++ b/Documentation/powerpc/00-INDEX
@@ -5,22 +5,28 @@ please mail me.
00-INDEX
- this file
+bootwrapper.txt
+ - Information on how the powerpc kernel is wrapped for boot on various
+ different platforms.
cpu_features.txt
- info on how we support a variety of CPUs with minimal compile-time
options.
eeh-pci-error-recovery.txt
- info on PCI Bus EEH Error Recovery
+firmware-assisted-dump.txt
+ - Documentation on the firmware assisted dump mechanism "fadump".
hvcs.txt
- IBM "Hypervisor Virtual Console Server" Installation Guide
+kvm_440.txt
+ - Various notes on the implementation of KVM for PowerPC 440.
mpc52xx.txt
- Linux 2.6.x on MPC52xx family
-ppc_htab.txt
- - info about the Linux/PPC /proc/ppc_htab entry
-SBC8260_memory_mapping.txt
- - EST SBC8260 board info
-smp.txt
- - use and state info about Linux/PPC on MP machines
-sound.txt
- - info on sound support under Linux/PPC
-zImage_layout.txt
- - info on the kernel images for Linux/PPC
+pmu-ebb.txt
+ - Description of the API for using the PMU with Event Based Branches.
+qe_firmware.txt
+ - describes the layout of firmware binaries for the Freescale QUICC
+ Engine and the code that parses and uploads the microcode therein.
+ptrace.txt
+ - Information on the ptrace interfaces for hardware debug registers.
+transactional_memory.txt
+ - Overview of the Power8 transactional memory support.
diff --git a/Documentation/powerpc/SBC8260_memory_mapping.txt b/Documentation/powerpc/SBC8260_memory_mapping.txt
deleted file mode 100644
index e6e9ee0506c..00000000000
--- a/Documentation/powerpc/SBC8260_memory_mapping.txt
+++ /dev/null
@@ -1,197 +0,0 @@
-Please mail me (Jon Diekema, diekema_jon@si.com or diekema@cideas.com)
-if you have questions, comments or corrections.
-
- * EST SBC8260 Linux memory mapping rules
-
- http://www.estc.com/
- http://www.estc.com/products/boards/SBC8260-8240_ds.html
-
- Initial conditions:
- -------------------
-
- Tasks that need to be perform by the boot ROM before control is
- transferred to zImage (compressed Linux kernel):
-
- - Define the IMMR to 0xf0000000
-
- - Initialize the memory controller so that RAM is available at
- physical address 0x00000000. On the SBC8260 is this 16M (64M)
- SDRAM.
-
- - The boot ROM should only clear the RAM that it is using.
-
- The reason for doing this is to enhances the chances of a
- successful post mortem on a Linux panic. One of the first
- items to examine is the 16k (LOG_BUF_LEN) circular console
- buffer called log_buf which is defined in kernel/printk.c.
-
- - To enhance boot ROM performance, the I-cache can be enabled.
-
- Date: Mon, 22 May 2000 14:21:10 -0700
- From: Neil Russell <caret@c-side.com>
-
- LiMon (LInux MONitor) runs with and starts Linux with MMU
- off, I-cache enabled, D-cache disabled. The I-cache doesn't
- need hints from the MMU to work correctly as the D-cache
- does. No D-cache means no special code to handle devices in
- the presence of cache (no snooping, etc). The use of the
- I-cache means that the monitor can run acceptably fast
- directly from ROM, rather than having to copy it to RAM.
-
- - Build the board information structure (see
- include/asm-ppc/est8260.h for its definition)
-
- - The compressed Linux kernel (zImage) contains a bootstrap loader
- that is position independent; you can load it into any RAM,
- ROM or FLASH memory address >= 0x00500000 (above 5 MB), or
- at its link address of 0x00400000 (4 MB).
-
- Note: If zImage is loaded at its link address of 0x00400000 (4 MB),
- then zImage will skip the step of moving itself to
- its link address.
-
- - Load R3 with the address of the board information structure
-
- - Transfer control to zImage
-
- - The Linux console port is SMC1, and the baud rate is controlled
- from the bi_baudrate field of the board information structure.
- On thing to keep in mind when picking the baud rate, is that
- there is no flow control on the SMC ports. I would stick
- with something safe and standard like 19200.
-
- On the EST SBC8260, the SMC1 port is on the COM1 connector of
- the board.
-
-
- EST SBC8260 defaults:
- ---------------------
-
- Chip
- Memory Sel Bus Use
- --------------------- --- --- ----------------------------------
- 0x00000000-0x03FFFFFF CS2 60x (16M or 64M)/64M SDRAM
- 0x04000000-0x04FFFFFF CS4 local 4M/16M SDRAM (soldered to the board)
- 0x21000000-0x21000000 CS7 60x 1B/64K Flash present detect (from the flash SIMM)
- 0x21000001-0x21000001 CS7 60x 1B/64K Switches (read) and LEDs (write)
- 0x22000000-0x2200FFFF CS5 60x 8K/64K EEPROM
- 0xFC000000-0xFCFFFFFF CS6 60x 2M/16M flash (8 bits wide, soldered to the board)
- 0xFE000000-0xFFFFFFFF CS0 60x 4M/16M flash (SIMM)
-
- Notes:
- ------
-
- - The chip selects can map 32K blocks and up (powers of 2)
-
- - The SDRAM machine can handled up to 128Mbytes per chip select
-
- - Linux uses the 60x bus memory (the SDRAM DIMM) for the
- communications buffers.
-
- - BATs can map 128K-256Mbytes each. There are four data BATs and
- four instruction BATs. Generally the data and instruction BATs
- are mapped the same.
-
- - The IMMR must be set above the kernel virtual memory addresses,
- which start at 0xC0000000. Otherwise, the kernel may crash as
- soon as you start any threads or processes due to VM collisions
- in the kernel or user process space.
-
-
- Details from Dan Malek <dan_malek@mvista.com> on 10/29/1999:
-
- The user application virtual space consumes the first 2 Gbytes
- (0x00000000 to 0x7FFFFFFF). The kernel virtual text starts at
- 0xC0000000, with data following. There is a "protection hole"
- between the end of kernel data and the start of the kernel
- dynamically allocated space, but this space is still within
- 0xCxxxxxxx.
-
- Obviously the kernel can't map any physical addresses 1:1 in
- these ranges.
-
-
- Details from Dan Malek <dan_malek@mvista.com> on 5/19/2000:
-
- During the early kernel initialization, the kernel virtual
- memory allocator is not operational. Prior to this KVM
- initialization, we choose to map virtual to physical addresses
- 1:1. That is, the kernel virtual address exactly matches the
- physical address on the bus. These mappings are typically done
- in arch/ppc/kernel/head.S, or arch/ppc/mm/init.c. Only
- absolutely necessary mappings should be done at this time, for
- example board control registers or a serial uart. Normal device
- driver initialization should map resources later when necessary.
-
- Although platform dependent, and certainly the case for embedded
- 8xx, traditionally memory is mapped at physical address zero,
- and I/O devices above physical address 0x80000000. The lowest
- and highest (above 0xf0000000) I/O addresses are traditionally
- used for devices or registers we need to map during kernel
- initialization and prior to KVM operation. For this reason,
- and since it followed prior PowerPC platform examples, I chose
- to map the embedded 8xx kernel to the 0xc0000000 virtual address.
- This way, we can enable the MMU to map the kernel for proper
- operation, and still map a few windows before the KVM is operational.
-
- On some systems, you could possibly run the kernel at the
- 0x80000000 or any other virtual address. It just depends upon
- mapping that must be done prior to KVM operational. You can never
- map devices or kernel spaces that overlap with the user virtual
- space. This is why default IMMR mapping used by most BDM tools
- won't work. They put the IMMR at something like 0x10000000 or
- 0x02000000 for example. You simply can't map these addresses early
- in the kernel, and continue proper system operation.
-
- The embedded 8xx/82xx kernel is mature enough that all you should
- need to do is map the IMMR someplace at or above 0xf0000000 and it
- should boot far enough to get serial console messages and KGDB
- connected on any platform. There are lots of other subtle memory
- management design features that you simply don't need to worry
- about. If you are changing functions related to MMU initialization,
- you are likely breaking things that are known to work and are
- heading down a path of disaster and frustration. Your changes
- should be to make the flexibility of the processor fit Linux,
- not force arbitrary and non-workable memory mappings into Linux.
-
- - You don't want to change KERNELLOAD or KERNELBASE, otherwise the
- virtual memory and MMU code will get confused.
-
- arch/ppc/Makefile:KERNELLOAD = 0xc0000000
-
- include/asm-ppc/page.h:#define PAGE_OFFSET 0xc0000000
- include/asm-ppc/page.h:#define KERNELBASE PAGE_OFFSET
-
- - RAM is at physical address 0x00000000, and gets mapped to
- virtual address 0xC0000000 for the kernel.
-
-
- Physical addresses used by the Linux kernel:
- --------------------------------------------
-
- 0x00000000-0x3FFFFFFF 1GB reserved for RAM
- 0xF0000000-0xF001FFFF 128K IMMR 64K used for dual port memory,
- 64K for 8260 registers
-
-
- Logical addresses used by the Linux kernel:
- -------------------------------------------
-
- 0xF0000000-0xFFFFFFFF 256M BAT0 (IMMR: dual port RAM, registers)
- 0xE0000000-0xEFFFFFFF 256M BAT1 (I/O space for custom boards)
- 0xC0000000-0xCFFFFFFF 256M BAT2 (RAM)
- 0xD0000000-0xDFFFFFFF 256M BAT3 (if RAM > 256MByte)
-
-
- EST SBC8260 Linux mapping:
- --------------------------
-
- DBAT0, IBAT0, cache inhibited:
-
- Chip
- Memory Sel Use
- --------------------- --- ---------------------------------
- 0xF0000000-0xF001FFFF n/a IMMR: dual port RAM, registers
-
- DBAT1, IBAT1, cache inhibited:
-
diff --git a/Documentation/powerpc/bootwrapper.txt b/Documentation/powerpc/bootwrapper.txt
new file mode 100644
index 00000000000..d60fced5e1c
--- /dev/null
+++ b/Documentation/powerpc/bootwrapper.txt
@@ -0,0 +1,141 @@
+The PowerPC boot wrapper
+------------------------
+Copyright (C) Secret Lab Technologies Ltd.
+
+PowerPC image targets compresses and wraps the kernel image (vmlinux) with
+a boot wrapper to make it usable by the system firmware. There is no
+standard PowerPC firmware interface, so the boot wrapper is designed to
+be adaptable for each kind of image that needs to be built.
+
+The boot wrapper can be found in the arch/powerpc/boot/ directory. The
+Makefile in that directory has targets for all the available image types.
+The different image types are used to support all of the various firmware
+interfaces found on PowerPC platforms. OpenFirmware is the most commonly
+used firmware type on general purpose PowerPC systems from Apple, IBM and
+others. U-Boot is typically found on embedded PowerPC hardware, but there
+are a handful of other firmware implementations which are also popular. Each
+firmware interface requires a different image format.
+
+The boot wrapper is built from the makefile in arch/powerpc/boot/Makefile and
+it uses the wrapper script (arch/powerpc/boot/wrapper) to generate target
+image. The details of the build system is discussed in the next section.
+Currently, the following image format targets exist:
+
+ cuImage.%: Backwards compatible uImage for older version of
+ U-Boot (for versions that don't understand the device
+ tree). This image embeds a device tree blob inside
+ the image. The boot wrapper, kernel and device tree
+ are all embedded inside the U-Boot uImage file format
+ with boot wrapper code that extracts data from the old
+ bd_info structure and loads the data into the device
+ tree before jumping into the kernel.
+ Because of the series of #ifdefs found in the
+ bd_info structure used in the old U-Boot interfaces,
+ cuImages are platform specific. Each specific
+ U-Boot platform has a different platform init file
+ which populates the embedded device tree with data
+ from the platform specific bd_info file. The platform
+ specific cuImage platform init code can be found in
+ arch/powerpc/boot/cuboot.*.c. Selection of the correct
+ cuImage init code for a specific board can be found in
+ the wrapper structure.
+ dtbImage.%: Similar to zImage, except device tree blob is embedded
+ inside the image instead of provided by firmware. The
+ output image file can be either an elf file or a flat
+ binary depending on the platform.
+ dtbImages are used on systems which do not have an
+ interface for passing a device tree directly.
+ dtbImages are similar to simpleImages except that
+ dtbImages have platform specific code for extracting
+ data from the board firmware, but simpleImages do not
+ talk to the firmware at all.
+ PlayStation 3 support uses dtbImage. So do Embedded
+ Planet boards using the PlanetCore firmware. Board
+ specific initialization code is typically found in a
+ file named arch/powerpc/boot/<platform>.c; but this
+ can be overridden by the wrapper script.
+ simpleImage.%: Firmware independent compressed image that does not
+ depend on any particular firmware interface and embeds
+ a device tree blob. This image is a flat binary that
+ can be loaded to any location in RAM and jumped to.
+ Firmware cannot pass any configuration data to the
+ kernel with this image type and it depends entirely on
+ the embedded device tree for all information.
+ The simpleImage is useful for booting systems with
+ an unknown firmware interface or for booting from
+ a debugger when no firmware is present (such as on
+ the Xilinx Virtex platform). The only assumption that
+ simpleImage makes is that RAM is correctly initialized
+ and that the MMU is either off or has RAM mapped to
+ base address 0.
+ simpleImage also supports inserting special platform
+ specific initialization code to the start of the bootup
+ sequence. The virtex405 platform uses this feature to
+ ensure that the cache is invalidated before caching
+ is enabled. Platform specific initialization code is
+ added as part of the wrapper script and is keyed on
+ the image target name. For example, all
+ simpleImage.virtex405-* targets will add the
+ virtex405-head.S initialization code (This also means
+ that the dts file for virtex405 targets should be
+ named (virtex405-<board>.dts). Search the wrapper
+ script for 'virtex405' and see the file
+ arch/powerpc/boot/virtex405-head.S for details.
+ treeImage.%; Image format for used with OpenBIOS firmware found
+ on some ppc4xx hardware. This image embeds a device
+ tree blob inside the image.
+ uImage: Native image format used by U-Boot. The uImage target
+ does not add any boot code. It just wraps a compressed
+ vmlinux in the uImage data structure. This image
+ requires a version of U-Boot that is able to pass
+ a device tree to the kernel at boot. If using an older
+ version of U-Boot, then you need to use a cuImage
+ instead.
+ zImage.%: Image format which does not embed a device tree.
+ Used by OpenFirmware and other firmware interfaces
+ which are able to supply a device tree. This image
+ expects firmware to provide the device tree at boot.
+ Typically, if you have general purpose PowerPC
+ hardware then you want this image format.
+
+Image types which embed a device tree blob (simpleImage, dtbImage, treeImage,
+and cuImage) all generate the device tree blob from a file in the
+arch/powerpc/boot/dts/ directory. The Makefile selects the correct device
+tree source based on the name of the target. Therefore, if the kernel is
+built with 'make treeImage.walnut simpleImage.virtex405-ml403', then the
+build system will use arch/powerpc/boot/dts/walnut.dts to build
+treeImage.walnut and arch/powerpc/boot/dts/virtex405-ml403.dts to build
+the simpleImage.virtex405-ml403.
+
+Two special targets called 'zImage' and 'zImage.initrd' also exist. These
+targets build all the default images as selected by the kernel configuration.
+Default images are selected by the boot wrapper Makefile
+(arch/powerpc/boot/Makefile) by adding targets to the $image-y variable. Look
+at the Makefile to see which default image targets are available.
+
+How it is built
+---------------
+arch/powerpc is designed to support multiplatform kernels, which means
+that a single vmlinux image can be booted on many different target boards.
+It also means that the boot wrapper must be able to wrap for many kinds of
+images on a single build. The design decision was made to not use any
+conditional compilation code (#ifdef, etc) in the boot wrapper source code.
+All of the boot wrapper pieces are buildable at any time regardless of the
+kernel configuration. Building all the wrapper bits on every kernel build
+also ensures that obscure parts of the wrapper are at the very least compile
+tested in a large variety of environments.
+
+The wrapper is adapted for different image types at link time by linking in
+just the wrapper bits that are appropriate for the image type. The 'wrapper
+script' (found in arch/powerpc/boot/wrapper) is called by the Makefile and
+is responsible for selecting the correct wrapper bits for the image type.
+The arguments are well documented in the script's comment block, so they
+are not repeated here. However, it is worth mentioning that the script
+uses the -p (platform) argument as the main method of deciding which wrapper
+bits to compile in. Look for the large 'case "$platform" in' block in the
+middle of the script. This is also the place where platform specific fixups
+can be selected by changing the link order.
+
+In particular, care should be taken when working with cuImages. cuImage
+wrapper bits are very board specific and care should be taken to make sure
+the target you are trying to build is supported by the wrapper bits.
diff --git a/Documentation/powerpc/cpu_families.txt b/Documentation/powerpc/cpu_families.txt
new file mode 100644
index 00000000000..fc08e22feb1
--- /dev/null
+++ b/Documentation/powerpc/cpu_families.txt
@@ -0,0 +1,221 @@
+CPU Families
+============
+
+This document tries to summarise some of the different cpu families that exist
+and are supported by arch/powerpc.
+
+
+Book3S (aka sPAPR)
+------------------
+
+ - Hash MMU
+ - Mix of 32 & 64 bit
+
+ +--------------+ +----------------+
+ | Old POWER | --------------> | RS64 (threads) |
+ +--------------+ +----------------+
+ |
+ |
+ v
+ +--------------+ +----------------+ +------+
+ | 601 | --------------> | 603 | ---> | e300 |
+ +--------------+ +----------------+ +------+
+ | |
+ | |
+ v v
+ +--------------+ +----------------+ +-------+
+ | 604 | | 750 (G3) | ---> | 750CX |
+ +--------------+ +----------------+ +-------+
+ | | |
+ | | |
+ v v v
+ +--------------+ +----------------+ +-------+
+ | 620 (64 bit) | | 7400 | | 750CL |
+ +--------------+ +----------------+ +-------+
+ | | |
+ | | |
+ v v v
+ +--------------+ +----------------+ +-------+
+ | POWER3/630 | | 7410 | | 750FX |
+ +--------------+ +----------------+ +-------+
+ | |
+ | |
+ v v
+ +--------------+ +----------------+
+ | POWER3+ | | 7450 |
+ +--------------+ +----------------+
+ | |
+ | |
+ v v
+ +--------------+ +----------------+
+ | POWER4 | | 7455 |
+ +--------------+ +----------------+
+ | |
+ | |
+ v v
+ +--------------+ +-------+ +----------------+
+ | POWER4+ | --> | 970 | | 7447 |
+ +--------------+ +-------+ +----------------+
+ | | |
+ | | |
+ v v v
+ +--------------+ +-------+ +----------------+
+ | POWER5 | | 970FX | | 7448 |
+ +--------------+ +-------+ +----------------+
+ | | |
+ | | |
+ v v v
+ +--------------+ +-------+ +----------------+
+ | POWER5+ | | 970MP | | e600 |
+ +--------------+ +-------+ +----------------+
+ |
+ |
+ v
+ +--------------+
+ | POWER5++ |
+ +--------------+
+ |
+ |
+ v
+ +--------------+ +-------+
+ | POWER6 | <-?-> | Cell |
+ +--------------+ +-------+
+ |
+ |
+ v
+ +--------------+
+ | POWER7 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | POWER7+ |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | POWER8 |
+ +--------------+
+
+
+ +---------------+
+ | PA6T (64 bit) |
+ +---------------+
+
+
+IBM BookE
+---------
+
+ - Software loaded TLB.
+ - All 32 bit
+
+ +--------------+
+ | 401 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | 403 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | 405 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | 440 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+ +----------------+
+ | 450 | --> | BG/P |
+ +--------------+ +----------------+
+ |
+ |
+ v
+ +--------------+
+ | 460 |
+ +--------------+
+ |
+ |
+ v
+ +--------------+
+ | 476 |
+ +--------------+
+
+
+Motorola/Freescale 8xx
+----------------------
+
+ - Software loaded with hardware assist.
+ - All 32 bit
+
+ +-------------+
+ | MPC8xx Core |
+ +-------------+
+
+
+Freescale BookE
+---------------
+
+ - Software loaded TLB.
+ - e6500 adds HW loaded indirect TLB entries.
+ - Mix of 32 & 64 bit
+
+ +--------------+
+ | e200 |
+ +--------------+
+
+
+ +--------------------------------+
+ | e500 |
+ +--------------------------------+
+ |
+ |
+ v
+ +--------------------------------+
+ | e500v2 |
+ +--------------------------------+
+ |
+ |
+ v
+ +--------------------------------+
+ | e500mc (Book3e) |
+ +--------------------------------+
+ |
+ |
+ v
+ +--------------------------------+
+ | e5500 (64 bit) |
+ +--------------------------------+
+ |
+ |
+ v
+ +--------------------------------+
+ | e6500 (HW TLB) (Multithreaded) |
+ +--------------------------------+
+
+
+IBM A2 core
+-----------
+
+ - Book3E, software loaded TLB + HW loaded indirect TLB entries.
+ - 64 bit
+
+ +--------------+ +----------------+
+ | A2 core | --> | WSP |
+ +--------------+ +----------------+
+ |
+ |
+ v
+ +--------------+
+ | BG/Q |
+ +--------------+
diff --git a/Documentation/powerpc/cpu_features.txt b/Documentation/powerpc/cpu_features.txt
index 472739880e8..ae09df8722c 100644
--- a/Documentation/powerpc/cpu_features.txt
+++ b/Documentation/powerpc/cpu_features.txt
@@ -11,10 +11,10 @@ split instruction and data caches, and if the CPU supports the DOZE and NAP
sleep modes.
Detection of the feature set is simple. A list of processors can be found in
-arch/ppc/kernel/cputable.c. The PVR register is masked and compared with each
-value in the list. If a match is found, the cpu_features of cur_cpu_spec is
-assigned to the feature bitmask for this processor and a __setup_cpu function
-is called.
+arch/powerpc/kernel/cputable.c. The PVR register is masked and compared with
+each value in the list. If a match is found, the cpu_features of cur_cpu_spec
+is assigned to the feature bitmask for this processor and a __setup_cpu
+function is called.
C code may test 'cur_cpu_spec[smp_processor_id()]->cpu_features' for a
particular feature bit. This is done in quite a few places, for example
@@ -31,7 +31,7 @@ anyways).
After detecting the processor type, the kernel patches out sections of code
that shouldn't be used by writing nop's over it. Using cpufeatures requires
-just 2 macros (found in include/asm-ppc/cputable.h), as seen in head.S
+just 2 macros (found in arch/powerpc/include/asm/cputable.h), as seen in head.S
transfer_to_handler:
#ifdef CONFIG_ALTIVEC
@@ -51,6 +51,6 @@ should be used in the majority of cases.
The END_FTR_SECTION macros are implemented by storing information about this
code in the '__ftr_fixup' ELF section. When do_cpu_ftr_fixups
-(arch/ppc/kernel/misc.S) is invoked, it will iterate over the records in
+(arch/powerpc/kernel/misc.S) is invoked, it will iterate over the records in
__ftr_fixup, and if the required feature is not present it will loop writing
nop's from each BEGIN_FTR_SECTION to END_FTR_SECTION.
diff --git a/Documentation/powerpc/eeh-pci-error-recovery.txt b/Documentation/powerpc/eeh-pci-error-recovery.txt
index 67a11a36270..9d4e33df624 100644
--- a/Documentation/powerpc/eeh-pci-error-recovery.txt
+++ b/Documentation/powerpc/eeh-pci-error-recovery.txt
@@ -36,8 +36,8 @@ Causes of EEH Errors
EEH was originally designed to guard against hardware failure, such
as PCI cards dying from heat, humidity, dust, vibration and bad
electrical connections. The vast majority of EEH errors seen in
-"real life" are due to eithr poorly seated PCI cards, or,
-unfortunately quite commonly, due device driver bugs, device firmware
+"real life" are due to either poorly seated PCI cards, or,
+unfortunately quite commonly, due to device driver bugs, device firmware
bugs, and sometimes PCI card hardware bugs.
The most common software bug, is one that causes the device to
@@ -90,7 +90,7 @@ EEH-isolated, there is a firmware call it can make to determine if
this is the case. If so, then the device driver should put itself
into a consistent state (given that it won't be able to complete any
pending work) and start recovery of the card. Recovery normally
-would consist of reseting the PCI device (holding the PCI #RST
+would consist of resetting the PCI device (holding the PCI #RST
line high for two seconds), followed by setting up the device
config space (the base address registers (BAR's), latency timer,
cache line size, interrupt line, and so on). This is followed by a
@@ -116,12 +116,12 @@ At this time, a generic EEH recovery mechanism has been implemented,
so that individual device drivers do not need to be modified to support
EEH recovery. This generic mechanism piggy-backs on the PCI hotplug
infrastructure, and percolates events up through the userspace/udev
-infrastructure. Followiing is a detailed description of how this is
+infrastructure. Following is a detailed description of how this is
accomplished.
EEH must be enabled in the PHB's very early during the boot process,
and if a PCI slot is hot-plugged. The former is performed by
-eeh_init() in arch/ppc64/kernel/eeh.c, and the later by
+eeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by
drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code.
EEH must be enabled before a PCI scan of the device can proceed.
Current Power5 hardware will not work unless EEH is enabled;
@@ -133,7 +133,7 @@ error. Given an arbitrary address, the routine
pci_get_device_by_addr() will find the pci device associated
with that address (if any).
-The default include/asm-ppc64/io.h macros readb(), inb(), insb(),
+The default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(),
etc. include a check to see if the i/o read returned all-0xff's.
If so, these make a call to eeh_dn_check_failure(), which in turn
asks the firmware if the all-ff's value is the sign of a true EEH
@@ -143,11 +143,12 @@ seen in /proc/ppc64/eeh (subject to change). Normally, almost
all of these occur during boot, when the PCI bus is scanned, where
a large number of 0xff reads are part of the bus scan procedure.
-If a frozen slot is detected, code in arch/ppc64/kernel/eeh.c will
-print a stack trace to syslog (/var/log/messages). This stack trace
-has proven to be very useful to device-driver authors for finding
-out at what point the EEH error was detected, as the error itself
-usually occurs slightly beforehand.
+If a frozen slot is detected, code in
+arch/powerpc/platforms/pseries/eeh.c will print a stack trace to
+syslog (/var/log/messages). This stack trace has proven to be very
+useful to device-driver authors for finding out at what point the EEH
+error was detected, as the error itself usually occurs slightly
+beforehand.
Next, it uses the Linux kernel notifier chain/work queue mechanism to
allow any interested parties to find out about the failure. Device
diff --git a/Documentation/powerpc/firmware-assisted-dump.txt b/Documentation/powerpc/firmware-assisted-dump.txt
new file mode 100644
index 00000000000..3007bc98af2
--- /dev/null
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -0,0 +1,270 @@
+
+ Firmware-Assisted Dump
+ ------------------------
+ July 2011
+
+The goal of firmware-assisted dump is to enable the dump of
+a crashed system, and to do so from a fully-reset system, and
+to minimize the total elapsed time until the system is back
+in production use.
+
+- Firmware assisted dump (fadump) infrastructure is intended to replace
+ the existing phyp assisted dump.
+- Fadump uses the same firmware interfaces and memory reservation model
+ as phyp assisted dump.
+- Unlike phyp dump, fadump exports the memory dump through /proc/vmcore
+ in the ELF format in the same way as kdump. This helps us reuse the
+ kdump infrastructure for dump capture and filtering.
+- Unlike phyp dump, userspace tool does not need to refer any sysfs
+ interface while reading /proc/vmcore.
+- Unlike phyp dump, fadump allows user to release all the memory reserved
+ for dump, with a single operation of echo 1 > /sys/kernel/fadump_release_mem.
+- Once enabled through kernel boot parameter, fadump can be
+ started/stopped through /sys/kernel/fadump_registered interface (see
+ sysfs files section below) and can be easily integrated with kdump
+ service start/stop init scripts.
+
+Comparing with kdump or other strategies, firmware-assisted
+dump offers several strong, practical advantages:
+
+-- Unlike kdump, the system has been reset, and loaded
+ with a fresh copy of the kernel. In particular,
+ PCI and I/O devices have been reinitialized and are
+ in a clean, consistent state.
+-- Once the dump is copied out, the memory that held the dump
+ is immediately available to the running kernel. And therefore,
+ unlike kdump, fadump doesn't need a 2nd reboot to get back
+ the system to the production configuration.
+
+The above can only be accomplished by coordination with,
+and assistance from the Power firmware. The procedure is
+as follows:
+
+-- The first kernel registers the sections of memory with the
+ Power firmware for dump preservation during OS initialization.
+ These registered sections of memory are reserved by the first
+ kernel during early boot.
+
+-- When a system crashes, the Power firmware will save
+ the low memory (boot memory of size larger of 5% of system RAM
+ or 256MB) of RAM to the previous registered region. It will
+ also save system registers, and hardware PTE's.
+
+ NOTE: The term 'boot memory' means size of the low memory chunk
+ that is required for a kernel to boot successfully when
+ booted with restricted memory. By default, the boot memory
+ size will be the larger of 5% of system RAM or 256MB.
+ Alternatively, user can also specify boot memory size
+ through boot parameter 'fadump_reserve_mem=' which will
+ override the default calculated size. Use this option
+ if default boot memory size is not sufficient for second
+ kernel to boot successfully.
+
+-- After the low memory (boot memory) area has been saved, the
+ firmware will reset PCI and other hardware state. It will
+ *not* clear the RAM. It will then launch the bootloader, as
+ normal.
+
+-- The freshly booted kernel will notice that there is a new
+ node (ibm,dump-kernel) in the device tree, indicating that
+ there is crash data available from a previous boot. During
+ the early boot OS will reserve rest of the memory above
+ boot memory size effectively booting with restricted memory
+ size. This will make sure that the second kernel will not
+ touch any of the dump memory area.
+
+-- User-space tools will read /proc/vmcore to obtain the contents
+ of memory, which holds the previous crashed kernel dump in ELF
+ format. The userspace tools may copy this info to disk, or
+ network, nas, san, iscsi, etc. as desired.
+
+-- Once the userspace tool is done saving dump, it will echo
+ '1' to /sys/kernel/fadump_release_mem to release the reserved
+ memory back to general use, except the memory required for
+ next firmware-assisted dump registration.
+
+ e.g.
+ # echo 1 > /sys/kernel/fadump_release_mem
+
+Please note that the firmware-assisted dump feature
+is only available on Power6 and above systems with recent
+firmware versions.
+
+Implementation details:
+----------------------
+
+During boot, a check is made to see if firmware supports
+this feature on that particular machine. If it does, then
+we check to see if an active dump is waiting for us. If yes
+then everything but boot memory size of RAM is reserved during
+early boot (See Fig. 2). This area is released once we finish
+collecting the dump from user land scripts (e.g. kdump scripts)
+that are run. If there is dump data, then the
+/sys/kernel/fadump_release_mem file is created, and the reserved
+memory is held.
+
+If there is no waiting dump data, then only the memory required
+to hold CPU state, HPTE region, boot memory dump and elfcore
+header, is reserved at the top of memory (see Fig. 1). This area
+is *not* released: this region will be kept permanently reserved,
+so that it can act as a receptacle for a copy of the boot memory
+content in addition to CPU state and HPTE region, in the case a
+crash does occur.
+
+ o Memory Reservation during first kernel
+
+ Low memory Top of memory
+ 0 boot memory size |
+ | | |<--Reserved dump area -->|
+ V V | Permanent Reservation V
+ +-----------+----------/ /----------+---+----+-----------+----+
+ | | |CPU|HPTE| DUMP |ELF |
+ +-----------+----------/ /----------+---+----+-----------+----+
+ | ^
+ | |
+ \ /
+ -------------------------------------------
+ Boot memory content gets transferred to
+ reserved area by firmware at the time of
+ crash
+ Fig. 1
+
+ o Memory Reservation during second kernel after crash
+
+ Low memory Top of memory
+ 0 boot memory size |
+ | |<------------- Reserved dump area ----------- -->|
+ V V V
+ +-----------+----------/ /----------+---+----+-----------+----+
+ | | |CPU|HPTE| DUMP |ELF |
+ +-----------+----------/ /----------+---+----+-----------+----+
+ | |
+ V V
+ Used by second /proc/vmcore
+ kernel to boot
+ Fig. 2
+
+Currently the dump will be copied from /proc/vmcore to a
+a new file upon user intervention. The dump data available through
+/proc/vmcore will be in ELF format. Hence the existing kdump
+infrastructure (kdump scripts) to save the dump works fine with
+minor modifications.
+
+The tools to examine the dump will be same as the ones
+used for kdump.
+
+How to enable firmware-assisted dump (fadump):
+-------------------------------------
+
+1. Set config option CONFIG_FA_DUMP=y and build kernel.
+2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
+3. Optionally, user can also set 'fadump_reserve_mem=' kernel cmdline
+ to specify size of the memory to reserve for boot memory dump
+ preservation.
+
+NOTE: If firmware-assisted dump fails to reserve memory then it will
+ fallback to existing kdump mechanism if 'crashkernel=' option
+ is set at kernel cmdline.
+
+Sysfs/debugfs files:
+------------
+
+Firmware-assisted dump feature uses sysfs file system to hold
+the control files and debugfs file to display memory reserved region.
+
+Here is the list of files under kernel sysfs:
+
+ /sys/kernel/fadump_enabled
+
+ This is used to display the fadump status.
+ 0 = fadump is disabled
+ 1 = fadump is enabled
+
+ This interface can be used by kdump init scripts to identify if
+ fadump is enabled in the kernel and act accordingly.
+
+ /sys/kernel/fadump_registered
+
+ This is used to display the fadump registration status as well
+ as to control (start/stop) the fadump registration.
+ 0 = fadump is not registered.
+ 1 = fadump is registered and ready to handle system crash.
+
+ To register fadump echo 1 > /sys/kernel/fadump_registered and
+ echo 0 > /sys/kernel/fadump_registered for un-register and stop the
+ fadump. Once the fadump is un-registered, the system crash will not
+ be handled and vmcore will not be captured. This interface can be
+ easily integrated with kdump service start/stop.
+
+ /sys/kernel/fadump_release_mem
+
+ This file is available only when fadump is active during
+ second kernel. This is used to release the reserved memory
+ region that are held for saving crash dump. To release the
+ reserved memory echo 1 to it:
+
+ echo 1 > /sys/kernel/fadump_release_mem
+
+ After echo 1, the content of the /sys/kernel/debug/powerpc/fadump_region
+ file will change to reflect the new memory reservations.
+
+ The existing userspace tools (kdump infrastructure) can be easily
+ enhanced to use this interface to release the memory reserved for
+ dump and continue without 2nd reboot.
+
+Here is the list of files under powerpc debugfs:
+(Assuming debugfs is mounted on /sys/kernel/debug directory.)
+
+ /sys/kernel/debug/powerpc/fadump_region
+
+ This file shows the reserved memory regions if fadump is
+ enabled otherwise this file is empty. The output format
+ is:
+ <region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
+
+ e.g.
+ Contents when fadump is registered during first kernel
+
+ # cat /sys/kernel/debug/powerpc/fadump_region
+ CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
+ HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
+ DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
+
+ Contents when fadump is active during second kernel
+
+ # cat /sys/kernel/debug/powerpc/fadump_region
+ CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
+ HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
+ DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
+ : [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
+
+NOTE: Please refer to Documentation/filesystems/debugfs.txt on
+ how to mount the debugfs filesystem.
+
+
+TODO:
+-----
+ o Need to come up with the better approach to find out more
+ accurate boot memory size that is required for a kernel to
+ boot successfully when booted with restricted memory.
+ o The fadump implementation introduces a fadump crash info structure
+ in the scratch area before the ELF core header. The idea of introducing
+ this structure is to pass some important crash info data to the second
+ kernel which will help second kernel to populate ELF core header with
+ correct data before it gets exported through /proc/vmcore. The current
+ design implementation does not address a possibility of introducing
+ additional fields (in future) to this structure without affecting
+ compatibility. Need to come up with the better approach to address this.
+ The possible approaches are:
+ 1. Introduce version field for version tracking, bump up the version
+ whenever a new field is added to the structure in future. The version
+ field can be used to find out what fields are valid for the current
+ version of the structure.
+ 2. Reserve the area of predefined size (say PAGE_SIZE) for this
+ structure and have unused area as reserved (initialized to zero)
+ for future field additions.
+ The advantage of approach 1 over 2 is we don't need to reserve extra space.
+---
+Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
+This document is based on the original documentation written for phyp
+assisted dump by Linas Vepstas and Manish Ahuja.
diff --git a/Documentation/powerpc/hvcs.txt b/Documentation/powerpc/hvcs.txt
index dca75cbda6f..a730ca5a07f 100644
--- a/Documentation/powerpc/hvcs.txt
+++ b/Documentation/powerpc/hvcs.txt
@@ -259,7 +259,7 @@ This index of '2' means that in order to connect to vty-server adapter
It should be noted that due to the system hotplug I/O capabilities of a
system the /dev/hvcs* entry that interacts with a particular vty-server
-adapter is not guarenteed to remain the same across system reboots. Look
+adapter is not guaranteed to remain the same across system reboots. Look
in the Q & A section for more on this issue.
---------------------------------------------------------------------------
@@ -528,7 +528,7 @@ this driver assignment of hotplug added vty-servers may be in a different
order than how they would be exposed on module load. Rebooting or
reloading the module after dynamic addition may result in the /dev/hvcs*
and vty-server coupling changing if a vty-server adapter was added in a
-slot inbetween two other vty-server adapters. Refer to the section above
+slot between two other vty-server adapters. Refer to the section above
on how to determine which vty-server goes with which /dev/hvcs* node.
Hint; look at the sysfs "index" attribute for the vty-server.
@@ -558,9 +558,9 @@ partitions.
The proper channel for reporting bugs is either through the Linux OS
distribution company that provided your OS or by posting issues to the
-ppc64 development mailing list at:
+PowerPC development mailing list at:
-linuxppc64-dev@lists.linuxppc.org
+linuxppc-dev@lists.ozlabs.org
This request is to provide a documented and searchable public exchange
of the problems and solutions surrounding this driver for the benefit of
diff --git a/Documentation/powerpc/kvm_440.txt b/Documentation/powerpc/kvm_440.txt
new file mode 100644
index 00000000000..c02a003fa03
--- /dev/null
+++ b/Documentation/powerpc/kvm_440.txt
@@ -0,0 +1,41 @@
+Hollis Blanchard <hollisb@us.ibm.com>
+15 Apr 2008
+
+Various notes on the implementation of KVM for PowerPC 440:
+
+To enforce isolation, host userspace, guest kernel, and guest userspace all
+run at user privilege level. Only the host kernel runs in supervisor mode.
+Executing privileged instructions in the guest traps into KVM (in the host
+kernel), where we decode and emulate them. Through this technique, unmodified
+440 Linux kernels can be run (slowly) as guests. Future performance work will
+focus on reducing the overhead and frequency of these traps.
+
+The usual code flow is started from userspace invoking an "run" ioctl, which
+causes KVM to switch into guest context. We use IVPR to hijack the host
+interrupt vectors while running the guest, which allows us to direct all
+interrupts to kvmppc_handle_interrupt(). At this point, we could either
+- handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or
+- let the host interrupt handler run (e.g. when the decrementer fires), or
+- return to host userspace (e.g. when the guest performs device MMIO)
+
+Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
+address space (in host or guest), which gives us virtual address space to use
+for guest mappings. While the guest is running, the host kernel remains mapped
+in AS=0, but the guest can only use AS=1 mappings.
+
+TLB entries: The TLB entries covering the host linear mapping remain
+present while running the guest. This reduces the overhead of lightweight
+exits, which are handled by KVM running in the host kernel. We keep three
+copies of the TLB:
+ - guest TLB: contents of the TLB as the guest sees it
+ - shadow TLB: the TLB that is actually in hardware while guest is running
+ - host TLB: to restore TLB state when context switching guest -> host
+When a TLB miss occurs because a mapping was not present in the shadow TLB,
+but was present in the guest TLB, KVM handles the fault without invoking the
+guest. Large guest pages are backed by multiple 4KB shadow pages through this
+mechanism.
+
+IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
+and block IO, so those drivers must be enabled in the guest. It's possible
+that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
+little effort.
diff --git a/Documentation/powerpc/mpc52xx.txt b/Documentation/powerpc/mpc52xx.txt
index 10dd4ab93b8..0d540a31ea1 100644
--- a/Documentation/powerpc/mpc52xx.txt
+++ b/Documentation/powerpc/mpc52xx.txt
@@ -2,7 +2,7 @@ Linux 2.6.x on MPC52xx family
-----------------------------
For the latest info, go to http://www.246tNt.com/mpc52xx/
-
+
To compile/use :
- U-Boot:
@@ -10,23 +10,23 @@ To compile/use :
if you wish to ).
# make lite5200_defconfig
# make uImage
-
+
then, on U-boot:
=> tftpboot 200000 uImage
=> tftpboot 400000 pRamdisk
=> bootm 200000 400000
-
+
- DBug:
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
if you wish to ).
# make lite5200_defconfig
# cp your_initrd.gz arch/ppc/boot/images/ramdisk.image.gz
- # make zImage.initrd
- # make
+ # make zImage.initrd
+ # make
then in DBug:
DBug> dn -i zImage.initrd.lite5200
-
+
Some remarks :
- The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
diff --git a/Documentation/powerpc/pmu-ebb.txt b/Documentation/powerpc/pmu-ebb.txt
new file mode 100644
index 00000000000..73cd163dbfb
--- /dev/null
+++ b/Documentation/powerpc/pmu-ebb.txt
@@ -0,0 +1,137 @@
+PMU Event Based Branches
+========================
+
+Event Based Branches (EBBs) are a feature which allows the hardware to
+branch directly to a specified user space address when certain events occur.
+
+The full specification is available in Power ISA v2.07:
+
+ https://www.power.org/documentation/power-isa-version-2-07/
+
+One type of event for which EBBs can be configured is PMU exceptions. This
+document describes the API for configuring the Power PMU to generate EBBs,
+using the Linux perf_events API.
+
+
+Terminology
+-----------
+
+Throughout this document we will refer to an "EBB event" or "EBB events". This
+just refers to a struct perf_event which has set the "EBB" flag in its
+attr.config. All events which can be configured on the hardware PMU are
+possible "EBB events".
+
+
+Background
+----------
+
+When a PMU EBB occurs it is delivered to the currently running process. As such
+EBBs can only sensibly be used by programs for self-monitoring.
+
+It is a feature of the perf_events API that events can be created on other
+processes, subject to standard permission checks. This is also true of EBB
+events, however unless the target process enables EBBs (via mtspr(BESCR)) no
+EBBs will ever be delivered.
+
+This makes it possible for a process to enable EBBs for itself, but not
+actually configure any events. At a later time another process can come along
+and attach an EBB event to the process, which will then cause EBBs to be
+delivered to the first process. It's not clear if this is actually useful.
+
+
+When the PMU is configured for EBBs, all PMU interrupts are delivered to the
+user process. This means once an EBB event is scheduled on the PMU, no non-EBB
+events can be configured. This means that EBB events can not be run
+concurrently with regular 'perf' commands, or any other perf events.
+
+It is however safe to run 'perf' commands on a process which is using EBBs. The
+kernel will in general schedule the EBB event, and perf will be notified that
+its events could not run.
+
+The exclusion between EBB events and regular events is implemented using the
+existing "pinned" and "exclusive" attributes of perf_events. This means EBB
+events will be given priority over other events, unless they are also pinned.
+If an EBB event and a regular event are both pinned, then whichever is enabled
+first will be scheduled and the other will be put in error state. See the
+section below titled "Enabling an EBB event" for more information.
+
+
+Creating an EBB event
+---------------------
+
+To request that an event is counted using EBB, the event code should have bit
+63 set.
+
+EBB events must be created with a particular, and restrictive, set of
+attributes - this is so that they interoperate correctly with the rest of the
+perf_events subsystem.
+
+An EBB event must be created with the "pinned" and "exclusive" attributes set.
+Note that if you are creating a group of EBB events, only the leader can have
+these attributes set.
+
+An EBB event must NOT set any of the "inherit", "sample_period", "freq" or
+"enable_on_exec" attributes.
+
+An EBB event must be attached to a task. This is specified to perf_event_open()
+by passing a pid value, typically 0 indicating the current task.
+
+All events in a group must agree on whether they want EBB. That is all events
+must request EBB, or none may request EBB.
+
+EBB events must specify the PMC they are to be counted on. This ensures
+userspace is able to reliably determine which PMC the event is scheduled on.
+
+
+Enabling an EBB event
+---------------------
+
+Once an EBB event has been successfully opened, it must be enabled with the
+perf_events API. This can be achieved either via the ioctl() interface, or the
+prctl() interface.
+
+However, due to the design of the perf_events API, enabling an event does not
+guarantee that it has been scheduled on the PMU. To ensure that the EBB event
+has been scheduled on the PMU, you must perform a read() on the event. If the
+read() returns EOF, then the event has not been scheduled and EBBs are not
+enabled.
+
+This behaviour occurs because the EBB event is pinned and exclusive. When the
+EBB event is enabled it will force all other non-pinned events off the PMU. In
+this case the enable will be successful. However if there is already an event
+pinned on the PMU then the enable will not be successful.
+
+
+Reading an EBB event
+--------------------
+
+It is possible to read() from an EBB event. However the results are
+meaningless. Because interrupts are being delivered to the user process the
+kernel is not able to count the event, and so will return a junk value.
+
+
+Closing an EBB event
+--------------------
+
+When an EBB event is finished with, you can close it using close() as for any
+regular event. If this is the last EBB event the PMU will be deconfigured and
+no further PMU EBBs will be delivered.
+
+
+EBB Handler
+-----------
+
+The EBB handler is just regular userspace code, however it must be written in
+the style of an interrupt handler. When the handler is entered all registers
+are live (possibly) and so must be saved somehow before the handler can invoke
+other code.
+
+It's up to the program how to handle this. For C programs a relatively simple
+option is to create an interrupt frame on the stack and save registers there.
+
+Fork
+----
+
+EBB events are not inherited across fork. If the child process wishes to use
+EBBs it should open a new event for itself. Similarly the EBB state in
+BESCR/EBBHR/EBBRR is cleared across fork().
diff --git a/Documentation/powerpc/ppc_htab.txt b/Documentation/powerpc/ppc_htab.txt
deleted file mode 100644
index 8b8c7df29fa..00000000000
--- a/Documentation/powerpc/ppc_htab.txt
+++ /dev/null
@@ -1,118 +0,0 @@
- Information about /proc/ppc_htab
-=====================================================================
-
-This document and the related code was written by me (Cort Dougan), please
-email me (cort@fsmlabs.com) if you have questions, comments or corrections.
-
-Last Change: 2.16.98
-
-This entry in the proc directory is readable by all users but only
-writable by root.
-
-The ppc_htab interface is a user level way of accessing the
-performance monitoring registers as well as providing information
-about the PTE hash table.
-
-1. Reading
-
- Reading this file will give you information about the memory management
- hash table that serves as an extended tlb for page translation on the
- powerpc. It will also give you information about performance measurement
- specific to the cpu that you are using.
-
- Explanation of the 604 Performance Monitoring Fields:
- MMCR0 - the current value of the MMCR0 register
- PMC1
- PMC2 - the value of the performance counters and a
- description of what events they are counting
- which are based on MMCR0 bit settings.
- Explanation of the PTE Hash Table fields:
-
- Size - hash table size in Kb.
- Buckets - number of buckets in the table.
- Address - the virtual kernel address of the hash table base.
- Entries - the number of ptes that can be stored in the hash table.
- User/Kernel - how many pte's are in use by the kernel or user at that time.
- Overflows - How many of the entries are in their secondary hash location.
- Percent full - ratio of free pte entries to in use entries.
- Reloads - Count of how many hash table misses have occurred
- that were fixed with a reload from the linux tables.
- Should always be 0 on 603 based machines.
- Non-error Misses - Count of how many hash table misses have occurred
- that were completed with the creation of a pte in the linux
- tables with a call to do_page_fault().
- Error Misses - Number of misses due to errors such as bad address
- and permission violations. This includes kernel access of
- bad user addresses that are fixed up by the trap handler.
-
- Note that calculation of the data displayed from /proc/ppc_htab takes
- a long time and spends a great deal of time in the kernel. It would
- be quite hard on performance to read this file constantly. In time
- there may be a counter in the kernel that allows successive reads from
- this file only after a given amount of time has passed to reduce the
- possibility of a user slowing the system by reading this file.
-
-2. Writing
-
- Writing to the ppc_htab allows you to change the characteristics of
- the powerpc PTE hash table and setup performance monitoring.
-
- Resizing the PTE hash table is not enabled right now due to many
- complications with moving the hash table, rehashing the entries
- and many many SMP issues that would have to be dealt with.
-
- Write options to ppc_htab:
-
- - To set the size of the hash table to 64Kb:
-
- echo 'size 64' > /proc/ppc_htab
-
- The size must be a multiple of 64 and must be greater than or equal to
- 64.
-
- - To turn off performance monitoring:
-
- echo 'off' > /proc/ppc_htab
-
- - To reset the counters without changing what they're counting:
-
- echo 'reset' > /proc/ppc_htab
-
- Note that counting will continue after the reset if it is enabled.
-
- - To count only events in user mode or only in kernel mode:
-
- echo 'user' > /proc/ppc_htab
- ...or...
- echo 'kernel' > /proc/ppc_htab
-
- Note that these two options are exclusive of one another and the
- lack of either of these options counts user and kernel.
- Using 'reset' and 'off' reset these flags.
-
- - The 604 has 2 performance counters which can each count events from
- a specific set of events. These sets are disjoint so it is not
- possible to count _any_ combination of 2 events. One event can
- be counted by PMC1 and one by PMC2.
-
- To start counting a particular event use:
-
- echo 'event' > /proc/ppc_htab
-
- and choose from these events:
-
- PMC1
- ----
- 'ic miss' - instruction cache misses
- 'dtlb' - data tlb misses (not hash table misses)
-
- PMC2
- ----
- 'dc miss' - data cache misses
- 'itlb' - instruction tlb misses (not hash table misses)
- 'load miss time' - cycles to complete a load miss
-
-3. Bugs
-
- The PMC1 and PMC2 counters can overflow and give no indication of that
- in /proc/ppc_htab.
diff --git a/Documentation/powerpc/ptrace.txt b/Documentation/powerpc/ptrace.txt
new file mode 100644
index 00000000000..99c5ce88d0f
--- /dev/null
+++ b/Documentation/powerpc/ptrace.txt
@@ -0,0 +1,151 @@
+GDB intends to support the following hardware debug features of BookE
+processors:
+
+4 hardware breakpoints (IAC)
+2 hardware watchpoints (read, write and read-write) (DAC)
+2 value conditions for the hardware watchpoints (DVC)
+
+For that, we need to extend ptrace so that GDB can query and set these
+resources. Since we're extending, we're trying to create an interface
+that's extendable and that covers both BookE and server processors, so
+that GDB doesn't need to special-case each of them. We added the
+following 3 new ptrace requests.
+
+1. PTRACE_PPC_GETHWDEBUGINFO
+
+Query for GDB to discover the hardware debug features. The main info to
+be returned here is the minimum alignment for the hardware watchpoints.
+BookE processors don't have restrictions here, but server processors have
+an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
+adding special cases to GDB based on what it sees in AUXV.
+
+Since we're at it, we added other useful info that the kernel can return to
+GDB: this query will return the number of hardware breakpoints, hardware
+watchpoints and whether it supports a range of addresses and a condition.
+The query will fill the following structure provided by the requesting process:
+
+struct ppc_debug_info {
+ unit32_t version;
+ unit32_t num_instruction_bps;
+ unit32_t num_data_bps;
+ unit32_t num_condition_regs;
+ unit32_t data_bp_alignment;
+ unit32_t sizeof_condition; /* size of the DVC register */
+ uint64_t features; /* bitmask of the individual flags */
+};
+
+features will have bits indicating whether there is support for:
+
+#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
+#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
+#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
+#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
+#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
+
+2. PTRACE_SETHWDEBUG
+
+Sets a hardware breakpoint or watchpoint, according to the provided structure:
+
+struct ppc_hw_breakpoint {
+ uint32_t version;
+#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
+#define PPC_BREAKPOINT_TRIGGER_READ 0x2
+#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
+ uint32_t trigger_type; /* only some combinations allowed */
+#define PPC_BREAKPOINT_MODE_EXACT 0x0
+#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
+#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
+#define PPC_BREAKPOINT_MODE_MASK 0x3
+ uint32_t addr_mode; /* address match mode */
+
+#define PPC_BREAKPOINT_CONDITION_MODE 0x3
+#define PPC_BREAKPOINT_CONDITION_NONE 0x0
+#define PPC_BREAKPOINT_CONDITION_AND 0x1
+#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
+#define PPC_BREAKPOINT_CONDITION_OR 0x2
+#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
+#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
+#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
+ uint32_t condition_mode; /* break/watchpoint condition flags */
+
+ uint64_t addr;
+ uint64_t addr2;
+ uint64_t condition_value;
+};
+
+A request specifies one event, not necessarily just one register to be set.
+For instance, if the request is for a watchpoint with a condition, both the
+DAC and DVC registers will be set in the same request.
+
+With this GDB can ask for all kinds of hardware breakpoints and watchpoints
+that the BookE supports. COMEFROM breakpoints available in server processors
+are not contemplated, but that is out of the scope of this work.
+
+ptrace will return an integer (handle) uniquely identifying the breakpoint or
+watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
+request to ask for its removal. Return -ENOSPC if the requested breakpoint
+can't be allocated on the registers.
+
+Some examples of using the structure to:
+
+- set a breakpoint in the first breakpoint register
+
+ p.version = PPC_DEBUG_CURRENT_VERSION;
+ p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
+ p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
+ p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
+ p.addr = (uint64_t) address;
+ p.addr2 = 0;
+ p.condition_value = 0;
+
+- set a watchpoint which triggers on reads in the second watchpoint register
+
+ p.version = PPC_DEBUG_CURRENT_VERSION;
+ p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
+ p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
+ p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
+ p.addr = (uint64_t) address;
+ p.addr2 = 0;
+ p.condition_value = 0;
+
+- set a watchpoint which triggers only with a specific value
+
+ p.version = PPC_DEBUG_CURRENT_VERSION;
+ p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
+ p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
+ p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
+ p.addr = (uint64_t) address;
+ p.addr2 = 0;
+ p.condition_value = (uint64_t) condition;
+
+- set a ranged hardware breakpoint
+
+ p.version = PPC_DEBUG_CURRENT_VERSION;
+ p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
+ p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
+ p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
+ p.addr = (uint64_t) begin_range;
+ p.addr2 = (uint64_t) end_range;
+ p.condition_value = 0;
+
+- set a watchpoint in server processors (BookS)
+
+ p.version = 1;
+ p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
+ p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
+ or
+ p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
+
+ p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
+ p.addr = (uint64_t) begin_range;
+ /* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
+ * addr2 - addr <= 8 Bytes.
+ */
+ p.addr2 = (uint64_t) end_range;
+ p.condition_value = 0;
+
+3. PTRACE_DELHWDEBUG
+
+Takes an integer which identifies an existing breakpoint or watchpoint
+(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
+corresponding breakpoint or watchpoint..
diff --git a/Documentation/powerpc/qe_firmware.txt b/Documentation/powerpc/qe_firmware.txt
new file mode 100644
index 00000000000..2031ddb33d0
--- /dev/null
+++ b/Documentation/powerpc/qe_firmware.txt
@@ -0,0 +1,295 @@
+ Freescale QUICC Engine Firmware Uploading
+ -----------------------------------------
+
+(c) 2007 Timur Tabi <timur at freescale.com>,
+ Freescale Semiconductor
+
+Table of Contents
+=================
+
+ I - Software License for Firmware
+
+ II - Microcode Availability
+
+ III - Description and Terminology
+
+ IV - Microcode Programming Details
+
+ V - Firmware Structure Layout
+
+ VI - Sample Code for Creating Firmware Files
+
+Revision Information
+====================
+
+November 30, 2007: Rev 1.0 - Initial version
+
+I - Software License for Firmware
+=================================
+
+Each firmware file comes with its own software license. For information on
+the particular license, please see the license text that is distributed with
+the firmware.
+
+II - Microcode Availability
+===========================
+
+Firmware files are distributed through various channels. Some are available on
+http://opensource.freescale.com. For other firmware files, please contact
+your Freescale representative or your operating system vendor.
+
+III - Description and Terminology
+================================
+
+In this document, the term 'microcode' refers to the sequence of 32-bit
+integers that compose the actual QE microcode.
+
+The term 'firmware' refers to a binary blob that contains the microcode as
+well as other data that
+
+ 1) describes the microcode's purpose
+ 2) describes how and where to upload the microcode
+ 3) specifies the values of various registers
+ 4) includes additional data for use by specific device drivers
+
+Firmware files are binary files that contain only a firmware.
+
+IV - Microcode Programming Details
+===================================
+
+The QE architecture allows for only one microcode present in I-RAM for each
+RISC processor. To replace any current microcode, a full QE reset (which
+disables the microcode) must be performed first.
+
+QE microcode is uploaded using the following procedure:
+
+1) The microcode is placed into I-RAM at a specific location, using the
+ IRAM.IADD and IRAM.IDATA registers.
+
+2) The CERCR.CIR bit is set to 0 or 1, depending on whether the firmware
+ needs split I-RAM. Split I-RAM is only meaningful for SOCs that have
+ QEs with multiple RISC processors, such as the 8360. Splitting the I-RAM
+ allows each processor to run a different microcode, effectively creating an
+ asymmetric multiprocessing (AMP) system.
+
+3) The TIBCR trap registers are loaded with the addresses of the trap handlers
+ in the microcode.
+
+4) The RSP.ECCR register is programmed with the value provided.
+
+5) If necessary, device drivers that need the virtual traps and extended mode
+ data will use them.
+
+Virtual Microcode Traps
+
+These virtual traps are conditional branches in the microcode. These are
+"soft" provisional introduced in the ROMcode in order to enable higher
+flexibility and save h/w traps If new features are activated or an issue is
+being fixed in the RAM package utilizing they should be activated. This data
+structure signals the microcode which of these virtual traps is active.
+
+This structure contains 6 words that the application should copy to some
+specific been defined. This table describes the structure.
+
+ ---------------------------------------------------------------
+ | Offset in | | Destination Offset | Size of |
+ | array | Protocol | within PRAM | Operand |
+ --------------------------------------------------------------|
+ | 0 | Ethernet | 0xF8 | 4 bytes |
+ | | interworking | | |
+ ---------------------------------------------------------------
+ | 4 | ATM | 0xF8 | 4 bytes |
+ | | interworking | | |
+ ---------------------------------------------------------------
+ | 8 | PPP | 0xF8 | 4 bytes |
+ | | interworking | | |
+ ---------------------------------------------------------------
+ | 12 | Ethernet RX | 0x22 | 1 byte |
+ | | Distributor Page | | |
+ ---------------------------------------------------------------
+ | 16 | ATM Globtal | 0x28 | 1 byte |
+ | | Params Table | | |
+ ---------------------------------------------------------------
+ | 20 | Insert Frame | 0xF8 | 4 bytes |
+ ---------------------------------------------------------------
+
+
+Extended Modes
+
+This is a double word bit array (64 bits) that defines special functionality
+which has an impact on the softwarew drivers. Each bit has its own impact
+and has special instructions for the s/w associated with it. This structure is
+described in this table:
+
+ -----------------------------------------------------------------------
+ | Bit # | Name | Description |
+ -----------------------------------------------------------------------
+ | 0 | General | Indicates that prior to each host command |
+ | | push command | given by the application, the software must |
+ | | | assert a special host command (push command)|
+ | | | CECDR = 0x00800000. |
+ | | | CECR = 0x01c1000f. |
+ -----------------------------------------------------------------------
+ | 1 | UCC ATM | Indicates that after issuing ATM RX INIT |
+ | | RX INIT | command, the host must issue another special|
+ | | push command | command (push command) and immediately |
+ | | | following that re-issue the ATM RX INIT |
+ | | | command. (This makes the sequence of |
+ | | | initializing the ATM receiver a sequence of |
+ | | | three host commands) |
+ | | | CECDR = 0x00800000. |
+ | | | CECR = 0x01c1000f. |
+ -----------------------------------------------------------------------
+ | 2 | Add/remove | Indicates that following the specific host |
+ | | command | command: "Add/Remove entry in Hash Lookup |
+ | | validation | Table" used in Interworking setup, the user |
+ | | | must issue another command. |
+ | | | CECDR = 0xce000003. |
+ | | | CECR = 0x01c10f58. |
+ -----------------------------------------------------------------------
+ | 3 | General push | Indicates that the s/w has to initialize |
+ | | command | some pointers in the Ethernet thread pages |
+ | | | which are used when Header Compression is |
+ | | | activated. The full details of these |
+ | | | pointers is located in the software drivers.|
+ -----------------------------------------------------------------------
+ | 4 | General push | Indicates that after issuing Ethernet TX |
+ | | command | INIT command, user must issue this command |
+ | | | for each SNUM of Ethernet TX thread. |
+ | | | CECDR = 0x00800003. |
+ | | | CECR = 0x7'b{0}, 8'b{Enet TX thread SNUM}, |
+ | | | 1'b{1}, 12'b{0}, 4'b{1} |
+ -----------------------------------------------------------------------
+ | 5 - 31 | N/A | Reserved, set to zero. |
+ -----------------------------------------------------------------------
+
+V - Firmware Structure Layout
+==============================
+
+QE microcode from Freescale is typically provided as a header file. This
+header file contains macros that define the microcode binary itself as well as
+some other data used in uploading that microcode. The format of these files
+do not lend themselves to simple inclusion into other code. Hence,
+the need for a more portable format. This section defines that format.
+
+Instead of distributing a header file, the microcode and related data are
+embedded into a binary blob. This blob is passed to the qe_upload_firmware()
+function, which parses the blob and performs everything necessary to upload
+the microcode.
+
+All integers are big-endian. See the comments for function
+qe_upload_firmware() for up-to-date implementation information.
+
+This structure supports versioning, where the version of the structure is
+embedded into the structure itself. To ensure forward and backwards
+compatibility, all versions of the structure must use the same 'qe_header'
+structure at the beginning.
+
+'header' (type: struct qe_header):
+ The 'length' field is the size, in bytes, of the entire structure,
+ including all the microcode embedded in it, as well as the CRC (if
+ present).
+
+ The 'magic' field is an array of three bytes that contains the letters
+ 'Q', 'E', and 'F'. This is an identifier that indicates that this
+ structure is a QE Firmware structure.
+
+ The 'version' field is a single byte that indicates the version of this
+ structure. If the layout of the structure should ever need to be
+ changed to add support for additional types of microcode, then the
+ version number should also be changed.
+
+The 'id' field is a null-terminated string(suitable for printing) that
+identifies the firmware.
+
+The 'count' field indicates the number of 'microcode' structures. There
+must be one and only one 'microcode' structure for each RISC processor.
+Therefore, this field also represents the number of RISC processors for this
+SOC.
+
+The 'soc' structure contains the SOC numbers and revisions used to match
+the microcode to the SOC itself. Normally, the microcode loader should
+check the data in this structure with the SOC number and revisions, and
+only upload the microcode if there's a match. However, this check is not
+made on all platforms.
+
+Although it is not recommended, you can specify '0' in the soc.model
+field to skip matching SOCs altogether.
+
+The 'model' field is a 16-bit number that matches the actual SOC. The
+'major' and 'minor' fields are the major and minor revision numbers,
+respectively, of the SOC.
+
+For example, to match the 8323, revision 1.0:
+ soc.model = 8323
+ soc.major = 1
+ soc.minor = 0
+
+'padding' is necessary for structure alignment. This field ensures that the
+'extended_modes' field is aligned on a 64-bit boundary.
+
+'extended_modes' is a bitfield that defines special functionality which has an
+impact on the device drivers. Each bit has its own impact and has special
+instructions for the driver associated with it. This field is stored in
+the QE library and available to any driver that calles qe_get_firmware_info().
+
+'vtraps' is an array of 8 words that contain virtual trap values for each
+virtual traps. As with 'extended_modes', this field is stored in the QE
+library and available to any driver that calles qe_get_firmware_info().
+
+'microcode' (type: struct qe_microcode):
+ For each RISC processor there is one 'microcode' structure. The first
+ 'microcode' structure is for the first RISC, and so on.
+
+ The 'id' field is a null-terminated string suitable for printing that
+ identifies this particular microcode.
+
+ 'traps' is an array of 16 words that contain hardware trap values
+ for each of the 16 traps. If trap[i] is 0, then this particular
+ trap is to be ignored (i.e. not written to TIBCR[i]). The entire value
+ is written as-is to the TIBCR[i] register, so be sure to set the EN
+ and T_IBP bits if necessary.
+
+ 'eccr' is the value to program into the ECCR register.
+
+ 'iram_offset' is the offset into IRAM to start writing the
+ microcode.
+
+ 'count' is the number of 32-bit words in the microcode.
+
+ 'code_offset' is the offset, in bytes, from the beginning of this
+ structure where the microcode itself can be found. The first
+ microcode binary should be located immediately after the 'microcode'
+ array.
+
+ 'major', 'minor', and 'revision' are the major, minor, and revision
+ version numbers, respectively, of the microcode. If all values are 0,
+ then these fields are ignored.
+
+ 'reserved' is necessary for structure alignment. Since 'microcode'
+ is an array, the 64-bit 'extended_modes' field needs to be aligned
+ on a 64-bit boundary, and this can only happen if the size of
+ 'microcode' is a multiple of 8 bytes. To ensure that, we add
+ 'reserved'.
+
+After the last microcode is a 32-bit CRC. It can be calculated using
+this algorithm:
+
+u32 crc32(const u8 *p, unsigned int len)
+{
+ unsigned int i;
+ u32 crc = 0;
+
+ while (len--) {
+ crc ^= *p++;
+ for (i = 0; i < 8; i++)
+ crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
+ }
+ return crc;
+}
+
+VI - Sample Code for Creating Firmware Files
+============================================
+
+A Python program that creates firmware binaries from the header files normally
+distributed by Freescale can be found on http://opensource.freescale.com.
diff --git a/Documentation/powerpc/smp.txt b/Documentation/powerpc/smp.txt
deleted file mode 100644
index 5b581b849ff..00000000000
--- a/Documentation/powerpc/smp.txt
+++ /dev/null
@@ -1,34 +0,0 @@
- Information about Linux/PPC SMP mode
-=====================================================================
-
-This document and the related code was written by me
-(Cort Dougan, cort@fsmlabs.com) please email me if you have questions,
-comments or corrections.
-
-Last Change: 3.31.99
-
-If you want to help by writing code or testing different hardware please
-email me!
-
-1. State of Supported Hardware
-
- PowerSurge Architecture - tested on UMAX s900, Apple 9600
- The second processor on this machine boots up just fine and
- enters its idle loop. Hopefully a completely working SMP kernel
- on this machine will be done shortly.
-
- The code makes the assumption of only two processors. The changes
- necessary to work with any number would not be overly difficult but
- I don't have any machines with >2 processors so it's not high on my
- list of priorities. If anyone else would like do to the work email
- me and I can point out the places that need changed. If you have >2
- processors and don't want to add support yourself let me know and I
- can take a look into it.
-
- BeBox
- BeBox support hasn't been added to the 2.1.X kernels from 2.0.X
- but work is being done and SMP support for BeBox is in the works.
-
- CHRP
- CHRP SMP works and is fairly solid. It's been tested on the IBM F50
- with 4 processors for quite some time now.
diff --git a/Documentation/powerpc/sound.txt b/Documentation/powerpc/sound.txt
deleted file mode 100644
index df23d95e03a..00000000000
--- a/Documentation/powerpc/sound.txt
+++ /dev/null
@@ -1,81 +0,0 @@
- Information about PowerPC Sound support
-=====================================================================
-
-Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions,
-comments or corrections.
-
-Last Change: 6.16.99
-
-This just covers sound on the PReP and CHRP systems for now and later
-will contain information on the PowerMac's.
-
-Sound on PReP has been tested and is working with the PowerStack and IBM
-Power Series onboard sound systems which are based on the cs4231(2) chip.
-The sound options when doing the make config are a bit different from
-the default, though.
-
-The I/O base, irq and dma lines that you enter during the make config
-are ignored and are set when booting according to the machine type.
-This is so that one binary can be used for Motorola and IBM machines
-which use different values and isn't allowed by the driver, so things
-are hacked together in such a way as to allow this information to be
-set automatically on boot.
-
-1. Motorola PowerStack PReP machines
-
- Enable support for "Crystal CS4232 based (PnP) cards" and for the
- Microsoft Sound System. The MSS isn't used, but some of the routines
- that the CS4232 driver uses are in it.
-
- Although the options you set are ignored and determined automatically
- on boot these are included for information only:
-
- (830) CS4232 audio I/O base 530, 604, E80 or F40
- (10) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15
- (6) CS4232 audio DMA 0, 1 or 3
- (7) CS4232 second (duplex) DMA 0, 1 or 3
-
- This will allow simultaneous record and playback, as 2 different dma
- channels are used.
-
- The sound will be all left channel and very low volume since the
- auxiliary input isn't muted by default. I had the changes necessary
- for this in the kernel but the sound driver maintainer didn't want
- to include them since it wasn't common in other machines. To fix this
- you need to mute it using a mixer utility of some sort (if you find one
- please let me know) or by patching the driver yourself and recompiling.
-
- There is a problem on the PowerStack 2's (PowerStack Pro's) using a
- different irq/drq than the kernel expects. Unfortunately, I don't know
- which irq/drq it is so if anyone knows please email me.
-
- Midi is not supported since the cs4232 driver doesn't support midi yet.
-
-2. IBM PowerPersonal PReP machines
-
- I've only tested sound on the Power Personal Series of IBM workstations
- so if you try it on others please let me know the result. I'm especially
- interested in the 43p's sound system, which I know nothing about.
-
- Enable support for "Crystal CS4232 based (PnP) cards" and for the
- Microsoft Sound System. The MSS isn't used, but some of the routines
- that the CS4232 driver uses are in it.
-
- Although the options you set are ignored and determined automatically
- on boot these are included for information only:
-
- (530) CS4232 audio I/O base 530, 604, E80 or F40
- (5) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15
- (1) CS4232 audio DMA 0, 1 or 3
- (7) CS4232 second (duplex) DMA 0, 1 or 3
- (330) CS4232 MIDI I/O base 330, 370, 3B0 or 3F0
- (9) CS4232 MIDI IRQ 5, 7, 9, 11, 12 or 15
-
- This setup does _NOT_ allow for recording yet.
-
- Midi is not supported since the cs4232 driver doesn't support midi yet.
-
-2. IBM CHRP
-
- I have only tested this on the 43P-150. Build the kernel with the cs4232
- set as a module and load the module with irq=9 dma=1 dma2=2 io=0x550
diff --git a/Documentation/powerpc/transactional_memory.txt b/Documentation/powerpc/transactional_memory.txt
new file mode 100644
index 00000000000..9791e98ab49
--- /dev/null
+++ b/Documentation/powerpc/transactional_memory.txt
@@ -0,0 +1,198 @@
+Transactional Memory support
+============================
+
+POWER kernel support for this feature is currently limited to supporting
+its use by user programs. It is not currently used by the kernel itself.
+
+This file aims to sum up how it is supported by Linux and what behaviour you
+can expect from your user programs.
+
+
+Basic overview
+==============
+
+Hardware Transactional Memory is supported on POWER8 processors, and is a
+feature that enables a different form of atomic memory access. Several new
+instructions are presented to delimit transactions; transactions are
+guaranteed to either complete atomically or roll back and undo any partial
+changes.
+
+A simple transaction looks like this:
+
+begin_move_money:
+ tbegin
+ beq abort_handler
+
+ ld r4, SAVINGS_ACCT(r3)
+ ld r5, CURRENT_ACCT(r3)
+ subi r5, r5, 1
+ addi r4, r4, 1
+ std r4, SAVINGS_ACCT(r3)
+ std r5, CURRENT_ACCT(r3)
+
+ tend
+
+ b continue
+
+abort_handler:
+ ... test for odd failures ...
+
+ /* Retry the transaction if it failed because it conflicted with
+ * someone else: */
+ b begin_move_money
+
+
+The 'tbegin' instruction denotes the start point, and 'tend' the end point.
+Between these points the processor is in 'Transactional' state; any memory
+references will complete in one go if there are no conflicts with other
+transactional or non-transactional accesses within the system. In this
+example, the transaction completes as though it were normal straight-line code
+IF no other processor has touched SAVINGS_ACCT(r3) or CURRENT_ACCT(r3); an
+atomic move of money from the current account to the savings account has been
+performed. Even though the normal ld/std instructions are used (note no
+lwarx/stwcx), either *both* SAVINGS_ACCT(r3) and CURRENT_ACCT(r3) will be
+updated, or neither will be updated.
+
+If, in the meantime, there is a conflict with the locations accessed by the
+transaction, the transaction will be aborted by the CPU. Register and memory
+state will roll back to that at the 'tbegin', and control will continue from
+'tbegin+4'. The branch to abort_handler will be taken this second time; the
+abort handler can check the cause of the failure, and retry.
+
+Checkpointed registers include all GPRs, FPRs, VRs/VSRs, LR, CCR/CR, CTR, FPCSR
+and a few other status/flag regs; see the ISA for details.
+
+Causes of transaction aborts
+============================
+
+- Conflicts with cache lines used by other processors
+- Signals
+- Context switches
+- See the ISA for full documentation of everything that will abort transactions.
+
+
+Syscalls
+========
+
+Performing syscalls from within transaction is not recommended, and can lead
+to unpredictable results.
+
+Syscalls do not by design abort transactions, but beware: The kernel code will
+not be running in transactional state. The effect of syscalls will always
+remain visible, but depending on the call they may abort your transaction as a
+side-effect, read soon-to-be-aborted transactional data that should not remain
+invisible, etc. If you constantly retry a transaction that constantly aborts
+itself by calling a syscall, you'll have a livelock & make no progress.
+
+Simple syscalls (e.g. sigprocmask()) "could" be OK. Even things like write()
+from, say, printf() should be OK as long as the kernel does not access any
+memory that was accessed transactionally.
+
+Consider any syscalls that happen to work as debug-only -- not recommended for
+production use. Best to queue them up till after the transaction is over.
+
+
+Signals
+=======
+
+Delivery of signals (both sync and async) during transactions provides a second
+thread state (ucontext/mcontext) to represent the second transactional register
+state. Signal delivery 'treclaim's to capture both register states, so signals
+abort transactions. The usual ucontext_t passed to the signal handler
+represents the checkpointed/original register state; the signal appears to have
+arisen at 'tbegin+4'.
+
+If the sighandler ucontext has uc_link set, a second ucontext has been
+delivered. For future compatibility the MSR.TS field should be checked to
+determine the transactional state -- if so, the second ucontext in uc->uc_link
+represents the active transactional registers at the point of the signal.
+
+For 64-bit processes, uc->uc_mcontext.regs->msr is a full 64-bit MSR and its TS
+field shows the transactional mode.
+
+For 32-bit processes, the mcontext's MSR register is only 32 bits; the top 32
+bits are stored in the MSR of the second ucontext, i.e. in
+uc->uc_link->uc_mcontext.regs->msr. The top word contains the transactional
+state TS.
+
+However, basic signal handlers don't need to be aware of transactions
+and simply returning from the handler will deal with things correctly:
+
+Transaction-aware signal handlers can read the transactional register state
+from the second ucontext. This will be necessary for crash handlers to
+determine, for example, the address of the instruction causing the SIGSEGV.
+
+Example signal handler:
+
+ void crash_handler(int sig, siginfo_t *si, void *uc)
+ {
+ ucontext_t *ucp = uc;
+ ucontext_t *transactional_ucp = ucp->uc_link;
+
+ if (ucp_link) {
+ u64 msr = ucp->uc_mcontext.regs->msr;
+ /* May have transactional ucontext! */
+#ifndef __powerpc64__
+ msr |= ((u64)transactional_ucp->uc_mcontext.regs->msr) << 32;
+#endif
+ if (MSR_TM_ACTIVE(msr)) {
+ /* Yes, we crashed during a transaction. Oops. */
+ fprintf(stderr, "Transaction to be restarted at 0x%llx, but "
+ "crashy instruction was at 0x%llx\n",
+ ucp->uc_mcontext.regs->nip,
+ transactional_ucp->uc_mcontext.regs->nip);
+ }
+ }
+
+ fix_the_problem(ucp->dar);
+ }
+
+When in an active transaction that takes a signal, we need to be careful with
+the stack. It's possible that the stack has moved back up after the tbegin.
+The obvious case here is when the tbegin is called inside a function that
+returns before a tend. In this case, the stack is part of the checkpointed
+transactional memory state. If we write over this non transactionally or in
+suspend, we are in trouble because if we get a tm abort, the program counter and
+stack pointer will be back at the tbegin but our in memory stack won't be valid
+anymore.
+
+To avoid this, when taking a signal in an active transaction, we need to use
+the stack pointer from the checkpointed state, rather than the speculated
+state. This ensures that the signal context (written tm suspended) will be
+written below the stack required for the rollback. The transaction is aborted
+because of the treclaim, so any memory written between the tbegin and the
+signal will be rolled back anyway.
+
+For signals taken in non-TM or suspended mode, we use the
+normal/non-checkpointed stack pointer.
+
+
+Failure cause codes used by kernel
+==================================
+
+These are defined in <asm/reg.h>, and distinguish different reasons why the
+kernel aborted a transaction:
+
+ TM_CAUSE_RESCHED Thread was rescheduled.
+ TM_CAUSE_TLBI Software TLB invalide.
+ TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap.
+ TM_CAUSE_SYSCALL Currently unused; future syscalls that must abort
+ transactions for consistency will use this.
+ TM_CAUSE_SIGNAL Signal delivered.
+ TM_CAUSE_MISC Currently unused.
+ TM_CAUSE_ALIGNMENT Alignment fault.
+ TM_CAUSE_EMULATE Emulation that touched memory.
+
+These can be checked by the user program's abort handler as TEXASR[0:7]. If
+bit 7 is set, it indicates that the error is consider persistent. For example
+a TM_CAUSE_ALIGNMENT will be persistent while a TM_CAUSE_RESCHED will not.q
+
+GDB
+===
+
+GDB and ptrace are not currently TM-aware. If one stops during a transaction,
+it looks like the transaction has just started (the checkpointed state is
+presented). The transaction cannot then be continued and will take the failure
+handler route. Furthermore, the transactional 2nd register state will be
+inaccessible. GDB can currently be used on programs using TM, but not sensibly
+in parts within transactions.
diff --git a/Documentation/powerpc/zImage_layout.txt b/Documentation/powerpc/zImage_layout.txt
deleted file mode 100644
index 048e0150f57..00000000000
--- a/Documentation/powerpc/zImage_layout.txt
+++ /dev/null
@@ -1,47 +0,0 @@
- Information about the Linux/PPC kernel images
-=====================================================================
-
-Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions,
-comments or corrections.
-
-This document is meant to answer several questions I've had about how
-the PReP system boots and how Linux/PPC interacts with that mechanism.
-It would be nice if we could have information on how other architectures
-boot here as well. If you have anything to contribute, please
-let me know.
-
-
-1. PReP boot file
-
- This is the file necessary to boot PReP systems from floppy or
- hard drive. The firmware reads the PReP partition table entry
- and will load the image accordingly.
-
- To boot the zImage, copy it onto a floppy with dd if=zImage of=/dev/fd0h1440
- or onto a PReP hard drive partition with dd if=zImage of=/dev/sda4
- assuming you've created a PReP partition (type 0x41) with fdisk on
- /dev/sda4.
-
- The layout of the image format is:
-
- 0x0 +------------+
- | | PReP partition table entry
- | |
- 0x400 +------------+
- | | Bootstrap program code + data
- | |
- | |
- +------------+
- | | compressed kernel, elf header removed
- +------------+
- | | initrd (if loaded)
- +------------+
- | | Elf section table for bootstrap program
- +------------+
-
-
-2. MBX boot file
-
- The MBX boards can load an elf image, and relocate it to the
- proper location in memory - it copies the image to the location it was
- linked at.