aboutsummaryrefslogtreecommitdiff
path: root/Documentation/powerpc
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@ppc970.osdl.org>2005-04-16 15:20:36 -0700
committerLinus Torvalds <torvalds@ppc970.osdl.org>2005-04-16 15:20:36 -0700
commit1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch)
tree0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/powerpc
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
Diffstat (limited to 'Documentation/powerpc')
-rw-r--r--Documentation/powerpc/00-INDEX20
-rw-r--r--Documentation/powerpc/SBC8260_memory_mapping.txt197
-rw-r--r--Documentation/powerpc/cpu_features.txt56
-rw-r--r--Documentation/powerpc/eeh-pci-error-recovery.txt332
-rw-r--r--Documentation/powerpc/hvcs.txt567
-rw-r--r--Documentation/powerpc/mpc52xx.txt39
-rw-r--r--Documentation/powerpc/ppc_htab.txt118
-rw-r--r--Documentation/powerpc/smp.txt34
-rw-r--r--Documentation/powerpc/sound.txt81
-rw-r--r--Documentation/powerpc/zImage_layout.txt47
10 files changed, 1491 insertions, 0 deletions
diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX
new file mode 100644
index 00000000000..e7bea0a407b
--- /dev/null
+++ b/Documentation/powerpc/00-INDEX
@@ -0,0 +1,20 @@
+Index of files in Documentation/powerpc. If you think something about
+Linux/PPC needs an entry here, needs correction or you've written one
+please mail me.
+ Cort Dougan (cort@fsmlabs.com)
+
+00-INDEX
+ - this file
+cpu_features.txt
+ - info on how we support a variety of CPUs with minimal compile-time
+ options.
+ppc_htab.txt
+ - info about the Linux/PPC /proc/ppc_htab entry
+smp.txt
+ - use and state info about Linux/PPC on MP machines
+SBC8260_memory_mapping.txt
+ - EST SBC8260 board info
+sound.txt
+ - info on sound support under Linux/PPC
+zImage_layout.txt
+ - info on the kernel images for Linux/PPC
diff --git a/Documentation/powerpc/SBC8260_memory_mapping.txt b/Documentation/powerpc/SBC8260_memory_mapping.txt
new file mode 100644
index 00000000000..e6e9ee0506c
--- /dev/null
+++ b/Documentation/powerpc/SBC8260_memory_mapping.txt
@@ -0,0 +1,197 @@
+Please mail me (Jon Diekema, diekema_jon@si.com or diekema@cideas.com)
+if you have questions, comments or corrections.
+
+ * EST SBC8260 Linux memory mapping rules
+
+ http://www.estc.com/
+ http://www.estc.com/products/boards/SBC8260-8240_ds.html
+
+ Initial conditions:
+ -------------------
+
+ Tasks that need to be perform by the boot ROM before control is
+ transferred to zImage (compressed Linux kernel):
+
+ - Define the IMMR to 0xf0000000
+
+ - Initialize the memory controller so that RAM is available at
+ physical address 0x00000000. On the SBC8260 is this 16M (64M)
+ SDRAM.
+
+ - The boot ROM should only clear the RAM that it is using.
+
+ The reason for doing this is to enhances the chances of a
+ successful post mortem on a Linux panic. One of the first
+ items to examine is the 16k (LOG_BUF_LEN) circular console
+ buffer called log_buf which is defined in kernel/printk.c.
+
+ - To enhance boot ROM performance, the I-cache can be enabled.
+
+ Date: Mon, 22 May 2000 14:21:10 -0700
+ From: Neil Russell <caret@c-side.com>
+
+ LiMon (LInux MONitor) runs with and starts Linux with MMU
+ off, I-cache enabled, D-cache disabled. The I-cache doesn't
+ need hints from the MMU to work correctly as the D-cache
+ does. No D-cache means no special code to handle devices in
+ the presence of cache (no snooping, etc). The use of the
+ I-cache means that the monitor can run acceptably fast
+ directly from ROM, rather than having to copy it to RAM.
+
+ - Build the board information structure (see
+ include/asm-ppc/est8260.h for its definition)
+
+ - The compressed Linux kernel (zImage) contains a bootstrap loader
+ that is position independent; you can load it into any RAM,
+ ROM or FLASH memory address >= 0x00500000 (above 5 MB), or
+ at its link address of 0x00400000 (4 MB).
+
+ Note: If zImage is loaded at its link address of 0x00400000 (4 MB),
+ then zImage will skip the step of moving itself to
+ its link address.
+
+ - Load R3 with the address of the board information structure
+
+ - Transfer control to zImage
+
+ - The Linux console port is SMC1, and the baud rate is controlled
+ from the bi_baudrate field of the board information structure.
+ On thing to keep in mind when picking the baud rate, is that
+ there is no flow control on the SMC ports. I would stick
+ with something safe and standard like 19200.
+
+ On the EST SBC8260, the SMC1 port is on the COM1 connector of
+ the board.
+
+
+ EST SBC8260 defaults:
+ ---------------------
+
+ Chip
+ Memory Sel Bus Use
+ --------------------- --- --- ----------------------------------
+ 0x00000000-0x03FFFFFF CS2 60x (16M or 64M)/64M SDRAM
+ 0x04000000-0x04FFFFFF CS4 local 4M/16M SDRAM (soldered to the board)
+ 0x21000000-0x21000000 CS7 60x 1B/64K Flash present detect (from the flash SIMM)
+ 0x21000001-0x21000001 CS7 60x 1B/64K Switches (read) and LEDs (write)
+ 0x22000000-0x2200FFFF CS5 60x 8K/64K EEPROM
+ 0xFC000000-0xFCFFFFFF CS6 60x 2M/16M flash (8 bits wide, soldered to the board)
+ 0xFE000000-0xFFFFFFFF CS0 60x 4M/16M flash (SIMM)
+
+ Notes:
+ ------
+
+ - The chip selects can map 32K blocks and up (powers of 2)
+
+ - The SDRAM machine can handled up to 128Mbytes per chip select
+
+ - Linux uses the 60x bus memory (the SDRAM DIMM) for the
+ communications buffers.
+
+ - BATs can map 128K-256Mbytes each. There are four data BATs and
+ four instruction BATs. Generally the data and instruction BATs
+ are mapped the same.
+
+ - The IMMR must be set above the kernel virtual memory addresses,
+ which start at 0xC0000000. Otherwise, the kernel may crash as
+ soon as you start any threads or processes due to VM collisions
+ in the kernel or user process space.
+
+
+ Details from Dan Malek <dan_malek@mvista.com> on 10/29/1999:
+
+ The user application virtual space consumes the first 2 Gbytes
+ (0x00000000 to 0x7FFFFFFF). The kernel virtual text starts at
+ 0xC0000000, with data following. There is a "protection hole"
+ between the end of kernel data and the start of the kernel
+ dynamically allocated space, but this space is still within
+ 0xCxxxxxxx.
+
+ Obviously the kernel can't map any physical addresses 1:1 in
+ these ranges.
+
+
+ Details from Dan Malek <dan_malek@mvista.com> on 5/19/2000:
+
+ During the early kernel initialization, the kernel virtual
+ memory allocator is not operational. Prior to this KVM
+ initialization, we choose to map virtual to physical addresses
+ 1:1. That is, the kernel virtual address exactly matches the
+ physical address on the bus. These mappings are typically done
+ in arch/ppc/kernel/head.S, or arch/ppc/mm/init.c. Only
+ absolutely necessary mappings should be done at this time, for
+ example board control registers or a serial uart. Normal device
+ driver initialization should map resources later when necessary.
+
+ Although platform dependent, and certainly the case for embedded
+ 8xx, traditionally memory is mapped at physical address zero,
+ and I/O devices above physical address 0x80000000. The lowest
+ and highest (above 0xf0000000) I/O addresses are traditionally
+ used for devices or registers we need to map during kernel
+ initialization and prior to KVM operation. For this reason,
+ and since it followed prior PowerPC platform examples, I chose
+ to map the embedded 8xx kernel to the 0xc0000000 virtual address.
+ This way, we can enable the MMU to map the kernel for proper
+ operation, and still map a few windows before the KVM is operational.
+
+ On some systems, you could possibly run the kernel at the
+ 0x80000000 or any other virtual address. It just depends upon
+ mapping that must be done prior to KVM operational. You can never
+ map devices or kernel spaces that overlap with the user virtual
+ space. This is why default IMMR mapping used by most BDM tools
+ won't work. They put the IMMR at something like 0x10000000 or
+ 0x02000000 for example. You simply can't map these addresses early
+ in the kernel, and continue proper system operation.
+
+ The embedded 8xx/82xx kernel is mature enough that all you should
+ need to do is map the IMMR someplace at or above 0xf0000000 and it
+ should boot far enough to get serial console messages and KGDB
+ connected on any platform. There are lots of other subtle memory
+ management design features that you simply don't need to worry
+ about. If you are changing functions related to MMU initialization,
+ you are likely breaking things that are known to work and are
+ heading down a path of disaster and frustration. Your changes
+ should be to make the flexibility of the processor fit Linux,
+ not force arbitrary and non-workable memory mappings into Linux.
+
+ - You don't want to change KERNELLOAD or KERNELBASE, otherwise the
+ virtual memory and MMU code will get confused.
+
+ arch/ppc/Makefile:KERNELLOAD = 0xc0000000
+
+ include/asm-ppc/page.h:#define PAGE_OFFSET 0xc0000000
+ include/asm-ppc/page.h:#define KERNELBASE PAGE_OFFSET
+
+ - RAM is at physical address 0x00000000, and gets mapped to
+ virtual address 0xC0000000 for the kernel.
+
+
+ Physical addresses used by the Linux kernel:
+ --------------------------------------------
+
+ 0x00000000-0x3FFFFFFF 1GB reserved for RAM
+ 0xF0000000-0xF001FFFF 128K IMMR 64K used for dual port memory,
+ 64K for 8260 registers
+
+
+ Logical addresses used by the Linux kernel:
+ -------------------------------------------
+
+ 0xF0000000-0xFFFFFFFF 256M BAT0 (IMMR: dual port RAM, registers)
+ 0xE0000000-0xEFFFFFFF 256M BAT1 (I/O space for custom boards)
+ 0xC0000000-0xCFFFFFFF 256M BAT2 (RAM)
+ 0xD0000000-0xDFFFFFFF 256M BAT3 (if RAM > 256MByte)
+
+
+ EST SBC8260 Linux mapping:
+ --------------------------
+
+ DBAT0, IBAT0, cache inhibited:
+
+ Chip
+ Memory Sel Use
+ --------------------- --- ---------------------------------
+ 0xF0000000-0xF001FFFF n/a IMMR: dual port RAM, registers
+
+ DBAT1, IBAT1, cache inhibited:
+
diff --git a/Documentation/powerpc/cpu_features.txt b/Documentation/powerpc/cpu_features.txt
new file mode 100644
index 00000000000..472739880e8
--- /dev/null
+++ b/Documentation/powerpc/cpu_features.txt
@@ -0,0 +1,56 @@
+Hollis Blanchard <hollis@austin.ibm.com>
+5 Jun 2002
+
+This document describes the system (including self-modifying code) used in the
+PPC Linux kernel to support a variety of PowerPC CPUs without requiring
+compile-time selection.
+
+Early in the boot process the ppc32 kernel detects the current CPU type and
+chooses a set of features accordingly. Some examples include Altivec support,
+split instruction and data caches, and if the CPU supports the DOZE and NAP
+sleep modes.
+
+Detection of the feature set is simple. A list of processors can be found in
+arch/ppc/kernel/cputable.c. The PVR register is masked and compared with each
+value in the list. If a match is found, the cpu_features of cur_cpu_spec is
+assigned to the feature bitmask for this processor and a __setup_cpu function
+is called.
+
+C code may test 'cur_cpu_spec[smp_processor_id()]->cpu_features' for a
+particular feature bit. This is done in quite a few places, for example
+in ppc_setup_l2cr().
+
+Implementing cpufeatures in assembly is a little more involved. There are
+several paths that are performance-critical and would suffer if an array
+index, structure dereference, and conditional branch were added. To avoid the
+performance penalty but still allow for runtime (rather than compile-time) CPU
+selection, unused code is replaced by 'nop' instructions. This nop'ing is
+based on CPU 0's capabilities, so a multi-processor system with non-identical
+processors will not work (but such a system would likely have other problems
+anyways).
+
+After detecting the processor type, the kernel patches out sections of code
+that shouldn't be used by writing nop's over it. Using cpufeatures requires
+just 2 macros (found in include/asm-ppc/cputable.h), as seen in head.S
+transfer_to_handler:
+
+ #ifdef CONFIG_ALTIVEC
+ BEGIN_FTR_SECTION
+ mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */
+ stw r22,THREAD_VRSAVE(r23)
+ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
+ #endif /* CONFIG_ALTIVEC */
+
+If CPU 0 supports Altivec, the code is left untouched. If it doesn't, both
+instructions are replaced with nop's.
+
+The END_FTR_SECTION macro has two simpler variations: END_FTR_SECTION_IFSET
+and END_FTR_SECTION_IFCLR. These simply test if a flag is set (in
+cur_cpu_spec[0]->cpu_features) or is cleared, respectively. These two macros
+should be used in the majority of cases.
+
+The END_FTR_SECTION macros are implemented by storing information about this
+code in the '__ftr_fixup' ELF section. When do_cpu_ftr_fixups
+(arch/ppc/kernel/misc.S) is invoked, it will iterate over the records in
+__ftr_fixup, and if the required feature is not present it will loop writing
+nop's from each BEGIN_FTR_SECTION to END_FTR_SECTION.
diff --git a/Documentation/powerpc/eeh-pci-error-recovery.txt b/Documentation/powerpc/eeh-pci-error-recovery.txt
new file mode 100644
index 00000000000..2bfe71beec5
--- /dev/null
+++ b/Documentation/powerpc/eeh-pci-error-recovery.txt
@@ -0,0 +1,332 @@
+
+
+ PCI Bus EEH Error Recovery
+ --------------------------
+ Linas Vepstas
+ <linas@austin.ibm.com>
+ 12 January 2005
+
+
+Overview:
+---------
+The IBM POWER-based pSeries and iSeries computers include PCI bus
+controller chips that have extended capabilities for detecting and
+reporting a large variety of PCI bus error conditions. These features
+go under the name of "EEH", for "Extended Error Handling". The EEH
+hardware features allow PCI bus errors to be cleared and a PCI
+card to be "rebooted", without also having to reboot the operating
+system.
+
+This is in contrast to traditional PCI error handling, where the
+PCI chip is wired directly to the CPU, and an error would cause
+a CPU machine-check/check-stop condition, halting the CPU entirely.
+Another "traditional" technique is to ignore such errors, which
+can lead to data corruption, both of user data or of kernel data,
+hung/unresponsive adapters, or system crashes/lockups. Thus,
+the idea behind EEH is that the operating system can become more
+reliable and robust by protecting it from PCI errors, and giving
+the OS the ability to "reboot"/recover individual PCI devices.
+
+Future systems from other vendors, based on the PCI-E specification,
+may contain similar features.
+
+
+Causes of EEH Errors
+--------------------
+EEH was originally designed to guard against hardware failure, such
+as PCI cards dying from heat, humidity, dust, vibration and bad
+electrical connections. The vast majority of EEH errors seen in
+"real life" are due to eithr poorly seated PCI cards, or,
+unfortunately quite commonly, due device driver bugs, device firmware
+bugs, and sometimes PCI card hardware bugs.
+
+The most common software bug, is one that causes the device to
+attempt to DMA to a location in system memory that has not been
+reserved for DMA access for that card. This is a powerful feature,
+as it prevents what; otherwise, would have been silent memory
+corruption caused by the bad DMA. A number of device driver
+bugs have been found and fixed in this way over the past few
+years. Other possible causes of EEH errors include data or
+address line parity errors (for example, due to poor electrical
+connectivity due to a poorly seated card), and PCI-X split-completion
+errors (due to software, device firmware, or device PCI hardware bugs).
+The vast majority of "true hardware failures" can be cured by
+physically removing and re-seating the PCI card.
+
+
+Detection and Recovery
+----------------------
+In the following discussion, a generic overview of how to detect
+and recover from EEH errors will be presented. This is followed
+by an overview of how the current implementation in the Linux
+kernel does it. The actual implementation is subject to change,
+and some of the finer points are still being debated. These
+may in turn be swayed if or when other architectures implement
+similar functionality.
+
+When a PCI Host Bridge (PHB, the bus controller connecting the
+PCI bus to the system CPU electronics complex) detects a PCI error
+condition, it will "isolate" the affected PCI card. Isolation
+will block all writes (either to the card from the system, or
+from the card to the system), and it will cause all reads to
+return all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads).
+This value was chosen because it is the same value you would
+get if the device was physically unplugged from the slot.
+This includes access to PCI memory, I/O space, and PCI config
+space. Interrupts; however, will continued to be delivered.
+
+Detection and recovery are performed with the aid of ppc64
+firmware. The programming interfaces in the Linux kernel
+into the firmware are referred to as RTAS (Run-Time Abstraction
+Services). The Linux kernel does not (should not) access
+the EEH function in the PCI chipsets directly, primarily because
+there are a number of different chipsets out there, each with
+different interfaces and quirks. The firmware provides a
+uniform abstraction layer that will work with all pSeries
+and iSeries hardware (and be forwards-compatible).
+
+If the OS or device driver suspects that a PCI slot has been
+EEH-isolated, there is a firmware call it can make to determine if
+this is the case. If so, then the device driver should put itself
+into a consistent state (given that it won't be able to complete any
+pending work) and start recovery of the card. Recovery normally
+would consist of reseting the PCI device (holding the PCI #RST
+line high for two seconds), followed by setting up the device
+config space (the base address registers (BAR's), latency timer,
+cache line size, interrupt line, and so on). This is followed by a
+reinitialization of the device driver. In a worst-case scenario,
+the power to the card can be toggled, at least on hot-plug-capable
+slots. In principle, layers far above the device driver probably
+do not need to know that the PCI card has been "rebooted" in this
+way; ideally, there should be at most a pause in Ethernet/disk/USB
+I/O while the card is being reset.
+
+If the card cannot be recovered after three or four resets, the
+kernel/device driver should assume the worst-case scenario, that the
+card has died completely, and report this error to the sysadmin.
+In addition, error messages are reported through RTAS and also through
+syslogd (/var/log/messages) to alert the sysadmin of PCI resets.
+The correct way to deal with failed adapters is to use the standard
+PCI hotplug tools to remove and replace the dead card.
+
+
+Current PPC64 Linux EEH Implementation
+--------------------------------------
+At this time, a generic EEH recovery mechanism has been implemented,
+so that individual device drivers do not need to be modified to support
+EEH recovery. This generic mechanism piggy-backs on the PCI hotplug
+infrastructure, and percolates events up through the hotplug/udev
+infrastructure. Followiing is a detailed description of how this is
+accomplished.
+
+EEH must be enabled in the PHB's very early during the boot process,
+and if a PCI slot is hot-plugged. The former is performed by
+eeh_init() in arch/ppc64/kernel/eeh.c, and the later by
+drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code.
+EEH must be enabled before a PCI scan of the device can proceed.
+Current Power5 hardware will not work unless EEH is enabled;
+although older Power4 can run with it disabled. Effectively,
+EEH can no longer be turned off. PCI devices *must* be
+registered with the EEH code; the EEH code needs to know about
+the I/O address ranges of the PCI device in order to detect an
+error. Given an arbitrary address, the routine
+pci_get_device_by_addr() will find the pci device associated
+with that address (if any).
+
+The default include/asm-ppc64/io.h macros readb(), inb(), insb(),
+etc. include a check to see if the the i/o read returned all-0xff's.
+If so, these make a call to eeh_dn_check_failure(), which in turn
+asks the firmware if the all-ff's value is the sign of a true EEH
+error. If it is not, processing continues as normal. The grand
+total number of these false alarms or "false positives" can be
+seen in /proc/ppc64/eeh (subject to change). Normally, almost
+all of these occur during boot, when the PCI bus is scanned, where
+a large number of 0xff reads are part of the bus scan procedure.
+
+If a frozen slot is detected, code in arch/ppc64/kernel/eeh.c will
+print a stack trace to syslog (/var/log/messages). This stack trace
+has proven to be very useful to device-driver authors for finding
+out at what point the EEH error was detected, as the error itself
+usually occurs slightly beforehand.
+
+Next, it uses the Linux kernel notifier chain/work queue mechanism to
+allow any interested parties to find out about the failure. Device
+drivers, or other parts of the kernel, can use
+eeh_register_notifier(struct notifier_block *) to find out about EEH
+events. The event will include a pointer to the pci device, the
+device node and some state info. Receivers of the event can "do as
+they wish"; the default handler will be described further in this
+section.
+
+To assist in the recovery of the device, eeh.c exports the
+following functions:
+
+rtas_set_slot_reset() -- assert the PCI #RST line for 1/8th of a second
+rtas_configure_bridge() -- ask firmware to configure any PCI bridges
+ located topologically under the pci slot.
+eeh_save_bars() and eeh_restore_bars(): save and restore the PCI
+ config-space info for a device and any devices under it.
+
+
+A handler for the EEH notifier_block events is implemented in
+drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events().
+It saves the device BAR's and then calls rpaphp_unconfig_pci_adapter().
+This last call causes the device driver for the card to be stopped,
+which causes hotplug events to go out to user space. This triggers
+user-space scripts that might issue commands such as "ifdown eth0"
+for ethernet cards, and so on. This handler then sleeps for 5 seconds,
+hoping to give the user-space scripts enough time to complete.
+It then resets the PCI card, reconfigures the device BAR's, and
+any bridges underneath. It then calls rpaphp_enable_pci_slot(),
+which restarts the device driver and triggers more user-space
+events (for example, calling "ifup eth0" for ethernet cards).
+
+
+Device Shutdown and User-Space Events
+-------------------------------------
+This section documents what happens when a pci slot is unconfigured,
+focusing on how the device driver gets shut down, and on how the
+events get delivered to user-space scripts.
+
+Following is an example sequence of events that cause a device driver
+close function to be called during the first phase of an EEH reset.
+The following sequence is an example of the pcnet32 device driver.
+
+ rpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c
+ {
+ calls
+ pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c
+ {
+ calls
+ pci_destroy_dev (struct pci_dev *)
+ {
+ calls
+ device_unregister (&dev->dev) // in /drivers/base/core.c
+ {
+ calls
+ device_del (struct device *)
+ {
+ calls
+ bus_remove_device() // in /drivers/base/bus.c
+ {
+ calls
+ device_release_driver()
+ {
+ calls
+ struct device_driver->remove() which is just
+ pci_device_remove() // in /drivers/pci/pci_driver.c
+ {
+ calls
+ struct pci_driver->remove() which is just
+ pcnet32_remove_one() // in /drivers/net/pcnet32.c
+ {
+ calls
+ unregister_netdev() // in /net/core/dev.c
+ {
+ calls
+ dev_close() // in /net/core/dev.c
+ {
+ calls dev->stop();
+ which is just pcnet32_close() // in pcnet32.c
+ {
+ which does what you wanted
+ to stop the device
+ }
+ }
+ }
+ which
+ frees pcnet32 device driver memory
+ }
+ }}}}}}
+
+
+ in drivers/pci/pci_driver.c,
+ struct device_driver->remove() is just pci_device_remove()
+ which calls struct pci_driver->remove() which is pcnet32_remove_one()
+ which calls unregister_netdev() (in net/core/dev.c)
+ which calls dev_close() (in net/core/dev.c)
+ which calls dev->stop() which is pcnet32_close()
+ which then does the appropriate shutdown.
+
+---
+Following is the analogous stack trace for events sent to user-space
+when the pci device is unconfigured.
+
+rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
+ calls
+ pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
+ calls
+ pci_destroy_dev (struct pci_dev *) {
+ calls
+ device_unregister (&dev->dev) { // in /drivers/base/core.c
+ calls
+ device_del(struct device * dev) { // in /drivers/base/core.c
+ calls
+ kobject_del() { //in /libs/kobject.c
+ calls
+ kobject_hotplug() { // in /libs/kobject.c
+ calls
+ kset_hotplug() { // in /lib/kobject.c
+ calls
+ kset->hotplug_ops->hotplug() which is really just
+ a call to
+ dev_hotplug() { // in /drivers/base/core.c
+ calls
+ dev->bus->hotplug() which is really just a call to
+ pci_hotplug () { // in drivers/pci/hotplug.c
+ which prints device name, etc....
+ }
+ }
+ then kset_hotplug() calls
+ call_usermodehelper () with
+ argv[0]=hotplug_path[] which is "/sbin/hotplug"
+ --> event to userspace,
+ }
+ }
+ kobject_del() then calls sysfs_remove_dir(), which would
+ trigger any user-space daemon that was watching /sysfs,
+ and notice the delete event.
+
+
+Pro's and Con's of the Current Design
+-------------------------------------
+There are several issues with the current EEH software recovery design,
+which may be addressed in future revisions. But first, note that the
+big plus of the current design is that no changes need to be made to
+individual device drivers, so that the current design throws a wide net.
+The biggest negative of the design is that it potentially disturbs
+network daemons and file systems that didn't need to be disturbed.
+
+-- A minor complaint is that resetting the network card causes
+ user-space back-to-back ifdown/ifup burps that potentially disturb
+ network daemons, that didn't need to even know that the pci
+ card was being rebooted.
+
+-- A more serious concern is that the same reset, for SCSI devices,
+ causes havoc to mounted file systems. Scripts cannot post-facto
+ unmount a file system without flushing pending buffers, but this
+ is impossible, because I/O has already been stopped. Thus,
+ ideally, the reset should happen at or below the block layer,
+ so that the file systems are not disturbed.
+
+ Reiserfs does not tolerate errors returned from the block device.
+ Ext3fs seems to be tolerant, retrying reads/writes until it does
+ succeed. Both have been only lightly tested in this scenario.
+
+ The SCSI-generic subsystem already has built-in code for performing
+ SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter
+ (HBA) resets. These are cascaded into a chain of attempted
+ resets if a SCSI command fails. These are completely hidden
+ from the block layer. It would be very natural to add an EEH
+ reset into this chain of events.
+
+-- If a SCSI error occurs for the root device, all is lost unless
+ the sysadmin had the foresight to run /bin, /sbin, /etc, /var
+ and so on, out of ramdisk/tmpfs.
+
+
+Conclusions
+-----------
+There's forward progress ...
+
+
diff --git a/Documentation/powerpc/hvcs.txt b/Documentation/powerpc/hvcs.txt
new file mode 100644
index 00000000000..c0a62e116e6
--- /dev/null
+++ b/Documentation/powerpc/hvcs.txt
@@ -0,0 +1,567 @@
+===========================================================================
+ HVCS
+ IBM "Hypervisor Virtual Console Server" Installation Guide
+ for Linux Kernel 2.6.4+
+ Copyright (C) 2004 IBM Corporation
+
+===========================================================================
+NOTE:Eight space tabs are the optimum editor setting for reading this file.
+===========================================================================
+
+ Author(s) : Ryan S. Arnold <rsa@us.ibm.com>
+ Date Created: March, 02, 2004
+ Last Changed: August, 24, 2004
+
+---------------------------------------------------------------------------
+Table of contents:
+
+ 1. Driver Introduction:
+ 2. System Requirements
+ 3. Build Options:
+ 3.1 Built-in:
+ 3.2 Module:
+ 4. Installation:
+ 5. Connection:
+ 6. Disconnection:
+ 7. Configuration:
+ 8. Questions & Answers:
+ 9. Reporting Bugs:
+
+---------------------------------------------------------------------------
+1. Driver Introduction:
+
+This is the device driver for the IBM Hypervisor Virtual Console Server,
+"hvcs". The IBM hvcs provides a tty driver interface to allow Linux user
+space applications access to the system consoles of logically partitioned
+operating systems (Linux and AIX) running on the same partitioned Power5
+ppc64 system. Physical hardware consoles per partition are not practical
+on this hardware so system consoles are accessed by this driver using
+firmware interfaces to virtual terminal devices.
+
+---------------------------------------------------------------------------
+2. System Requirements:
+
+This device driver was written using 2.6.4 Linux kernel APIs and will only
+build and run on kernels of this version or later.
+
+This driver was written to operate solely on IBM Power5 ppc64 hardware
+though some care was taken to abstract the architecture dependent firmware
+calls from the driver code.
+
+Sysfs must be mounted on the system so that the user can determine which
+major and minor numbers are associated with each vty-server. Directions
+for sysfs mounting are outside the scope of this document.
+
+---------------------------------------------------------------------------
+3. Build Options:
+
+The hvcs driver registers itself as a tty driver. The tty layer
+dynamically allocates a block of major and minor numbers in a quantity
+requested by the registering driver. The hvcs driver asks the tty layer
+for 64 of these major/minor numbers by default to use for hvcs device node
+entries.
+
+If the default number of device entries is adequate then this driver can be
+built into the kernel. If not, the default can be over-ridden by inserting
+the driver as a module with insmod parameters.
+
+---------------------------------------------------------------------------
+3.1 Built-in:
+
+The following menuconfig example demonstrates selecting to build this
+driver into the kernel.
+
+ Device Drivers --->
+ Character devices --->
+ <*> IBM Hypervisor Virtual Console Server Support
+
+Begin the kernel make process.
+
+---------------------------------------------------------------------------
+3.2 Module:
+
+The following menuconfig example demonstrates selecting to build this
+driver as a kernel module.
+
+ Device Drivers --->
+ Character devices --->
+ <M> IBM Hypervisor Virtual Console Server Support
+
+The make process will build the following kernel modules:
+
+ hvcs.ko
+ hvcserver.ko
+
+To insert the module with the default allocation execute the following
+commands in the order they appear:
+
+ insmod hvcserver.ko
+ insmod hvcs.ko
+
+The hvcserver module contains architecture specific firmware calls and must
+be inserted first, otherwise the hvcs module will not find some of the
+symbols it expects.
+
+To override the default use an insmod parameter as follows (requesting 4
+tty devices as an example):
+
+ insmod hvcs.ko hvcs_parm_num_devs=4
+
+There is a maximum number of dev entries that can be specified on insmod.
+We think that 1024 is currently a decent maximum number of server adapters
+to allow. This can always be changed by modifying the constant in the
+source file before building.
+
+NOTE: The length of time it takes to insmod the driver seems to be related
+to the number of tty interfaces the registering driver requests.
+
+In order to remove the driver module execute the following command:
+
+ rmmod hvcs.ko
+
+The recommended method for installing hvcs as a module is to use depmod to
+build a current modules.dep file in /lib/modules/`uname -r` and then
+execute:
+
+modprobe hvcs hvcs_parm_num_devs=4
+
+The modules.dep file indicates that hvcserver.ko needs to be inserted
+before hvcs.ko and modprobe uses this file to smartly insert the modules in
+the proper order.
+
+The following modprobe command is used to remove hvcs and hvcserver in the
+proper order:
+
+modprobe -r hvcs
+
+---------------------------------------------------------------------------
+4. Installation:
+
+The tty layer creates sysfs entries which contain the major and minor
+numbers allocated for the hvcs driver. The following snippet of "tree"
+output of the sysfs directory shows where these numbers are presented:
+
+ sys/
+ |-- *other sysfs base dirs*
+ |
+ |-- class
+ | |-- *other classes of devices*
+ | |
+ | `-- tty
+ | |-- *other tty devices*
+ | |
+ | |-- hvcs0
+ | | `-- dev
+ | |-- hvcs1
+ | | `-- dev
+ | |-- hvcs2
+ | | `-- dev
+ | |-- hvcs3
+ | | `-- dev
+ | |
+ | |-- *other tty devices*
+ |
+ |-- *other sysfs base dirs*
+
+For the above examples the following output is a result of cat'ing the
+"dev" entry in the hvcs directory:
+
+ Pow5:/sys/class/tty/hvcs0/ # cat dev
+ 254:0
+
+ Pow5:/sys/class/tty/hvcs1/ # cat dev
+ 254:1
+
+ Pow5:/sys/class/tty/hvcs2/ # cat dev
+ 254:2
+
+ Pow5:/sys/class/tty/hvcs3/ # cat dev
+ 254:3
+
+The output from reading the "dev" attribute is the char device major and
+minor numbers that the tty layer has allocated for this driver's use. Most
+systems running hvcs will already have the device entries created or udev
+will do it automatically.
+
+Given the example output above, to manually create a /dev/hvcs* node entry
+mknod can be used as follows:
+
+ mknod /dev/hvcs0 c 254 0
+ mknod /dev/hvcs1 c 254 1
+ mknod /dev/hvcs2 c 254 2
+ mknod /dev/hvcs3 c 254 3
+
+Using mknod to manually create the device entries makes these device nodes
+persistent. Once created they will exist prior to the driver insmod.
+
+Attempting to connect an application to /dev/hvcs* prior to insertion of
+the hvcs module will result in an error message similar to the following:
+
+ "/dev/hvcs*: No such device".
+
+NOTE: Just because there is a device node present doesn't mean that there
+is a vty-server device configured for that node.
+
+---------------------------------------------------------------------------
+5. Connection
+
+Since this driver controls devices that provide a tty interface a user can
+interact with the device node entries using any standard tty-interactive
+method (e.g. "cat", "dd", "echo"). The intent of this driver however, is
+to provide real time console interaction with a Linux partition's console,
+which requires the use of applications that provide bi-directional,
+interactive I/O with a tty device.
+
+Applications (e.g. "minicom" and "screen") that act as terminal emulators
+or perform terminal type control