aboutsummaryrefslogtreecommitdiff
path: root/drivers/iommu/amd_iommu.c
AgeCommit message (Collapse)Author
2013-07-25iommu/amd: Only unmap large pages from the first pteAlex Williamson
commit 60d0ca3cfd199b6612bbbbf4999a3470dad38bb1 upstream. If we use a large mapping, the expectation is that only unmaps from the first pte in the superpage are supported. Unmaps from offsets into the superpage should fail (ie. return zero sized unmap). In the current code, unmapping from an offset clears the size of the full mapping starting from an offset. For instance, if we map a 16k physically contiguous range at IOVA 0x0 with a large page, then attempt to unmap 4k at offset 12k, 4 ptes are cleared (12k - 28k) and the unmap returns 16k unmapped. This potentially incorrectly clears valid mappings and confuses drivers like VFIO that use the unmap size to release pinned pages. Fix by refusing to unmap from offsets into the page. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-06Merge tag 'iommu-updates-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "The updates are mostly about the x86 IOMMUs this time. Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a PPC platform) and an extension to the IOMMU group interface. On the x86 side this includes a workaround for VT-d to disable interrupt remapping on broken chipsets. On the AMD-Vi side the most important new feature is a kernel command-line interface to override broken information in IVRS ACPI tables and get interrupt remapping working this way. Besides that there are small fixes all over the place." * tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (24 commits) iommu/tegra: Fix printk formats for dma_addr_t iommu: Add a function to find an iommu group by id iommu/vt-d: Remove warning for HPET scope type iommu: Move swap_pci_ref function to drivers/iommu/pci.h. iommu/vt-d: Disable translation if already enabled iommu/amd: fix error return code in early_amd_iommu_init() iommu/AMD: Per-thread IOMMU Interrupt Handling iommu: Include linux/err.h iommu/amd: Workaround for ERBT1312 iommu/amd: Document ivrs_ioapic and ivrs_hpet parameters iommu/amd: Don't report firmware bugs with cmd-line ivrs overrides iommu/amd: Add ioapic and hpet ivrs override iommu/amd: Add early maps for ioapic and hpet iommu/amd: Extend IVRS special device data structure iommu/amd: Move add_special_device() to __init iommu: Fix compile warnings with forward declarations iommu/amd: Properly initialize irq-table lock iommu/amd: Use AMD specific data structure for irq remapping iommu/amd: Remove map_sg_no_iommu() iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets ...
2013-05-02Merge branches 'iommu/fixes', 'x86/vt-d', 'x86/amd', 'ppc/pamu', 'core' and ↵Joerg Roedel
'arm/tegra' into next
2013-04-29Merge tag 'pci-v3.10-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.10 merge window: PCI device hotplug - Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe) - Make acpiphp builtin only, not modular (Jiang Liu) - Add acpiphp mutual exclusion (Jiang Liu) Power management - Skip "PME enabled/disabled" messages when not supported (Rafael Wysocki) - Fix fallback to PCI_D0 (Rafael Wysocki) Miscellaneous - Factor quirk_io_region (Yinghai Lu) - Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas) - Clean up EISA resource initialization and logging (Bjorn Helgaas) - Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas) - MIPS: Initialize of_node before scanning bus (Gabor Juhos) - Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor Juhos) - Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang) - Fix aer_inject return values (Prarit Bhargava) - Remove PME/ACPI dependency (Andrew Murray) - Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)" * tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits) vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability PCI: Remove "extern" from function declarations ...
2013-04-23iommu: Move swap_pci_ref function to drivers/iommu/pci.h.Varun Sethi
The swap_pci_ref function is used by the IOMMU API code for swapping pci device pointers, while determining the iommu group for the device. Currently this function was being implemented for different IOMMU drivers. This patch moves the function to a new file, drivers/iommu/pci.h so that the implementation can be shared across various IOMMU drivers. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-23iommu/AMD: Per-thread IOMMU Interrupt HandlingSuravee Suthikulpanit
In the current interrupt handling scheme, there are as many threads as the number of IOMMUs. Each thread is created and assigned to an IOMMU at the time of registering interrupt handlers (request_threaded_irq). When an IOMMU HW generates an interrupt, the irq handler (top half) wakes up the corresponding thread to process event and PPR logs of all IOMMUs starting from the 1st IOMMU. In the system with multiple IOMMU,this handling scheme complicates the synchronization of the IOMMU data structures and status registers as there could be multiple threads competing for the same IOMMU while the other IOMMU could be left unhandled. To simplify, this patch is proposing a different interrupt handling scheme by having each thread only managing interrupts of the corresponding IOMMU. This can be achieved by passing the struct amd_iommu when registering the interrupt handlers. This structure is unique for each IOMMU and can be used by the bottom half thread to identify the IOMMU to be handled instead of calling for_each_iommu. Besides this also eliminate the needs to lock the IOMMU for processing event and PPR logs. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-19iommu/amd: Workaround for ERBT1312Joerg Roedel
Work around an IOMMU hardware bug where clearing the EVT_INT or PPR_INT bit in the status register may race with the hardware trying to set it again. When not handled the bit might not be cleared and we lose all future event or ppr interrupts. Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-18iommu/amd: Properly initialize irq-table lockJoerg Roedel
Fixes a lockdep warning. Cc: stable@vger.kernel.org # >= v3.7 Reviewed-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-18iommu/amd: Use AMD specific data structure for irq remappingJoerg Roedel
For compatibility reasons the irq remapping code for the AMD IOMMU used the same per-irq data structure as the Intel implementation. Now that support for the AMD specific data structure is upstream we can use this one instead. Reviewed-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-18iommu/amd: Remove map_sg_no_iommu()Joerg Roedel
This function was intended as a fall-back if the map_sg function is called for a device not mapped by the IOMMU. Since the AMD IOMMU driver uses per-device dma_ops this can never happen. So this function isn't needed anymore. Reviewed-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-02iommu/fsl: Make iova dma_addr_t in the iommu_iova_to_phys API.Varun Sethi
This is required in case of PAMU, as it can support a window size of up to 64G (even on 32bit). Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-04-02iommu/amd: Re-enable IOMMU event log interrupt after handling.Suravee Suthikulpanit
Current driver does not clear the IOMMU event log interrupt bit in the IOMMU status register after processing an interrupt. This causes the IOMMU hardware to generate event log interrupt only once. This has been observed in both IOMMU v1 and V2 hardware. This patch clears the bit by writing 1 to bit 1 of the IOMMU status register (MMIO Offset 2020h) Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-03-27iommu/amd: Make sure dma_ops are set for hotplug devicesJoerg Roedel
There is a bug introduced with commit 27c2127 that causes devices which are hot unplugged and then hot-replugged to not have per-device dma_ops set. This causes these devices to not function correctly. Fixed with this patch. Cc: stable@vger.kernel.org Reported-by: Andreas Degert <andreas.degert@googlemail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-03-26iommu/amd: Remove calc_devid() and use PCI_DEVID() from PCIShuah Khan
Change to remove calc_devid() and use PCI_DEVID() from PCI instead. Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Joerg Roedel <joro@8bytes.org>
2013-03-26iommu/amd: Remove local PCI_BUS() define and use PCI_BUS_NUM() from PCIShuah Khan
Change to remove local PCI_BUS() define and use the new PCI_BUS_NUM() interface from PCI. Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Joerg Roedel <joro@8bytes.org>
2013-02-26Merge tag 'iommu-updates-v3.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU Updates from Joerg Roedel: "Besides some fixes and cleanups in the code there are three more important changes to point out this time: * New IOMMU driver for the ARM SHMOBILE platform * An IOMMU-API extension for non-paging IOMMUs (required for upcoming PAMU driver) * Rework of the way the Tegra IOMMU driver accesses its registetrs - register windows are easier to extend now. There are also a few changes to non-iommu code, but that is acked by the respective maintainers." * tag 'iommu-updates-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (23 commits) iommu/tegra: assume CONFIG_OF in SMMU driver iommu/tegra: assume CONFIG_OF in gart driver iommu/amd: Remove redundant NULL check before dma_ops_domain_free(). iommu/amd: Initialize device table after dma_ops iommu/vt-d: Zero out allocated memory in dmar_enable_qi iommu/tegra: smmu: Fix incorrect mask for regbase iommu/exynos: Make exynos_sysmmu_disable static ARM: mach-shmobile: r8a7740: Add IPMMU device ARM: mach-shmobile: sh73a0: Add IPMMU device ARM: mach-shmobile: sh7372: Add IPMMU device iommu/shmobile: Add iommu driver for Renesas IPMMU modules iommu: Add DOMAIN_ATTR_WINDOWS domain attribute iommu: Add domain window handling functions iommu: Implement DOMAIN_ATTR_PAGING attribute iommu: Check for valid pgsize_bitmap in iommu_map/unmap iommu: Make sure DOMAIN_ATTR_MAX is really the maximum iommu/tegra: smmu: Change SMMU's dependency on ARCH_TEGRA iommu/tegra: smmu: Use helper function to check for valid register offset iommu/tegra: smmu: Support variable MMIO ranges/blocks iommu/tegra: Add missing spinlock initialization ...
2013-02-13iommu/amd: Remove redundant NULL check before dma_ops_domain_free().Cyril Roelandt
dma_ops_domain_free on a NULL pointer is a no-op, so the NULL check in amd_iommu_init_dma_ops() can be removed. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-01-28x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 coreJoerg Roedel
Move all the code to either to the header file asm/irq_remapping.h or to drivers/iommu/. Signed-off-by: Joerg Roedel <joro@8bytes.org> Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-12-02iommu/amd: Remove obsolete commentJoerg Roedel
The AMD IOMMU driver only uses the page-sizes it gets from IOMMU core and uses the appropriate page-size. So this comment is not necessary. Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-12-02iommu/amd: Don't use 512GB pagesJoerg Roedel
There is a bug in the hardware that will be triggered when this page size is used. Make sure this does not happen. Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-10-24iommu/amd: Properly account for virtual aliases in IOMMU groupsAlex Williamson
An alias doesn't always point to a physical device. When this happens we must first verify that the IOMMU group isn't rooted in a device above the alias. In this case the alias is effectively just another quirk for the devices aliased to it. Alternatively, the virtual alias itself may be the root of the IOMMU group. To support this, allow a group to be hosted on the alias dev_data for use by anything that might have the same alias. Signed-off-by: Alex williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-24iommu/amd: Split IOMMU group allocation and attachAlex Williamson
Add a WARN_ON to make it clear why we don't add dma_pdev->dev to the group we're allocating. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-24iommu/amd: Split upstream bus device lookupAlex Williamson
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-24iommu/amd: Split IOMMU Group topology walkAlex Williamson
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-24iommu/amd: Split IOMMU group initializationAlex Williamson
This needs to be broken apart, start with pulling all the IOMMU group init code into a new function. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-02Merge branches 'dma-debug', 'iommu/fixes', 'arm/tegra', 'arm/exynos', ↵Joerg Roedel
'x86/amd', 'x86/vt-d' and 'x86/amd-irq-remapping' into next Conflicts: drivers/iommu/amd_iommu_init.c
2012-10-02iommu/amd: Remove obsolete comment lineJoerg Roedel
IRQ_DELAYED_DISABLE does not exist anymore. So this comment is obsolete. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-02iommu/amd: Fix possible use after free in get_irq_table()Dan Carpenter
We should return NULL on error instead of the freed pointer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Report irq remapping through IOMMU-APIJoerg Roedel
Report the availability of irq remapping through the IOMMU-API to allow KVM device passthrough again without additional module parameter overrides. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add initialization routines for AMD interrupt remappingJoerg Roedel
Add the six routines required to setup interrupt remapping with the AMD IOMMU. Also put it all together into the AMD specific irq_remap_ops. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add call-back routine for HPET MSIJoerg Roedel
Add a routine to setup a HPET MSI interrupt for remapping. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Implement MSI routines for interrupt remappingJoerg Roedel
Add routines to setup interrupt remapping for MSI interrupts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add IOAPIC remapping routinesJoerg Roedel
Add the routine to setup interrupt remapping for ioapic interrupts. Also add a routine to change the affinity of an irq and to free an irq allocation for interrupt remapping. The last two functions will also be used for MSI interrupts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add routines to manage irq remapping tablesJoerg Roedel
Add routines to: * Alloc remapping tables and single entries from these tables * Change entries in the tables * Free entries in the table Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add IRTE invalidation routineJoerg Roedel
Add routine to invalidate the IOMMU cache for interupt translations. Also include the IRTE caches when flushing all IOMMU caches. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Add slab-cache for irq remapping tablesJoerg Roedel
The irq remapping tables for the AMD IOMMU need to be aligned on a 128 byte boundary. Create a seperate slab-cache to guarantee this alignment. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-28iommu/amd: Keep track of HPET and IOAPIC device idsJoerg Roedel
The IVRS ACPI table provides information about the IOAPICs and the HPETs available in the system and which PCI device ID they use in transactions. Save that information for later usage in interrupt remapping. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-28iommu/amd: Fix wrong assumption in iommu-group specific codeJoerg Roedel
The new IOMMU groups code in the AMD IOMMU driver makes the assumption that there is a pci_dev struct available for all device-ids listed in the IVRS ACPI table. Unfortunatly this assumption is not true and so this code causes a NULL pointer dereference at boot on some systems. Fix it by making sure the given pointer is never NULL when passed to the group specific code. The real fix is larger and will be queued for v3.7. Reported-by: Florian Dazinger <florian@dazinger.net> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-09-18iommu/amd: Fix some typosFrank Arnold
Fix some typos in comments and user-visible messages. No functional changes. Signed-off-by: Frank Arnold <frank.arnold@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-08-06iommu/amd: Fix ACS path checkingAlex Williamson
SR-IOV can create buses without a bridge. There may be other cases where this happens as well. In these cases skip to the parent bus and continue testing devices there. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-23Merge branches 'iommu/fixes', 'x86/amd', 'groups', 'arm/tegra' and ↵Joerg Roedel
'api/domain-attr' into next Conflicts: drivers/iommu/iommu.c include/linux/iommu.h
2012-07-19iommu/amd: Fix hotplug with iommu=ptJoerg Roedel
This did not work because devices are not put into the pt_domain. Fix this. Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-17iommu/amd: Move unmap_flush message to amd_iommu_init_dma_ops()Joerg Roedel
The message belongs there anyway, so move it to that function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-17iommu/amd: Introduce early_amd_iommu_init routineJoerg Roedel
Split out the code to parse the ACPI table and setup relevant data structures into a new function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-17iommu/amd: Fix sparse warningsJoerg Roedel
A few sparse warnings fire in drivers/iommu/amd_iommu_init.c. Fix most of them with this patch. Also fix the sparse warnings in drivers/iommu/irq_remapping.c while at it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-11iommu/amd: Implement DOMAIN_ATTR_GEOMETRY attributeJoerg Roedel
Implement the attribute itself and add the code for the AMD IOMMU driver. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-02iommu/amd: fix type bug in flush codeDan Carpenter
write_file_bool() modifies 32 bits of data, so "amd_iommu_unmap_flush" needs to be 32 bits as well or we'll corrupt memory. Fortunately it looks like the data is aligned with a gap after the declaration so this is harmless in production. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25amd_iommu: Make use of DMA quirks and ACS checks in IOMMU groupsAlex Williamson
Work around broken devices and adhere to ACS support when determining IOMMU grouping. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25amd_iommu: Support IOMMU groupsAlex Williamson
Add IOMMU group support to AMD-Vi device init and uninit code. Existing notifiers make sure this gets called for each device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25iommu: IOMMU GroupsAlex Williamson
IOMMU device groups are currently a rather vague associative notion with assembly required by the user or user level driver provider to do anything useful. This patch intends to grow the IOMMU group concept into something a bit more consumable. To do this, we first create an object representing the group, struct iommu_group. This structure is allocated (iommu_group_alloc) and filled (iommu_group_add_device) by the iommu driver. The iommu driver is free to add devices to the group using it's own set of policies. This allows inclusion of devices based on physical hardware or topology limitations of the platform, as well as soft requirements, such as multi-function trust levels or peer-to-peer protection of the interconnects. Each device may only belong to a single iommu group, which is linked from struct device.iommu_group. IOMMU groups are maintained using kobject reference counting, allowing for automatic removal of empty, unreferenced groups. It is the responsibility of the iommu driver to remove devices from the group (iommu_group_remove_device). IOMMU groups also include a userspace representation in sysfs under /sys/kernel/iommu_groups. When allocated, each group is given a dynamically assign ID (int). The ID is managed by the core IOMMU group code to support multiple heterogeneous iommu drivers, which could potentially collide in group naming/numbering. This also keeps group IDs to small, easily managed values. A directory is created under /sys/kernel/iommu_groups for each group. A further subdirectory named "devices" contains links to each device within the group. The iommu_group file in the device's sysfs directory, which formerly contained a group number when read, is now a link to the iommu group. Example: $ ls -l /sys/kernel/iommu_groups/26/devices/ total 0 lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:00:1e.0 -> ../../../../devices/pci0000:00/0000:00:1e.0 lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.0 -> ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0 lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.1 -> ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1 $ ls -l /sys/kernel/iommu_groups/26/devices/*/iommu_group [truncating perms/owner/timestamp] /sys/kernel/iommu_groups/26/devices/0000:00:1e.0/iommu_group -> ../../../kernel/iommu_groups/26 /sys/kernel/iommu_groups/26/devices/0000:06:0d.0/iommu_group -> ../../../../kernel/iommu_groups/26 /sys/kernel/iommu_groups/26/devices/0000:06:0d.1/iommu_group -> ../../../../kernel/iommu_groups/26 Groups also include several exported functions for use by user level driver providers, for example VFIO. These include: iommu_group_get(): Acquires a reference to a group from a device iommu_group_put(): Releases reference iommu_group_for_each_dev(): Iterates over group devices using callback iommu_group_[un]register_notifier(): Allows notification of device add and remove operations relevant to the group iommu_group_id(): Return the group number This patch also extends the IOMMU API to allow attaching groups to domains. This is currently a simple wrapper for iterating through devices within a group, but it's expected that the IOMMU API may eventually make groups a more integral part of domains. Groups intentionally do not try to manage group ownership. A user level driver provider must independently acquire ownership for each device within a group before making use of the group as a whole. This may change in the future if group usage becomes more pervasive across both DMA and IOMMU ops. Groups intentionally do not provide a mechanism for driver locking or otherwise manipulating driver matching/probing of devices within the group. Such interfaces are generic to devices and beyond the scope of IOMMU groups. If implemented, user level providers have ready access via iommu_group_for_each_dev and group notifiers. iommu_device_group() is removed here as it has no users. The replacement is: group = iommu_group_get(dev); id = iommu_group_id(group); iommu_group_put(group); AMD-Vi & Intel VT-d support re-added in following patches. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>