We cannot call unmap directly if the map_sg fails partially as the tlb
invalidate functions need to enable/prepare clocks, which require
non-atomic context. Let map_sg return the failure and handle this when
we are out of the atomic context.
Change-Id: I6401c1e281850aeda27e32524cae34324045f762
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
For all but the last level of the page tables, we
need to call unmap with supported IOMMU pages sizes.
However currently to get the maximum page size that can be
unmapped, we are passing the original size parameter,
instead of the remaining size to be unmapped, and that can
end up unmapping more than what is intended.
Fix this by passing the remaining size parameter to
find the correct IOMMU page size.
Change-Id: Id4bd188ff6a3b4b7d34ba43ae6a61efb3c65b281
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
When unmapping 2MB mappings, which are 2MB aligned, the smmu driver
is leaking the 3rd level page tables.
Fix this leak by updating __arm_lpae_free_pgtable so that it no
longer leaks leaf table entries.
To reproduce this leak simply map and unmap a non-block 2MB mapping
which is 2MB aligned.
Change-Id: Ibdbdb084ceb8d03ebe0a04e8777e3eb9419e9b87
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Currently, the page table is flushed after the installation of each
individual page table entry. This is not terribly efficient since
virtual address ranges are often mapped with physically contiguous
chunks of page table memory. Optimize the map operation by factoring
out the page table flushing so that contiguous ranges of page table
memory can be flushed in one go.
Change-Id: Ie80eb57ef50d253db6489a6f75824d4c746314c7
Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
TTBR1 shouldn't currently be used at all. Any such usage is the result
of a bug. Configure TCR such that any usage of TTBR1 will generate a
fault, which will help catch such bugs earlier.
Change-Id: I74f2dc9580e5aed5d391debe23c7f5cc9fc1f672
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Some IOMMU drivers (like arm-smmu) need to perform fixups to page table
memory when it's allocated and freed. Some callbacks were added for
this purpose ({prepare,unprepare}_pgtable) that are called from the
io-pgtable code after each allocation and before each free. However,
this approach is prone to bugs where new calls to allocate/free are
added without corresponding calls to the fixup callbacks. Furthermore,
allocation and free is often done from atomic context, so if the driver
needs to do non-atomic fixups during free they are out of luck since the
memory will be freed back to the system by the time control is turned
back over to the driver.
Adding yet another callback for non-atomic fixups would start to get
tedious, and would only increase the chance of missing callbacks as new
allocations/frees are added. Instead, fix this by allowing all page
table memory allocation and free to be delegated entirely to the driver.
Fall back to {alloc,free}_pages_exact by default if the driver doesn't
need any special handling of page table memory.
Change-Id: I0361bb81e25ff5ad4ef93a45330a35af47bc6013
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently, arm_lpae_split_blk_unmap is assuming that the size to be
remapped is the block size of the next level. However, optimizations
have been made to the unmap code that result in the entire remaining
size being passed in, rather than just doing it one block at a time,
which breaks arm_lpae_split_blk_unmap.
Fix this by overriding the size passed in to be the block size of the
next level.
Change-Id: Ifce5b2e07dc15aba3cd37b7ac249e00decd2923f
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently, when all of the 4K PTEs beneath a 2M table entry are
unmapped, that 2M table entry is left intact, even though it doesn't
point to any valid 4K mappings anymore. This results in a warning if a
subsequent block mapping lands on top of the dangling table entry, since
we require empty page table entries when we map. It also causes the
page at which that the stomped-on table was pointing to be leaked. Fix
this by keeping track of how many entries are currently mapped beneath a
table. When the map count goes to zero (in unmap), free up the page the
table is pointing at and zero out the table entry.
Change-Id: I470e6ffb2206a09fe7c24253e3fd64a744337a7f
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
It can be useful for debugging to know how much memory is being used for
IOMMU page tables. Add some dedicated allocation functions and a
debugfs file for this.
Change-Id: Id69b4b1b5df5dcc6c604eec3a12a894b8eab0eb6
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Requests from the walker to page table memory are currently inner- and
outer-cacheable. This configuration hasn't been fully validated for
functionality or characterized for performance. Configure these
requests as non-cacheable.
Change-Id: I7efb0a697faff68a67ee0afdb933b6dd6926f30a
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The last level optimization for __arm_lpae_unmap assumes that
consecutive blocks of 2MB addresses are located in the same
1GB mapping but this isn't always true - if the address spans
a 1GB boundary the next iova is in a different pagetable.
Only perform the optimization for the current pagetable entry and
then kick it back for the loop in arm_lpae_unmap to try again
with the updated iova. All this means that __arm_lpae_unmap may
not unmap all of the size that it was given. This is okay assuming
at least something was unmapped so don't jump out of the loop
in arm_lpae_unmap until the child function returns 0 or the entire
block is freed, whichever comes first.
CRs-Fixed: 867143
Change-Id: Ic0dedbad407d60365a95afdaf03ec3b91f53960d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The size parameter in .unprepare_pgtable() and arm_smmu_assign_table()
needs to be the same since the functions are complimentary.
Use PAGE_SIZE in both functions instead of relying on the calling
function for .unprepare_pgtable().
Change-Id: Ic6fade307360254329968e1b4548732d045b8205
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
For secure domains, the page tables need to be assigned
to the correct VMIDs. Add support for doing the assignment.
Change-Id: I60009ef96ae1c8965d57096b6a1b0658ae6acc9a
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
This reverts commit 0c78cf6e138f ("iommu: io-pgtable-arm: set page
tables as outer shareable"). We actually don't want outer-shareable
since we'd like to disable coherent table walking.
Change-Id: Id38e931864c4c1a0d77bb06d0da231b546bedf6d
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
This reverts commit 713d52a0acca ("iommu: io-pgtable-arm: flush tlb for
stale mappings"), which was a workaround for some other bugs in the page
table mapping code. Those bugs have since been fixed, so the workaround
is no longer needed.
Change-Id: Ic699328dd60baffd1c6080e6b0d9b2fb0feea831
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The arm_lpae_range_has_mapping currently checks if there are any
non-mapped slots in a given iova range, but it's really meant to check
if there is *any* mapping whatsoever in a given iova range. Fix this.
Change-Id: I90e426ab157cc194328b754ac5021051ac883603
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
We're currently suppressing all map failures during the entirety of the
selftests. We really only want to suppress those failures during
individual negative map test cases to avoid logspam, but we *do* want to
see other map failures during the selftests. Fix this by only
suppressing map failures during negative map test cases.
Change-Id: If51a95dd4d8c5b756cfa4597a5bdd7c75afe2637
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
There can be page table bugs when both block and page mappings are used
to make up the mapping for a single VA range. Add a test case for this
to the selftests.
Change-Id: Ic2b943dd74f1ed2ed1e5e03832742f0e6deff58e
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently we have an optimization in place for unmapping the last level
of the page tables. We do this by memset()'ing the entire last level at
once rather than calling unmap on each individual page mapping at the
last level. For this to work we have to pass in sizes that aren't equal
to any of the supported IOMMU page sizes. However, our optimization
only applies at the last level. Unmapping at the other levels still
relies on the fact that unmap is only called with supported IOMMU page
sizes, which means it's currently broken.
Fix this by always calling unmap with an IOMMU page size, unless we're
at the last level of the page tables (i.e. the size to be unmapped is
less than the block size at the second-to-last level), in which case we
can pass in the entire remaining size.
Change-Id: Ie3716002c793af3dca51e0e3363d261f345e9e25
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
If something bad happens in arm_lpae_map_sg we try to clean up after
ourselves by unmapping any partial mappings that succeeded. However,
we're currently passing in the wrong iova to the unmap function during
the cleanup. Fix this.
Change-Id: Ieb30616141f3fb709d02abd147f9f598e2db07cc
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Unmap returns a size_t all throughout the IOMMU framework. Make
io-pgtable match this convention.
Change-Id: Ice4c75a428f0f95a665e2fbe4210349d6f78c220
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Coherent table walk is broken on our system. Set page tables as outer
shareable so that the SMMU doesn't try to walk them in the cache.
Change-Id: Id9dd3d10139750b0dbb77842c12efd49e2672645
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
There seems to be a bug in the unmap code that results in us leaving
stale mappings in the page table. We can actually live with this as
long as we invalidate the tlb when a new mapping comes in on the same
virtual address (to prevent the walker from using the old, bogus
iova->phys mappings).
Change-Id: If5923e853e7ec542b12ca954d5f1c22dec5e5bb2
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
In ARMv8, the output address from a page table walk is obtained by
combining some bits from the physical address in the leaf page table
entry with some bits from the input virtual address. The number of bits
that should be taken from the virtual address varies based on the lookup
level and descriptor type. However, we're currently always using
data->pg_shift bits, which is a constant.
Conveniently there's already a macro to compute the number of bits we
want (ARM_LPAE_LVL_SHIFT). Use this macro instead of data->pg_shift to
build the virtual address mask.
Change-Id: Id7f8aa2c553cc004e5d895d05c9226a896d22ce6
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The io-pgtable-arm unit tests currently only check a few mappings within
a range to determine whether or not that range "looks good". There can
be subtle bugs that don't show up until you get to a certain offset
within the range, etc. Check the entire range before assuming that it's
good.
Change-Id: I244a2150d38f57d95a5c81854cdeaf59ab4ace06
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Clients might want to map device memory into their SMMU. Add support
for these device mappings through the IOMMU_DEVICE flag.
Change-Id: I756720181aa0d531f4c56453ef832f81b36ffccd
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently we walk each last-level leaf pte during unmap and zero them
out individually. Since these last-level ptes are all contiguous (up to
512 entries), optimize the unmapping process by simply zeroing them all
out at once rather than operating on them individually.
Change-Id: I21d490e8a94355df4d4caecab33774b5f8ecf3ca
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Rather than calling the tlb maintenance routines throughout the course
of the unmap operation, just flush the entire tlb for the context in
question all at once, at the very end of the unmap. This greatly
improves performance for large page tables (which is common for large
buffers in a heavily fragmented system).
In my testing, this optimization gave a ~10% speedup when unmapping 64K.
Change-Id: Iaa2b211e730dad6bd9235ef98dd2a89cf541e663
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
io-pgtable-arm has just gotten support for .map_sg. Add a test to the
suite of self-tests for this.
Change-Id: Iba56bb801c1f9ef151827598022411c95d389faa
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Mapping an entire scatterlist at once is faster than calling iommu_map
on each link individually. Implement .map_sg in the ARM LPAE page table
allocator so that drivers using the allocator can leverage this
performance boost.
Change-Id: I77f62a2566058693c3f58fc0b05d715a780ae5d8
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The quirk causes the Non-Secure bit to be set in all page table entries.
Change-Id: I937fb7dec4214eca33f8014c664cfc5c99cb0027
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: c896c132b0
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
This patch adds a series of basic self-consistency tests to the ARM LPAE
IO page table allocator that exercise corner cases in map/unmap, as well
as testing all valid configurations of pagesize, ias and stage.
Change-Id: I703df977b7e5914e0ccf9aaca2174cf5956dd604
Signed-off-by: Will Deacon <will.deacon@arm.com>
Git-commit: fe4b991dcd
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
A number of IOMMUs found in ARM SoCs can walk architecture-compatible
page tables.
This patch adds a generic allocator for Stage-1 and Stage-2 v7/v8
long-descriptor page tables. 4k, 16k and 64k pages are supported, with
up to 4-levels of walk to cover a 48-bit address space.
Change-Id: I32740cfa795c55e0d3683b42105b4f49c9dcf984
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Git-commit: e1d3c0fd70
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
In checking whether DMA addresses differ from physical addresses, using
dma_to_phys() is actually the wrong thing to do, since it may hide any
DMA offset, which is precisely one of the things we are checking for.
Simply casting between the two address types, whilst ugly, is in fact
the appropriate course of action. Further care (and ugliness) is also
necessary in the comparison to avoid truncation if phys_addr_t and
dma_addr_t differ in size.
We can also reject any device with a fixed DMA offset up-front at page
table creation, leaving the allocation-time check for the more subtle
cases like bounce buffering due to an incorrect DMA mask.
Furthermore, we can then fix the hackish KConfig dependency so that
architectures without a dma_to_phys() implementation may still
COMPILE_TEST (or even use!) the code. The true dependency is on the
DMA API, so use the appropriate symbol for that.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: folded in selftest fix from Yong Wu]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When installing a block mapping, we unconditionally overwrite a non-leaf
PTE if we find one. However, this can cause a problem if the following
sequence of events occur:
(1) iommu_map called for a 4k (i.e. PAGE_SIZE) mapping at some address
- We initialise the page table all the way down to a leaf entry
- No TLB maintenance is required, because we're going from invalid
to valid.
(2) iommu_unmap is called on the mapping installed in (1)
- We walk the page table to the final (leaf) entry and zero it
- We only changed a valid leaf entry, so we invalidate leaf-only
(3) iommu_map is called on the same address as (1), but this time for
a 2MB (i.e. BLOCK_SIZE) mapping)
- We walk the page table down to the penultimate level, where we
find a table entry
- We overwrite the table entry with a block mapping and return
without any TLB maintenance and without freeing the memory used
by the now-orphaned table.
This last step can lead to a walk-cache caching the overwritten table
entry, causing unexpected faults when the new mapping is accessed by a
device. One way to fix this would be to collapse the page table when
freeing the last page at a given level, but this would require expensive
iteration on every map call. Instead, this patch detects the case when
we are overwriting a table entry and explicitly unmaps the table first,
which takes care of both freeing and TLB invalidation.
Cc: <stable@vger.kernel.org>
Reported-by: Brian Starkey <brian.starkey@arm.com>
Tested-by: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the users fully converted to DMA API operations, it's dead, Jim.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With all current users now opted in to DMA API operations, make the
iommu_dev pointer mandatory, rendering the flush_pgtable callback
redundant for cache maintenance. However, since the DMA calls could be
nops in the case of a coherent IOMMU, we still need to ensure the page
table updates are fully synchronised against a subsequent page table
walk. In the unmap path, the TLB sync will usually need to do this
anyway, so just cement that requirement; in the map path which may
consist solely of cacheable memory writes (in the coherent case),
insert an appropriate barrier at the end of the operation, and obviate
the need to call flush_pgtable on every individual update for
synchronisation.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: slight clarification to tlb_sync comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, users of the LPAE page table code are (ab)using dma_map_page()
as a means to flush page table updates for non-coherent IOMMUs. Since
from the CPU's point of view, creating IOMMU page tables *is* passing
DMA buffers to a device (the IOMMU's page table walker), there's little
reason not to use the DMA API correctly.
Allow IOMMU drivers to opt into DMA API operations for page table
allocation and updates by providing their appropriate device pointer.
The expectation is that an LPAE IOMMU should have a full view of system
memory, so use streaming mappings to avoid unnecessary pressure on
ZONE_DMA, and treat any DMA translation as a warning sign.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Although we set TCR.T1SZ to 0, the input address range covered by TTBR1
is actually calculated using T0SZ in this case on the ARM SMMU. This
could theoretically lead to speculative table walks through physical
address zero, leading to all sorts of fun and games if we have MMIO
regions down there.
This patch avoids the issue by setting EPD1 to disable walks through
the unused TTBR1 register.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Various build/boot bots have reported WARNs being triggered by the ARM
iopgtable LPAE self-tests on i386 machines.
This boils down to two instances of right-shifting a 32-bit unsigned
long (i.e. an iova) by more than the size of the type. On 32-bit ARM,
this happens to give us zero, hence my testing didn't catch this
earlier.
This patch fixes the issue by using DIV_ROUND_UP and explicit case to
to avoid the erroneous shifts.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The quirk causes the Non-Secure bit to be set in all page table entries.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds a series of basic self-consistency tests to the ARM LPAE
IO page table allocator that exercise corner cases in map/unmap, as well
as testing all valid configurations of pagesize, ias and stage.
Signed-off-by: Will Deacon <will.deacon@arm.com>
A number of IOMMUs found in ARM SoCs can walk architecture-compatible
page tables.
This patch adds a generic allocator for Stage-1 and Stage-2 v7/v8
long-descriptor page tables. 4k, 16k and 64k pages are supported, with
up to 4-levels of walk to cover a 48-bit address space.
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>