Coherent hardware table walking hasn't been properly validated or
characterized, so all clients are expected to disable it. This is a
recipe for disaster since it could be easy for a new client to come
along without knowing that they need to disable it. Just disable it by
default. Clients can always explicitly enable it in the future if it's
found to be beneficial.
Change-Id: I4badfe33e815a6ba7b25507f5dd5a42f68d4bfa6
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
It can be useful to check whether or not coherent hardware table walking
has been explicitly disabled on attached domains. Add this to the
attach info debugfs file.
Change-Id: I432303ecb734d32eaa02038694daad0d8c4d8aba
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently IOMMU attachment info is available in debugfs files located at
<debugfs_root>/iommu/attachments/<attached_device>. However, there are
more actions that can be taken on attached devices besides just printing
their info. Make room for more debugfs files for attached devices by
creating a directory for each one, and move the existing info file to:
<debugfs_root>/iommu/attachments/<attached_device>/info.
Change-Id: Ia56efc3aeb5e82afc34314fe48aaa0cd6e5579be
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently the debug device structure is allocated with kmalloc, without
initializing all of the fields in the structure. Later, those fields
might be uses before they've every been assigned. For example, if a user
executes the following code on a fresh boot:
# cd /sys/kernel/debug/iommu/tests/some_device
# echo 0 > attach
The kernel crashes with something like this (assuming page poisoning is
enabled):
Unable to handle kernel paging request at virtual address aaaaaaaaaaaaaaaa
pgd = ffffffc0a92a1000
[aaaaaaaaaaaaaaaa] *pgd=0000000000000000, *pud=0000000000000000
Fix this by initializing all the fields in the structure to 0 by using
kzalloc instead of kmalloc.
Change-Id: I3514bf7bf174e176ff7a310c7134d0f53e22d771
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The functions for iommu/devices/<device>/profiling don't actually have
the word `profiling' in the name, which will be confusing as we add more
files to that directory. Rename them for clarity.
Change-Id: Ic57d9400d8784d2cbd667185c5b2b0e1275461dd
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
In order to facilitate debugging of context faults, the result of a
software table walk is currently printed as part of the fault handling.
However, there are certain bugs and error conditions that can't be
caught by doing a software table walk (like missing TLB invalidation or
misbehaving hardware). Add a hardware table walk (via ATOS) to improve
debugging capabilities.
Change-Id: Ie89019df62f115627359e29b1f6cc5de3a36d1b5
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The hardware address translation operations (ATOS) can be useful for
debugging. arm-smmu used to have support for ATOS, but it was ripped
out while moving to the io-pgtable framework. Resurrect the old ATOS
code with the following modifications:
- Remove errata workarounds for deprecated hardware.
- Move the atos lock to a spinlock (since the only reason a mutex was
being used previously was due to the fact that some of the old
errata workarounds required sleeping operations).
Change-Id: I1a021026b9ee41ba2c1761bd5d5b7a13399c6417
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Some IOMMU hardware implementations provide hardware translation
operations that can be useful during debugging and development. Add a
function for this purpose along with an associated op in the iommu_ops
structure so that drivers can implement it.
Change-Id: I54ad5df526cdce05f8e04206a4f01253b3976b48
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently, the device name for the SMMU context bank device is used as
the filename for the IOMMU debug info file. This doesn't work in cases
where multiple domains can be attached to a single SMMU context bank
device (like dynamic domains). Make these filenames unique by appending
a 16-byte uuid to the name.
Change-Id: Ie26ece773bfa2e8c75a329a8cb8461bcd598218e
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The last level optimization for __arm_lpae_unmap assumes that
consecutive blocks of 2MB addresses are located in the same
1GB mapping but this isn't always true - if the address spans
a 1GB boundary the next iova is in a different pagetable.
Only perform the optimization for the current pagetable entry and
then kick it back for the loop in arm_lpae_unmap to try again
with the updated iova. All this means that __arm_lpae_unmap may
not unmap all of the size that it was given. This is okay assuming
at least something was unmapped so don't jump out of the loop
in arm_lpae_unmap until the child function returns 0 or the entire
block is freed, whichever comes first.
CRs-Fixed: 867143
Change-Id: Ic0dedbad407d60365a95afdaf03ec3b91f53960d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The default mapping type is non-executeable. Check for the
DMA_ATTR_EXEC_MAPPING attribute which allows clients to
request an executeable mapping.
Change-Id: I24a170990cc04a848b6779871ae2025721177d46
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
DMA_ATTR_EXEC_MAPPING specifies that an executable mapping
should be created for the requested buffer. By default, the
DMA mappings are non-executable.
Change-Id: I135077e14996e92fa9d199bdee043c443db48924
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
The logic for the iommu executable flag is inverted now and
all the iommu mappings are executable by default.
Provide the IOMMU_NOEXEC flag where the mapping needs to be non-executable.
Change-Id: Ifa0aa3d17ae79c16abdf66d2177a09b868a9f45f
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
[pdaly@codeaurora.org Remove gpu/mdss]
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
The msm lazy mapping APIs did not allow to pass in
dma attributes that could be passed to the dma-mapping
driver. This patch allows users to specify dma attributes for the
msm lazy mappings.
Change-Id: I3e4cd2bb99d205dce78083a256f4d444d865f3cc
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
DMA_ATTR_NO_DELAYED_UNMAP specifies to the msm lazy mapping
driver that this buffer should be immediately unmapped once
it is freed.
Change-Id: I43e6a6058705502cf91bf5f0c530c3099cba06ae
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
arm32 recently removed the `order' parameter from
arm_iommu_create_mapping: (68efd7d2fb: arm: dma-mapping: remove order
parameter from arm_iommu_create_mapping()) in order to make the API
easier to understand. The arm32 DMA IOMMU mapper has dynamic resizing
of the iova bitmap, so there was no reason to keep the `order' parameter
around (which was introduced to reduce the size of the bitmap).
Although we don't have dynamic iova bitmap reallocation on arm64, we'd
still like to get rid of the `order' parameter since it's confusing and
doesn't really help much (especially since all known clients on our
system are passing order=0). Remove it.
Change-Id: I35e32fdfbe05ec434f64a3a316d13c8f43304bc6
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
[pdaly@codeaurora.org Remove gpu/ipa etc modifications]
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
An incorrect conflict resolution when picking up 07c17f4e0b71d ("arm64:
add support for NO_KERNEL_MAPPING and STRONGLY_ORDERED") caused
MT_NORMAL_NC attributes being set for strongly-ordered mappings, rather
than MT_DEVICE_nGnRnE attributes.
As a result of this, speculative data fetches of the incorrectly-
mapped memory have been observed. In cases where this memory is
XPU protected or reads result in side-effects, this may result in
device crashes.
Fix this by setting the attributes returned by pgprot_noncached()
regardless of the value of the coherent flag passed to
__get_dma_pgprot()
Change-Id: Iec56027e280ae0920016df3066045b71299a915b
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
Currently, debugging IOMMU issues is done via manual instrumentation of
the code or via low-level techniques like using a JTAG debugger.
Introduce a set of library functions and debugfs files to facilitate
interactive debugging and testing.
This patch introduces the basic infrastructure as well an initial
debugfs for:
- viewing IOMMU attachments (domain->dev mappings created by
iommu_attach_device) and resulting attributes (like the base address
of the page tables)
- basic performance profiling
Example usage:
# cd /sys/kernel/debug/iommu/attachments
# cat b40000.qcom,kgsl-iommu:iommu_kgsl_cb2
Domain: 0xffffffc0cb983f00
PT_BASE_ADDR: virt=0xffffffc057eca000 phys=0x00000000d7eca000
# cd /sys/kernel/debug/iommu/tests
# cat soc:qcom,cam_smmu:msm_cam_smmu_cb1/profiling
size iommu_map iommu_unmap
4K 47 us 909 us
64K 97 us 594 us
2M 1536 us 605 us
12M 8737 us 1193 us
20M 26517 us 1121 us
size iommu_map_sg iommu_unmap
64K 31 us 656 us
2M 885 us 600 us
12M 2674 us 687 us
20M 4352 us 1096 us
Change-Id: I1c301eec6e64688831cad80ffd0380743f7f0df6
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently we're relying on the smmu_domain->lock for synchronizing
attach and detach. This is a problem because each domain has its own
smmu_domain->lock, so if multiple different domains try to attach to the
same device at the same time, they'll be racing.
Fix the race by holding a lock that's part of the smmu
structure (attach_lock should do just fine).
The test case that uncovered this was:
# cd /sys/kernel/debug/iommu/tests/soc:qcom,msm-audio-ion/
# while :; do cat profiling; done &
# while :; do cat profiling; done &
Change-Id: I8a60cdc214c91967aff63882e3a7280865ffda9e
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
We currently have an error path in arm_smmu_attach_dev where we're
returning with a mutex locked. Fix this.
Change-Id: I197edea7cefe361027cf46e22313ebe844684ec8
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently we're using a 10us delay while polling the TLB status register
after doing a TLB operation. These operations almost always finish on
the first iteration or two, so the delay is unnecessary. Just do a
tight poll.
Change-Id: I7d5787ea92e227ded5a0578c1c647e8317c8ceca
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
There are TLBSTATUS registers in SMMU global register space as well as
context bank register space. Currently we're polling the global
TLBSTATUS registers after TLB invalidation, even when using the TLB
invalidation registers from context bank address space. This violates
the usage model described in the ARM SMMU spec. Fix this by polling
context bank TLBSTATUS registers for context bank TLB operations, and
global TLBSTATUS registers for global TLB operations.
Change-Id: I8aa916f7bc71793cad4c9224aa85d5310eacec75
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The size parameter in .unprepare_pgtable() and arm_smmu_assign_table()
needs to be the same since the functions are complimentary.
Use PAGE_SIZE in both functions instead of relying on the calling
function for .unprepare_pgtable().
Change-Id: Ic6fade307360254329968e1b4548732d045b8205
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
Make all msm_dma_iommu apis depend on CONFIG_IOMMU_API as it is
only used when we have the linux iommu layer available.
Change-Id: I879dc1a9174d498b9b4bc68b2418165f3b2675a3
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
Ensure that clock enabling and reference counting is done atomically
to avoid any potential race conditions.
An example of a potential race condition is that while thread one is
enabling the clocks thread two could enter and then exit the clock
enabling function early because of reference counting. This could
lead to thread two attempting to access registers before the clocks
are enabled.
Have removed the regulator reference because enabling the regulators
involves the use of a mutex so spin locks cannot be used to protect
the reference count. Also the use of a regulator reference count is
of limited benefit since there is only one regulator to enable.
Change-Id: I7215bbf9157907fde24c94841e347370769423c8
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Because the ARM SMMU driver assigns context banks dynamically,
some drivers need a way to know which one they are using.
Change-Id: Ic0dedbad4327ef86c5a893a48b57f0f9417800e9
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
After attaching a domain, this attribute may be
queried to determine which hardware context bank
was assigned.
Change-Id: I31e674672041103007fcaff3f83a0cc2c33a4a6d
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Some attributes are changed during domain attach and detach,
so hold init_mutex to ensure consistency.
Change-Id: I450a9a2da4bfe3616ef6dd0a6426271d25c292ce
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Currently we're leaving domains half-initialized after a
partially-successful attach. Fix this by destroying the
domain in the error path.
Change-Id: I36c529ed4974c01fba96088b6f57a8e82b350252
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
The `lock' field of struct arm_smmu_domain was replaced by `init_mutex'
in 9725ec12d27e215 (iommu/arm-smmu: re-use the init_mutex for protecting
smmu_domain.smmu), but the `lock' field itself was not deleted. It's
not meant to be used anymore, so delete it. Some usages of the crufty
lock have also crept up, so fix those as well.
Change-Id: I33c2f83e7b15f0ec2cb08c784a84991a7c57950f
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
removed-dma-pool compatibility creates a carve out region,
include the documentation for the same.
Change-Id: I6c2cd9a9dac4afab106b0d4d49db4cd51c63fbf7
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
For secure domains, the page tables need to be assigned
to the correct VMIDs. Add support for doing the assignment.
Change-Id: I60009ef96ae1c8965d57096b6a1b0658ae6acc9a
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
The hyp_assign_phys() api can be called by different
usecases where it is not guaranteed that the source vm is
always VMID_HLOS.
Pass the responsibility of setting the source_vm to
caller of the function.
Change-Id: I3851a6681f49d4bb6fa5b7a889a16a158497e9e6
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
This reverts commit 0c78cf6e138f ("iommu: io-pgtable-arm: set page
tables as outer shareable"). We actually don't want outer-shareable
since we'd like to disable coherent table walking.
Change-Id: Id38e931864c4c1a0d77bb06d0da231b546bedf6d
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Fill out the pagefault flags more fully when calling
client fault handlers so that clients do not have to
read hardware state themselves. Also, respect the -EBUSY
return code from the client by not clearing the FSR or
resuming/terminating the stalled transaction.
Change-Id: I03a2546e8f90a1fa937ccd31bdd062fa05d76adb
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
On some platforms, certain IOMMU hardware state
might be cleared before reaching client fault handlers,
making it difficult for client fault handlers to do much
useful processing. Add some flags so that this information
can be passed to client fault handlers directly, rather
than expecting clients to read the hardware themselves.
Also provide a mechanism for client fault handlers to
request that the IOMMU driver not clear the fault or
resume/retry the faulting transaction. The client fault
handler can return -EBUSY to request this behavior.
Change-Id: I9780beb52b4257fff99d708a493173c9fe0a9d8a
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
Drop upstream iommu implementation as it is very old and has
conflicting file names that needs to be replaced by internal
one.
Change-Id: I7b61e6b5d2ce2a47b2b13c71c321dea62be940a0
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
When a context fault occurs, it can be useful for debugging to know the
stream ID of the faulting transaction. This information is available in
the CBFRSYNRAn register. Read and print the SID value when a context
fault occurs.
Change-Id: If8b47b801bc72a053b1198767de58799606ca626
Signed-off-by: Shalaj Jain <shalajj@codeaurora.org>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Use reference counting when enabling and disabling clocks and
regulators in order to improve performance.
Change-Id: I89a3eec17fd551f0625da8a504634f0df311d64f
Signed-off-by: Liam Mark <lmark@codeaurora.org>
This reverts commit 713d52a0acca ("iommu: io-pgtable-arm: flush tlb for
stale mappings"), which was a workaround for some other bugs in the page
table mapping code. Those bugs have since been fixed, so the workaround
is no longer needed.
Change-Id: Ic699328dd60baffd1c6080e6b0d9b2fb0feea831
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
We don't need to enable clocks during map because we don't need to do
anything through hardware (unlike unmap, which needs to do TLB
invalidation). We had to enable clocks at one point in order to enable
a workaround for some software bugs in the page table code. These
workarounds are no longer present, so we don't need to enable clocks.
Rip out the clock/regulator enablement.
This seems to improve the performance of iommu_map by several orders of
magnitude. The performance impact on iommu_map_sg is smaller, maybe a
percent or two.
Change-Id: Iddf530bc35f96840413a5c46ad9ead5334b9abd1
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
The arm_lpae_range_has_mapping currently checks if there are any
non-mapped slots in a given iova range, but it's really meant to check
if there is *any* mapping whatsoever in a given iova range. Fix this.
Change-Id: I90e426ab157cc194328b754ac5021051ac883603
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
We're currently suppressing all map failures during the entirety of the
selftests. We really only want to suppress those failures during
individual negative map test cases to avoid logspam, but we *do* want to
see other map failures during the selftests. Fix this by only
suppressing map failures during negative map test cases.
Change-Id: If51a95dd4d8c5b756cfa4597a5bdd7c75afe2637
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
There can be page table bugs when both block and page mappings are used
to make up the mapping for a single VA range. Add a test case for this
to the selftests.
Change-Id: Ic2b943dd74f1ed2ed1e5e03832742f0e6deff58e
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Currently we have an optimization in place for unmapping the last level
of the page tables. We do this by memset()'ing the entire last level at
once rather than calling unmap on each individual page mapping at the
last level. For this to work we have to pass in sizes that aren't equal
to any of the supported IOMMU page sizes. However, our optimization
only applies at the last level. Unmapping at the other levels still
relies on the fact that unmap is only called with supported IOMMU page
sizes, which means it's currently broken.
Fix this by always calling unmap with an IOMMU page size, unless we're
at the last level of the page tables (i.e. the size to be unmapped is
less than the block size at the second-to-last level), in which case we
can pass in the entire remaining size.
Change-Id: Ie3716002c793af3dca51e0e3363d261f345e9e25
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
arm_smmu_enable_regulators also prepares all of our clocks (similarly
for arm_smmu_disable_regulators), and is always called from
arm_smmu_enable_clocks. arm_smmu_enable_clocks, also prepares our
clocks, so clocks are being prepared twice, which is once more than we
need. Fix this by enabling (not preparing) clocks in
arm_smmu_enable_clocks, relying on arm_smmu_enable_regulators to prepare
the clocks beforehand.
Change-Id: Id07848f64a81522e27198d6e708159941b07d444
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
When we swapped out the page table code we also lost our change to use
the NOSIGN SEP value for all SMMUs. As noted in the original change,
this is needed for correct functionality of the SMMU. Restore this
change.
Change-Id: I0154003de92f59172a7c1e49aa68c387e87e2aa1
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
In atomic context, gen_pool_alloc allocates a single page large
enough to accomodate the requested size. However __iommu_create_mapping
always maps pages assuming they are of size 4K. Thus only the first
4K of the buffer is mapped and a translation fault is generated
during an unmap.
Fix this by splitting the larger pages into 4K pages.
Change-Id: Ifcbe29477ad210204028486bd011470fe8b50852
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
[pdaly@codeaurora.org Keep upstream version of alloc_from_pool]
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
The return value -EINVAL is returned in the case of invalid
arguments, and is not the correct value when the function is not
implemented.
Return -ENOSYS instead.
Change-Id: I196537f121d5a290fec74e2b7bcb1cfd490468c7
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
[pdaly@codeaurora.org Resolve minor conflicts]
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>