CP_QUEUE_THRESHOLDS is only used in A3XX. Move the register setting
out of common ringbuffer initialization and into A3XX specific region.
CRs-Fixed: 971153
Change-Id: I05ef504a802534f1582e62085c5b12b20ac57209
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Ensure that all the logs that can be triggered from the
interrupt handler are rate limited.
CRs-Fixed: 971145
Change-Id: I9fe4a6b28be0dc6299467fb8402bef3694aeac76
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
The start offset for protected mode ranges needs to be aligned with
the block size. 0xE87 is not aligned with 16 (1 << 4). The hardware
assumes alignment internally so it turns out that 0xE80 - 0xE8F is
the range that gets protected. Luckily for us that this is the range
we want protected so nothing critical has been left unprotected, but
the software should reflect the hardware to prevent incorrect
assumptions.
CRs-Fixed: 968713
Change-Id: Ic0dedbad6ec7be5cc473afbbc52655663ea65159
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
In the LLM sleep sequence, the IDLE_FULL_LM bit (0) needs to be set
to force the children to sleep. Also, after the first child is
put to sleep, we need to wait for the idle acknowledgment before
taking the next child down.
In the wake sequence, poll until WAKEUP_ACK is 1 *and* IDLE_FULL_ACK
is 0 to ensure that the wake sequence was successful.
CRs-Fixed: 970270
Change-Id: Ic0dedbadfca1e0882d84965d634166f921f1e630
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The current MMU code assumes a binary state - either there is a
IOMMU or there isn't. This precludes other memory models and
makes for a lot of inherent IOMMU knowledge in the generic MMU
code and the rest of the driver. Reorganize and cleanup the
MMU and IOMMU code:
* Add a Kconfig boolean dependent on ARM and/or MSM SMMU support.
* Make "nommu" mode an actual MMU subtype and figure out available
MMU subtypes at probe time.
* Move IOMMU device tree parsing to the IOMMU code.
* Move the MMU subtype private structures into struct kgsl_mmu.
* Move adreno_iommu specific functions out of other generic
adreno code.
* Move A4XX specific preemption code out of the ringbuffer code.
CRs-Fixed: 970264
Change-Id: Ic0dedbad1293a1d129b7c4ed1105d684ca84d97f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
ARM driver supports only one bus for all context banks.
In some cases, hypervisor may not be available and GPU SMMU
uses ARM driver. This will make all context banks are on
non secure bus and kgsl_mmu_bus_secured() returns -EPERM.
Make platform_bus_type as secure for ARM driver.
Change-Id: I11a637ca2b1ef29cc42c9811cad009312a2879cd
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
There can only be one module_init() function per module. Move all
three driver register calls into the same initialization function. The
ordering should still work correctly.
Change-Id: Ic0dedbadf7c69221a836ba3bbba362d0660f1f0f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Add a helper macro to convert an adreno_device pointer to a
struct kgsl_device pointer. This is mostly syntatic sugar
but it makes the code a bit cleaner and it abstracts a bit of
the ugliness away.
Change-Id: Ic0dedbadd97bda3316a58514a5a64757bd4154c7
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The ringbuffer structures are static members of struct adreno_device
which means that they are permanently associated with a specific
adreno device and by extension a struct kgsl_device too. The upshot
is that we can use macro math to derive the adreno device from
a ringbuffer pointer and get rid of the device shortcut in the
ringbuffer struct. This also gives us a chance to clean up
how functions use the ringbuffer and adreno_device structs
to limit unnessesary dereferencing.
Change-Id: Ic0dedbad909ef71e99cd3319713cee38fb1700f0
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Currently VBIF XIN register offset is been overwritten by
the AXI offset. This will cause VBIF XIN halt time out in
VBIF clear transaction path. Fix this by using the proper
VBIF XIN offset for A3xx targets.
Change-Id: Iac20528cb105904e46e012d67287dd736fa11f70
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Use bit 5 in the CP_INIT_MASK to properly enable/disable
microcode workarounds.
Change-Id: I9f43c8c988c3179b3de2cce071339bc565b4a00d
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
The snapshot mempool size takes into account the memory
required for section headers. It was being calculated based
on old header structure. Update that to avoid corruption/
buffer overflow of the mempool memory.
Change-Id: I07274934e4c0dced707e03be3e31b2459e00d706
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Enable CP to process yield packets placed in the IB2s.
Change-Id: I2fadfb108a2dc42f574b3f6ed2e667baddb7889c
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
CP_CACHE_FLUSH interrupts can storm on very rare occasions.
Check for this interrupt storm and do nothing when it occurs
rather than thrashing the CPU which can occasionally bring the
system down.
Change-Id: I0528ad4fec43abfaeeba1499d0b0e51e14b09f0d
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
During snapshot, we may encounter an invalid IB1 base which
is not found in the current rb. In that case, dump the entire
ringbuffer from start to end and all the IBs in it to get a
more complete picture of the failure.
Change-Id: I4393c7de6f8f4890870fa6e2b5e69073dce922b7
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
For proper memory accounting, a key metric is to know how much
memory kgsl allocated for a process and how much of it the process
is actually using. This is done by keeping track of memory in our
vmfault routines. This information is provided via the process
mem file.
Change-Id: I7e3371a708ea5fdade3840b2384b3bc4012ad004
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Global pagetable entries are exclusively for IOMMU and per-process
pagetables. Move all the code out of the generic driver and into
the IOMMU driver and clean up a bunch of stuff along the way.
Change-Id: Ic0dedbadbb368bb2a289ba4393f729d7e6066a17
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The setstate memory is a IOMMU specific construct. Move it to the
IOMMU code where it belongs.
Change-Id: Ic0dedbada977f2861f7c1300a5365da5b09d70a9
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
A freemem list entry maybe truncated/deleted even if it doesn't
overlap with a new buffer, for example, if the new buffer is entirely
to the left or right of the entry. Fix the conditional logic so
that an entry may be truncated/deleted only if it overlaps with
the new buffer.
Change-Id: Ib1519f20d3b56c1e0ed36e9e0afb33c1b31d6166
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
For the case when newly allocated gpuaddr falls between
a memfree entry's gpuaddr and size, the size of the mem
entry was being truncated to a negative value. Fix
the math to reflect the truncated size correctly.
Change-Id: Id39519acc2af106240db8f41539b9fd1dc0cb0eb
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
All the mem entries were being written to a single location hence
the snapshot consisted of only the last mem entry of the process.
Fix this by writing each mem entry to consecutive location in the
snapshot.
Change-Id: I1971fc4b3adce3146768862a56db2b11c6ac44c4
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
_gpu_find_svm() makes a BUG_ON if the returned address is
greater than ULONG_MAX.
But in 32bit kernel, Error conditions would also make the
comparison to be true. Because it compares address or
error in unit64_t with ULONG_MAX which is in 32bit.
Check whether returned address is an error before making the BUG_ON.
Change-Id: I482b330db3e06a1bee31dd6931faf239a61f9ab8
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
Make sure the allocated memory is freed before returning.
Change-Id: I6da7d1ffbd83ad206970e38ac99f9da211ffe86c
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Limits management can force clock throttling which results in incorrect
busy percentage calculation. We need to account for that in DCVS.
Change-Id: Iaa17a7f7d661ab9966597f7710374d5b2e00d514
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Make the various timeout values HZ agnostic by using the proper
macros and values instead.
Change-Id: I708cd491f593782f0172cd7d2cca058cd41044a5
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
_get_unmapped_area_topdown() subtracts the requested size
from lower entry base without really checking its value.
This leads to overflow while working at boundary conditions.
Add a condition to check entry base with the size and proceed.
Change-Id: Ic695da683b11de35c7c4b8936a35d693dc8fa452
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
Dispatcher can acquire drawctxt->lock if context is pending
and the fence it is waiting on just got signalled.
Dispatcher acquires drawctxt->lock and tries to delete the
cmdbatch timer using delete_timer_sync(). Delete_timer_sync()
waits till timer and its pending handlers are deleted.
But if the timer expires at the same time, timer handler
could be waiting on drawctxt->lock leading to a
deadlock. To prevent this use spin_trylock_bh() instead of
spin_lock_bh(). spin_trylock_bh() does not wait for the lock
if it does not get it and allows the timer handler to finish.
This prevents the deadlock.
Change-Id: Ic2344fed5fccb581b58ec0b66b45ba68af9f1459
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Add ISENSE based limit management, provide interfaces to GPMU
and hardware LLM and BCL subsystems.
Change-Id: Ic0419509bdc6d4d9d478277cc90ae75dc527ca66
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Add support for the A540 GPU device:
* Add entry to the GPU list and add adreno_is_a540() functions
* Add VBIF settings
* Add hardware clock gating values
Change-Id: Ibd653597400ded01ca05607fbbdafea3e86e177f
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
mmapsize is no longer important to the memory descriptor
and the APIs that would use it never materialized. It currently
just tracks the size of the memdesc and is no longer needed.
Change-Id: I8fa1001c2f89f23034029de7de6ab77532bf45fa
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Probably overkill, but ensure that the struct pointer we are going
to dereference to send into kref calls is valid before we dereference.
Change-Id: I308176df9f7476a2a9f1357612381a93160ad698
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Alignment checks only need to be done once and can be moved down to
the lower layers.
Change-Id: Ia4683cf9db08506db810e80854c006d94dc80310
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
The code has gotten bloated and repeats the same logic in several
places leading to very stringy code. Consolidate this code to
enable easier readability as well as prep the code for future
changes to this area.
Change-Id: Ibb70cbae3a8a5157e589020ccebefff11b6ffaf1
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Setting the GPU to 64bit when rest of world is in 32bit can
make the GPU misbehave. Hence, check the kernel configuration
before actually moving the GPU to 64bit mode.
Change-Id: Ie4cf6c2d4fdfa978287c86812bdce4bf8c56ef5f
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
Deep nap removes the quality of service latency vote. Restore device
before powering back the GPU while coming out of deep nap.
Change-Id: I9366ffa6f5f2768cb3ea10f9117678ba8cf8d190
Signed-off-by: Prakash Kamliya <pkamliya@codeaurora.org>
Signed-off-by: Divya Ponnusamy <pdivya@codeaurora.org>
struct kgsl_mmu is a static member of struct kgsl_device so we can
use the usual container_of trick to get the device from a mmu
pointer rather than carry around an unneeded back reference.
Change-Id: Ic0dedbad7ff22e598b03d980dfbb738374ed5a7a
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The MMU code does most of its magic by way of device specific MMU
and pagetable functions. Add macros to make it easier for developers
to verify that hooks exist before calling them.
Change-Id: Ic0dedbadf74682adebec1a973384e1d3bbf4f79e
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
a5xx_post_start() is currently only used for either an A530 workaround
OR preemption. If neither are allocated then memory is allocated in
the ringbuffer for no reason and it confuses everybody.
Change-Id: Ic0dedbad7615ba0593da5eb701cc5943877883f4
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
If there are concurrent sysfs reads of snapshot binary
there can be a race condition where the snapshot data
is prematurely free'd by one reader while the other reader
is still reading it. Fix this by proper refcounting using
an atomic.
CRs-Fixed: 902816
Change-Id: I7a156c3a22f5475df0394ae30328d0fd6140f3da
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
The existing timeout values for the various GPMU interactions seems
to have been a tad optimistic for all conditions. Increase them to
cover measured worse case scenarios.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Make several changes to build the GPU driver for 4.4:
- Rename CONFIG_MSM to CONFIG_QCOM where applicable
- Add msm_kgsl.h to the Kbuild exports
- Remove linux/coresight_of.h (as it has been merged into
coresight.h) and remove the .owner member of the
coresight_desc struct.
- Use the new location for the sync.h file (in staging)
- Remove an unused sync function
- Move oneshot_sync.h inside of #ifdef wrappers
Signed-off-by: Jordan Crouse <jcrouse@codeauorora.org>
Snapshot of the Qualcomm GPU devfreq governors and support
as of msm-3.18 commit e70ad0cd5efd
("Promotion of kernel.lnx.3.18-151201.").
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Snapshot of the Qualcom Adreno GPU driver (KGSL) as of msm-3.18 commit
commit e70ad0cd5efd ("Promotion of kernel.lnx.3.18-151201.").
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>