Unload zap shader during device hibernation and reload it
during resume otherwise scm calls during post hibernation
GPU initialization will fail as there is possibility that
TZ driver is not aware of the hibernation.
Change-Id: I1f62fb97cbc8e6c3e3536d4d5260a543ca15b685
Signed-off-by: Suprith Malligere Shankaregowda <supgow@codeaurora.org>
Signed-off-by: Thomas Yun <wyun@codeaurora.org>
This patch ensures device resumes successfully after
XO shutdown without any KGSL error.
Change-Id: I9eb8e281bc62793dc7521ba72aaeecf946860851
Signed-off-by: Suprith Malligere Shankaregowda <supgow@codeaurora.org>
On a 64bit kernel, a 32bit user application is not
restricted to 3GB limit of virtual memory. It is
allowed to access complete 4GB range.
Move global memory region to 0x100000000 outside of
32bit range on 64bit kernel to increase the virtual
memory range for a 32bit application running on a
64bit kernel. This will also move secure memory
region to 0xF0000000.
Change-Id: I017ac0c052b4d9466f9f1a66af4a83f0636450cb
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Add the property to determine the current command timeout
value which is used by the clients via KGSL IOCTL.
Change-Id: Ifd6b373d211ebd78dc3a8032ede073258487d689
Signed-off-by: Sunil Khatri <sunilkh@codeaurora.org>
A3xx device gets the ring buffer read pointer directly
from the GPU registers. So don’t allocate scratch memory
which can’t be used for A3xx GPU devices.
Change-Id: I95016dfc169b9fee74e978f5560592740f34515e
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Add debug log to dump the GPU speed bin value incase probe
fails due to efused bin value mismatch with speed bin value.
Change-Id: I329523f8dbb82272418981a54a1c2e6cf5e90b85
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Add a devicetree "disable-wake-on-touch" property support
to disable GPU wake up on touch input events. This will
help save power in case of unintended taps and swipes,
for example, when the screen is wet.
Change-Id: I35768dc78c473272472aaf9c0e09e66d75817b2c
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Add a PM QOS request to disallow L2PC during wake up
from SLUMBER state. This is required to improve queue
to submit time for first set of GPU commands which results
in GPU wake up.
Change-Id: Iad1a6dfdf9e1fe034eef4fae526138d724bdd3eb
Signed-off-by: Gaurav Sonwani <gsonwani@codeaurora.org>
GPU perfcounters gets reset after a soft reset. Currently
GPU busy stats is using previous values after a soft reset.
This might lead to invalid GPU busy calculations. Start GPU
stats from the scratch after a soft reset to get valid GPU
busy calculations.
Change-Id: Ia38c18ad59f438d724ff4710ee2b350853b3810d
Signed-off-by: Abhilash Kumar <krabhi@codeaurora.org>
Memory retention is needed only for NAP state but not for SLUMBER state.
Disables memory retention for core clock before entering SLUMBER to save
power.
Change-Id: I64a5ecec6fc90d662da8d9d793860e56b0c6473f
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Following changes been made to improve soft fault detection,
which will fix un clocked register access in dispatcher_do_fault()
and incorrect declaration of GPU soft fault.
i) Stop fault timer before entering to NAP state
ii) Don’t start fault timer if the dispatcher inflight count is zero
iii) Add ringbuffer empty check in _isidle()
iv) Add device state check in dispatcher_do_fault()
CRs-Fixed: 2012731
Change-Id: I5ce498029f389eeeb428b4ac7fb07afd84d5764c
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Map the GPU QTimer area as a global into the GPU
IOMMU so that the GPU can access the QTimer.
Change-Id: If50bd36681123adde7e3a37644c41316f101154c
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
Current irq handler clears the pending interrupt bits in interrupt
status register before serving the interrupts. This leads to a race
condition with the idle check which checks the interrupt status
register to determine whether any interrupt is pending or not. As
the interrupt status register is already cleared, idle check goes
ahead and switch off the GPU clocks even when irq is yet to be served
causing NOC errors.
This change refcounts each irq handler call and uses this reference
count to determine if any irq is still pending or not along with
interrupt status register to avoid this race condition.
Change-Id: I030d52c52055f836ea4c7519ce2d8db94a2a09a0
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Add a quirk to set LMLOADKILLDIS bit in A5XX_VPC_DBG_ECO_CNTL
and clear LMLOADKILLDIS bit in A5XX_HLSQ_DBG_ECO_CNTL registers.
This is done to avoid a VPC corner case with local memory(LM)
which leads to corrupt internal state on A540 and its derivatives.
CRs-Fixed: 1036444
Change-Id: I31008433f19924bb35560d3e35fe0665e73751d5
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
When programming perfcounter via gpu commands, we may encounter
-EAGAIN because of cancelling rb events either due to soft reset
or when powering down the device. Ignore this error because we
have already set up the perfcounter in software and it will be
programmed in hardware by adreno_perfcounter_restore when gpu
comes back up.
CRs-Fixed: 1024199
Change-Id: I5dc3561d15fa50ac58646f96559cfd262020dda9
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Readl/writel macros expect void pointers so declare the
addresses as void and not unsigned int.
Change-Id: I67cf15fa918832ebab56cb999265d02880682c5e
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Add a sysfs entry to enable control of notifications
from pwrscale to devfreq.
Change-Id: Ife0a31e96975239bf4fefd59ac6266568c4db1a5
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
Sometimes an interrupt from GPU is ignored while we
are still executing the previous interrupt. In order
to service any interrupt that was fired while executing
the interrupt handler, clear the interrupt register
immediately.
Also, clear the ADRENO_INT_RBBM_AHB_ERROR bit not before
but after it's serviced in its respective handler. This
will avoid firing the main interrupt handler a second
time.
CRs-Fixed: 1072781
Change-Id: Ie6b5a511f5b3077adae7d464de437f2aa893b0c9
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Move target specific initialization and setup into target
specific init functions. The change is required to port the GPU
driver to support future generation GPUs.
CRs-Fixed: 1053516
Change-Id: I808e247669fab61a6a64131858fe2f9e19754242
Signed-off-by: George Shen <sqiao@codeaurora.org>
a540 hardware does not support BCL and LMH after all.
CRs-Fixed: 1075694
Change-Id: I09808145d20ded63b5043cae6510429560cb599e
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Currently dispatcher accepts kgsl_cmdbatch object. This object
is a superset of all the types of objects dispatcher accepts.
Split kgsl_cmdbatch object to SYNC and IB/MARKER objects and
structure the code to make it easier for new type of objects
to be added to the dispatcher queue.
CRs-Fixed: 1054354
Change-Id: I2d482d1081ce6fdb7925243c88ce00ea6b864efe
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Rename all cmdbatch to drawobj. This forms a platform
for future changes where cmdbatch is split into different
types of drawobjs.
CRs-Fixed: 1054353
Change-Id: Ib84bee679e859db34e0d1f8a0ac70319eabddf53
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Add new sysfs nodes which satisfy a generic format requested
by customer. Also add a new node to track GPU temperature.
Create links to these nodes at a generic location:
/sys/kernel/gpu/
CRs-Fixed: 1064728
Change-Id: I414a07ff4f9ee14b8f882d15644b06a73d5fcf76
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Call clk_set_flag() to turn off both memory core and periphery for
bimc_gfx_clk clock and memory for gfx_3d.
CRs-Fixed: 1046649
Change-Id: I941f91eeba01f4e7aa5427056bc57875e7edf197
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Log the nearby allocations for pagefaults on global buffers.
Print the names of the allocations that fall around the
faulting address on a global buffer. Also add a new debugfs
file to list all the global pagetable entries. Useful for
debugging pagefaults and other issues with "global" objects.
CRs-Fixed: 985631
Change-Id: Ifbbdc69044fc64d7ea02509bf8113ed94eeece1e
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
DEEP-NAP and SLEEP states are not used in targets of previous
two generations. They are neither saving GPU power, nor saving
system power. Remove to reduce maintenance overhead.
CRs-Fixed: 1053516
Change-Id: If2fc2701548f90bb7ea9559a87752e13a7b0f736
Signed-off-by: George Shen <sqiao@codeaurora.org>
Disable RB sampler data path DP2 clock gating optimization
for 1-SP A5XX GPU's. Optimization leads to precision
difference during interpolation which cause rendering
difference between Binning and Direct rendering mode.
CRs-Fixed: 1040638
Change-Id: I40d1ce2f5db0ed75453feda5c31152f8201b8697
Signed-off-by: Sunil Khatri <sunilkh@codeaurora.org>
Scheduling issues were occurring with the GPU event worker after
b7be807 (msm: kgsl: Unbind the kgsl-event workqueue) was merged.
In certain conditions, it seems that the kgsl-event workqueue
was conflicting with the KGSL worker and slowing it down.
It turns out that everywhere we schedule the event worker
and the dispatcher worker at the same time. Since the worker
is singlethread, the event worker and the dispatcher run
synchronously anyway, so it makes sense to run the event processor
from within the dispatcher and save the extra schedule.
CRs-Fixed: 1043509
Change-Id: Ic0dedbad67eb04d41afb6add4477f146dfff9784
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
This change adds a check to avoid allocating memory for loading microcode
in case it is already allocated. This avoids memory allocation failure
for microcode during multiple tries by userspace to open the kgsl device
in case of errors.
CRs-Fixed: 1043490
Change-Id: I018ebdb0dab1fc13af8d85a273c1c8b477fa1e26
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Tracks GPU active time per frequency for GPU workload
profiling. The data will be output in
/sys/class/kgsl/kgsl-3d0/gpu_clock_stats
with one u64 value in microseconds per clock level.
For example:
cat /sys/class/kgsl/kgsl-3d0/gpu_clock_stats
39392 29292 929292 929292 4040404
CRs-Fixed: 1011462
Change-Id: I5f2caa8b38d99ffd23f03c1dfed1efda273fc2fb
Signed-off-by: George Shen <sqiao@codeaurora.org>
Instead of trying to make a decision to switch out the active
draw context for NULL at detach time leave the reference count
for it until the next context switch or until the next slumber
whichever comes first. This avoids races with the preemption
code and ensures a smooth transition.
A side effect is that we were depending heavily on the context
detach to reset the ringbuffer to the default at power down and
we didn't touch it on power up (though we did on soft reset and
wake from slumber. Curious). Obviously if we are no longer
switching we will need to force the default pagetable during start
but it seems to me like this would be the right thing to do even
if we were still switching out.
CRs-Fixed: 1009124
Change-Id: Ic0dedbadff8df192096292b221130c8ef5b31e12
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The secure buffer registers were not being programmed in the soft
reset path which was causing a failure for the critical packets
workaround and forcing a hard reset.
CRs-Fixed: 1009194
Change-Id: Ic0dedbad998767a1ffdfe265e52fae7baa18d203
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Allow 5XX targets to preempt quickly from an atomic context. In
particular this allows quicker transition from a high priority
ringbuffer to a lower one without having to wait for the worker
to schedule.
CRs-Fixed: 1009124
Change-Id: Ic0dedbad01a31a5da2954b097cb6fa937d45ef5c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The memstore shared between the CPU and GPU is old but can not be
messed with. Rather than stealing values from it where available,
add a new block of shared memory that is exclusive to the driver
and GPU. This block can be used more freely than the old
memstore block.
Program the GPU to write the RPTR out to an address the CPU can read rather
than having the CPU read a GPU register directly. There are some very
small but very real conditions where different blocks on the GPU have
outdated values for the RPTR. When scheduling preemption the value read
from the register could not reflect the actual value of the RPTR in the CP.
This can cause the save/restore from preemption to give back incorrect RPTR
values causing much confusion between the GPU and CPU.
Remove the ringbuffers copy of the read pointer shadow.
Now that the GPU will update a shared memory address with the
value of the read pointer, there is no need to poll the register
to get the value and then keep a local copy of it.
CRs-Fixed: 987082
Change-Id: Ic44759d1a5c6e48b2f0f566ea8c153f01cf68279
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Map the GPU QDSS STM area as a global into the GPU
IOMMU so that GPU traces can be routed to QDSS.
Enable the gpuaddr and size of the area to be queried
from userspace.
CRs-Fixed: 1031648
Change-Id: I2e32522a42508a6bee088c95dc56a13935dd691c
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
CRC idle throttling effectively slows down GPU 10x,
internal GPU idle hysteresis does not account for this
and may take up to 3usec to expire. Make host delay.
CRs-Fixed: 1028293
Change-Id: I0a80e49a3fea6e0e8d9e8b82847188b0a4452943
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Create sysfs entry to control GPU clock throttling. When 0 is
written - all sources of clock throlling - ie LM, BCL, IDLE
are disabled.
CRs-Fixed: 973565
Change-Id: Iad588eb94861bd6b223715cc05354e3c39db9b24
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>