Call clk_set_flag() to turn off both memory core and periphery for
bimc_gfx_clk clock and memory for gfx_3d.
CRs-Fixed: 1046649
Change-Id: I941f91eeba01f4e7aa5427056bc57875e7edf197
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Device type memory mapping enforces certain restrictions
on unaligned address accesses.
If userspace in an IOCTL incorrectly sends an unaligned
address at the boundary of device type memory mapping to
kernel, there will be a fault because kernel goes ahead
and reads the device type memory with unaligned access.
To overcome such issues, change device-type memory mapping
to Normal-noncached wherever it is possible.
Change-Id: I34e8268a0defe335ca9d360e910655c2891cd572
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
Log the nearby allocations for pagefaults on global buffers.
Print the names of the allocations that fall around the
faulting address on a global buffer. Also add a new debugfs
file to list all the global pagetable entries. Useful for
debugging pagefaults and other issues with "global" objects.
CRs-Fixed: 985631
Change-Id: Ifbbdc69044fc64d7ea02509bf8113ed94eeece1e
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
DEEP-NAP and SLEEP states are not used in targets of previous
two generations. They are neither saving GPU power, nor saving
system power. Remove to reduce maintenance overhead.
CRs-Fixed: 1053516
Change-Id: If2fc2701548f90bb7ea9559a87752e13a7b0f736
Signed-off-by: George Shen <sqiao@codeaurora.org>
The format specifier %p can leak kernel addresses
while not valuing the kptr_restrict system settings.
Use %pK instead of %p, which evaluates whether
kptr_restrict is set.
Change-Id: I0778e43e0a03852ca2944377256a7b401586a747
Signed-off-by: Divya Ponnusamy <pdivya@codeaurora.org>
If a high priority context submits while preemption to a lower context
is underway, then preemption to higher context is not triggered until
either we get a GPU command complete interrupt or another workload
from the higher context is submitted. To avoid this latency,
trigger preemption from the preemption complete interrupt.
CRs-Fixed: 1058401
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Change-Id: I0a05df94e7bdd5daadfa0713371a595a06b7bda7
Add WQ_SYSFS to the worker threads so that they show
up under /sys/bus/workqueue/devices. This allows some
of the properties to be adjusted at runtime.
Change-Id: I3424ae51461e04e0771560ff1c5b35cdf5b1fd6c
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
GFX retention mode does not save GFX rail power. The feature
increased MX rail power. Fixing the problem requires more overhead
than removing it. The feature has never been enabled in any targets.
So remove the feature.
CRs-Fixed: 1053516
Change-Id: I5f118138eca307f7cc16405ff9c8897ecd510c12
Signed-off-by: George Shen <sqiao@codeaurora.org>
Operation needs to be read/modify/write. It was write only which
zeroed out bits outside of requested region.
CRs-Fixed: 1055047
Change-Id: I2e010a99ed5961cd501e1eae913c73b3dbee4789
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Add support to allocate/reserve a virtual address range without
physically backing. Add support to allocate physically backing memory
without assigning it a virtual address. Add support to unite
the two forementioned allocations together. Add support to
divorce them from one another. Add support to let their kids
do cache operations as they see fit.
Create a 'dummy' page that is used to back virtual allocations
that are not yet backed by physical memory.
CRs-Fixed: 1046456
Change-Id: Ifaa687b036eeab22ab4cf0238abdfbe7b2311ed3
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
In case of GPU idle (NAP), schedule DCVS call to obtain updated
GPU load for correct GPU frequency scaling.
Change-Id: Ifcf05ffde0a054839e51d3f8173b8449fe177aa0
CRs-Fixed: 1050000
Signed-off-by: Oleg Perelet <operelet@codeaurora.org>
Disable RB sampler data path DP2 clock gating optimization
for 1-SP A5XX GPU's. Optimization leads to precision
difference during interpolation which cause rendering
difference between Binning and Direct rendering mode.
CRs-Fixed: 1040638
Change-Id: I40d1ce2f5db0ed75453feda5c31152f8201b8697
Signed-off-by: Sunil Khatri <sunilkh@codeaurora.org>
If the low memory killer runs early in execution, it could free
several reserved pages and for pools which are not allowed to
allocate new pages, those pages are gone forever. Change the
shrinker to not free the reserved pages from pools which are not
allowed to allocate new pages.
Crs-Fixed: 1052430
Change-Id: I65631628a3043fe7c2f74d41bb116fe1b6255873
Signed-off-by: Shrenuj Bansal <shrenujb@codeaurora.org>
Currently, if read pointer is behind write pointer and there
is not enough space toward the end of the ringbuffer for
new commands, then write pointer is being set to 0.
This is problematic, because it leads to the overwriting of
unexecuted commands with new commands at the start of the
ringbuffer. So, instead of setting the write pointer to 0,
look for space from the start of the ringbuffer up till the
read pointer and if there is room, update the write pointer
accordingly.
CRs-Fixed: 1028465
Change-Id: I1cbdbf139b14988513a22030aa2be4a99a221880
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Treat 0 as a valid fd instead of treating it as an error.
CRs-Fixed: 1030098
Change-Id: I4a1b14fcbca617bc2a43b30af7256edc3920f04c
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Make VBIF register dump more generic to avoid adding new entry
for every VBIF revision. There will not be any change in register
mapping and address for new VBIF revision. AHB reads are permitted
throughout entire VBIF range. For all unoccupied registers read
values driven to 0 by HW but should not be relied upon.
CRs-Fixed: 1021711
Change-Id: I5aada474389e9189abcd38f1bc4854ada91dea87
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Scheduling issues were occurring with the GPU event worker after
b7be807 (msm: kgsl: Unbind the kgsl-event workqueue) was merged.
In certain conditions, it seems that the kgsl-event workqueue
was conflicting with the KGSL worker and slowing it down.
It turns out that everywhere we schedule the event worker
and the dispatcher worker at the same time. Since the worker
is singlethread, the event worker and the dispatcher run
synchronously anyway, so it makes sense to run the event processor
from within the dispatcher and save the extra schedule.
CRs-Fixed: 1043509
Change-Id: Ic0dedbad67eb04d41afb6add4477f146dfff9784
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
For A3xx we get GPU read pointer from the CP_RB_RPTR
register instead of rptr scratch memory address. In
retire_cmdbatch() and _retier_maraker() GPU clock will
be off, so avoid reading CP_RB_RPTR register. Also hold
device mutex in sendcmd() to access GPU registers.
CRs-Fixed: 1024730
Change-Id: Ifa5e9d3f892301685cb48a227ce4967d895499b1
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
Check the return value of dma_buf_get function using
IS_ERR_OR_NULL as dma_buf_get function can return
ERR_PTR(-EINVAL) which won't be caught by simple NULL
check. This will avoid kernel panic due to invalid
pointer access.
CRs-Fixed: 1008517
Change-Id: I11ebf618edd25a251d3fb8bb7fbbb886e10d788f
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Dump 256 instead of 128 dwords of SDS as DRAW_STATE_ADDR is
actually 8 bits wide [7:0] and not 7 bits wide [6:0].
CRs-Fixed: 1023608
Change-Id: I8dcb07bf0a3b9e91b6ec7396d89239fdbd548ac0
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
A540 has a new vbif version. Add it so that we can
dump vbif registers in snapshot for A540.
CRs-Fixed: 1024192
Change-Id: Id9323fa98951e2755fcc6903f84a450bc7ab6169
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Add the register to be dumped in a5xx snapshot.
CRs-Fixed: 1024179
Change-Id: I316029caa10047828375ae0eab1f1d35d30fccb6
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
A5xx GPUs currently don't need more than 64KB for
CP preemption record.
CRs-Fixed: 1019529
Change-Id: I3df22b7b282fb8ff3269f01b2b258318fc83cbcb
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
This is done to improve the kgsl vmfault routine. Currently,
it traverses the sglist to find the faulted page, which takes
linear time. By having an array of all the page pointers,
this operation will be completed in constant time.
Also, allocate sgt only for mapping this memory to the GPU.
Since this optimization is not needed for secure/global or
imported memory, we will not keep this array but keep
the sgt instead.
CRs-Fixed: 1006012
Change-Id: I221fce9082da0bdd59842455221b896a33a6ce42
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
This change adds a check to avoid allocating memory for loading microcode
in case it is already allocated. This avoids memory allocation failure
for microcode during multiple tries by userspace to open the kgsl device
in case of errors.
CRs-Fixed: 1043490
Change-Id: I018ebdb0dab1fc13af8d85a273c1c8b477fa1e26
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
Tracks GPU active time per frequency for GPU workload
profiling. The data will be output in
/sys/class/kgsl/kgsl-3d0/gpu_clock_stats
with one u64 value in microseconds per clock level.
For example:
cat /sys/class/kgsl/kgsl-3d0/gpu_clock_stats
39392 29292 929292 929292 4040404
CRs-Fixed: 1011462
Change-Id: I5f2caa8b38d99ffd23f03c1dfed1efda273fc2fb
Signed-off-by: George Shen <sqiao@codeaurora.org>
The global buffers are allocated through cma, which can
be very limited on some targets. Add a flag to allocate
a global buffer through our page allocator.
CRs-Fixed: 1024295
Change-Id: Ie796b03ce152774535f593acdf00e900109d303a
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
Instead of trying to make a decision to switch out the active
draw context for NULL at detach time leave the reference count
for it until the next context switch or until the next slumber
whichever comes first. This avoids races with the preemption
code and ensures a smooth transition.
A side effect is that we were depending heavily on the context
detach to reset the ringbuffer to the default at power down and
we didn't touch it on power up (though we did on soft reset and
wake from slumber. Curious). Obviously if we are no longer
switching we will need to force the default pagetable during start
but it seems to me like this would be the right thing to do even
if we were still switching out.
CRs-Fixed: 1009124
Change-Id: Ic0dedbadff8df192096292b221130c8ef5b31e12
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>