This way the necessary VM update is kicked off immediately
if all BOs involved are in GPU accessible memory.
v2: fix vm lock
v3: immediately update unmaps as well
v4: use drm_free_large instead of kfree
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only invalidating PTEs needs to be executed synchronized to using the PT.
v2: fix sync to uses
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We never invalidate PD entries and making them valid can
run with other users in parallel.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows us to finally remove the VM fence and
so allow concurrent use of it from different engines.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use multiple VMIDs for each VM, one for each ring. That allows
us to execute flushes separately on each ring, still not ideal
cause in a lot of cases rings can share IDs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Note for each fence if it's a VM page table update or not. This allows
us to determine the last VM update in a sync object and so to figure
out if we need to flush the TLB or not.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows us to add the real execution fence as shared.
v2: fix typo
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously we just allocated space for four hardware semaphores
in each software semaphore object. Make software semaphore objects
represent only one hardware semaphore address again by splitting
the sync code into it's own object.
v2: fix typo in comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The PD/PTs reservation object now contains everything needed.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That's useless when all callers drop the reservation
immediately after calling the function.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use ring structure instead of index and provide vm_id and pd_addr separately.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current code always reprogrammed the sclk levels,
but we don't currently handle disp sclk requirements
so just skip it.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable smc fan control for CI boards. Should
reduce the fan noise on systems with a higher
default fan profile.
v2: disable by default, add additional fan setup, rpm control
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73338
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable smc fan control for SI boards. Should
reduce the fan noise on systems with a higher
default fan profile.
v2: disable by default, add rpm controls
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73338
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just use the acpi interface. That's what windows uses on this
generation and it's the only thing that seems to work reliably
on these generation parts.
You can still force the native backlight interface by setting
radeon.backlight=1
Bug:
https://bugzilla.kernel.org/show_bug.cgi?id=88501
v2: merge into above if/else block
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Always need to set bit 0 of RLC_CGTT_MGCG_OVERRIDE
to avoid unreliable doorbell updates in some cases.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The cursor_set2 hook provides the cursor hotspot position within the
cursor image. When the hotspot position changes, we can adjust the cursor
position such that the hotspot doesn't move on the screen. This prevents
the cursor from appearing to intermittently jump around on the screen
when the position of the hotspot within the cursor image changes.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This function can be called now with i915 interrupts enabled, so the
corresponding WARN is incorrect, remove it. I think this was spotted by
Paulo during his review, but since I already removed the same WARN
from intel_suspend_gt_powersave() I missed then his point.
Spotted-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Spotted while reading and trying to understand how our error capture
code deals with full ppgtt.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
I saw punit timeouts in vlv_set_rps_idle() while running various
subtests of pm_rpm. Increasing the timeout to 100ms got rid of the
issue.
Testcase: igt/pm_rpm
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=82939
Signed-off-by: Imre Deak <imre.deak@intel.com>
Tested-by: Guo Jinxian <jinxianx.guo@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently after doing DPMS-OFF on all outputs CDCLK won't be set to its
minimum value as it should. A subsequent modeset to turn off all outputs
will thus run with all power domains disabled, and notice that it needs
to change CDCLK to its minimum value. Since the power domains are
disabled this will emit a register-access-while-suspended WARN and fail
to set the minimum freq.
The proper solution for this is to set the minimum frequency during
DPMS-OFF. That needs a bigger rework that would take into account the
user DPMS setting too during the calculation of the new modesetting
configuration. Until that's done this stop-gap solution gets the PIPE-A
power domain during setting the CDCLK; this domain covers the HW blocks
needed for this.
Idea to use PIPE-A domain from Ville.
Testcase: igt/pm_rpm
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=82939
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Makes it easier to debug infoframe mismatches.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
For MMIO registers which are shadowed, force wake is not needed to
write to these registers.
v2: Rebase on top of nightly (Damien)
v3: Rebase on top of "Gen9 multiple-engine forcewake" changes
v4: (Mika, Bob, done by Damien)
- Reorder the shadowed registers by popularity
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Zhe Wang <zhe1.wang@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Enable multi-engine forcewake for Gen9.
v2: (Damien)
- Rebase on top of nightly
- Move the register range definitions to intel_uncore.c
- Whitespace fixes
v3: (Addressing Mika's comment, done by Damien)
- Use REG_RANGE() (introduced after the patch was written)
- Add a SKL_NEEDS_FORCE_WAKE() macro that gets rid of a useless
comparison to FORCEWAKE (reg 0xa18c is not used on SKL)
v4: (Damien)
- Use newly introduced ASSIGN_READ/WRITE_MMIO_VFUNCS() macros
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Zhe Wang <zhe1.wang@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Trying to read the status of the power wells right after taking forcewake
for the other register reads makes little sense. Most of the time the
power wells will still be up due to the recent forcewake. Instead do the
power well status read first, and only then read the register needing
forcewake. This way the reported power well status can actually reflect
what's going on in the system.
Cc: Deepak S <deepak.s@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's just throw in the towel on this one and take the cheap way out.
Based on a patch from Chris Wilson, but checking for a different bit.
Chris' patch checked for even bank layout, this one here for a magic
bit. Given the evidence we've gathered (not much) both work I think,
but checking for the magic bit might be more accurate.
Anyway, works on my gm45 here.
For paranoi restrict to gen4 (and mobile), since we've only ever seen
this on gm45 and i965gm.
Also add some debugfs output so that we can skip the tiled swapping
tests properly in these cases.
v2: Clean up the quirk'ed pin count in free_object to avoid upsetting
the WARN_ON. Spotted by Chris.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28813
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45092
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In __gen6_update_ring_freq, use the full range of
possible gpu frequencies from max_freq to min_freq.
The actual gpu frequency could be outside the range
from max_freq_softlimit to min_freq_softlimit due
to power/thermal constraints.
Signed-off-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In gen8_enable_rps, change the initial rps setting
to the min_freq_softlimit (same as gen6_enable_rps).
Signed-off-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Set the min_freq_softlimit to max(RPe, 450MHz).
Setting a floor can ensure a minimum experience
level. The 450MHz value came from a power and
performance study of various types of workloads
(3D, Media, GPGPU, idle, etc).
v2: rebased
Signed-off-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Added gen6_init_rps_frequencies() to initialize
the rps frequency values. This function replaces
parse_rp_state_cap(). In addition to reading RPn,
RP0, and RP1 from RP_STATE_CAP register, the new
function reads efficient frequency (aka RPe) from
pcode for Haswell and Broadwell and sets the turbo
softlimits. The turbo minimum frequency softlimit
is set to RPe for Haswell and Broadwell and to RPn
otherwise.
For RPe, the efficiency is based on the frequency/power
ratio (MHz/W); this is considering GT power and not
package power. The efficent frequency is the highest
frequency for which the frequency/power ratio is within
some threshold of the highest frequency/power ratio.
A fixed decrease in frequency results in smaller
decrease in power at frequencies less than RPe than
at frequencies above RPe.
v2: Following suggestions from Chris Wilson and
Daniel Vetter to extend and rename parse_rp_state_cap
and to open-code a poorly named function.
Signed-off-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Remove unused variables.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Found one more!
With this we can clear up the ggtt init code a bit, yay!
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
With this all the ums nonsense around gem setup/teardown has
disappeared, yay!
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Again just complicates gem init functions and makes a general mess out
of everything.
Good riddance!
v2: In my enthusiasm to start removing dri1/ums crud I went overboard a
bit and killed parts of hangcheck. Resurrect it.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
This patch allows framebuffers for cirrus to be created with
32bpp pixel formats provided that they do not violate certain
restrictions of the cirrus hardware.
v2: Use pci resource length for vram size.
Signed-off-by: Zach Reizner <zachr@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Only importing an FD to a handle is currently supported on UDL,
but the exporting functionality is equally useful.
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
By default set udl_gem_object as cacheable, but set WC flag when attaching
dmabuf. In udl_gem_mmap() update cache attributes based on the flags, similar
to exynos_drm_gem_mmap().
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Olof Johansson <olofj@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Before this patch, cirrus_device_init could have failed while
cirrus_mm_init succeeded and the driver would have reported overall
success on load. This patch causes cirrus_device_init to return on
the first error encountered.
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Oversight from my kerneldoc cleanup when doing the original atomic
helper series - I've only applied this clarification to the modeset
related helpers, and not the plane update code. Remedy this asap.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I guess for hysterical raisins this was meant to be the way to read
blob properties. But that's done with the two-stage approach which
uses separate blob kms object and the special-purpose get_blob ioctl.
Shipping userspace seems to have never relied on this, and the kernel
also never put any blob thing onto that property. And nowadays it
would blow up, e.g. in drm_property_destroy. Also it makes no sense to
return values in an ioctl that only returns metadata about everything.
So let's ditch all the internal code for the blob list, rename the
list to be unambiguous and sprinkle comments all over the place to
explain this peculiar piece of api.
v2: Squash in fixup from Rob to remove now unused variables.
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Make it clear that it's a negative errno (more in line with
everything else).
- Clean up the confusion around get_properties vs. getproperty ioctls:
One reads per-obj property values, the other reads property
metadata.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Yet another fallout from not considering DP MST hotplug. With the
previous patches we have stable indices, but it might still happen
that a connector gets added between when we allocate the array and
when we actually add a connector. Especially when we back off due to
ww mutex contention or similar issues.
So store the sizes of the arrays in struct drm_atomic_state and double
check them. We don't really care about races except that we want to
use a consistent value, so ACCESS_ONCE is all we need. And if we
indeed notice that we'd overrun the array then just give up and
restart the entire ioctl.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Otherwise the connector might have been unplugged and destroyed while
we didn't look. Yet another fallout from DP MST hotplugging that I
didn't consider.
To make sure we get this right add an appropriate WARN_ON to
drm_atomic_state_clear (obviously only when we actually have a state
to clear up). And reorder all the state_clear and backoff calls to
make it work out properly.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>