Commit graph

18950 commits

Author SHA1 Message Date
Kenneth Graunke
d60de81da4 drm/i915: Improve HiZ throughput on Cherryview.
Found by reading the HIZ_CHICKEN documentation.

Improves performance in a HiZ microbenchmark by around 50%.
Improves performance in OglZBuffer by around 18%.

Thanks to Chris Wilson for helping me figure out where to put this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-13 00:16:53 +01:00
Thomas Daniel
c0a03a2e4c drm/i915: Reset CSB read pointer in ring init
A previous commit enabled execlists by default:

       commit 27401d126b ("drm/i915/bdw: Enable execlists by default where supported")

This allowed routine testing of execlists which exposed a regression when
resuming from suspend.  The cause was tracked down the to recent changes to the
ring init sequence:

       commit 35a57ffbb1 ("drm/i915: Only init engines once")

During a suspend/resume cycle the hardware Context Status Buffer write pointer
is reset.  However since the recent changes to the init sequence the software CSB
read pointer is no longer reset.  This means that context status events are not
handled correctly and new contexts are not written to the ELSP, resulting in an
apparent GPU hang.

Pending further changes to the ring init code, just move the
ring->next_context_status_buffer initialization into gen8_init_common_ring to
fix this regression.

v2: Moved init into gen8_init_common_ring rather than context_enable after
feedback from Daniel Vetter.  Updated commit msg to reflect this and also cite
commits related to the regression.  Fixed bz link to correct bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-13 00:11:52 +01:00
Matt Roper
53a366b9f3 drm/i915: Drop unused position fields (v2)
The userspace-requested plane coordinates are now always available via
plane->state.base (and the i915-adjusted values are stored in
plane->state), so we no longer use the coordinate fields in intel_plane
and can drop them.

Also, note that the error case for pageflip calls update_plane() to
program the values from plane->state; it's simpler to just call
intel_plane_restore() which does the same thing.

v2: Replace manual update_plane() with intel_plane_restore() in pageflip
    error handler.

Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:59:41 +01:00
Matt Roper
ea2c67bb4a drm/i915: Move to atomic plane helpers (v9)
Switch plane handling to use the atomic plane helpers.  This means that
rather than provide our own implementations of .update_plane() and
.disable_plane(), we expose the lower-level check/prepare/commit/cleanup
entrypoints and let the DRM core implement update/disable for us using
those entrypoints.

The other main change that falls out of this patch is that our
drm_plane's will now always have a valid plane->state that contains the
relevant plane state (initial state is allocated at plane creation).
The base drm_plane_state pointed to holds the requested source/dest
coordinates, and the subclassed intel_plane_state holds the adjusted
values that our driver actually uses.

v2:
 - Renamed file from intel_atomic.c to intel_atomic_plane.c (Daniel)
 - Fix a copy/paste comment mistake (Bob)

v3:
 - Use prepare/cleanup functions that we've already factored out
 - Use newly refactored pre_commit/commit/post_commit to avoid sleeping
   during vblank evasion

v4:
 - Rebase to latest di-nightly requires adding an 'old_state' parameter
   to atomic_update;

v5:
 - Must have botched a rebase somewhere and lost some work.  Restore
   state 'dirty' flag to let begin/end code know which planes to
   run the pre_commit/post_commit hooks for.  This would have actually
   shown up as broken in the next commit rather than this one.

v6:
 - Squash kerneldoc patch into this one.
 - Previous patches have now already taken care of most of the
   infrastructure that used to be in this patch.  All we're adding here
   now is some thin wrappers.

v7:
 - Check return of intel_plane_duplicate_state() for allocation
   failures.

v8:
 - Drop unused drm_plane_state -> intel_plane_state cast.  (Ander)
 - Squash in actual transition to plane helpers.  Significant
   refactoring earlier in the patchset has made the combined
   prep+transition much easier to swallow than it was in earlier
   iterations. (Ander)

v9:
 - s/track_fbs/disabled_planes/ in the atomic crtc flags.  The only fb's
   we need to update frontbuffer tracking for are those on a plane about
   to be disabled (since the atomic helpers never call prepare_fb() when
   disabling a plane), so the new name more accurately describes what
   we're actually tracking.

Testcase: igt/kms_plane
Testcase: igt/kms_universal_plane
Testcase: igt/kms_cursor_crc
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:59:31 +01:00
Matt Roper
4a3b8769f8 drm/i915: Clarify sprite plane function names (v4)
A few of the sprite-related function names in i915 are very similar
(e.g., intel_enable_planes() vs intel_crtc_enable_planes()) and don't
make it clear whether they only operate on sprite planes, or whether
they also apply to all universal plane types.  Rename a few functions to
be more consistent with our function naming for primary/cursor planes or
to clarify that they apply specifically to sprite planes:

 - s/intel_disable_planes/intel_disable_sprite_planes/
 - s/intel_enable_planes/intel_enable_sprite_planes/

Also, drop the sprite-specific intel_destroy_plane() and just use
the type-agnostic intel_plane_destroy() function.  The extra 'disable'
call that intel_destroy_plane() did is unnecessary since the plane will
already be disabled due to framebuffer destruction by the point it gets
called.

v2: Earlier consolidation patches have reduced the number of functions
    we need to rename here.

v3: Also rename intel_plane_funcs vtable to intel_sprite_plane_funcs
    for consistency with primary/cursor.  (Ander)

v4: Convert comment for intel_plane_destroy() to kerneldoc now that it
    is no longer a static function.  (Ander)

Reviewed-by(v1): Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:59:15 +01:00
Matt Roper
c34c9ee482 drm/i915: Move vblank evasion to commit (v4)
Move the vblank evasion up from the low-level, hw-specific
update_plane() handlers to the general plane commit operation.
Everything inside commit should now be non-sleeping, so this brings us
closer to how vblank evasion will behave once we move over to atomic.

v2:
 - Restore lost intel_crtc->active check on vblank evasion

v3:
 - Replace assert_pipe_enabled() in intel_disable_primary_hw_plane()
   with an intel_crtc->active test; it turns out assert_pipe_enabled()
   grabs some mutexes and can sleep, which we can't do with interrupts
   disabled.

v4:
 - Equivalent to v2; v3 change is now squashed into an earlier patch
   of the series.  (Ander).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:58:47 +01:00
Matt Roper
32b7eeec4d drm/i915: Refactor work that can sleep out of commit (v7)
Once we integrate our work into the atomic pipeline, plane commit
operations will need to happen with interrupts disabled, due to vblank
evasion.  Our commit functions today include sleepable work, so those
operations need to be split out and run either before or after the
atomic register programming.

The solution here calculates which of those operations will need to be
performed during the 'check' phase and sets flags in an intel_crtc
sub-struct.  New intel_begin_crtc_commit() and
intel_finish_crtc_commit() functions are added before and after the
actual register programming; these will eventually be called from the
atomic plane helper's .atomic_begin() and .atomic_end() entrypoints.

v2: Fix broken sprite code split

v3: Make the pre/post commit work crtc-based to match how we eventually
    want this to be called from the atomic plane helpers.

v4: Some platforms that haven't had their watermark code reworked were
    waiting for vblank, then calling update_sprite_watermarks in their
    platform-specific disable code.  These also need to be flagged out
    of the critical section.

v5: Sprite plane test for primary show/hide should just set the flag to
    wait for pending flips, not actually perform the wait.  (Ander)

v6:
 - Rebase onto latest di-nightly; picks up an important runtime PM fix.
 - Handle 'wait_for_flips' flag in intel_begin_crtc_commit(). (Ander)
 - Use wait_for_flips flag for primary plane update rather than
   performing the wait in the check routine.
 - Added kerneldoc to pre_disable/post_enable functions that are no
   longer static.  (Ander)
 - Replace assert_pipe_enabled() in intel_disable_primary_hw_plane()
   with an intel_crtc->active test; it turns out assert_pipe_enabled()
   grabs some mutexes and can sleep, which we can't do with interrupts
   disabled.

v7:
 - Check for fb != NULL when deciding whether the sprite plane hides the
   primary plane during a sprite update.  (PRTS)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:58:19 +01:00
Jani Nikula
2f3408c7ef drm/i915: fix build for CONFIG_BUG=n
If CONFIG_BUG=n __WARN_printf won't be defined leading to the below
build failure. The double underscores should have told us to steer clear
of it anyway.

drivers/gpu/drm/i915/intel_display.c: In function ‘assert_pll’:
drivers/gpu/drm/i915/intel_display.c:1027:2: error: implicit declaration
of function ‘__WARN_printf’ [-Werror=implicit-function-declaration]
  I915_STATE_WARN(cur_state != state,

Use WARN(1, ...) instead. It handles CONFIG_BUG=n gracefully and, with
the constant condition, a sane compiler should reduce it to
__WARN_printf.

This is a regression introduced by

commit e2c719b75c
Author: Rob Clark <robdclark@gmail.com>
Date:   Mon Dec 15 13:56:32 2014 -0500

    drm/i915: tame the chattermouth (v2)

Reported-by: Jim Davis <jim.epost@gmail.com>
Reference: http://mid.gmane.org/CA+r1ZhgHTi7bS2irhtuSUs9aO=Br1dumN8=oAOeaMJDZ_ZhwBw@mail.gmail.com
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 23:43:09 +01:00
Alex Deucher
5615f890bc drm/radeon: add si dpm quirk list
This adds a quirks list to fix stability problems with
certain SI boards.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76490

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-01-12 17:16:50 -05:00
Christian König
ad1a62227f drm/radeon: don't print error on -ERESTARTSYS
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-12 17:16:49 -05:00
Daniel Vetter
0a87a2db48 Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queued
Conflicts:
	drivers/gpu/drm/i915/intel_runtime_pm.c

Separate branch so that Takashi can also pull just this refactoring
into sound-next.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-01-12 23:07:46 +01:00
Dave Airlie
426959c945 drm: fix mismerge in drm_crtc.c
Daniel merged two things in 72a3697097,
but he merged this code twice, Dan's static checker spotted it.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-01-13 07:38:53 +10:00
Chris Wilson
226e5ae9e5 drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
if the mutex debugging is enabled which introduces a race in our
mutex_is_locked_by() - i.e. we may inspect the old owner value before it
is acquired by the new task.

This is the root cause of this error:

 diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
 index 5cf6731..3ef3736 100644
 --- a/kernel/locking/mutex-debug.c
 +++ b/kernel/locking/mutex-debug.c
 @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
 			DEBUG_LOCKS_WARN_ON(lock->owner != current);

 		DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
 -		mutex_clear_owner(lock);
 	}

 	/*
 	 * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
 	 * mutexes so that we can do it here after we've verified state.
 	 */
 +	mutex_clear_owner(lock);
 	atomic_set(&lock->count, 1);
  }

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12 10:53:02 +02:00
Chris Wilson
48bf5b2d00 drm/i915: Ban Haswell from using RCS flips
Like Ivybridge, we have reports that we get random hangs when flipping
with multiple pipes. Extend

commit 2a92d5bca1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jul 8 10:40:29 2014 +0100

    drm/i915: Disable RCS flips on Ivybridge

to also apply to Haswell.

Reported-and-tested-by: Scott Tsai <scottt.tw@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87759
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org # 2a92d5bca1 drm/i915: Disable RCS flips on Ivybridge
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12 10:52:42 +02:00
Imre Deak
f24eeb1912 drm/i915: vlv: sanitize RPS interrupt mask during GPU idling
We apply the RPS interrupt workaround on VLV everywhere except when
writing the mask directly during idling the GPU. For consistency do this
also there.

While at it also extend the code comment about affected platforms.
I couldn't reproduce the issue on VLV fixed by this workaround, by
removing the workaround from everywhere, while it's 100% reproducible on
SNB using igt/gem_reset_stats/ban-ctx-render. So also add a note that
it hasn't been verified if the workaround really applies to VLV/CHV.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12 10:52:41 +02:00
Imre Deak
59d02a1f45 drm/i915: fix HW lockup due to missing RPS IRQ workaround on GEN6
In

commit dbea3cea69
Author: Imre Deak <imre.deak@intel.com>
Date:   Mon Dec 15 18:59:28 2014 +0200

    drm/i915: sanitize RPS resetting during GPU reset

we disable RPS interrupts during GPU resetting, but don't apply the
necessary GEN6 HW workaround. This leads to a HW lockup during a
subsequent "looping batchbuffer" workload. This is triggered by the
testcase that submits exactly this kind of workload after a simulated
GPU reset. I'm not sure how likely the bug would have triggered
otherwise, since we would have applied the workaround anyway shortly
after the GPU reset, when enabling GT powersaving from the deferred
work.

This may also fix unrelated issues, since during driver loading /
suspending we also disable RPS interrupts and so we also had a short
window during the rest of the loading / resuming where a similar
workload could run without the workaround applied.

v2:
- separate the fix to route RPS interrupts to the CPU on GEN9 too
  to a separate patch (Daniel)

Bisected-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Testcase: igt/gem_reset_stats/ban-ctx-render
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87429
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12 10:52:41 +02:00
Imre Deak
63a3451641 drm/i915: gen9: fix RPS interrupt routing to CPU vs. GT
GEN8+ HW has the option to route PM interrupts to either the CPU or to
GT. For GEN8 this was already set correctly to routing to CPU, but not
for GEN9, so fix this. Note that when disabling RPS interrupts this was
set already correctly, though in that case it didn't matter much except
for the possibility of spurious interrupts.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-12 10:52:41 +02:00
Imre Deak
fcf3aac5fc drm/i915: remove unused power_well/get_cdclk_freq api
After switching to using the component interface this API isn't needed
any more.

v2-3: unchanged
v4:
- move the removal of i915_powerwell.h to this patch (Takashi)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 02:48:24 +01:00
Imre Deak
58fddc288b drm/i915: add component support
Register a component to be used to interface with the snd_hda_intel
driver. This is meant to replace the same interface that is currently
based on module symbol lookup.

v2:
- change roles between the hda and i915 components (Daniel)
- add the implementation to a new file (Jani)
- use better namespacing (Jani)
v3:
- move the implementation to intel_audio.c (Daniel)
- rename display_component to audio_component (Daniel)
- add kerneldoc (Daniel)
v4:
- run forgotten git rm i915_component.c (Jani)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 02:48:20 +01:00
Imre Deak
888d0d4216 drm/i915: add dev_to_i915 helper
This will be needed by later patches, so factor it out.

No functional change.

v2:
- s/dev_to_i915_priv/dev_to_i915/ (Jani)
- don't use the helper in i915_pm_suspend (Chris)
- simplify the helper (Chris)
v3:
- remove redundant upcasting in the helper (Daniel)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12 02:48:19 +01:00
Hyungwon Hwang
d40f74f727 drm/exynos: remove the redundant machine checking code
This code is unnecessary, because same logic is already included. Refer
this mail thread[1] for detail.

[1] http://lists.freedesktop.org/archives/dri-devel/2015-January/075132.html

Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2015-01-12 02:03:38 +09:00
Dave Airlie
adc31849b2 Merge tag 'drm-intel-next-2014-12-19' of git://anongit.freedesktop.org/drm-intel into drm-next
- plane handling refactoring from Matt Roper and Gustavo Padovan in prep for
  atomic updates
- fixes and more patches for the seqno to request transformation from John
- docbook for fbc from Rodrigo
- prep work for dual-link dsi from Gaurav Signh
- crc fixes from Ville
- special ggtt views infrastructure from Tvrtko Ursulin
- shadow patch copying for the cmd parser from Brad Volkin
- execlist and full ppgtt by default on gen8, for testing for now

* tag 'drm-intel-next-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (131 commits)
  drm/i915: Update DRIVER_DATE to 20141219
  drm/i915: Hold runtime PM during plane commit
  drm/i915: Organize bind_vma funcs
  drm/i915: Organize INSTDONE report for future.
  drm/i915: Organize PDP regs report for future.
  drm/i915: Organize PPGTT init
  drm/i915: Organize Fence registers for future enablement.
  drm/i915: tame the chattermouth (v2)
  drm/i915: Warn about missing context state workarounds only once
  drm/i915: Use true PPGTT in Gen8+ when execlists are enabled
  drm/i915: Skip gunit save/restore for cherryview
  drm/i915/chv: Use timeout mode for RC6 on chv
  drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist
  drm/i915: Tidy up execbuffer command parsing code
  drm/i915: Mark shadow batch buffers as purgeable
  drm/i915: Use batch length instead of object size in command parser
  drm/i915: Use batch pools with the command parser
  drm/i915: Implement a framework for batch buffer pools
  drm/i915: fix use after free during eDP encoder destroying
  drm/i915/skl: Skylake also supports DP MST
  ...
2015-01-10 08:46:24 +10:00
Oded Gabbay
6bbcde9803 drm/amd: Remove old radeon_sa funcs from kfd-->kgd interface
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:11 +02:00
Oded Gabbay
632aa2cb08 drm/radeon: Remove old radeon_sa usage from kfd-->kgd interface
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:11 +02:00
Oded Gabbay
a86aa3ca5a drm/amdkfd: Using new gtt sa in amdkfd
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd
interface to the new kfd gtt sa inside amdkfd

v2: change the new call in sdma code that appeared because of the sdma feature

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:10 +02:00
Oded Gabbay
73a1da0bb3 drm/amdkfd: Allocate gart memory using new interface
This patch changes the calls to allocate the gart memory for amdkfd from the
old interface (radeon_sa) to the new one (kfd_gtt_sa)

The new gart sub-allocator is initialized with chunk size equal to 512 bytes.
This is because the KV MQD is 512 Bytes and most of the sub-allocations are
MQDs.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay
e18e794e6b drm/amdkfd: Fixed calculation of gart buffer size
This patch makes the gart's buffer size calculation more accurate. This buffer
is needed per GPU.

It takes into account maximum number of MQDs, runlist packets, kernel queues
and reserves 512KB for other misc allocations.

The total size is just shy of 4MB, for 32 processes and 128 queues per
process, which are the defaults for amdkfd kernel module parameters.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay
6e81090b2e drm/amdkfd: Add kfd gtt sub-allocator functions
This patch adds new kfd gtt sub-allocator functions that service the amdkfd
driver when it wants to use gtt memory.

The sub-allocator uses a bitmap to handle the memory area that was transferred
to it during init. It divides the memory area into chunks, according to chunk
size parameter.

The allocation function will allocate contiguous chunks from that memory area,
according to the requested size. If the requested size is smaller than the
chunk size, a single chunk will be allocated.

v2: Do some more verifications on parameters that are passed into
kfd_gtt_sa_init()

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay
36b5c08f09 drm/amdkfd: Add gtt sa related data to kfd_dev struct
This patch adds new fields to kfd_dev struct that are necessary for the new kfd
gtt sa module

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay
ceae881bfa drm/radeon: Impl. new gtt allocate/free functions
This patch adds the implementation of the gtt interface functions.

The allocate function will allocate a single bo, pin and map it to kernel
memory. It will return the gpu address and cpu ptr as arguments.

v2:

The bulk of the allocations in the GART is for MQDs. MQDs represent active
user-mode queues, which are on the current runlist. It is important to
remember that active queues doesn't necessarily mean scheduled/running
queues, especially if there is over-subscription of queues or more than a
single HSA process.

Because the scheduling of the user-mode queues is done by the CP firmware,
amdkfd doesn't have any indication if the queue is scheduled or not. If the
CP will try to schedule a queue, and its MQD is not present, this will
probably stuck the CP permanently, as it will load garbage from the GART
(the address of the MQD is given to the CP inside the runlist packet).

In addition, there are a couple of small allocations which also should
always be pinned - runlist packets (2 packets) and HPDs. runlist packets can
be quite large, depending on number of processes and queues.

This new allocate function represents the short/mid-term solution of limiting
the total memory consumption to around 4MB by default.

The long-term solution is to create a mechanism through which radeon/ttm can
ask amdkfd to clear GART/VRAM memory due to memory pressure.
Then, amdkfd will preempt the running queues and wait until the memory pressure
is over. After that, amdkfd will reschedule the queues.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:07 +02:00
Oded Gabbay
e27ade73fd drm/amd: Add new kfd-->kgd interface for gart usage
This patch adds two new functions to the kfd-->kgd interface:

init_gtt_mem_allocation, which allocate a large enough buffer on the amdkfd
needs, such as mqds, hpds, kernel queue, fence and runlists. This function
is only called once per GPU device. The size of the allocated buffer is
based on the maximum number of HSA processes and maximum number of queues
per HSA process (two amdkfd kernel module parameters).

free_gtt_mem, which frees a buffer that was allocated on the gart aperture.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:07 +02:00
Ben Goz
d7a60d8ea5 drm/radeon: Enable sdma preemption
This patch adds to radeon the enablement of sdma preemption.
This is needed to support HWS of SDMA user-mode queues.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:06 +02:00
Ben Goz
85dfaef341 drm/amdkfd: Pass queue type to pqm_create_queue()
This patch passes the correct queue type to pqm_create_queue() instead of a
fixed KFD_QUEUE_TYPE_COMPUTE type.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:06 +02:00
Ben Goz
3385f9dd64 drm/amdkfd: Identify SDMA queue in create queue ioctl
This patch adds a check to the create queue ioctl path, which identifies SDMA
queue type that is sent by userspace.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz
bcea308175 drm/amdkfd: Add SDMA user-mode queues support to QCM
This patch adds support for SDMA user-mode queues to the QCM - the Queue
management system that manages queues-per-device and queues-per-process.

v2: Remove calls to interface function that initializes sdma engines.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz
77669eb87a drm/amdkfd: Add SDMA mqd support
This patch adds support for SDMA mqd operations:
- init_mqd_sdma
- uninit_mqd_sdma
- load_mqd_sdma
- update_mqd_sdma
- destroy_mqd_sdma
- is_occupied_sdma

It also adds SDMA queue information to some private structures of amdkfd.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:04 +02:00
Ben Goz
a84a9903b5 drm/radeon: Implement SDMA interface functions
This patch implements the new SDMA interface functions. It also adds defines
and structures related to SDMA registers.

v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.

v3:
- Removed unused defines.
- Added SDMA_ prefix to defines that didn't have them.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:04 +02:00
Ben Goz
85ea7d07e1 drm/amd: Add SDMA functions to kfd-->kgd interface
This patch adds three new functions to the kfd2kgd interface:

- hqd_sdma_load() - Loads SDMA mqd to a H/W SDMA hqd slot. Used only in no HWS
                    mode.

- hqd_sdma_is_occupied() - Checks if an SDMA hqd slot is occupied. Used only
                           in no HWS mode.

- hqd_sdma_destroy() - Destructs and preempts the SDMA queue assigned to
                       that SDMA hqd slot. Used only in no HWS mode.

These functions are needed to support SDMA queues scheduling when using no HWS
mode (used for debug or bring-up).

v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:03 +02:00
Alexey Skidanov
093c7d8cfd drm/amdkfd: Process-device data creation and lookup split
This patch splits the current kfd_get_process_device_data() to two
functions, one that specifically creates a pdd and another one which
just do lookup.

This is done to enhance the readability and maintainability of the code.

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:58 +02:00
Alexey Skidanov
f7c826ad38 drm/amdkfd: Add number of watch points to topology
This patch adds the number of watch points to the node capabilities in the
topology module

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:55 +02:00
Daniel Kurtz
4b9a90c0b3 drm/rockchip: fix dma_alloc_attrs() error check
dma_alloc_attrs() returns NULL if it cannot allocate a dma buffer (or
mapping), not a negative error code.

Rerported-by: Pawel Osciak <posciak@chromium.org>
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
2015-01-09 11:29:06 +08:00
Dave Airlie
c93546a5e3 Merge tag 'topic/atomic-core-2015-01-05' of git://anongit.freedesktop.org/drm-intel into drm-next
Next batch of atomic work. Most important is the propertification from Rob
and the nth iteration of the actual atomic ioctl originally from Ville.
Big differences compared to earlier revisions:
- Core properties are now fully handled by the core, drivers can only
  handle driver-specific properties.
- Atomic props&ioctl are opt-in per file_priv, userspace needs to
  explicitly ask for it (like universal plane support).
- For now all hidden behind the atomic module option until this has
  settled a bit.
- Atomic modesets are currently not possible since the exact abi for how
  to handle the mode property is still under discussion.

Besides this some cleanup patches from me and the addition of per-object
state to global state backpointers to simplify drivers.

* tag 'topic/atomic-core-2015-01-05' of git://anongit.freedesktop.org/drm-intel:
  drm: Ensure universal_planes is set for atomic
  drm/atomic: Hide drm.ko internal interfaces
  drm: Atomic modeset ioctl
  drm/atomic: atomic connector properties
  drm/atomic: atomic plane properties
  drm: small property creation cleanup
  drm/atomic: atomic_check functions
  drm: add atomic properties
  drm: refactor getproperties/getconnector
  drm: tweak getconnector locking
  drm: add atomic_get_property
  drm: add atomic_set_property wrappers
  drm: get rid of direct property value access
  drm: store property instead of id in obj attachment
  drm: allow property validation for refcnted props
  drm/atomic: Introduce state->obj backpointers
  drm/atomic-helper: Again check modeset *before* plane states
  drm/atomic-helper: Export both plane and modeset check helpers
2015-01-09 09:22:40 +10:00
Dave Airlie
e5202a2289 Merge tag 'topic/core-stuff-2014-12-19' of git://anongit.freedesktop.org/drm-intel into drm-next
Misc drm patches with mostly polish patches from Thierry, with a bit of
generic mode validation from Ville and a few other oddball things.

* tag 'topic/core-stuff-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (25 commits)
  drm: Include drm_crtc_helper.h in DocBook
  drm: Make drm_crtc_helper.h standalone includible
  drm: Move IRQ related fields to proper section
  drm: Remove stale comment
  drm: Do basic sanity checks for user modes
  drm: Perform basic sanity checks on probed modes
  drm: Reorganize probed mode validation
  drm/doc: Remove duplicate "by"
  drm/info: Remove unused code
  drm/cache: Use wbinvd helpers
  drm/plane-helper: Test for plane disable earlier
  drm/doc: Document drm_add_modes_noedid() usage
  drm: bit of spell-check / editorializing.
  drm: Prefer sizeof(type) over sizeof type
  drm: Remove useless else block
  drm: Remove unneeded braces for single statement blocks
  drm: Do not assign in if condition
  drm: Prefer kmalloc_array() over kmalloc() with multiply
  drm: Prefer kcalloc() over kzalloc() with multiply
  drm: Miscellaneous checkpatch whitespace cleanups
  ...
2015-01-09 09:13:41 +10:00
Alex Deucher
4369a69ec6 drm/radeon: add a dpm quirk list
Disable dpm on certain problematic boards rather than
disabling dpm for the entire chip family since most
boards work fine.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1386534
https://bugzilla.kernel.org/show_bug.cgi?id=83731

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-01-08 16:32:37 -05:00
Oded Gabbay
8dfe58b206 drm/amdkfd: Fix sparse warning (different address space)
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-08 16:46:16 +02:00
Alex Deucher
3a01fd367e drm/radeon: fix VM flush on CIK (v3)
We need to wait for the GPUVM flush to complete.  There
was some confusion as to how this mechanism was supposed
to work.  The operation is not atomic.  For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.

v2: drop gart changes
v3: just read back rather than polling

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-01-08 09:36:51 -05:00
Alex Deucher
d474ea7e52 drm/radeon: fix VM flush on SI (v3)
We need to wait for the GPUVM flush to complete.  There
was some confusion as to how this mechanism was supposed
to work.  The operation is not atomic.  For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.

v2: drop gart changes
v3: just read back rather than polling

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-01-08 09:36:50 -05:00
Alex Deucher
cbfc35b90f drm/radeon: fix VM flush on cayman/aruba (v3)
We need to wait for the GPUVM flush to complete.  There
was some confusion as to how this mechanism was supposed
to work.  The operation is not atomic.  For GPU initiated
invalidations you need to read back a VM register to
introduce enough latency for the update to complete.

v2: drop gart changes
v3: just read back rather than polling

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-01-08 09:36:50 -05:00
Tvrtko Ursulin
7226572d8e drm/i915: Reserve shadow batch VMA analogue to others
If not pinned VMA can become an eviction target just before it needs to be
executed which breaks the internal object lifetime rules.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-08 09:34:58 +01:00
Michel Dänzer
6ee0ad2a7f drm/amdkfd: Drop interrupt SW ring buffer
The work queue couldn't reliably prevent the SW ring buffer from
overflowing, so dmesg was spammed by

 kfd kfd: Interrupt ring overflow, dropping interrupt.

messages when running e.g. the Atlantis Substance demo from
https://wiki.unrealengine.com/Linux_Demos on Kaveri.

Since the SW ring buffer doesn't actually do anything at this point, just
remove it for now. When actual interrupt processing code is added to
amdkfd, it should try to do things immediately and only defer to work
queues when necessary.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-08 13:27:15 +09:00