Commit graph

427479 commits

Author SHA1 Message Date
Daniel Vetter
7e696e4cad drm/i915: ignore bios output config if not all outputs are on
Both Ville and QA rather immediately complained that with the new
initial_config logic from Jesse not all outputs get enabled. Since the
fbdev emulation pretty much tries to always enable as many outputs as
possible (it even has hotplug handling and all that) fall back if more
outputs could have been enabled.

v2: Fix up my confusion about what enabled means - it's passed from
the fbdev helper, we need to check for a non-zero connector->encoder
link. Spotted by Ville.

v3: Add some debug output as requested by Jesse for debugging fallback
issues.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75552
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:06 +01:00
Daniel Vetter
7c2bb53110 drm/i915: s/any_enabled/!fallback/ in fbdev_initial_config
It started as a simple check whether anything is lit up, but now is't
used to driver the general fallback logic to the default output
configuration selector in the helper library. So rename it for more
clarity.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:05 +01:00
Chris Wilson
7d5e379989 drm/i915: Reject changes of fb base when we have a flip pending
This should be impossible due to the wait for outstanding flips that the
caller is meant to perform prior to updating the scanout base. Paranoia
tells me to check anyway.

References: https://bugs.freedesktop.org/show_bug.cgi?id=75502
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:05 +01:00
Chris Wilson
227213438a Revert "drm/i915: enable HiZ Raw Stall Optimization on IVB"
This reverts commit 116f2b6da8.

This optimization causes widespread corruption in games, and even in
glxgears, on my ivb:gt1. The corruption appears like z-fighting of
overlapping polygons in the HiZ buffer.

The observation ties in very closely with the description of the
optimization disabled by default on IVB:

"The Hierarchical Z RAW Stall Optimization allows non-overlapping
polygons in the same 8x4 pixel/sample area to be processed without
stalling waiting for the earlier ones to write to Hierarchical Z
buffer."

No reason is given for why it is disabled by default, usually for such
optimizations it is that it is incomplete. However, there is no
indication whether this a gt1 only issue either. Before considering
reenabling this optimization, I would first suggest reproducing the
corruption in piglit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75623
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chia-I Wu <olv@lunarg.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:04 +01:00
Jesse Barnes
8b687df4c3 drm/i915: re-add locking around hw state readout
To silence locking complaints.  This was a rebase failure on my part in

commit fa9fa083d0
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Feb 11 15:28:56 2014 -0800

    drm/i915: read out hw state earlier v2

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:03 +01:00
Jesse Barnes
ef34ab894e drm/i915: honor forced connector modes v2
In the move over to use BIOS connector configs, we lost the ability to
force a specific set of connectors on or off.  Try to remedy that by
dropping back to the old behavior if we detect a hard coded connector
config.

v2: don't deref connector state for disabled connectors (Jesse)

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:02 +01:00
Chris Wilson
1af8452f16 drm/i915: Revert workaround for disabling L3 cache aging on IVB
In commit e4e0c058a1
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:50 2012 -0800

    drm/i915: gen7: Implement an L3 caching workaround.

the L3 cache aging was disabled. This was part of a shotgun response
to a number of GPU hang bugs, but there appears to be no documentation
to suggest that disabling the L3 cache age was ever required (to prevent
the GPU hangs).

Restoring the L3 cache age is a minor performance win of around 2%
on IVB:GT2. (Note that this value seems to be consistent across a number
of tests and so appears to be above the usual noise.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:02 +01:00
Sinclair Yeh
47e74f0fd1 drm/i915: Revert workaround for disabling L3 cache aging on BYT
V2:  edit the commit message to contain more info
The W/A spreadsheet says this is still required, but the b-spec says
it's not for BYT-T.  So the documentation is not clear.  However,
our experience with the other SKUs of BYT-I/M on Android and Linux
suggests that setting this bit actually causes GPU hang for certain
OGL benchmark applications.

Removing this bit completely resolves the GPU hangs.

Signed-off-by: Sinclair Yeh <sinclair.yeh@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:01 +01:00
Ben Widawsky
5abbcca30d drm/i915/bdw: Kill ppgtt->num_pt_pages
With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky
b146520ff9 drm/i915: Split GEN6 PPGTT initialization up
Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky
a00d825de9 drm/i915: Split GEN6 PPGTT cleanup
This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:59 +01:00
Ben Widawsky
c4ac524c15 drm/i915: Update i915_gem_gtt.c copyright
I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:58 +01:00
Ben Widawsky
7907f45bf9 Revert "drm/i915/bdw: Limit GTT to 2GB"
This reverts commit 3a2ffb65ee.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:57 +01:00
Ben Widawsky
7ad47cf252 drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc99351
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:41 +01:00
Dave Airlie
786a7828bc Merge branch 'drm-next-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-next
this is the second pull request for 3.15 radeon changes. Highlights this time:
- Better VRAM usage
- VM page table rework
- Enabling different UVD clocks again
- Some general cleanups and improvements

* 'drm-next-3.15' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: remove struct radeon_bo_list
  drm/radeon: drop non blocking allocations from sub allocator
  drm/radeon: remove global vm lock
  drm/radeon: use normal BOs for the page tables v4
  drm/radeon: further cleanup vm flushing & fencing
  drm/radeon: separate gart and vm functions
  drm/radeon: fix VCE suspend/resume
  drm/radeon: fix missing bo reservation
  drm/radeon: limit how much memory TTM can move per IB according to VRAM usage
  drm/radeon: validate relocations in the order determined by userspace v3
  drm/radeon: add buffers to the LRU list from smallest to largest
  drm/radeon: deduplicate code in radeon_gem_busy_ioctl
  drm/radeon: track memory statistics about VRAM and GTT usage and buffer moves v2
  drm/radeon: add a way to get and set initial buffer domains v2
  drm/radeon: use variable UVD clocks
  drm/radeon: cleanup the fence ring locking code
  drm/radeon: improve ring lockup detection code v2
2014-03-05 14:52:19 +10:00
Ben Widawsky
782f149523 drm/i915: Make clear/insert vfuncs args absolute
This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:57:52 +01:00
Ben Widawsky
bf2b4ed291 drm/i915/bdw: Split ppgtt initialization up
Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:59 +01:00
Ben Widawsky
f3a964b96d drm/i915/bdw: Reorganize PPGTT init
Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:31 +01:00
Ben Widawsky
b18b6bde30 drm/i915/bdw: Free PPGTT struct
GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:53:58 +01:00
Ben Widawsky
321f2ada91 drm/i915: Move ppgtt_release out of the header
At one time it was expected to be called in multiple places by kref_put.
At the current time however, it is all contained within
i915_gem_context.c.

This patch makes an upcoming required addition a bit nicer since it too
doesn't need to be defined in a header file.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:52:15 +01:00
Ville Syrjälä
c5c98a5899 drm/i915: Add a comment about WIZ hashing vs. thread counts
Add a comment next to our WIZ hashing setup to remind people about the
link between WIZ hashing disable bit and PS/WM thread counts.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:39:35 +01:00
Ville Syrjälä
36075a4cad drm/i915: Change BDW WIZ hashing mode to 16x4
BSpec recommends using 8x4 hashing mode when MSAA is used. But in
practice 16x4 seems to have a slight edge in performance (on IVB and
HSW at least). So just use 16x4.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:39:24 +01:00
Ville Syrjälä
a12c4967c9 drm/i915: Change HSW WIZ hashing mode to 16x4
BSpec recommends using 8x4 hashing mode when MSAA is used. But in
practice 16x4 seems to have a slight edge in performance (on IVB and
HSW at least). So just use 16x4.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:39:15 +01:00
Ville Syrjälä
a607c1a41d drm/i915: Change IVB WIZ hashing mode to 16x4
BSpec recommends using 8x4 hashing mode when MSAA is used. But in
practice 16x4 seems to have a slight edge in performance (on IVB and
HSW at least). So just use 16x4.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:39:05 +01:00
Ville Syrjälä
743b57d830 drm/i915: There's no need to mask all 3D_CHICKEN bits on SNB
The need to set all of the mask bits for 3D_CHICKEN3 was required
only for pre-production hardware.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:37:11 +01:00
Ville Syrjälä
5eb146dd0b drm/i915: Assume we implement WaStripsFansDisableFastClipPerformanceFix:snb
Based on the name, the workaround we implement is
WaStripsFansDisableFastClipPerformanceFix. Unfortunately there's no
description in the w/a database, so this is just a guess.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:36:22 +01:00
Ville Syrjälä
8d85d27281 drm/i915: Fix SNB GT_MODE register setup
On SNB we set up WaSetupGtModeTdRowDispatch:snb early in
gen6_init_clock_gating(). That sets a bit in the GEN6_GT_MODE register.
However later we go and disable all the bits in the same register. And
then we go on to set some other bit. So apparently we never actually
implemented this workaround since the "disable all bits" part was there
already before the w/a got supposedly implemented.

These are the relevant commits:

 commit 6547fbdbff
 Author: Daniel Vetter <daniel.vetter@ffwll.ch>
 Date:   Fri Dec 14 23:38:29 2012 +0100

    drm/i915: Implement WaSetupGtModeTdRowDispatch

 commit f8f2ac9a76
 Author: Ben Widawsky <ben@bwidawsk.net>
 Date:   Wed Oct 3 19:34:24 2012 -0700

    drm/i915: Fix GT_MODE default value

So, let's drop the "disable all bits" part, move both writes to
closer proxomity to each other, and name the WIZ hashing bits
appropriately. BSpec is still a bit confused how the bits should
actually be interpreted, but I took the the description for the
high bit since the low bit part only lists values for a single bit.

Also add a comment about our choice of WIZ hashing mode.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:35:52 +01:00
Paulo Zanoni
5bfa0199e9 drm/i915: get/put runtime PM without holding rps.hw_lock
We'll need this when we merge PC8 and Runtime PM: the PC8
enable/disable functions need that lock.

Also, it's good practice to not hold a lock for longer than strictly
needed.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:13:04 +01:00
Paulo Zanoni
da7235692c drm/i915: rename modeset_update_power_wells
To modeset_update_crtc_power_domains, since this function is
responsible for updating all the power domains of all CRTCs after a
modeset. In the future we should also run this function on all
platforms, not just Haswell.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:11:28 +01:00
Christian König
df0af4403a drm/radeon: remove struct radeon_bo_list
Just move all fields into radeon_cs_reloc, removing unused/duplicated fields.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-03-04 14:34:34 +01:00
Thierry Reding
d4d5be6192 drm/i915: Remove dead code
The i915 driver sets DRIVER_GEM unconditionally, so testing for the
feature will always fail.

Signed-off-by: Thierry Reding <treding@nvidia.com>
[danvet: Fix up conflicts.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 09:56:48 +01:00
Dave Airlie
4d33f3aa1c Merge tag 'drm-intel-next-2014-02-14' of ssh://git.freedesktop.org/git/drm-intel into drm-next
- Fix the execbuf rebind performance regression due to topic/ppgtt (Chris).
- Fix up the connector cleanup ordering for sdvod i2c and dp aux devices (Imre).
- Try to preserve the firmware modeset config on driver load. And a bit of prep
  work for smooth takeover of the fb contents (Jesse).
- Prep cleanup for larger gtt address spaces on bdw (Ben).
- Improve our vblank_wait code to make hsw modesets faster (Paulo).
- Display debugfs file (Jesse).
- DRRS prep work from Vandana Kannan.
- pipestat interrupt handler to fix a few races around vblank/pageflip handling
  on byt (Imre).
- Improve display fuse handling for display-less SKUs (Damien).
- Drop locks while stalling for the gpu when serving pagefaults to improve
  interactivity (Chris).
- And as usual piles of other improvements and small fixes all over.

* tag 'drm-intel-next-2014-02-14' of ssh://git.freedesktop.org/git/drm-intel: (65 commits)
  drm/i915: fix NULL deref in the load detect code
  drm/i915: Only bind each object rather than for every execbuffer
  drm/i915: Directly return the vma from bind_to_vm
  drm/i915: Simplify i915_gem_object_ggtt_unpin
  drm/i915: Allow blocking in the PDE alloc when running low on gtt space
  drm/i915: Don't allocate context pages as mappable
  drm/i915: Handle set_cache_level errors in the status page setup
  drm/i915: Don't pin the status page as mappable
  drm/i915: Don't set PIN_MAPPABLE for legacy ringbuffers
  drm/i915: Handle set_cache_level errors in the pipe control scratch setup
  drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE
  drm/i915: Consolidate binding parameters into flags
  drm/i915: sdvo: add i2c sysfs symlink to the connector's directory
  drm/i915: sdvo: fix error path in sdvo_connector_init
  drm/i915: dp: fix order of dp aux i2c device cleanup
  drm/i915: add unregister callback to connector
  drm/i915: don't reference null pointer at i915_sink_crc
  drm/i915/lvds: Remove dead code from failing case
  drm/i915: don't preserve inherited configs with nothing on v2
  drm/i915/bdw: Split up PPGTT cleanup
  ...
2014-03-04 07:51:41 +10:00
Christian König
4d15264662 drm/radeon: drop non blocking allocations from sub allocator
Not needed any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
2014-03-03 11:26:39 +01:00
Christian König
529364e05b drm/radeon: remove global vm lock
Not needed any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-03-03 11:26:27 +01:00
Christian König
6d2f2944e9 drm/radeon: use normal BOs for the page tables v4
No need to make it more complicated than necessary,
just allocate the page tables as normal BO and
flush whenever the address change.

v2: update comments and function name
v3: squash bug fixes, page directory and tables patch
v4: rebased on Mareks changes

Signed-off-by: Christian König <christian.koenig@amd.com>
2014-03-03 11:26:08 +01:00
Christian König
fa68834342 drm/radeon: further cleanup vm flushing & fencing
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-03-03 11:03:35 +01:00
Christian König
2280ab57b6 drm/radeon: separate gart and vm functions
Both are complex enough on their own.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-03-03 11:03:34 +01:00
Christian König
b03b4e4b6e drm/radeon: fix VCE suspend/resume
Signed-off-by: Christian König <christian.koenig@amd.com>
2014-03-03 11:03:32 +01:00
Christian König
f1e3dc708a drm/radeon: fix missing bo reservation
Signed-off-by: Christian König <christian.koenig@amd.com>
2014-03-03 11:03:29 +01:00
Marek Olšák
19dff56a5f drm/radeon: limit how much memory TTM can move per IB according to VRAM usage
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 11:00:24 +01:00
Marek Olšák
c9b7654889 drm/radeon: validate relocations in the order determined by userspace v3
Userspace should set the first 4 bits of drm_radeon_cs_reloc::flags to
a number from 0 to 15. The higher the number, the higher the priority,
which means a buffer with a higher number will be validated sooner.

The old behavior is preserved: Buffers used for write are prioritized over
read-only buffers if the userspace doesn't set the number.

v2: add buffers to buckets directly, then concatenate them
v3: use a stable sort

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 10:57:19 +01:00
Marek Olšák
4330441a74 drm/radeon: add buffers to the LRU list from smallest to largest
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 10:57:15 +01:00
Marek Olšák
0bc490a8d9 drm/radeon: deduplicate code in radeon_gem_busy_ioctl
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 10:57:10 +01:00
Marek Olšák
67e8e3f970 drm/radeon: track memory statistics about VRAM and GTT usage and buffer moves v2
The statistics are:
- VRAM usage in bytes
- GTT usage in bytes
- number of bytes moved by TTM

The last one is actually a counter, so you need to sample it before and after
command submission and take the difference.

This is useful for finding performance bottlenecks. Userspace queries are
also added.

v2: use atomic64_t

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 10:54:19 +01:00
Marek Olšák
bda72d58a2 drm/radeon: add a way to get and set initial buffer domains v2
When passing buffers between processes, the receiving process needs to know
the original buffer domain, so that it doesn't accidentally move the buffer.

v2: reserve the buffer

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-03-03 10:53:01 +01:00
Daniel Vetter
b5ea642a76 drm/i915: sprinkle static
Apparently we've missed a few more than what Fengguang's 0-day tester
recently reported in i915_irq.c ... Makes sparse happy again (ignore
some spurious stuff about ksyms of exported functions).

Cc: kbuild test robot <fengguang.wu@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-02 21:19:51 +01:00
Alex Deucher
14a9579ddb drm/radeon: use variable UVD clocks
Now that Christian fixed the performance problems with
the feedback buffer in mesa, we can enable variable UVD
clocks.  There are multiple UVD power states associated
with different types and numbers of streams.  This uses
the appropriate state based on that information rather
than always using the fastest UVD clocks which saves some
power.  One possible downside is that this may adversely
affect decode benchmarks since these power states target
specific playback requirements rather than maximum
performance.  If that becomes an issue, we can add a
sysfs attribute to force the max UVD state.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-02-28 10:53:20 +01:00
Christian König
37615527c5 drm/radeon: cleanup the fence ring locking code
We no longer need to take the ring lock while checking for
a gpu lockup, so just cleanup the code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-02-28 10:53:18 +01:00
Christian König
aee4aa73a1 drm/radeon: improve ring lockup detection code v2
Use atomics and jiffies_64, so that we don't need to have the
ring mutex locked any more and avoid wrap arounds.

v2: fix some checkpatch warnings

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-02-28 10:53:16 +01:00
Dave Airlie
4d538b7919 Merge branch 'drm-next-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-next
So this is the initial pull request for radeon drm-next 3.15. Highlights:
- VCE bringup including DPM support
- Few cleanups for the ring handling code

* 'drm-next-3.15' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: cleanup false positive lockup handling
  drm/radeon: drop radeon_ring_force_activity
  drm/radeon: drop drivers copy of the rptr
  drm/radeon/cik: enable/disable vce cg when encoding v2
  drm/radeon: add support for vce 2.0 clock gating
  drm/radeon/dpm: properly enable/disable vce when vce pg is enabled
  drm/radeon/dpm: enable dynamic vce state switching v2
  drm/radeon: add vce dpm support for KV/KB
  drm/radeon: enable vce dpm on CI
  drm/radeon: add vce dpm support for CI
  drm/radeon: fill in set_vce_clocks for CIK asics
  drm/radeon/dpm: fetch vce states from the vbios
  drm/radeon/dpm: fill in some initial vce infrastructure
  drm/radeon/dpm: move platform caps fetching to a separate function
  drm/radeon: add callback for setting vce clocks
  drm/radeon: add VCE version parsing and checking
  drm/radeon: add VCE ring query
  drm/radeon: initial VCE support v4
  drm/radeon: fix CP semaphores on CIK
2014-02-27 14:39:30 +10:00