Mir

Double-buffered compositing performance is sometimes very poor (30 FPS) on intel

Bug #1377872 reported by Daniel van Vugt
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mir
Invalid
Medium
Unassigned
mesa (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

Double-buffered compositing performance is sometimes artificially poor on some intel systems. When this happens the frame rate seen is halved - about 30 FPS. However at the same time, Mir is observed to use very little render time and both the CPU and GPU are still mostly idle. It's just the Intel DRM logic sometimes takes two frames (~32ms) to complete a single page flip.

Two known workarounds avoid the issue:
  (a) Add glFinish() into the mesa DisplayBuffer code; or
  (b) env INTEL_DEBUG=sync for the Mir server binary.

Tags: performance
summary: - Double-buffered compositing performance is poor
+ Double-buffered compositing performance is very poor (30 FPS)
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Double-buffered compositing performance is very poor (30 FPS)

It gets weirder... Start a bunch more clients, or just one that has swap_interval==0 and that makes all compositing fast again (60Hz).

Somewhere in the middle there's a number of clients that makes all compositing slow. But a very low or very high number makes it fast.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

What's happening:
All clients have frames rendered and are sleeping waiting for new buffers.
Server/compositor has rendered and swapped buffers.
Server is sleeping waiting for vblank (page flip).
Server and all clients are completely idle for 16ms (at the same time).
Page flip completes.
Server releases all compositing buffers.
Clients get new buffers and render new frames.

There is now less than one frame to do all of:
  1. Send new buffers to all clients.
  2. All clients render new frames.
  3. Send newly rendered buffers from all clients to the server.
  4. Server schedules compositing.
  5. Server does compositing/rendering.
  6. Compositor swap buffers
  7. Server waits for page flip (vsync)

So you have the host CPU and GPU mostly idle most of the time, and clients are forced to produce new frames in much less than 16ms. And if none of them make the deadline then the compositor doesn't wake up, skipping a frame and only compositing at 30Hz.

The problem is presently hidden on platforms where our DisplayBuffer is triple buffered (Mesa and some Androids), but it would be nice to improve our parallelism so that a double buffering in the compositor can actually work without being stuck at 30Hz.

Changed in mir:
status: New → In Progress
milestone: none → 0.9.0
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Hmm, that analysis might be wrong too. Needs more investigation.

Changed in mir:
importance: Undecided → Medium
Changed in mir:
assignee: Daniel van Vugt (vanvugt) → Cemil Azizoglu (cemil-azizoglu)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Workaround (for the Mir server): env INTEL_DEBUG=sync

Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

In the Mir server code (DRI output) we call:
  1. eglSwapBuffers()
  2. get the new front buffer
  3. schedule a page flip: drmModePageFlip()

This works well, however if I force it to wait for the page flip immediately:
  4. select() on the DRM fd and then drmHandleEvent()
then step 4 (under some rare but predictable rendering loads) takes 32ms to complete.

I've now confirmed it is just the page flip event that takes almost two frames to arrive. And there are two workarounds that seem to successfully kick the driver into action:
  3.5. glFinish()
or
  0. env INTEL_DEBUG=sync
Using either of these workarounds, rendering completes in about 1ms and select then returns the next page flip event (~16ms interval).

So it seems the intel batching logic is deferring rendering way too long, or the page flip event delivery is being deferred. However the two workarounds suggest the former.

Using:
Mesa 10.3.2-0ubuntu1 (Ubuntu 15.04 vivid)
Intel® HD Graphics 4600 (Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz)

description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

As I've tracked the issue into the intel driver code and the workarounds for test case A don't help B, we might have to split this bug in two.

Changed in mir:
assignee: Cemil Azizoglu (cemil-azizoglu) → Daniel van Vugt (vanvugt)
Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

I wonder if this is just a race?

If the page flip actually completes faster than select() takes to start up then that would explain it.

Changed in mesa:
importance: Unknown → Medium
status: Unknown → Confirmed
Changed in mir:
milestone: 0.9.0 → 0.10.0
summary: - Double-buffered compositing performance is very poor (30 FPS)
+ Double-buffered compositing performance is very poor (30 FPS) on intel
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Double-buffered compositing performance is very poor (30 FPS) on intel

Not work in progress any more.

I think the proposed branch is the final one but we need the more significant performance bug 1395421 fixed first.

Changed in mir:
status: In Progress → Triaged
milestone: 0.10.0 → none
Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

Hmm, I wonder if this is another case of the i915 driver not keeping the kernel awake enough? Although I found this bug on a reasonably powerful i7, I recently also found an intel sleep states bug that affects low-end chips:
https://bugs.launchpad.net/mir/+bug/1388490

Maybe these two bugs are in the same ballpark...?

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Double-buffered compositing performance is very poor (30 FPS) on intel

Can't reproduce this any more.

Mir now has double buffered clients as well as double buffered compositing so the bug should be more visible, not less. Although vivid also just got updated to Mesa 10.5 which is where I would expect a fix to come from. Seems to be fixed, probably by recent Mesa/kernel updates.

Changed in mir:
status: Triaged → Incomplete
Changed in mesa (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Confirmed by duplicate. kdub is seeing similar.

Changed in mir:
status: Incomplete → Confirmed
Changed in mesa (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

Sounds like there is some related movement happening:
https://nouveau.freedesktop.org/patch/40616/
http://patchwork.freedesktop.org/patch/44172/

Although it kind of sounds like the problem might get worse rather than better. Not sure.

Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

Digging in the kernel, there's some suspicious logic in the i915 driver (used by Mesa i965 etc):

/* Throttle our rendering by waiting until the ring has completed our requests
 * emitted over 20 msec ago.
 *
 * Note that if we were to use the current jiffies each time around the loop,
 * we wouldn't escape the function with any frames outstanding if the time to
 * render a frame was over 20ms.
 *
 * This should get us reasonable parallelism between CPU and GPU but also
 * relatively low latency when blocking on a particular request to finish.
 */
static int
i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)

I think that's the problem. Maybe Mir is behaving so well that a single frame doesn't fill the ring (when Mir is only double buffering). So we have to rely on the 20ms delay in the i915 kernel module that causes us to skip a frame.

I'm still hoping to be wrong, and that this isn't a *feature* of the i915 kernel module.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I wonder if calling gbm_surface_lock_front_buffer() a bit earlier would keep the system sufficiently awake to avoid this?...

description: updated
summary: - Double-buffered compositing performance is very poor (30 FPS) on intel
+ Double-buffered compositing performance is sometimes very poor (30 FPS)
+ on intel
Changed in mesa (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Good-bad news!

I can reproduce this bug again today, after updating my vivid desktop for the first time in two weeks. Seems some kernel/Mesa change has made it re-appear. And the INTEL_DEBUG workaround no longer works. But --nbuffers=3 does.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Scratch that. What I found yesterday seems to be a different bug. Logged in --> bug 1447896

Revision history for this message
Kevin DuBois (kdub) wrote :

I see the same symptoms on my intel system. The easiest way to reproduce the systems for me is to have a host (in bypass mode, with nbuffers=2), a nested server (with nbuffers = 2 or nbuffers = 3), and then run eglplasma connected to the nested server.

If egltriangle is running, the host server composites at 60fps. If eglplasma is running, the host server composites at 30fps.

Revision history for this message
Kevin DuBois (kdub) wrote :

Just for clarity's sake, INTEL_DEBUG=sync doesn't seem to make any difference, but nbuffers=3 seems to avert the problem

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sounds like kdub is experiencing bug 1447896 more than this one too.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Incomplete. It appears we're both experiencing just bug 1447896 now. Bug 1377872 hasn't definitely been seen since late-2014.

Changed in mir:
status: Confirmed → Incomplete
Changed in mesa (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This might help:
   http://lists.freedesktop.org/archives/intel-gfx/2015-April/063988.html

Although this bug seems to be gone since the end of 2014... hopefully.

Changed in mir:
assignee: Daniel van Vugt (vanvugt) → nobody
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Two years later... Upstream has now offered some thoughts. Although I don't think this bug is around any more(?)

https://bugs.freedesktop.org/show_bug.cgi?id=86366

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Hmm it seems this bug refuses to expire.

Can't be reproduced any more and only happens using a rendering approach we have never used.

no longer affects: mesa
Changed in mir:
status: Incomplete → New
status: New → Incomplete
status: Incomplete → Invalid
Changed in mesa (Ubuntu):
status: Incomplete → Won't Fix
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

A fix for bug 1388490 now exists, so if this one becomes reproducible then it might be fixed too.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.