Mir

Merge lp:~vanvugt/mir/enable-dynamic-queue-scaling into lp:mir

Proposed by Daniel van Vugt
Status: Work in progress
Proposed branch: lp:~vanvugt/mir/enable-dynamic-queue-scaling
Merge into: lp:mir
Prerequisite: lp:~vanvugt/mir/fix-1480164-tests-scaling
Diff against target: 94 lines (+52/-7)
2 files modified
src/server/compositor/buffer_queue.cpp (+1/-1)
tests/acceptance-tests/test_latency.cpp (+51/-6)
To merge this branch: bzr merge lp:~vanvugt/mir/enable-dynamic-queue-scaling
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Needs Fixing
Mir CI Bot continuous-integration Approve
Daniel van Vugt Needs Information
Kevin DuBois (community) Abstain
Review via email: mp+285031@code.launchpad.net

Commit message

Enable dynamic buffer queue scaling to minimise latency, and tighten
the acceptance tests to show it's working. (LP: #1240909)

The relevant unit and integration tests are already present.

This only just became feasible after LP: #1476201 was fixed in Mir 0.19.0.

To post a comment you must log in.
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I need to do full stack testing on the phone again, but have run out of time today.

review: Needs Information
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:3300
https://mir-jenkins.ubuntu.com/job/mir-ci/228/
Executed test runs:
    None: https://mir-jenkins.ubuntu.com/job/generic-update-mp/228/console

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/228/rebuild

review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Kevin DuBois (kdub) wrote :

In the new model, we don't quite scale the number of buffers as much as the spare/3rd buffer doesn't happen to be used, so I don't think much has to be done to the new system to have a similar sort of feature. So, given that something's already around, and this has been in flight for a while, so +1 (once it gets a testing sign-off I suppose, abstain for now?)

More coordination than a code review really, but is this the last bit of the exchange-based features we'll be testing? I suppose I'd rather the full stack testing go into the new system (aiming for 0.20/0.21 being the switchover release)

review: Abstain
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I've already invested much of 2015 in this approach and the dependent pieces. So although you aim to retire it at some point, it is somewhat mature and was only turned off last year as a result of full stack testing on the phone (LP: #1476201).

As you can see in the acceptance test, there is precedent for us intentionally increasing latency when we were trying to boost frame rates (because you do need to do that first). So it's not like NBS will be held to the same low-latency threshold. Just a nice goal to have. My planned frame notification work will actually reduce latency of the full nested system even more than this, even if NBS did have higher per-queue latency, it wouldn't matter in the end.

And yeah, I think this is the last bit. As you can see all the work was done last year, so it's not terribly time consuming to try and turn on the optimizations now.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Cool, Jenkins is failing for the right reasons. GLMark2 can't maintain a high enough frame rate on mobile. I'll need to fix that or this branch is a non-starter.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Still needs more manual testing. There's a chance that slightly slow clients like the dash only ever keep up with 60Hz if you give them triple buffers. So theoretically such a pathological case could be made worse by this branch (skip one in every 30 frames, or two skips per second). That would not be good, so needs some eyes on manual testing here.

If there's any doubt, this effort will need to be trashed. And that's OK too. Because regardless of the outcome my next task after this will benefit all clients regardless of their performance profile.

review: Needs Information
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Alternatively, this work may just end up depending on us fixing this first:
   https://trello.com/c/puXQXSYw

Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:3310
https://mir-jenkins.ubuntu.com/job/mir-ci/302/
Executed test runs:
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build/90
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-0-fetch/96
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=vivid+overlay/92
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=xenial/92
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=amd64,compiler=gcc,platform=mesa,release=xenial/92
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=amd64,compiler=gcc,platform=mesa,release=xenial/92/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/92
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/92/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/92
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/92/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=i386,compiler=gcc,platform=mesa,release=xenial/92
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-advanced/arch=i386,compiler=gcc,platform=mesa,release=xenial/92/artifact/output/*zip*/output.zip

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/302/rebuild

review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)

Unmerged revisions

3310. By Daniel van Vugt

Shrink the diff and add another handy test

3309. By Daniel van Vugt

Fix failing test (which expected high latency but didn't find it any more).
Add another test specifically to verify latency stays low after a period
of sufficently fast rendering.

3308. By Daniel van Vugt

Re-enable (very conservatively) queue scaling again.

3307. By Daniel van Vugt

Merge latest trunk

3306. By Daniel van Vugt

Merge latest trunk

3305. By Daniel van Vugt

Merge latest trunk

3304. By Daniel van Vugt

Merge latest trunk

3303. By Daniel van Vugt

Merge latest trunk

3302. By Daniel van Vugt

Merge latest trunk

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/server/compositor/buffer_queue.cpp'
--- src/server/compositor/buffer_queue.cpp 2016-02-12 04:02:11 +0000
+++ src/server/compositor/buffer_queue.cpp 2016-02-15 08:26:32 +0000
@@ -137,7 +137,7 @@
137 graphics::BufferProperties const& props,137 graphics::BufferProperties const& props,
138 mc::FrameDroppingPolicyFactory const& policy_provider)138 mc::FrameDroppingPolicyFactory const& policy_provider)
139 : nbuffers{nbuffers},139 : nbuffers{nbuffers},
140 frame_deadlines_threshold{-1}, // Disable scaling by default140 frame_deadlines_threshold{30}, // around half a second to be sure
141 frame_deadlines_met{0},141 frame_deadlines_met{0},
142 scheduled_extra_frames{0},142 scheduled_extra_frames{0},
143 frame_dropping_enabled{false},143 frame_dropping_enabled{false},
144144
=== modified file 'tests/acceptance-tests/test_latency.cpp'
--- tests/acceptance-tests/test_latency.cpp 2016-02-10 04:27:20 +0000
+++ tests/acceptance-tests/test_latency.cpp 2016-02-15 08:26:32 +0000
@@ -190,6 +190,15 @@
190 return max;190 return max;
191 }191 }
192192
193 unsigned int max_latency_from_frame(unsigned int index) const
194 {
195 unsigned int max = 0;
196 for (auto i = index; i < latency_list.size(); ++i)
197 if (latency_list[i] > max)
198 max = latency_list[i];
199 return max;
200 }
201
193private:202private:
194 std::mutex mutex;203 std::mutex mutex;
195 std::vector<unsigned int> latency_list;204 std::vector<unsigned int> latency_list;
@@ -246,12 +255,10 @@
246 ASSERT_TRUE(stats.wait_for_posts(test_submissions,255 ASSERT_TRUE(stats.wait_for_posts(test_submissions,
247 std::chrono::seconds(60)));256 std::chrono::seconds(60)));
248257
249 // Note: Using the "early release" optimization without dynamic queue258 // This is to verify the server is performing as expected:
250 // scaling enabled makes the expected latency possibly up to259 float const expected_max_latency = expected_client_buffers - 1;
251 // nbuffers instead of nbuffers-1. After dynamic queue scaling is260 // and this is just a sanity check that the test itself isn't broken:
252 // enabled, the average will be lower than this.261 float const expected_min_latency = 1;
253 float const expected_max_latency = expected_client_buffers;
254 float const expected_min_latency = expected_client_buffers - 1;
255262
256 auto observed_latency = display.group.average_latency();263 auto observed_latency = display.group.average_latency();
257264
@@ -259,6 +266,44 @@
259 EXPECT_THAT(observed_latency, Lt(expected_max_latency+error_margin));266 EXPECT_THAT(observed_latency, Lt(expected_max_latency+error_margin));
260}267}
261268
269TEST_F(ClientLatency, latency_scales_down_for_a_smooth_client)
270{
271 using namespace testing;
272
273 auto stream = mir_surface_get_buffer_stream(surface);
274 for(auto i = 0u; i < test_submissions; i++) {
275 auto submission_id = mir_debug_surface_current_buffer_id(surface);
276 stats.record_submission(submission_id);
277 mir_buffer_stream_swap_buffers_sync(stream);
278 }
279
280 ASSERT_TRUE(stats.wait_for_posts(test_submissions,
281 std::chrono::seconds(60)));
282
283 EXPECT_THAT(display.group.max_latency_from_frame(50), Le(1));
284}
285
286TEST_F(ClientLatency, latency_doesnt_scale_down_for_a_stuttery_client)
287{
288 using namespace testing;
289
290 auto stream = mir_surface_get_buffer_stream(surface);
291 for(auto i = 0u; i < test_submissions; i++) {
292 auto submission_id = mir_debug_surface_current_buffer_id(surface);
293 stats.record_submission(submission_id);
294 mir_buffer_stream_swap_buffers_sync(stream);
295
296 // Stutter
297 if (i % 10 == 0)
298 std::this_thread::sleep_for(vblank_interval * 2);
299 }
300
301 ASSERT_TRUE(stats.wait_for_posts(test_submissions,
302 std::chrono::seconds(60)));
303
304 EXPECT_THAT(display.group.max_latency_from_frame(50), Ge(2));
305}
306
262TEST_F(ClientLatency, latency_is_limited_to_nbuffers)307TEST_F(ClientLatency, latency_is_limited_to_nbuffers)
263{308{
264 using namespace testing;309 using namespace testing;

Subscribers

People subscribed via source and target branches