Merge lp:~kdub/powerd/fix-1258655 into lp:powerd

Proposed by Kevin DuBois on 2013-12-18
Status: Rejected
Rejected by: Kevin DuBois on 2014-01-03
Proposed branch: lp:~kdub/powerd/fix-1258655
Merge into: lp:powerd
Diff against target: 227 lines (+72/-47)
1 file modified
src/display.c (+72/-47)
To merge this branch: bzr merge lp:~kdub/powerd/fix-1258655
Reviewer Review Type Date Requested Status
Seth Forshee (community) 2013-12-18 Disapprove on 2013-12-19
PS Jenkins bot continuous-integration Approve on 2013-12-18
Review via email: mp+199582@code.launchpad.net

Commit message

1) eliminating fb_state as a global variable and locking actual_display_state properly.
2) for earlysuspend devices, the responsibility for changing state and turning on the display/compositor via dbus is in the thread-monitoring state. for non-earlysuspend devices, that responsibility remains in the same thread that it is in now.

fixes ( LP: #1258655 )

Description of the change

powerd must ensure that the system is in a usable state before triggering the calls to the display server.

The problem was that for earlysuspend devices, there were two threads that thought they had responsibility for turning the display on, when really only the thread monitoring the files in /sys/power had the right to do so. Furthermore, the variables concerned with display state "actual_display_state" and "fb_state" were accessed by both the file-monitoring thread and the the request-servicing thread, causing races around the actual state of the display.

This MP fixes these issues by:
1) eliminating fb_state as a global variable and locking actual_display_state properly.
2) for earlysuspend devices, the responsibility for changing state and turning on the display/compositor via dbus is in the thread-monitoring state. for non-earlysuspend devices, that responsibility remains in the same thread that it is in now.

fixes ( LP: #1258655 )

Also, please feel free to test by jamming on the power button repeatedly and quickly. I have manually tested on my nexus 4 (an early suspend device) and my nexus 10 (a non-early suspend device).

To post a comment you must log in.
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Seth Forshee (sforshee) wrote :
Download full text (9.7 KiB)

I don't think this actually "fixes" anything, I think it probably just
shuffles timing around and ends up somehow hiding the problem.

On Wed, Dec 18, 2013 at 10:40:39PM -0000, Kevin DuBois wrote:
> Kevin DuBois has proposed merging lp:~kdub/powerd/fix-1258655 into lp:powerd.
>
> Commit message:
> 1) eliminating fb_state as a global variable and locking actual_display_state properly.

Previously all updates to actual_display_state happened on the main
loop. There were a couple of reads off the main loop related to power
button handling, but adding a lock doesn't eliminate any class of races
(i.e. the state changes just after the read). So actual_screen_state
doesn't *need* any locking until you start manipulating it from
wait_for_fb_thread(). Even then I find most of the locking to be
dubious, more on that later.

> powerd must ensure that the system is in a usable state before
> triggering the calls to the display server. The problem was that
> for earlysuspend devices, there were two threads that thought they had
> responsibility for turning the display on, when really only the thread
> monitoring the files in /sys/power had the right to do so. Furthermore,
> the variables concerned with display state "actual_display_state" and
> "fb_state" were accessed by both the file-monitoring thread and the the
> request-servicing thread, causing races around the actual state of the
> display.

Previously only the main thread ever turned the display on/off; the
monitoring thready only scheduled work on the main thread to do so. The
monitoring thread never checks actual_screen_state, only the work which
is scheduled on the main thread does this. Therefore there is no race in
accessing actual_screen_state, since it's always happening on the main
thread.

This change actually reduces serialization. Previously changes to the
display state and to the system power state were serialized relative to
each other by nature of both happening on the main thread. It _should_
be impossible to have earlysuspend enabled while the display was on due
the following:

 * Before the display was turned on update_display_state() requested the
   active state, which would synchronously disable earlysuspend before
   proceeding to turn on the display.
 * The display was turned off before releasing the active state request.
   Releasing this request may cause the system to be put into
   suspend/earlysuspend mode.
 * All of this happens on the main thread and is thus serialized

These changes move the display on/off to another thread, so the
guarantees of the display state being serialized wrt the system power
state are broken. There *are* races between turning the screen on on the
monitor thread and turning it off on the main thread with your changes.
Expanding the amount of code serialized by the mutex in the screen off
case could help some, but I'm not sure that it will ever be entirely
sufficient.

My guess would be that the actual problem is that unity's
setScreenPowerMode() doesn't complete synchronously wrt what powerd is
doing. So something like this might happen:

 * powerd takes an active state request to guarantee that earlysuspend
   is disabled, then calls setScre...

Read more...

Seth Forshee (sforshee) wrote :

Disapproving as per the comments I sent via email. I'm not convinced that the premise behind these changes is correct, but I'm open to being convinced otherwise. In any case I'm almost certain that the changes will introduce regressions.

review: Disapprove
Seth Forshee (sforshee) wrote :

I've been going back over the code, and as long as the dbus calls are handled on the main thread (which was my understanding, but I'm no expert about dbus) then all display on/off code should happen on the main thread. As far as I can tell there are adequate protections to prevent turning on the display when earlysuspend is enabled. I really don't think there are any problems there. In particular the display code holds an active request when unity's setScreenPowerMode() is called, and it's called on the main loop, so there should be absolutely no way that earlysuspend is active when it is called and there should be absolutely no way for earlysuspend to become active until the dbus call returns (and setScreenPowerMode() *is* called synchronously).

I did just remember though bug #1208433, which may be related. I think rsalveti and I chatted about that one once and we both thought that the root cause was probably something in the display driver. maguro and grouper are both earlysuspend devices and neither have the same problem with rapidly pressing the power key. Are you able to try either of those with unity to verify that's still the case?

Kevin DuBois (kdub) wrote :

unity-mir gets two on's from one button press, one from http://bazaar.launchpad.net/~phablet-team/powerd/trunk/view/head:/src/display.c#L415 and one from http://bazaar.launchpad.net/~phablet-team/powerd/trunk/view/head:/src/display.c#L263. The two on's don't happen everytime, they happen like races.

This isn't a dbus race, that code all checked out (best I can tell). Its that two requests are actually sent out in response to a button press. Sometimes the FB is not ready when one of these 'on' requests happens.

From mir's perspective, all the errors emanate from the FB device not being able to post/blank/etc. Best I can tell, the FB is sometimes not available from a power perspective when one of the on calls comes.

my theory about this problem goes something like this:

power button press
update_display_state()
request_sys_state_internal()
   (request to come out of suspend)
display_set_power_mode()
                                   fb comes out of suspend
                                   fb monitor thread calls update_display_state()

I can check the other devices

Seth Forshee (sforshee) wrote :

On Thu, Dec 19, 2013 at 05:54:59PM -0000, Kevin DuBois wrote:
> unity-mir gets two on's from one button press, one from http://bazaar.launchpad.net/~phablet-team/powerd/trunk/view/head:/src/display.c#L415 and one from http://bazaar.launchpad.net/~phablet-team/powerd/trunk/view/head:/src/display.c#L263. The two on's don't happen everytime, they happen like races.
>
> This isn't a dbus race, that code all checked out (best I can tell). Its that two requests are actually sent out in response to a button press. Sometimes the FB is not ready when one of these 'on' requests happens.
>
> >From mir's perspective, all the errors emanate from the FB device not being able to post/blank/etc. Best I can tell, the FB is sometimes not available from a power perspective when one of the on calls comes.

With reference to my comments on the bug, I suspect the first call comes
before the earlysuspend gets around to suspending the framebuffer, and
the second comes when it has been suspended and then resumed. It's a
race with the earlysuspend code in the kernel, which turns out to be
asynchronous with writing to /sys/power/state.

Kevin DuBois (kdub) wrote :

Seth came up with another fix in https://code.launchpad.net/~sforshee/powerd/fix-display-timing/+merge/200431
so moving to rejected.

Unmerged revisions

108. By Kevin DuBois on 2013-12-18

merge base

107. By Kevin DuBois on 2013-12-18

further cleanup, make sure non-early suspend devices still work

106. By Kevin DuBois on 2013-12-18

bit of cleanup in fb thread

105. By Kevin DuBois on 2013-12-18

change state handling to make more robust display interaction

104. By Kevin DuBois on 2013-12-18

add some locking to the variables accessed in both therads

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/display.c'
2--- src/display.c 2013-10-10 15:35:12 +0000
3+++ src/display.c 2013-12-18 22:40:08 +0000
4@@ -54,6 +54,7 @@
5 static gint target_brightness;
6 static gint dim_brightness;
7 static gboolean running_mir = FALSE;
8+static gboolean earlysuspend_used = FALSE;
9
10 /* Assume screen is off to start; we will force it on */
11 static uuid_t internal_request_cookie;
12@@ -64,6 +65,8 @@
13 .flags = 0,
14 };
15
16+GMutex actual_screen_state_guard;
17+
18 /*
19 * When using the proximity sensor, our actual state of the screen
20 * may not match the requested state in internal_state. Therefore
21@@ -77,19 +80,6 @@
22 */
23 static unsigned screen_off_overrides;
24
25-enum fb_state {
26- FB_SLEEP,
27- FB_AWAKE,
28-
29- NUM_FB_STATES
30-};
31-static enum fb_state fb_state = FB_AWAKE;
32-
33-static const char *fb_file_names[] = {
34- [FB_SLEEP] = "/sys/power/wait_for_fb_sleep",
35- [FB_AWAKE] = "/sys/power/wait_for_fb_wake"
36-};
37-
38 static const char *display_state_strs[POWERD_NUM_DISPLAY_STATES] = {
39 [POWERD_DISPLAY_STATE_DONT_CARE] = "off",
40 [POWERD_DISPLAY_STATE_ON] = "on",
41@@ -107,6 +97,10 @@
42 return (flags & AB_ENABLED_MASK) == AB_ENABLED;
43 }
44
45+
46+/* the Display Server (mir or sf) is free to fire off a few more frames as
47+ part of its shutdown process. powerd must guarantee the fb_state is FB_AWAKE,
48+ and the system is not suspended until the request returns. */
49 gboolean display_set_power_mode(int display, char *power_mode)
50 {
51 GError *error = NULL;
52@@ -170,7 +164,10 @@
53
54 gboolean powerd_display_enabled(void)
55 {
56- return actual_screen_state == POWERD_DISPLAY_STATE_ON;
57+ g_mutex_lock(&actual_screen_state_guard);
58+ gboolean ret = (actual_screen_state == POWERD_DISPLAY_STATE_ON);
59+ g_mutex_unlock(&actual_screen_state_guard);
60+ return ret;
61 }
62
63 static int get_target_brightness(gboolean request_bright, int hw_value)
64@@ -221,8 +218,8 @@
65
66 switch (applied_state) {
67 case DISPLAY_STATE_OFF:
68- if (actual_screen_state == DISPLAY_STATE_OFF) {
69- /* Nothing to do */
70+ if (!powerd_display_enabled()) {
71+ /* Nothing to do, display is alread off */
72 return;
73 }
74
75@@ -230,11 +227,19 @@
76 if (using_ab)
77 powerd_autobrightness_disable();
78 powerd_set_brightness(0);
79+
80 if (!display_set_power_mode(0, "off")) {
81 powerd_warn("failed to set display power mode, not clearing state");
82 return;
83 }
84
85+ if (!earlysuspend_used)
86+ {
87+ g_mutex_lock(&actual_screen_state_guard);
88+ actual_screen_state = DISPLAY_STATE_OFF;
89+ g_mutex_unlock(&actual_screen_state_guard);
90+ }
91+
92 powerd_debug("Releasing internal active state request");
93 ret = clear_sys_state_internal(internal_request_cookie);
94 if (!ret) {
95@@ -245,7 +250,8 @@
96 }
97 break;
98 case POWERD_DISPLAY_STATE_ON:
99- if (actual_screen_state != POWERD_DISPLAY_STATE_ON) {
100+ if (!powerd_display_enabled()) {
101+ /* display is off, must turn display on */
102 powerd_debug("Requesting active state internally");
103 ret = request_sys_state_internal("display-request",
104 POWERD_SYS_STATE_ACTIVE,
105@@ -253,16 +259,18 @@
106 if (!ret)
107 powerd_warn("Request for active state failed");
108
109- /*
110- * If the current state is suspend we need to wait for
111- * notification of leaving the active state, otherwise the
112- * screen may fail to turn on for devices using earlysuspend.
113- * Otherwise we can turn on the screen right away.
114- */
115- if (fb_state == FB_AWAKE)
116+ /* if the device uses earlysuspend, responsibility for turning
117+ the display server and changing actual_screen_state falls
118+ to wait_for_fb_thread() */
119+ if (!earlysuspend_used)
120+ {
121 turn_on_display(use_ab);
122+ g_mutex_lock(&actual_screen_state_guard);
123+ actual_screen_state = POWERD_DISPLAY_STATE_ON;
124+ g_mutex_unlock(&actual_screen_state_guard);
125+ }
126 } else {
127- /* Only changing brightness */
128+ /* display is already on, must only changing brightness */
129 if (use_ab) {
130 if (!using_ab)
131 powerd_autobrightness_enable();
132@@ -281,8 +289,6 @@
133 powerd_warn("Invalid display state %d", applied_state);
134 return;
135 }
136-
137- actual_screen_state = applied_state;
138 }
139
140 static void update_flags(guint32 flags)
141@@ -389,7 +395,6 @@
142 powerd_run_mainloop_sync(apply_override_worker, &ovr_data);
143 }
144
145-
146 static int wait_for_file(const char *fname)
147 {
148 int fd, ret;
149@@ -407,31 +412,43 @@
150 return ret == -1 ? -errno : 0;
151 }
152
153-static gboolean fb_state_changed(gpointer data)
154-{
155- fb_state = (enum fb_state)data;
156- powerd_debug("fb state %s", fb_state == FB_SLEEP ? "sleep" : "awake");
157- if (fb_state == FB_AWAKE && powerd_display_enabled())
158- turn_on_display(ab_enabled(internal_state.flags));
159- return FALSE;
160-}
161+enum fb_state {
162+ FB_SLEEP,
163+ FB_AWAKE,
164+
165+ NUM_FB_STATES
166+};
167+
168+static const char *fb_file_names[] = {
169+ [FB_SLEEP] = "/sys/power/wait_for_fb_sleep",
170+ [FB_AWAKE] = "/sys/power/wait_for_fb_wake"
171+};
172
173 static gpointer wait_for_fb_thread(gpointer unused)
174 {
175- unsigned long state = fb_state;
176-
177- if (state >= NUM_FB_STATES) {
178- powerd_warn("Invalid FB state");
179- return NULL;
180- }
181-
182+ enum fb_state fb_state = FB_AWAKE;
183 while (1) {
184- state = (state == FB_SLEEP) ? FB_AWAKE : FB_SLEEP;
185- if (wait_for_file(fb_file_names[state])) {
186- powerd_warn("Error reading %s", fb_file_names[state]);
187+ enum fb_state next_state = (fb_state == FB_SLEEP) ? FB_AWAKE : FB_SLEEP;
188+
189+ if (wait_for_file(fb_file_names[next_state])) {
190+ powerd_warn("Error reading %s", fb_file_names[next_state]);
191 return NULL;
192 }
193- g_timeout_add(0, fb_state_changed, (gpointer)state);
194+
195+ fb_state = next_state;
196+ powerd_debug("fb state %s", fb_state == FB_SLEEP ? "sleep" : "awake");
197+
198+ g_mutex_lock(&actual_screen_state_guard);
199+ if (fb_state == FB_AWAKE)
200+ {
201+ turn_on_display(ab_enabled(internal_state.flags));
202+ actual_screen_state = POWERD_DISPLAY_STATE_ON;
203+ }
204+ else if (fb_state == FB_SLEEP)
205+ {
206+ actual_screen_state = DISPLAY_STATE_OFF;
207+ }
208+ g_mutex_unlock(&actual_screen_state_guard);
209 }
210 return NULL;
211 }
212@@ -489,7 +506,15 @@
213
214 /* If kernel supports earlysuspend, start thread to monitor fb state */
215 if (!access(fb_file_names[FB_SLEEP], F_OK))
216+ {
217+ powerd_debug("device uses earlysuspend");
218+ earlysuspend_used = TRUE;
219 g_thread_new("powerd_fb_wake_monitor", wait_for_fb_thread, &error);
220+ } else
221+ {
222+ powerd_debug("device does not use earlysuspend");
223+ earlysuspend_used = FALSE;
224+ }
225
226 return 0;
227 }

Subscribers

People subscribed via source and target branches