Mir

Merge lp:~albaguirre/mir/screencast-crash-fix into lp:mir

Proposed by Alberto Aguirre
Status: Merged
Approved by: Daniel van Vugt
Approved revision: no longer in the source branch.
Merged at revision: 1404
Proposed branch: lp:~albaguirre/mir/screencast-crash-fix
Merge into: lp:mir
Diff against target: 135 lines (+38/-36)
1 file modified
src/utils/screencast.cpp (+38/-36)
To merge this branch: bzr merge lp:~albaguirre/mir/screencast-crash-fix
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Alexandros Frantzis (community) Approve
Review via email: mp+206337@code.launchpad.net

Commit message

Fix crash in android devices by working around a subtle threading bug

Use a dummy thead_local array to push the gl/egl context TLS into a region
where the future wait code does not overwrite it.

Other changes:
- Query the preferred read pixel format/type at setup.
- Write a video file instead of snapshot files; the raw file can then be played as a video with vlc

fixes: lp: #1280086

Description of the change

Fix crash in android devices by working around a subtle threading bug

Use a dummy thead_local array to push the gl/egl context TLS into a region
where the future wait code does not overwrite it.

Other changes:
- Query the preferred read pixel format/type at setup.
- Write a video file instead of snapshot files; the raw file can then be played as a video with vlc

fixes: lp: #1280086

For some unkown reason, on android devices, the EGL context goes bad after waiting on a future in the main thread - the EGL context was made current in the main thread and should remain so.

Possibly a TLS issue with libhybris?

To post a comment you must log in.
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Remove the use of asyncs to dispatch file writes; no real reason to launch one as the main loop just synchronizes it with a wait anyway.

Actually there is a reason: with an async file write, we can swap the screencast buffer and write to file concurrently, which depending on the actual operation timings and IO blocking, helps improve the capture frame rate (so, iteration_time = max(write_time, swap_time) instead of iteration_time = write_time + swap_time).

Anyway, this is the main point of the workaround, so no point arguing for this to stay :)
Hopefully we can reinstate it when when we fix the core issue.

> std::ofstream videoFile(ss.str());

We should use underscores in variable names: http://unity.ubuntu.com/mir/cppguide/index.html#Variable_Names

> glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT, &format);

GL_IMPLEMENTATION_COLOR_READ_FORMAT is meaningful only in combination with GL_IMPLEMENTATION_COLOR_READ_TYPE (i.e., we can't assume per the standard that GL_IMPLEMENTATION_COLOR_READ_TYPE => GL_UNSIGNED_BYTE, although it's often the case). Since we currently want support for either RGBA8888 or BGRA8888 frames, we should check that GL_IMPLEMENTATION_COLOR_READ_TYPE == GL_UNSIGNED_BYTE and GL_IMPLEMENTATION_COLOR_READ_FORMAT == BGRA and only then return it as valid, otherwise return the guaranteed default RGBA.

Note that when implementing surface snapshots I found that using the GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE was unreliable (e.g., using the returned values failed depending on the underlying buffer type), although the situation was more complex there, involving FBOs backed by textures backed by EGLImages. That's why I used the double glReadPixels scheme, which was also used here.

review: Needs Fixing
Revision history for this message
Alberto Aguirre (albaguirre) wrote :

> > Remove the use of asyncs to dispatch file writes; no real reason to launch
> one as the main loop just synchronizes it with a wait anyway.
>
> Actually there is a reason: with an async file write, we can swap the
> screencast buffer and write to file concurrently, which depending on the
> actual operation timings and IO blocking, helps improve the capture frame rate
> (so, iteration_time = max(write_time, swap_time) instead of iteration_time =
> write_time + swap_time).
>
Aahh, very true. Updated commit/description.

> > std::ofstream videoFile(ss.str());
>
> We should use underscores in variable names:
> http://unity.ubuntu.com/mir/cppguide/index.html#Variable_Names
>

Fixed now.

> > glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT, &format);
>
> GL_IMPLEMENTATION_COLOR_READ_FORMAT is meaningful only in combination with
> GL_IMPLEMENTATION_COLOR_READ_TYPE (i.e., we can't assume per the standard that
> GL_IMPLEMENTATION_COLOR_READ_TYPE => GL_UNSIGNED_BYTE, although it's often the
> case). Since we currently want support for either RGBA8888 or BGRA8888 frames,
> we should check that GL_IMPLEMENTATION_COLOR_READ_TYPE == GL_UNSIGNED_BYTE and
> GL_IMPLEMENTATION_COLOR_READ_FORMAT == BGRA and only then return it as valid,
> otherwise return the guaranteed default RGBA.
>
> Note that when implementing surface snapshots I found that using the
> GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE was unreliable (e.g., using the
> returned values failed depending on the underlying buffer type), although the
> situation was more complex there, involving FBOs backed by textures backed by
> EGLImages. That's why I used the double glReadPixels scheme, which was also
> used here.

Ok. I implemented these suggestions at setup time. To make sure I issue a glReadPixels of 1 pixel to validate the read format and type.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

56 + if (type == GL_UNSIGNED_BYTE)
57 + read_pixel_format = static_cast<GLenum>(format);

Better to use (type == GL_UNSIGNED_BYTE && format == GL_BGRA_EXT), for now at least, since we only support BGRA as an alternative, and platforms are allowed to return other types (e.g. GL_RGB). Of course, ideally we would support other types too.

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

> 56 + if (type == GL_UNSIGNED_BYTE)
> 57 + read_pixel_format = static_cast<GLenum>(format);
>
> Better to use (type == GL_UNSIGNED_BYTE && format == GL_BGRA_EXT), for now at
> least, since we only support BGRA as an alternative, and platforms are allowed
> to return other types (e.g. GL_RGB). Of course, ideally we would support other
> types too.

Yeah, I just realize it could be another non 4-byte format. So then I'll do just the
glReadPixels as before but at setup time. See update.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Other than what I mentioned above, looks ok, so pre-approving in order not to block this.

There is a chance we will encounter similar issues as I did for surface snapshots, in which case we will need to try both formats (RGBA and implementation/BGRA) and take the one that works, but let's do this only if the need comes up.

review: Approve
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

(Latest updates) Looks good.

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alberto Aguirre (albaguirre) wrote :

I restored the async file write by using a different workaround.

Using a thread_local variable of enough size seems to push the egl/gl context TLS into a region where the future wait code does not overwrite it.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Failure not related to this branch, rebuilding.

(see https://launchpad.net/bugs/1281146)

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Failure not related to this branch, rebuilding.

Ah, hadn't seen the latest update :)

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Hmm, usually I'd say wait for a second review+approval on all MPs. But in this case it's really just Alexandros who needed to check it.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Now landing is blocked by bug 1281145. We need that fix to land too :/

Revision history for this message
PS Jenkins bot (ps-jenkins) :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/utils/screencast.cpp'
2--- src/utils/screencast.cpp 2014-02-06 14:44:00 +0000
3+++ src/utils/screencast.cpp 2014-02-17 14:56:45 +0000
4@@ -42,45 +42,23 @@
5
6 volatile sig_atomic_t running = 1;
7
8+//In android, waiting for a future is causing the gl/egl context to become invalid
9+//possibly due to assumptions in libhybris/android linker.
10+//A TLS allocation in the main thread is forced with this variable which seems to push
11+//the gl/egl context TLS into a slot where the future wait code does not overwrite it.
12+thread_local int tls_hack[2];
13+
14 void shutdown(int)
15 {
16 running = 0;
17 }
18
19-std::future<void> write_frame_to_file(
20- std::vector<char> const& frame_data, int frame_number, GLenum format)
21-{
22- return std::async(
23- std::launch::async,
24- [&frame_data, frame_number, format]
25- {
26- std::stringstream ss;
27- ss << "/tmp/mir_" ;
28- ss.width(5);
29- ss.fill('0');
30- ss << frame_number;
31- ss << (format == GL_BGRA_EXT ? ".bgra" : ".rgba");
32- std::ofstream f(ss.str());
33- f.write(frame_data.data(), frame_data.size());
34- });
35-}
36-
37-GLenum read_pixels(mir::geometry::Size const& size, void* buffer)
38+void read_pixels(GLenum format, mir::geometry::Size const& size, void* buffer)
39 {
40 auto width = size.width.as_uint32_t();
41 auto height = size.height.as_uint32_t();
42
43- GLenum format = GL_BGRA_EXT;
44-
45 glReadPixels(0, 0, width, height, format, GL_UNSIGNED_BYTE, buffer);
46-
47- if (glGetError() != GL_NO_ERROR)
48- {
49- format = GL_RGBA;
50- glReadPixels(0, 0, width, height, format, GL_UNSIGNED_BYTE, buffer);
51- }
52-
53- return format;
54 }
55
56
57@@ -184,6 +162,13 @@
58 {
59 throw std::runtime_error("Failed to make screencast surface current");
60 }
61+
62+ uint32_t a_pixel;
63+ glReadPixels(0, 0, 1, 1, GL_BGRA_EXT, GL_UNSIGNED_BYTE, &a_pixel);
64+ if (glGetError() == GL_NO_ERROR)
65+ read_pixel_format = GL_BGRA_EXT;
66+ else
67+ read_pixel_format = GL_RGBA;
68 }
69
70 ~EGLSetup()
71@@ -200,10 +185,16 @@
72 throw std::runtime_error("Failed to swap screencast surface buffers");
73 }
74
75+ GLenum pixel_read_format()
76+ {
77+ return read_pixel_format;
78+ }
79+
80 EGLDisplay egl_display;
81 EGLContext egl_context;
82 EGLSurface egl_surface;
83 EGLConfig egl_config;
84+ GLenum read_pixel_format;
85 };
86
87 void do_screencast(MirConnection* connection, MirScreencast* screencast,
88@@ -216,22 +207,30 @@
89 frame_size.width.as_uint32_t() *
90 frame_size.height.as_uint32_t();
91
92- int frame_number{0};
93 std::vector<char> frame_data(frame_size_bytes, 0);
94- std::future<void> frame_written_future =
95- std::async(std::launch::deferred, []{});
96
97 EGLSetup egl_setup{connection, screencast};
98+ auto format = egl_setup.pixel_read_format();
99+
100+ std::stringstream ss;
101+ ss << "/tmp/mir_screencast_" ;
102+ ss << frame_size.width << "x" << frame_size.height;
103+ ss << (format == GL_BGRA_EXT ? ".bgra" : ".rgba");
104+ std::ofstream video_file(ss.str());
105
106 while (running)
107 {
108- frame_written_future.wait();
109+ read_pixels(format, frame_size, frame_data.data());
110
111- auto format = read_pixels(frame_size, frame_data.data());
112- frame_written_future = write_frame_to_file(frame_data, frame_number, format);
113+ auto write_out_future = std::async(
114+ std::launch::async,
115+ [&video_file, &frame_data] {
116+ video_file.write(frame_data.data(), frame_data.size());
117+ });
118
119 egl_setup.swap_buffers();
120- ++frame_number;
121+
122+ write_out_future.wait();
123 }
124 }
125
126@@ -245,6 +244,9 @@
127 char const* socket_file = nullptr;
128 uint32_t output_id = mir_display_output_id_invalid;
129
130+ //avoid unused warning/error
131+ tls_hack[0] = 0;
132+
133 while ((arg = getopt (argc, argv, "hm:o:")) != -1)
134 {
135 switch (arg)

Subscribers

People subscribed via source and target branches