Mir

Merge lp:~alan-griffiths/mir/fix-1526658 into lp:mir

Proposed by Alan Griffiths
Status: Merged
Approved by: Alan Griffiths
Approved revision: no longer in the source branch.
Merged at revision: 3286
Proposed branch: lp:~alan-griffiths/mir/fix-1526658
Merge into: lp:mir
Prerequisite: lp:~kdub/mir/0.19-stanzas
Diff against target: 172 lines (+60/-29)
5 files modified
src/client/probing_client_platform_factory.cpp (+24/-23)
src/common/sharedlibrary/CMakeLists.txt (+1/-1)
src/common/sharedlibrary/shared_library_prober.cpp (+20/-5)
src/common/symbols.map (+7/-0)
src/include/common/mir/shared_library_prober.h (+8/-0)
To merge this branch: bzr merge lp:~alan-griffiths/mir/fix-1526658
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Daniel van Vugt Approve
Cemil Azizoglu (community) Approve
Alexandros Frantzis (community) Approve
Alberto Aguirre (community) Approve
Mir CI Bot continuous-integration Approve
Review via email: mp+284439@code.launchpad.net

Commit message

common, client: a more flexible way to probe modules: once we've found a good current platform we don't even try to load an older one. (More cleanup related to cases of lp:1526658)

Description of the change

common, client: a more flexible way to probe modules: once we've found a good current platform we don't even try to load an older one. (More cleanup related to cases of lp:1526658)

Some background from the bug history:

1. the mesa.so.2 client platform module leaks symbols and dl handle to itself when *loaded*.

2. the mesa.so.3 client platform module leaks symbols and dl handle to itself when *used*. It can be loaded and probed safely.

3. the mesa.so.4 client platform module leaks symbols when *used*. It can be loaded and probed safely.

libmirclient8 from Mir-0.13 (say) needs mesa.so.2 but can load and probe mesa.so.3/4 (see 2, 3 above).

Various fixes have been applied in the libmirclient9 series, and in 0.19 for_path() ordered the modules by "so number descending", but still loaded all the modules before doing so. As mentioned in 1 above this is problematic. The workaround in the bug is to uninstall mesa2.

This proposal continues to orders modules by "so number descending" but create_client_platform() probes them as they are loaded. Once a compatible driver is selected *the search is quit* so once mesa4 is selected neither mesa3 nor mesa2 is loaded.

That means that libmirclient8/mesa2 can safely co-exist with libmirclient9/mesa4.

To post a comment you must log in.
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:3283
https://mir-jenkins.ubuntu.com/job/mir-ci/187/
Executed test runs:
    None: https://mir-jenkins.ubuntu.com/job/generic-update-mp/188/console

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/187/rebuild

review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

8: [ FAILED ] TestClientCursorAPI.cursor_passed_through_nested_server (60036 ms)

lp:1525003?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

There's an additional fix I want to do on this bug this week:
Add debian packaging magic to simply force the broken packages to be uninstalled.

Because once you've loaded and tried to probe mir-client-platform-mesa2 finding out it's not the right one, it's already too late. It has already leaked a handle to itself (which will happen with this branch too).

That change is much more important than this one. But I haven't reviewed this in detail yet.

review: Abstain
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> There's an additional fix I want to do on this bug this week:
> Add debian packaging magic to simply force the broken packages to be
> uninstalled.

Removing these older packages is a workaround that works for most cases. However, a hypothetical parallel installation that requires mesa2 would be broken by forced removal.

Debian packaging magic to ensure that any installed mir-graphics-drivers-desktop/mir-graphics-drivers-android is updated would be useful though.

> Because once you've loaded and tried to probe mir-client-platform-mesa2
> finding out it's not the right one, it's already too late. It has already
> leaked a handle to itself (which will happen with this branch too).

The point of this MP is that we don't probe the badly behaved mesa2 once we've found that mesa4 works.

> That change is much more important than this one. But I haven't reviewed this
> in detail yet.

I disagree. Automating a workaround for our failed support for side-by-side installations is far less important than fixing the support.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I did initially agree with your "hypothetical parallel installation that requires mesa2" assertion. However that's not true if the older Mir installation uses libmirclient9. All recent Mir releases use libmirclient9, which means even old clients will start using the new libmirclient9 immediately and only need mir-client-platform-mesa4 to work.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Manual testing suggests this branch works well. Because mesa.so.2 is never loaded according to MIR_CLIENT_SHARED_LIBRARY_PROBER_REPORT=log.

Unfortunately I can't fully test for the original crash of bug 1526658 because my filesystem insists on finding mesa.so.4 in the directory before mesa.so.2, so it is working around the bug by pure luck.

Approved, but I suggest it might be nicer to stop adding functions to mircommon and instead just glob the directory from inside src/client/probing_client_platform_factory.cpp

review: Approve
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> Manual testing suggests this branch works well. Because mesa.so.2 is never
> loaded according to MIR_CLIENT_SHARED_LIBRARY_PROBER_REPORT=log.
>
> Unfortunately I can't fully test for the original crash of bug 1526658 because
> my filesystem insists on finding mesa.so.4 in the directory before mesa.so.2,
> so it is working around the bug by pure luck.

You're aware that libraries_for_path() orders the libraries by so version (descending)?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Whoops. My test methodology was flawed and my previous statement still holds (if you substitute mir-client-platform-mesa2 for mesa.so.[123]):

"Because once you've loaded and tried to probe mir-client-platform-mesa2 finding out it's not the right one, it's already too late. It has already leaked a handle to itself (which will happen with this branch too)."

If your directory happens to hash in such a way that the OS finds mesa.so.3 (from Mir 0.17) first, then the simple act of dlopening it (regardless of probe outcome) will make it leak. And your clients will then crash. So I think this branch won't fix anything. We're just failing to see it crash by pure luck, per above.

Note that the delicacy of the problem isn't really in Mir 0.18 as much as Mir 0.17 where we had the constructor attribute :(

http://bazaar.launchpad.net/~mir-team/mir/0.17/view/head:/src/platforms/mesa/client/client_platform.cpp#L62

If there is any chance at all that you might dlopen an old mesa.so.[123] then there's a chance you'll leak that old one and Mesa will still crash. So this branch is not a useful fix.

review: Disapprove
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

> You're aware that libraries_for_path() orders the libraries by so version
> (descending)?

Well, the bug exists so that's clearly not helping. Actually it's the order in which you dlopen the libraries that's important, and not the order in which they are returned in our API.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> Whoops. My test methodology was flawed and my previous statement still holds
> (if you substitute mir-client-platform-mesa2 for mesa.so.[123]):
>
> "Because once you've loaded and tried to probe mir-client-platform-mesa2
> finding out it's not the right one, it's already too late. It has already
> leaked a handle to itself (which will happen with this branch too)."
>
> If your directory happens to hash in such a way that the OS finds mesa.so.3
> (from Mir 0.17) first, then the simple act of dlopening it (regardless of
> probe outcome) will make it leak. And your clients will then crash. So I think
> this branch won't fix anything. We're just failing to see it crash by pure
> luck, per above.
>
> Note that the delicacy of the problem isn't really in Mir 0.18 as much as Mir
> 0.17 where we had the constructor attribute :(
>
> http://bazaar.launchpad.net/~mir-
> team/mir/0.17/view/head:/src/platforms/mesa/client/client_platform.cpp#L62
>
> If there is any chance at all that you might dlopen an old mesa.so.[123] then
> there's a chance you'll leak that old one and Mesa will still crash. So this
> branch is not a useful fix.

You're describing the problem this MP fixes.

The previous implementation using libraries_for_path() ordered the modules by "so number descending", but still loaded all the modules. As you point out that was problematic.

This MP select_libraries_for_path() also orders by "so number descending" but create_client_platform() probes them as they are loaded. Once a compatible driver is selected the search is quit and *earlier ones are neither loaded nor probed*.

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

Makes sense to me.

review: Approve
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Looks good.

review: Approve
Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

Even if we didn't have the bug, what this MP is doing makes sense. Once we've found a recent version of a shared object that works, I can't think of a reason to keep looking at the older libraries.

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Oh I failed to look at the sorting implementation. I assumed it had RAII constructed the SharedLibraries before it sorted them. It actually sorts just the filename strings.

That will teach me to try and comment on code while cooking dinner.

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Building remotely on prodstack-worker-med-3 (pbuilder-mir pbuilder-medium pbuilder-cloud-medium-test) in workspace /var/lib/jenkins/slaves/prodstack-worker-med-3/workspace/mir-android-vivid-i386-build
java.io.IOException: remote file operation failed: /var/lib/jenkins/slaves/prodstack-worker-med-3/workspace/mir-android-vivid-i386-build at hudson.remoting.Channel@77b86652:prodstack-worker-med-3: hudson.remoting.ChannelClosedException: channel is already closed
 at hudson.FilePath.act(FilePath.java:987)

Reapproving

Revision history for this message
PS Jenkins bot (ps-jenkins) :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/client/probing_client_platform_factory.cpp'
2--- src/client/probing_client_platform_factory.cpp 2016-01-29 08:18:22 +0000
3+++ src/client/probing_client_platform_factory.cpp 2016-01-29 12:17:29 +0000
4@@ -31,6 +31,25 @@
5 // than it takes to choose the right one. So this list is local:
6 std::vector<std::shared_ptr<mir::SharedLibrary>> platform_modules;
7
8+ auto const module_selector = [context, &platform_modules](std::shared_ptr<mir::SharedLibrary> const& module)
9+ {
10+ try
11+ {
12+ auto probe = module->load_function<ClientPlatformProbe>("is_appropriate_module", CLIENT_PLATFORM_VERSION);
13+ if (probe(context))
14+ {
15+ platform_modules.push_back(module);
16+ return Selection::quit;
17+ }
18+ }
19+ catch (std::runtime_error const&)
20+ {
21+ // Assume we were handed a SharedLibrary that's not a client platform module of the correct vintage.
22+ }
23+
24+ return Selection::persist;
25+ };
26+
27 if (!platform_overrides.empty())
28 {
29 // Even forcing a choice, platform is only loaded on demand. It's good
30@@ -39,37 +58,19 @@
31 // if you really wanted to.
32
33 for (auto const& platform : platform_overrides)
34- platform_modules.push_back(
35- std::make_shared<mir::SharedLibrary>(platform));
36+ module_selector(std::make_shared<mir::SharedLibrary>(platform));
37 }
38 else
39 {
40 for (auto const& path : platform_paths)
41- {
42- // This is sub-optimal. We shouldn't need to have all drivers
43- // loaded simultaneously, but we do for now due to this API:
44- auto modules = mir::libraries_for_path(path,
45- *shared_library_prober_report);
46- for (auto const& module : modules)
47- platform_modules.push_back(module);
48- }
49+ select_libraries_for_path(path, module_selector, *shared_library_prober_report);
50 }
51
52 for (auto& module : platform_modules)
53 {
54- try
55- {
56- auto probe = module->load_function<mir::client::ClientPlatformProbe>("is_appropriate_module", CLIENT_PLATFORM_VERSION);
57- if (probe(context))
58- {
59- auto factory = module->load_function<mir::client::CreateClientPlatform>("create_client_platform", CLIENT_PLATFORM_VERSION);
60- return factory(context);
61- }
62- }
63- catch(std::runtime_error const&)
64- {
65- // We were handed a SharedLibrary that's not a client platform module?
66- }
67+ auto factory = module->load_function<CreateClientPlatform>("create_client_platform", CLIENT_PLATFORM_VERSION);
68+ return factory(context);
69 }
70+
71 BOOST_THROW_EXCEPTION(std::runtime_error{"No appropriate client platform module found"});
72 }
73
74=== modified file 'src/common/sharedlibrary/CMakeLists.txt'
75--- src/common/sharedlibrary/CMakeLists.txt 2016-01-20 23:59:18 +0000
76+++ src/common/sharedlibrary/CMakeLists.txt 2016-01-29 12:17:29 +0000
77@@ -17,7 +17,7 @@
78 add_library(mirsharedsharedlibrary OBJECT
79 module_deleter.cpp
80 shared_library.cpp
81- shared_library_prober.cpp
82+ shared_library_prober.cpp ${PROJECT_SOURCE_DIR}/src/include/common/mir/shared_library_prober.h
83 )
84
85 list(APPEND MIR_COMMON_SOURCES
86
87=== modified file 'src/common/sharedlibrary/shared_library_prober.cpp'
88--- src/common/sharedlibrary/shared_library_prober.cpp 2016-01-29 08:18:22 +0000
89+++ src/common/sharedlibrary/shared_library_prober.cpp 2016-01-29 12:17:29 +0000
90@@ -63,8 +63,10 @@
91 }
92 }
93
94-std::vector<std::shared_ptr<mir::SharedLibrary>>
95-mir::libraries_for_path(std::string const& path, mir::SharedLibraryProberReport& report)
96+void mir::select_libraries_for_path(
97+ std::string const& path,
98+ std::function<Selection(std::shared_ptr<mir::SharedLibrary> const&)> const& selector,
99+ mir::SharedLibraryProberReport& report)
100 {
101 report.probing_path(path);
102 // We use the error_code overload because we want to throw a std::system_error
103@@ -87,20 +89,33 @@
104
105 std::sort(libraries.begin(), libraries.end(), &greater_soname_version);
106
107- std::vector<std::shared_ptr<mir::SharedLibrary>> result;
108-
109 for(auto& lib : libraries)
110 {
111 try
112 {
113 report.loading_library(lib);
114- result.emplace_back(std::make_shared<mir::SharedLibrary>(lib.string()));
115+ auto const shared_lib = std::make_shared<mir::SharedLibrary>(lib.string());
116+
117+ if (selector(shared_lib) == Selection::quit)
118+ return;
119 }
120 catch (std::runtime_error const& err)
121 {
122 report.loading_failed(lib, err);
123 }
124 }
125+}
126+
127+std::vector<std::shared_ptr<mir::SharedLibrary>>
128+mir::libraries_for_path(std::string const& path, mir::SharedLibraryProberReport& report)
129+{
130+ std::vector<std::shared_ptr<mir::SharedLibrary>> result;
131+
132+ select_libraries_for_path(
133+ path,
134+ [&](std::shared_ptr<mir::SharedLibrary> const& shared_lib)
135+ { result.push_back(shared_lib); return Selection::persist; },
136+ report);
137
138 return result;
139 }
140
141=== modified file 'src/common/symbols.map'
142--- src/common/symbols.map 2016-01-29 12:17:29 +0000
143+++ src/common/symbols.map 2016-01-29 12:17:29 +0000
144@@ -211,3 +211,10 @@
145 mir::detail::libname_impl*;
146 };
147 } MIR_COMMON_5.1;
148+
149+MIR_COMMON_unreleased { # New functions in Mir 0.20
150+ global:
151+ extern "C++" {
152+ mir::select_libraries_for_path*;
153+ };
154+} MIR_COMMON_5v19;
155\ No newline at end of file
156
157=== modified file 'src/include/common/mir/shared_library_prober.h'
158--- src/include/common/mir/shared_library_prober.h 2015-02-22 07:46:25 +0000
159+++ src/include/common/mir/shared_library_prober.h 2016-01-29 12:17:29 +0000
160@@ -30,6 +30,14 @@
161 class SharedLibrary;
162
163 std::vector<std::shared_ptr<SharedLibrary>> libraries_for_path(std::string const& path, SharedLibraryProberReport& report);
164+
165+// The selector can tell select_libraries_for_path() to persist or quit after each library
166+enum class Selection { persist, quit };
167+
168+void select_libraries_for_path(
169+ std::string const& path,
170+ std::function<Selection(std::shared_ptr<SharedLibrary> const&)> const& selector,
171+ SharedLibraryProberReport& report);
172 }
173
174

Subscribers

People subscribed via source and target branches