Mir

Merge lp:~alan-griffiths/mir/fix-1526658 into lp:mir

Proposed by Alan Griffiths
Status: Merged
Approved by: Alan Griffiths
Approved revision: no longer in the source branch.
Merged at revision: 3286
Proposed branch: lp:~alan-griffiths/mir/fix-1526658
Merge into: lp:mir
Prerequisite: lp:~kdub/mir/0.19-stanzas
Diff against target: 172 lines (+60/-29)
5 files modified
src/client/probing_client_platform_factory.cpp (+24/-23)
src/common/sharedlibrary/CMakeLists.txt (+1/-1)
src/common/sharedlibrary/shared_library_prober.cpp (+20/-5)
src/common/symbols.map (+7/-0)
src/include/common/mir/shared_library_prober.h (+8/-0)
To merge this branch: bzr merge lp:~alan-griffiths/mir/fix-1526658
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Daniel van Vugt Approve
Cemil Azizoglu (community) Approve
Alexandros Frantzis (community) Approve
Alberto Aguirre (community) Approve
Mir CI Bot continuous-integration Approve
Review via email: mp+284439@code.launchpad.net

Commit message

common, client: a more flexible way to probe modules: once we've found a good current platform we don't even try to load an older one. (More cleanup related to cases of lp:1526658)

Description of the change

common, client: a more flexible way to probe modules: once we've found a good current platform we don't even try to load an older one. (More cleanup related to cases of lp:1526658)

Some background from the bug history:

1. the mesa.so.2 client platform module leaks symbols and dl handle to itself when *loaded*.

2. the mesa.so.3 client platform module leaks symbols and dl handle to itself when *used*. It can be loaded and probed safely.

3. the mesa.so.4 client platform module leaks symbols when *used*. It can be loaded and probed safely.

libmirclient8 from Mir-0.13 (say) needs mesa.so.2 but can load and probe mesa.so.3/4 (see 2, 3 above).

Various fixes have been applied in the libmirclient9 series, and in 0.19 for_path() ordered the modules by "so number descending", but still loaded all the modules before doing so. As mentioned in 1 above this is problematic. The workaround in the bug is to uninstall mesa2.

This proposal continues to orders modules by "so number descending" but create_client_platform() probes them as they are loaded. Once a compatible driver is selected *the search is quit* so once mesa4 is selected neither mesa3 nor mesa2 is loaded.

That means that libmirclient8/mesa2 can safely co-exist with libmirclient9/mesa4.

To post a comment you must log in.
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:3283
https://mir-jenkins.ubuntu.com/job/mir-ci/187/
Executed test runs:
    None: https://mir-jenkins.ubuntu.com/job/generic-update-mp/188/console

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/187/rebuild

review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

8: [ FAILED ] TestClientCursorAPI.cursor_passed_through_nested_server (60036 ms)

lp:1525003?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

There's an additional fix I want to do on this bug this week:
Add debian packaging magic to simply force the broken packages to be uninstalled.

Because once you've loaded and tried to probe mir-client-platform-mesa2 finding out it's not the right one, it's already too late. It has already leaked a handle to itself (which will happen with this branch too).

That change is much more important than this one. But I haven't reviewed this in detail yet.

review: Abstain
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> There's an additional fix I want to do on this bug this week:
> Add debian packaging magic to simply force the broken packages to be
> uninstalled.

Removing these older packages is a workaround that works for most cases. However, a hypothetical parallel installation that requires mesa2 would be broken by forced removal.

Debian packaging magic to ensure that any installed mir-graphics-drivers-desktop/mir-graphics-drivers-android is updated would be useful though.

> Because once you've loaded and tried to probe mir-client-platform-mesa2
> finding out it's not the right one, it's already too late. It has already
> leaked a handle to itself (which will happen with this branch too).

The point of this MP is that we don't probe the badly behaved mesa2 once we've found that mesa4 works.

> That change is much more important than this one. But I haven't reviewed this
> in detail yet.

I disagree. Automating a workaround for our failed support for side-by-side installations is far less important than fixing the support.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I did initially agree with your "hypothetical parallel installation that requires mesa2" assertion. However that's not true if the older Mir installation uses libmirclient9. All recent Mir releases use libmirclient9, which means even old clients will start using the new libmirclient9 immediately and only need mir-client-platform-mesa4 to work.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Manual testing suggests this branch works well. Because mesa.so.2 is never loaded according to MIR_CLIENT_SHARED_LIBRARY_PROBER_REPORT=log.

Unfortunately I can't fully test for the original crash of bug 1526658 because my filesystem insists on finding mesa.so.4 in the directory before mesa.so.2, so it is working around the bug by pure luck.

Approved, but I suggest it might be nicer to stop adding functions to mircommon and instead just glob the directory from inside src/client/probing_client_platform_factory.cpp

review: Approve
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> Manual testing suggests this branch works well. Because mesa.so.2 is never
> loaded according to MIR_CLIENT_SHARED_LIBRARY_PROBER_REPORT=log.
>
> Unfortunately I can't fully test for the original crash of bug 1526658 because
> my filesystem insists on finding mesa.so.4 in the directory before mesa.so.2,
> so it is working around the bug by pure luck.

You're aware that libraries_for_path() orders the libraries by so version (descending)?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Whoops. My test methodology was flawed and my previous statement still holds (if you substitute mir-client-platform-mesa2 for mesa.so.[123]):

"Because once you've loaded and tried to probe mir-client-platform-mesa2 finding out it's not the right one, it's already too late. It has already leaked a handle to itself (which will happen with this branch too)."

If your directory happens to hash in such a way that the OS finds mesa.so.3 (from Mir 0.17) first, then the simple act of dlopening it (regardless of probe outcome) will make it leak. And your clients will then crash. So I think this branch won't fix anything. We're just failing to see it crash by pure luck, per above.

Note that the delicacy of the problem isn't really in Mir 0.18 as much as Mir 0.17 where we had the constructor attribute :(

http://bazaar.launchpad.net/~mir-team/mir/0.17/view/head:/src/platforms/mesa/client/client_platform.cpp#L62

If there is any chance at all that you might dlopen an old mesa.so.[123] then there's a chance you'll leak that old one and Mesa will still crash. So this branch is not a useful fix.

review: Disapprove
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

> You're aware that libraries_for_path() orders the libraries by so version
> (descending)?

Well, the bug exists so that's clearly not helping. Actually it's the order in which you dlopen the libraries that's important, and not the order in which they are returned in our API.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> Whoops. My test methodology was flawed and my previous statement still holds
> (if you substitute mir-client-platform-mesa2 for mesa.so.[123]):
>
> "Because once you've loaded and tried to probe mir-client-platform-mesa2
> finding out it's not the right one, it's already too late. It has already
> leaked a handle to itself (which will happen with this branch too)."
>
> If your directory happens to hash in such a way that the OS finds mesa.so.3
> (from Mir 0.17) first, then the simple act of dlopening it (regardless of
> probe outcome) will make it leak. And your clients will then crash. So I think
> this branch won't fix anything. We're just failing to see it crash by pure
> luck, per above.
>
> Note that the delicacy of the problem isn't really in Mir 0.18 as much as Mir
> 0.17 where we had the constructor attribute :(
>
> http://bazaar.launchpad.net/~mir-
> team/mir/0.17/view/head:/src/platforms/mesa/client/client_platform.cpp#L62
>
> If there is any chance at all that you might dlopen an old mesa.so.[123] then
> there's a chance you'll leak that old one and Mesa will still crash. So this
> branch is not a useful fix.

You're describing the problem this MP fixes.

The previous implementation using libraries_for_path() ordered the modules by "so number descending", but still loaded all the modules. As you point out that was problematic.

This MP select_libraries_for_path() also orders by "so number descending" but create_client_platform() probes them as they are loaded. Once a compatible driver is selected the search is quit and *earlier ones are neither loaded nor probed*.

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

Makes sense to me.

review: Approve
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Looks good.

review: Approve
Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

Even if we didn't have the bug, what this MP is doing makes sense. Once we've found a recent version of a shared object that works, I can't think of a reason to keep looking at the older libraries.

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Oh I failed to look at the sorting implementation. I assumed it had RAII constructed the SharedLibraries before it sorted them. It actually sorts just the filename strings.

That will teach me to try and comment on code while cooking dinner.

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Building remotely on prodstack-worker-med-3 (pbuilder-mir pbuilder-medium pbuilder-cloud-medium-test) in workspace /var/lib/jenkins/slaves/prodstack-worker-med-3/workspace/mir-android-vivid-i386-build
java.io.IOException: remote file operation failed: /var/lib/jenkins/slaves/prodstack-worker-med-3/workspace/mir-android-vivid-i386-build at hudson.remoting.Channel@77b86652:prodstack-worker-med-3: hudson.remoting.ChannelClosedException: channel is already closed
 at hudson.FilePath.act(FilePath.java:987)

Reapproving

Revision history for this message
PS Jenkins bot (ps-jenkins) :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/client/probing_client_platform_factory.cpp'
--- src/client/probing_client_platform_factory.cpp 2016-01-29 08:18:22 +0000
+++ src/client/probing_client_platform_factory.cpp 2016-01-29 12:17:29 +0000
@@ -31,6 +31,25 @@
31 // than it takes to choose the right one. So this list is local:31 // than it takes to choose the right one. So this list is local:
32 std::vector<std::shared_ptr<mir::SharedLibrary>> platform_modules;32 std::vector<std::shared_ptr<mir::SharedLibrary>> platform_modules;
3333
34 auto const module_selector = [context, &platform_modules](std::shared_ptr<mir::SharedLibrary> const& module)
35 {
36 try
37 {
38 auto probe = module->load_function<ClientPlatformProbe>("is_appropriate_module", CLIENT_PLATFORM_VERSION);
39 if (probe(context))
40 {
41 platform_modules.push_back(module);
42 return Selection::quit;
43 }
44 }
45 catch (std::runtime_error const&)
46 {
47 // Assume we were handed a SharedLibrary that's not a client platform module of the correct vintage.
48 }
49
50 return Selection::persist;
51 };
52
34 if (!platform_overrides.empty())53 if (!platform_overrides.empty())
35 {54 {
36 // Even forcing a choice, platform is only loaded on demand. It's good55 // Even forcing a choice, platform is only loaded on demand. It's good
@@ -39,37 +58,19 @@
39 // if you really wanted to.58 // if you really wanted to.
4059
41 for (auto const& platform : platform_overrides)60 for (auto const& platform : platform_overrides)
42 platform_modules.push_back(61 module_selector(std::make_shared<mir::SharedLibrary>(platform));
43 std::make_shared<mir::SharedLibrary>(platform));
44 }62 }
45 else63 else
46 {64 {
47 for (auto const& path : platform_paths)65 for (auto const& path : platform_paths)
48 {66 select_libraries_for_path(path, module_selector, *shared_library_prober_report);
49 // This is sub-optimal. We shouldn't need to have all drivers
50 // loaded simultaneously, but we do for now due to this API:
51 auto modules = mir::libraries_for_path(path,
52 *shared_library_prober_report);
53 for (auto const& module : modules)
54 platform_modules.push_back(module);
55 }
56 }67 }
5768
58 for (auto& module : platform_modules)69 for (auto& module : platform_modules)
59 {70 {
60 try71 auto factory = module->load_function<CreateClientPlatform>("create_client_platform", CLIENT_PLATFORM_VERSION);
61 {72 return factory(context);
62 auto probe = module->load_function<mir::client::ClientPlatformProbe>("is_appropriate_module", CLIENT_PLATFORM_VERSION);
63 if (probe(context))
64 {
65 auto factory = module->load_function<mir::client::CreateClientPlatform>("create_client_platform", CLIENT_PLATFORM_VERSION);
66 return factory(context);
67 }
68 }
69 catch(std::runtime_error const&)
70 {
71 // We were handed a SharedLibrary that's not a client platform module?
72 }
73 }73 }
74
74 BOOST_THROW_EXCEPTION(std::runtime_error{"No appropriate client platform module found"});75 BOOST_THROW_EXCEPTION(std::runtime_error{"No appropriate client platform module found"});
75}76}
7677
=== modified file 'src/common/sharedlibrary/CMakeLists.txt'
--- src/common/sharedlibrary/CMakeLists.txt 2016-01-20 23:59:18 +0000
+++ src/common/sharedlibrary/CMakeLists.txt 2016-01-29 12:17:29 +0000
@@ -17,7 +17,7 @@
17add_library(mirsharedsharedlibrary OBJECT17add_library(mirsharedsharedlibrary OBJECT
18 module_deleter.cpp18 module_deleter.cpp
19 shared_library.cpp19 shared_library.cpp
20 shared_library_prober.cpp20 shared_library_prober.cpp ${PROJECT_SOURCE_DIR}/src/include/common/mir/shared_library_prober.h
21)21)
2222
23list(APPEND MIR_COMMON_SOURCES23list(APPEND MIR_COMMON_SOURCES
2424
=== modified file 'src/common/sharedlibrary/shared_library_prober.cpp'
--- src/common/sharedlibrary/shared_library_prober.cpp 2016-01-29 08:18:22 +0000
+++ src/common/sharedlibrary/shared_library_prober.cpp 2016-01-29 12:17:29 +0000
@@ -63,8 +63,10 @@
63}63}
64}64}
6565
66std::vector<std::shared_ptr<mir::SharedLibrary>>66void mir::select_libraries_for_path(
67mir::libraries_for_path(std::string const& path, mir::SharedLibraryProberReport& report)67 std::string const& path,
68 std::function<Selection(std::shared_ptr<mir::SharedLibrary> const&)> const& selector,
69 mir::SharedLibraryProberReport& report)
68{70{
69 report.probing_path(path);71 report.probing_path(path);
70 // We use the error_code overload because we want to throw a std::system_error72 // We use the error_code overload because we want to throw a std::system_error
@@ -87,20 +89,33 @@
8789
88 std::sort(libraries.begin(), libraries.end(), &greater_soname_version);90 std::sort(libraries.begin(), libraries.end(), &greater_soname_version);
8991
90 std::vector<std::shared_ptr<mir::SharedLibrary>> result;
91
92 for(auto& lib : libraries)92 for(auto& lib : libraries)
93 {93 {
94 try94 try
95 {95 {
96 report.loading_library(lib);96 report.loading_library(lib);
97 result.emplace_back(std::make_shared<mir::SharedLibrary>(lib.string()));97 auto const shared_lib = std::make_shared<mir::SharedLibrary>(lib.string());
98
99 if (selector(shared_lib) == Selection::quit)
100 return;
98 }101 }
99 catch (std::runtime_error const& err)102 catch (std::runtime_error const& err)
100 {103 {
101 report.loading_failed(lib, err);104 report.loading_failed(lib, err);
102 }105 }
103 }106 }
107}
108
109std::vector<std::shared_ptr<mir::SharedLibrary>>
110mir::libraries_for_path(std::string const& path, mir::SharedLibraryProberReport& report)
111{
112 std::vector<std::shared_ptr<mir::SharedLibrary>> result;
113
114 select_libraries_for_path(
115 path,
116 [&](std::shared_ptr<mir::SharedLibrary> const& shared_lib)
117 { result.push_back(shared_lib); return Selection::persist; },
118 report);
104119
105 return result;120 return result;
106}121}
107122
=== modified file 'src/common/symbols.map'
--- src/common/symbols.map 2016-01-29 12:17:29 +0000
+++ src/common/symbols.map 2016-01-29 12:17:29 +0000
@@ -211,3 +211,10 @@
211 mir::detail::libname_impl*;211 mir::detail::libname_impl*;
212 };212 };
213} MIR_COMMON_5.1;213} MIR_COMMON_5.1;
214
215MIR_COMMON_unreleased { # New functions in Mir 0.20
216 global:
217 extern "C++" {
218 mir::select_libraries_for_path*;
219 };
220} MIR_COMMON_5v19;
214\ No newline at end of file221\ No newline at end of file
215222
=== modified file 'src/include/common/mir/shared_library_prober.h'
--- src/include/common/mir/shared_library_prober.h 2015-02-22 07:46:25 +0000
+++ src/include/common/mir/shared_library_prober.h 2016-01-29 12:17:29 +0000
@@ -30,6 +30,14 @@
30class SharedLibrary;30class SharedLibrary;
3131
32std::vector<std::shared_ptr<SharedLibrary>> libraries_for_path(std::string const& path, SharedLibraryProberReport& report);32std::vector<std::shared_ptr<SharedLibrary>> libraries_for_path(std::string const& path, SharedLibraryProberReport& report);
33
34// The selector can tell select_libraries_for_path() to persist or quit after each library
35enum class Selection { persist, quit };
36
37void select_libraries_for_path(
38 std::string const& path,
39 std::function<Selection(std::shared_ptr<SharedLibrary> const&)> const& selector,
40 SharedLibraryProberReport& report);
33}41}
3442
3543

Subscribers

People subscribed via source and target branches