Merge lp:~mir-team/mir/add-multiplexing-dispatchable into lp:mir
| Status: | Merged |
|---|---|
| Approved by: | Chris Halse Rogers on 2015-02-18 |
| Approved revision: | 2310 |
| Merged at revision: | 2326 |
| Proposed branch: | lp:~mir-team/mir/add-multiplexing-dispatchable |
| Merge into: | lp:mir |
| Prerequisite: | lp:~mir-team/mir/add-dispatchable-interface |
| Diff against target: |
1390 lines (+1216/-46) 14 files modified
benchmarks/CMakeLists.txt (+8/-0) benchmarks/benchmark_multiplexing_dispatchable.cpp (+122/-0) include/common/mir/dispatch/multiplexing_dispatchable.h (+106/-0) src/common/dispatch/CMakeLists.txt (+2/-0) src/common/dispatch/multiplexing_dispatchable.cpp (+300/-0) src/common/dispatch/simple_dispatch_thread.cpp (+3/-45) src/common/dispatch/utils.cpp (+65/-0) src/common/dispatch/utils.h (+38/-0) src/common/input/android/android_input_receiver_thread.cpp (+1/-1) src/common/symbols.map (+11/-0) tests/unit-tests/CMakeLists.txt (+1/-0) tests/unit-tests/dispatch/CMakeLists.txt (+2/-0) tests/unit-tests/dispatch/test_dispatch_utils.cpp (+70/-0) tests/unit-tests/dispatch/test_multiplexing_dispatchable.cpp (+487/-0) |
| To merge this branch: | bzr merge lp:~mir-team/mir/add-multiplexing-dispatchable |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Daniel van Vugt | Abstain on 2015-02-17 | ||
| Alexandros Frantzis (community) | Approve on 2015-02-13 | ||
| PS Jenkins bot | continuous-integration | Approve on 2015-02-12 | |
| Chris Halse Rogers | Abstain on 2015-02-11 | ||
| Alan Griffiths | 2015-01-16 | Abstain on 2015-01-28 | |
| Robert Carr (community) | Approve on 2015-01-27 | ||
|
Review via email:
|
|||
Commit Message
Add MultiplexingDis
This does exactly what it says; multiplexes multiple Dispatchables into a single Dispatchable. Each dispatch() of the MultiplexingDis
Description of the Change
The second part of eventloop integration. This is pretty obvious; it's necessary because we want to present a single fd to wait on for the client regardless of the internal implementation, and we currently have one RPC fd + one input fd per surface with input.
Note that there's potentially a small performance improvement available if we specialise adding a MultiplexingDis
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2256
http://
Executed test runs:
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
- 2232. By Daniel van Vugt on 2015-01-19
-
Add client API access to set surfaces "hidden".
Approved by PS Jenkins bot, Alberto Aguirre, Cemil Azizoglu, Alan Griffiths.
- 2233. By Kevin DuBois on 2015-01-19
-
android: make the helper class HwcLayerList more helpful by adding a common hwc list function to update the framebuffer.
Approved by PS Jenkins bot, Alexandros Frantzis, Robert Carr.
- 2234. By Alan Griffiths on 2015-01-19
-
tests: fix logging option for some acceptance and integration tests
(LP: #1408231). Fixes: https://bugs.launchpad .net/bugs/ 1408231. Approved by PS Jenkins bot, Daniel van Vugt, Kevin DuBois, Robert Carr, Cemil Azizoglu.
- 2235. By Daniel van Vugt on 2015-01-19
-
Modifier flags: Avoid casting to impossible (unnamed) enum values. That would only confuse debuggers and future maintainers.
Approved by Cemil Azizoglu, PS Jenkins bot.
- 2236. By Daniel van Vugt on 2015-01-20
-
demo-shell: Allow mouse resizing (Alt+middlebutton) to control any edge
or corner.Approved by Alexandros Frantzis, PS Jenkins bot.
- 2237. By Alan Griffiths on 2015-01-20
-
Bump server ABI.
Approved by Daniel van Vugt, PS Jenkins bot.
- 2238. By Alan Griffiths on 2015-01-20
-
shell: remove FocusSetter from public interface.
Approved by PS Jenkins bot, Alexandros Frantzis.
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2262
http://
Executed test runs:
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
- 2239. By Daniel van Vugt on 2015-01-20
-
Document server ABI bump to 29.
Approved by Alan Griffiths, PS Jenkins bot.
- 2240. By Chris Halse Rogers on 2015-01-21
-
Add the start of a pollable fd interface to StreamTransport
- 2241. By Chris Halse Rogers on 2015-01-21
-
Test that watch_fd() is pollable
- 2242. By Chris Halse Rogers on 2015-01-21
-
Wire up watch_fd() to the actual epoller
- 2243. By Chris Halse Rogers on 2015-01-21
-
Add dispatch() method to StreamTransport
- 2244. By Chris Halse Rogers on 2015-01-21
-
Use mir::Fd in StreamTransport
- 2245. By Chris Halse Rogers on 2015-01-21
-
Remove superfluous ~StreamTranspor
tTest - 2246. By Chris Halse Rogers on 2015-01-21
-
Switch StreamSocketTra
nsport over to a dispatch() driven interface. Only the specific tests have been updated; at this point all client code will fail.
- 2247. By Chris Halse Rogers on 2015-01-21
-
Drop the ‘more events pending’ return value from StreamTransport
::dispatch( ). Anyone polling watch_fd() will do this with equal efficiency.
- 2248. By Chris Halse Rogers on 2015-01-21
-
Move dispatch tests to the top of the file.
These are more primitive than the observer & data read/write ones
- 2249. By Chris Halse Rogers on 2015-01-21
-
Simplify StreamSocketTra
nsport: :dispatch. We don't actually need to do all that exception-catching, particularly since
we don't do anything but eat the exceptions anyway. - 2250. By Chris Halse Rogers on 2015-01-21
-
Test that we dispatch a signle event for a single ::dispatch()
- 2251. By Chris Halse Rogers on 2015-01-21
-
Fix more docs; Observers are now called from the dispatch() thread
- 2252. By Chris Halse Rogers on 2015-01-21
-
Factor out fd_is_readable and fd_becomes_
readable. These are broadly useful; now live in mir_test
- 2253. By Chris Halse Rogers on 2015-01-21
-
Add a simple eventloop thread implementation
- 2254. By Chris Halse Rogers on 2015-01-21
-
Promote Dispatchable to a top-level namespace
- 2255. By Chris Halse Rogers on 2015-01-21
-
Move dispatch-y things to mir::dispatch namespace
- 2256. By Chris Halse Rogers on 2015-01-21
-
Also rename the SimpleRPCThread tests
- 2257. By Chris Halse Rogers on 2015-01-21
-
(Prepare to) Pass the set of events through to Dispatchable:
:dispatch. It can be useful for a Dispatchable to know why it's been dispatched.
Specifically, with this we'll be to implement StreamSocketTransport
without an additional epoll fd. - 2258. By Chris Halse Rogers on 2015-01-21
-
Make SimpleDispatchT
hread pass the relevant fd_event to its dispatchee - 2259. By Chris Halse Rogers on 2015-01-21
-
Simplify StreamSocketTra
nsport with the new dispatch(fd_events) interface. We no longer need an epoll fd simply to monitor when our socket is closed.
- 2260. By Chris Halse Rogers on 2015-01-21
-
Return false from Dispatchable:
:dispatch( ) to indicate no further events. Once the remote end has closed the connection a fd will remain in state
fd_event::remote_ closed indefinitely. Likewise, a file descriptor at EOF
will indefinitely be readable.At the same time it's unsafe to ::close the relevant fd. For a start,
there's no guarantee that someone hasn't dup()ed it, or kept a reference
to the mir::Fd that will now close some random file descriptor when it
loses scope. Even if neither of those are true, close()ing a file descriptor
when some other thread is in select(), poll(), or epoll_wait() invokes
undefined behaviour.Let the Dispatchable return false from dispatch to signal that it should
no longer be called, and punt that problem to whatever is dispatching it. - 2261. By Chris Halse Rogers on 2015-01-21
-
Reword watch_fd() comment
- 2262. By Chris Halse Rogers on 2015-01-21
-
socket_fd can be const again
- 2263. By Chris Halse Rogers on 2015-01-21
-
Consume the result of read() in SimpleDispatchT
hread tests - 2264. By Chris Halse Rogers on 2015-01-21
-
Use (and test) Dispatchable:
:relevant_ events( ) - 2265. By Chris Halse Rogers on 2015-01-21
-
Slightly better docs for Dispatchable
- 2266. By Chris Halse Rogers on 2015-01-21
-
CamelCase fd_event and fd_events
- 2267. By Chris Halse Rogers on 2015-01-21
-
Fixup test_naming_policy
- 2268. By Chris Halse Rogers on 2015-01-21
-
Move mt::fd_utils implementation out of header file
- 2269. By Chris Halse Rogers on 2015-01-21
-
Port mir::test::Pipe to mir::Fd
- 2270. By Chris Halse Rogers on 2015-01-21
-
Augment mt::Pipe to accept pipe2() flags
- 2271. By Chris Halse Rogers on 2015-01-21
-
Add a TestDispatchable to mir::test
- 2272. By Chris Halse Rogers on 2015-01-21
-
Simplify SimpleDispatchT
hread tests with mt::TestDispatc hable - 2273. By Chris Halse Rogers on 2015-01-21
-
Fix all the copyrights
- 2274. By Chris Halse Rogers on 2015-01-21
-
SimpleDispatchT
hread: Correctly close shutdown mir::Fd - 2275. By Chris Halse Rogers on 2015-01-21
-
Dispatchable: Doc fixup
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2287
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
| Chris Halse Rogers (raof) wrote : | # |
For what it's worth, ThreadSanitizer gives this code a clean bill of health.
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2288
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
- 2276. By Chris Halse Rogers on 2015-01-22
-
Add new public headers to SHA1sums
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2290
http://
Executed test runs:
SUCCESS: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
| Alan Griffiths (alan-griffiths) wrote : | # |
It isn't clear to me how this fits into the system - I would have expected the code to live in libmirclient, not libmircommon. Why am I wrong?
| Robert Carr (robertcarr) wrote : | # |
Looks good to me. Chris and I discussed the choice of the GC implementation some on IRC...
If anyone else is as confused as I initially was I've archived the conversation for reading:
http://
Some "nonblocking" test comments from the first pass:
+ auto dispatchee = std::make_
This looks like a common enough usage throughout multiple tests that perhaps this could be the default TestDispatchable constructor (with a ::dispatched member variable).
+TEST(DispatchU
Roundtrip?
- 2277. By Chris Halse Rogers on 2015-01-27
-
Drop accidental flag.h detritus from ABI SHAsums
- 2278. By Chris Halse Rogers on 2015-01-27
-
Add an epoll-based MultiplexingDis
patchable - 2279. By Chris Halse Rogers on 2015-01-27
-
Add ability to remove a Dispatchable from a MultiDispatchable
- 2280. By Chris Halse Rogers on 2015-01-27
-
Improve error message for duplicate watches
- 2281. By Chris Halse Rogers on 2015-01-27
-
Make add_watch failures not change the state of the MultiDispatchable
- 2282. By Chris Halse Rogers on 2015-01-27
-
Make MultiDispatchable threadsafe.
We rely on the threadsafety of epoll_ctl/
epoll_wait, and use the EPOLLONESHOT
mode to ensure that each time a watched file descriptor becomes readable it is
dispatched from exactly one thread waiting in epoll_wait. - 2283. By Chris Halse Rogers on 2015-01-27
-
Add optional concurrent dispatch mode to MultiDispatch
- 2284. By Chris Halse Rogers on 2015-01-27
-
Factor out epoll_to_fd_event and fd_event_to_epoll.
Also test them, fixing the embarassing bugs in the existing implementation.
- 2285. By Chris Halse Rogers on 2015-01-27
-
Add some doxygen to MultiplexingDis
patchable - 2286. By Chris Halse Rogers on 2015-01-27
-
Interface for adding/removing a raw {fd, std::function<
void()> } pair - 2287. By Chris Halse Rogers on 2015-01-27
-
Test (and fix) that remove_watch is threadsafe
- 2288. By Chris Halse Rogers on 2015-01-27
-
More threadsafety testing (and a fix) for MultiplexingDis
patchable - 2289. By Chris Halse Rogers on 2015-01-27
-
Test that Dispatchables that return false from dispatch() are removed
- 2290. By Chris Halse Rogers on 2015-01-27
-
Oh, yeah. It's 2015, that's right
- 2291. By Chris Halse Rogers on 2015-01-27
-
Add new public headers to the SHA1sums
| Chris Halse Rogers (raof) wrote : | # |
@Alan: It could be moved to libmirclient, this is true. It's in libmircommon because (a) it's not client-specific code at all, and could be used in the server in future, and (b) because it evolved from code in mircommon.
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2291
http://
Executed test runs:
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
| Alan Griffiths (alan-griffiths) wrote : | # |
> @Alan: It could be moved to libmirclient, this is true. It's in libmircommon
> because (a) it's not client-specific code at all, and could be used in the
> server in future, and (b) because it evolved from code in mircommon.
we envisage three types of downstream:
1. Clients - use libmirclient + libmircommon
2. Servers - use libmirserver + libmirclient + libmirplatform + libmircommon
3. Platforms - use libmirplatform + libmircommon
Unless we see the code being used by platforms then I'd rather see it in libmirclient to minimize those dependencies.
- 2292. By Chris Halse Rogers on 2015-01-27
-
Merge prerequisite branch
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2292
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
- 2293. By Chris Halse Rogers on 2015-01-28
-
Fix epoll event request for subsequent dispatches
- 2294. By Chris Halse Rogers on 2015-01-28
-
Speed up a particularly slow unit-test
| Chris Halse Rogers (raof) wrote : | # |
Oh, yeah. That's right!
@Alan: The reason why this is in mircommon is because AndroidInputRec
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2294
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
| Alan Griffiths (alan-griffiths) wrote : | # |
OK, I'm not convinced we have mir::input in the correct library but this MP didn't cause that.
The next concern I have is whether this code is sufficiently generally applicable to meet the needs of different toolkits. It would be a shame it they each implemented something similar. But I'm not sufficiently informed about that to block.
| Robert Carr (robertcarr) wrote : | # |
InputReceiver was only in mircommon for the qtmir psuedo surface of days past.
| Chris Halse Rogers (raof) wrote : | # |
@Robert - so there's a later cleanup available by way of moving that stuff into mirclient. Sweet.
@Alan - I'm not sure what you mean? This code isn't visible to toolkits; it's purely an implementation detail. It's sufficient to meet the needs of *my* code in subsequent MPs. It'd also be fine for obvious extensions of the client-fd API, such as multiple dispatch queues
| Chris Halse Rogers (raof) wrote : | # |
Hm. And if you mean “implemented something similar” then they mostly already do - each toolkit has its own “add this fd to the set of fds my event loop watches” type API. The GLib version of that specifically is not suitable for this use because you can't get a file descriptor *out* of it, so (unless I wanted to export a GSource from the Mir client API) it's not usable for this purpose.
| Alexandros Frantzis (afrantzis) wrote : | # |
Here is a scenario in which the code seems to not work correctly:
Assume there is one thread in dispatch() (*in_current_
After a call to remove_watch(fd1), the gc_queue contains info to remove dispatchee1, and in_current_
Now a call to remove_watch(fd2) comes, finds *in_current_
If I understood the intention of the code correctly, then dispatchee2 shouldn't have been freed immediately (i.e. while the thread is running). Have I missed something?
| Robert Carr (robertcarr) wrote : | # |
I still wouldn't be unhappy to see the difficult to understand GC system replaced with a read/write mutex based implementation as discussed in the IRC log ;)
- 2295. By Chris Halse Rogers on 2015-02-04
-
Merge trunk
- 2296. By Chris Halse Rogers on 2015-02-04
-
Fix race in MultiplexingDis
patcher: :remove_ watch() We tried to optimise the case where there are no threads in dispatch() by
immediately deleting the relevant Dispatchable.Sadly, this was racy in the presence of multiple calls to remove_watch():
after the first call to remove_watch() we can no longer determine whether
there are any threads in dispatch() using the Dispatchable we want to remove.Add a test, and remove the optimisation to fix it.
- 2297. By Chris Halse Rogers on 2015-02-04
-
Better document the assumptions pull_from_gc_queue makes
| Chris Halse Rogers (raof) wrote : | # |
Thanks for the catch, Alexandros!
I've written a test which triggers that bug, and fixed it.
Bah trying to add a fastpath :)
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2297
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
| Chris Halse Rogers (raof) wrote : | # |
Hm, actually, that test is insufficient. Let me fix it.
- 2298. By Chris Halse Rogers on 2015-02-05
-
Fix multiple-removal threadsafety test, and associated code.
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2298
http://
Executed test runs:
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
| Robert Carr (robertcarr) wrote : | # |
SHA1SUMS STRIKE AGAIN
- 2299. By Chris Halse Rogers on 2015-02-06
-
Merge trunk
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2299
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
| Alexandros Frantzis (afrantzis) wrote : | # |
> I still wouldn't be unhappy to see the difficult to understand GC system replaced with a
> read/write mutex based implementation as discussed in the IRC log ;)
This sounds sensible. I haven't though through the details, but it seems that a read/write lock would make the code *much* simpler. There is going to be a performance hit of course, but I am not convinced it is a problem (uncontended locks are cheap anyway). If we find that it is a bottleneck, we can then look into more complex solutions, such as the lockless-
That being said, the new code seems correct, although I think it's on the verge of being "so complicated that there are no obvious deficiencies".
| Chris Halse Rogers (raof) wrote : | # |
Ok! Good news for those who think the atomics+GC system is too complex. I've done some benchmarks for worst-case overhead - these spin up 8 threads (one per logical CPU on my system) and concurrently dispatch a dispatchable that does nothing but increment an atomic counter until it reaches the limit.
Three different implementations - the current atomic one, RecursiveReadWr
Funky atomics+GC:
~/Canonical/
Dispatching 10000000 times took 9488484241ns
Dispatching 10000000 times took 9396650870ns
Dispatching 10000000 times took 9430338483ns
Dispatching 10000000 times took 9422382181ns
Dispatching 10000000 times took 9316097769ns
Dispatching 10000000 times took 9048800767ns
Dispatching 10000000 times took 9230631179ns
Dispatching 10000000 times took 9059141983ns
Dispatching 10000000 times took 9442825755ns
Dispatching 10000000 times took 9030773329ns
Performance counter stats for 'bin/benchmark_
56429.679021 task-clock (msec) # 6.075 CPUs utilized ( +- 0.79% )
1,090,378 context-switches # 0.019 M/sec ( +- 0.94% )
266 cpu-migrations # 0.005 K/sec ( +- 13.22% )
176 page-faults # 0.003 K/sec ( +- 0.54% )
166,455,416,913 cycles # 2.950 GHz ( +- 0.79% )
<not supported> stalled-
<not supported> stalled-
55,263,597,398 instructions # 0.33 insns per cycle ( +- 0.73% )
13,009,159,774 branches # 230.538 M/sec ( +- 0.88% )
122,711,777 branch-misses # 0.94% of all branches ( +- 0.53% )
9.288753925 seconds time elapsed ( +- 0.62% )
RecursiveReadWr
~/Canonical/
Dispatching 10000000 times took 17776611842ns
Dispatching 10000000 times took 17162422952ns
Dispatching 10000000 times took 16978416490ns
Dispatching 10000000 times took 16816637546ns
Dispatching 10000000 times took 17111542565ns
Dispatching 10000000 times took 16939661008ns
Dispatching 10000000 times took 17518474808ns
Dispatching 10000000 times took 16914539739ns
Dispatching 10000000 times took 17174350069ns
Dispatching 10000000 times took 17391618505ns
Performance counter stats for 'bin/benchmark_
108793.953081 task-clock (msec) # 6.332 CPUs utilized ( +- 0.62% )
4,336,580 context-switches # 0.040 M/sec ( +- 0.91% )
1,617 cpu-migrations # 0.015 K/sec ( +- 22.08% )
183 page-faults # 0.002 K/sec ...
- 2300. By Chris Halse Rogers on 2015-02-11
-
Add benchmark for md::Multiplexin
gDispatchable - 2301. By Chris Halse Rogers on 2015-02-11
-
Use a pthread read-write lock in md::MultiDispat
chable rather than atomics+GC. It turns out that pthread_rwlock is just as fast and is significantly
more obviously correct.The existing mir::RecursiveR
eadWriteMutex is half as fast and has a
lot of behaviours we don't need here; recursive locking and being
able to take both a read-lock and a write-lock in the same thread. - 2302. By Chris Halse Rogers on 2015-02-11
-
Make the benchmark of md::Multiplexin
gDispatchable a bit more specific. Make the dispatch count thread local, rather than process-global; this significantly
reduces the time spent bouncing CPU cachelines in dispatch(), and so makes the runtime
more directly due to md::MultiplexingDispatchable overhead. - 2303. By Chris Halse Rogers on 2015-02-11
-
Add a test for, and fix, concurrent automatic removals in md::Multiplexin
Dispatchable - 2304. By Chris Halse Rogers on 2015-02-11
-
Add copyright header to MultiplexingDis
patchable benchmark
- 2305. By Chris Halse Rogers on 2015-02-11
-
Fix typo in comment
- 2306. By Chris Halse Rogers on 2015-02-11
-
Mark MultiplexingDis
patchable as final. It's not possible to usefully derive from it, so make it a compile
error to try.Might also help compilers generate better code. Who knows!
| PS Jenkins bot (ps-jenkins) wrote : | # |
FAILED: Continuous integration, rev:2306
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild:
http://
- 2307. By Chris Halse Rogers on 2015-02-12
-
Oops. RAII correctly. Thanks, ThreadSanitizer!
- 2308. By Chris Halse Rogers on 2015-02-12
-
Check pipe() and write() returns in MultiplexingDis
patchable benchmark - 2309. By Chris Halse Rogers on 2015-02-12
-
Make ~ReadLock() and ~WriteLock() noexcept.
While pthread_
rwlock_ unlock can technically fail, it can only fail in cases where
the constructor would have thrown anyway, so we gain nothing by error-checking
in the destructor.
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2309
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
- 2310. By Chris Halse Rogers on 2015-02-12
-
Merge trunk
| PS Jenkins bot (ps-jenkins) wrote : | # |
PASSED: Continuous integration, rev:2310
http://
Executed test runs:
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
deb: http://
SUCCESS: http://
SUCCESS: http://
Click here to trigger a rebuild:
http://
| Daniel van Vugt (vanvugt) wrote : | # |
(1) This will limit our performance to only being able to handle one event per timeslice/
459 + auto result = epoll_wait(
Much higher performance can be achieved in theory if you pass an array of N events, and loop through them afterwards. Then our theoretical performance upper limit becomes N events per timeslice instead of one. Rather important for the multiplexing case...
For an example see: man epoll
| Daniel van Vugt (vanvugt) wrote : | # |
And yes epoll_wait is that dumb. Looking at the source it's just implemented a light wrapper around a syscall. So if we don't do the batching in our own code it won't happen. And we'll be forced to switch to/from the kernel once per event... :P
| Chris Halse Rogers (raof) wrote : | # |
Why do you think the epoll call is going to relinquish our timeslice? It's got a timeout of 0 - ie: return immediately without sleeping.
While it's true that we could achieve higher single-thread throughput by acquiring all available events with a single call, this would be either at the cost of multi-thread throughput or significantly more complex code. If we batch then we have a single thread sequentially processing each member of the batch - which somewhat defeats the purpose of making this thread safe, which I specifically use to provide the existing threaded behaviour in later branches.
While syscalls are relatively expensive, they certainly don't terminate your timeslice. Check out the benchmark_
| Chris Halse Rogers (raof) wrote : | # |
Or, to put it another way - when I switched the benchmark to using a thread-local counter for the test dispatchable it reduced the runtime of the benchmark by 20%. The time spent bouncing an atomic int from CPU to CPU was a significant fraction of the total runtime.
| Daniel van Vugt (vanvugt) wrote : | # |
Yeah realistically we won't relinquish the time slice (assuming the kernel is sane). But bouncing into and out of kernel space will cost us. This is why the glibc documentation (and even asio's implementation) uses an array of events. And bouncing between multiple threads costs even more.
If you have multiple events ready and there's a chance you can handle most of them in one time slice (very likely else we need to fix our code), then the array approach will be dramatically more efficient.
You can decide separately if more than one thread is still required. But keep in mind that in a single time slice we can do billions of simple operations. If you intentionally divide up responsibility to a new thread each time, you limit yourself to only thousands of operations. That's potentially a million times slower for the simplest of ops.
It might sound like we're splitting hairs but this is the stuff that will help Mir to catch up to Wayland's performance. And we aim to overtake.
| Daniel van Vugt (vanvugt) wrote : | # |
Correction:
You can decide separately if more than one thread is still required. But keep in mind that in a single time slice we can do millions of simple operations without any context switching. If you intentionally divide up responsibility to a different thread each event, you might not use any more CPU time, but you will take significantly greater real time.
Threads are the problem we need to kill in rewriting our sockets code and catching up to Wayland.
| Chris Halse Rogers (raof) wrote : | # |
1) Bouncing in and out of kernel space costs us, but not all that much.
2) If there are multiple events on a single Dispatchable then passing in an array of events will still only pick up that one Dispatchable
3) If there are multiple Dispatchables ready then this is *exactly* the time that we need to process them on multiple threads (for our threaded input dispatch model).
4) If you're doing eventloop dispatch then you *still* have to dispatch a single event at a time, because that's what toolkits want.
Threads are absolutely not the primary problem with our RPC code, and batching epoll calls will provide no meaningful improvement to the performance of this code.
| Daniel van Vugt (vanvugt) wrote : | # |
I think we might be getting ahead of ourselves.
So long as the Mir server socket code doesn't use this, it's not a problem. It's only our server-side socket handling that will need a more efficient approach than epoll_wait with a single event.
| Chris Halse Rogers (raof) wrote : | # |
So, can I assume that's not needs-fixing, then?
| Daniel van Vugt (vanvugt) wrote : | # |
If you mean "we won't ever use this for the server side socket logic" then I'm happy to abstain.
| Chris Halse Rogers (raof) wrote : | # |
I mean, we don't currently use this for server side socket logic, and
if we do we can revisit the batching.
At some point I can also optimise the (actually pretty common case) of
adding a watch on a MultiplexingDis
| Chris Halse Rogers (raof) wrote : | # |
And once we get to the server side logic we can work out whether
threading client requests is a benefit, too.

FAILED: Continuous integration, rev:2255 jenkins. qa.ubuntu. com/job/ mir-ci/ 2673/ jenkins. qa.ubuntu. com/job/ mir-android- vivid-i386- build/886/ console jenkins. qa.ubuntu. com/job/ mir-clang- vivid-amd64- build/886/ console jenkins. qa.ubuntu. com/job/ mir-mediumtests -vivid- touch/848/ console jenkins. qa.ubuntu. com/job/ mir-vivid- amd64-ci/ 670/console jenkins. qa.ubuntu. com/job/ mir-mediumtests -builder- vivid-armhf/ 848/console
http://
Executed test runs:
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
FAILURE: http://
Click here to trigger a rebuild: s-jenkins. ubuntu- ci:8080/ job/mir- ci/2673/ rebuild
http://