Comment 69 for bug 1640518

Revision history for this message
Aaron Sawdey (acsawdey) wrote :

An update on my experiments:
* 500 runs no failures with TLE disabled
* 500 runs no failures TLE enabled but mprotect() syscall in Canary constructor/destructor
* 500 runs 11 failed with TLE enabled so about 2% fail rate
* Tried switching SMT off and interestingly got 200 runs no fails with TLE enabled.

This suggests to me that the timing of this race condition is rather tight. A lot of the fails are in the checksum right after the memset in the Canary constructor, which means the other guy comes back and writes the stack after memset wrote it but before the checksum read it. Also if strace's syscall timing is to be believed, 14% of the time mprotect() is under 8 microseconds, and even that seems to be sufficient to prevent the problem. Finally, by forcing other pthreads to always be on a different processor core (by disabling SMT) we also apparently eliminate this.