Comment 21 for bug 1638695

Revision history for this message
Elvis Pranskevichus (elprans) wrote :

After much testing I found what is causing the regression in 16.04 and later. There are several distinct causes which are attributed to the choices made in debian/rules and the changes in GCC.

Cause #1: the decision to compile `Modules/_math.c` with `-fPIC` *and* link it statically into the python executable [1]. This causes the majority of the slowdown. This may be a bug in GCC or simply a constraint, I didn't find anything specific on this topic, although there are a lot of old bug reports regarding the interaction of -fPIC with -flto.

Cause #2: the enablement of `fpectl` [2], specifically the passage of `--with-fpectl` to `configure`. fpectl is disabled in python.org builds by default and its use is discouraged. Yet, Debian builds enable it unconditionally, and it seems to cause a significant performance degradation. It's much less noticeable on 14.04 with GCC 4.8.0, but on more recent releases the performance difference seems to be larger.

Plausible Cause #3: stronger stack smashing protection in 16.04, which uses --fstack-protector-strong, wherease 14.04 and earlier used --fstack-protector (with lesser performance overhead).

Also, debian/rules limits the scope of PGO's PROFILE_TASK to 377 test suites vs upstream's 397, which affects performance somewhat negatively, but this is not definitive. What are the reasons behind the trimming of the tests used for PGO?

Without fpectl, and without -fPIC on _math.c, 2.7.12 built on 16.04 is slower than stock 2.7.6 on 14.04 by about 0.9% in my pyperformance runs [3]. This is in contrast to a whopping 7.95% slowdown when comparing stock versions.

Finally, a vanilla Python 2.7.12 build using GCC 5.4.0, default CFLAGS, default PROFILE_TASK and default Modules/Setup.local consistently runs faster in benchmarks than 2.7.6 (by about 0.7%), but I was not able to pinpoint the exact reason for that.

Note: the percentages above are the relative change in the geometric mean of pyperformance benchmark results.

[1] https://git.launchpad.net/~usd-import-team/ubuntu/+source/python2.7/tree/debian/rules?h=ubuntu/xenial-updates#n421

[2] https://git.launchpad.net/~usd-import-team/ubuntu/+source/python2.7/tree/debian/rules?h=ubuntu/xenial-updates#n117

[3] https://docs.google.com/spreadsheets/d/1L3_gxe-AOYJsXFwGZgFko8jaChB0dFPjK5oMO5T5vj4/edit?usp=sharing