Code review comment for lp:~kiithsacmp/stellarium/glexperiment

Revision history for this message
Ferdinand Majerech (kiithsacmp) wrote :

That is... really weird. It would make sense if it would be fragment shader on so many pixels, but
GL1 would be fast then (unless the driver emulates GL1 with a non-cheap fragment shader - but then trunk would be slow, too).

The largest resolution I can test with is 2560x1440, and the AMD open source drivers have no problem with
60 FPS with that resolution. My GPU is probably a bit faster (non-mobile), but it still doesn't explain trunk-vs-refactored difference.

I might try to APITrace pre- and post- change code to see what different calls are emitted, but not right now
(working on statistics right now, want to have that before optimization work).

If your profiler only records GL calls (which is what APITrace does), it's not that useful; you need to know
about how the driver works with which calls to figure out what is slow. It would be awesome to have a profiler
that actually, well, profiles, which is complicated as the GPU is working asynchronously. Afaik AMD gDebugger does something like that, once Catalyst supports 12.10 I might try that.

It would be useful to have a Mac profile with an Intel GPU (afaik there are no Macs with AMD GPUs?), especially trunk-vs-refactored difference. If the problem is still there, it's Mac specific. If not, NVidia specific (or NVidia+Mac specific). Does the Mac have Optimus, and if so, is there some way to force the Intel GPU to be used?

Also, it'd be good to have some Windows numbers. Don't have a Windows box, though, nor any experience with
compiling a C++ project on Windows.

« Back to merge proposal