I know that XCB is slower than Xlib because Xlib uses a greater cache size (I've experienced that slowlessness). It makes xcb a bit slower, so I don't know if the problem comes from here or not.
Nevertheless, you can try that: before you use Xlib (that is before running the wm), set the environment variable XLIBBUFFERSIZE to 4 and see if you have the same problem than with XCB.
I don't think that it is the problem, but you can try ;)
I know that XCB is slower than Xlib because Xlib uses a greater cache size (I've experienced that slowlessness). It makes xcb a bit slower, so I don't know if the problem comes from here or not.
Nevertheless, you can try that: before you use Xlib (that is before running the wm), set the environment variable XLIBBUFFERSIZE to 4 and see if you have the same problem than with XCB.
I don't think that it is the problem, but you can try ;)