Comment 19 for bug 1812935

Revision history for this message
Michele Baldessari (michele) wrote :

So I have done a bit of a code-walkthrough (note I have no idea about oslo nor memcached nor nova)L

1) The first exception we hit is this one:
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.
rpc.server [req-284f3071-8eee-4dcb-903c-838f2e024b48 40ca1490773f49f791d3a834af3702c8 8671bdf05abf48f58a9bdcdb0ef4b740 - default default] Exception during message handling: TypeError: object() takes no parameters
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_cache/_memcache_pool.py", line 163, in _get
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server conn = self.queue.pop().connection
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server IndexError: pop from an empty deque

The above exception comes from the following lines:
    def _get(self):
        try:
            conn = self.queue.pop().connection
        except IndexError:
            conn = self._create_connection()
        self._acquired += 1
        return conn

2)
So this was the first connection and it failed because the queue of connections is empty and so we caught the IndexError exception but we failed inside the catch branch:
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
...snip...
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_cache/_memcache_pool.py", line 214, in _get
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server conn = ConnectionPool._get(self)
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_cache/_memcache_pool.py", line 165, in _get
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server conn = self._create_connection()
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_cache/_memcache_pool.py", line 206, in _create_connection
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server return _MemcacheClient(self.urls, **self._arguments)
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server TypeError: object() takes no parameters
2019-01-17 13:59:37.096 46 ERROR oslo_messaging.rpc.server

Now the reason we failed seems to me due to this sequence:
2.1) We call conn = self._create_connection()
    def _create_connection(self):
        return _MemcacheClient(self.urls, **self._arguments)

2.2) The problem seems to be in how _MemcacheClient(...) overrides stuff:
class _MemcacheClient(memcache.Client):
    """Thread global memcache client

    As client is inherited from threading.local we have to restore object
    methods overloaded by threading.local so we can reuse clients in
    different threads
    """
    __delattr__ = object.__delattr__
    __getattribute__ = object.__getattribute__
    __new__ = object.__new__
    __setattr__ = object.__setattr__

    def __del__(self):
        pass

To confirm that it is indeed the above object.* overriding the root cause of the exception, I commented all of them out in a test container and then redeployed. Indeed with this commenting out the deployment succeeds and no errors are seen in the nova logs.

I think we should now focus as to what these object.__*__ overrides do and why they break in python 3. My uninformed hunch is that there is a conflict between the signature of those methods as defined in the parent class (memcache.Client) and as defined in python3 object.