UnicodeDecodeError when loading job files

Bug #1015174 reported by Brendan Donegan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Checkbox
Fix Released
High
Daniel Manrique

Bug Description

Recent runs of checkbox have the following error strewn through the log file:

2012-06-19 11:05:07,469 ERROR Error running event handler <string> MessageInfo.message_file(<_io.TextIOWrapper name='/usr/share/checkbox/jobs/server-services.txt' mode='r' encoding='ANSI_X3.4-1968'>, /usr/share/checkbox/jobs/server-services.txt) for event type 'message-file'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/checkbox/reactor.py", line 74, in fire
    results.append(handler(*args, **kwargs))
  File "<string>", line 112, in message_file
  File "/usr/lib/python3/dist-packages/checkbox/lib/template_i18n.py", line 132, in load_file
    elements = super(TemplateI18n, self).load_file(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 73, in load_file
    for string in self._reader(file):
  File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 31, in _reader
    buffer_new = file.read(size)
  File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 253: ordinal not in range(128)

This prevents any of the jobs from being loaded and any tests being run

Related branches

Revision history for this message
Daniel Manrique (roadmr) wrote :

I notice

encoding='ANSI_X3.4-1968'

which is weird, because we should read the files with encoding='UTF-8' I think.

This may be happening because the locale is not set and the default charset is used, see this:

charset 'C' (canonical name: ANSI_X3.4-1968) will be used.

So one solution would be to explicitly set the encoding for when a file is read.

This is just speculation though. BTW, this has happened only on server installations which are more prone to having weird language configurations: contrast values of LANG, on desktop it's for example "en_US.UTF-8", while on server it just has "en_US".

I'll keep looking into this.

Revision history for this message
Daniel Manrique (roadmr) wrote :

I modified the /etc/init.d/checkbox-certification server script to output a list of exports as well as LANG, and it turns out it's not set:

export INSTANCE=''
export JOB='networking'
export PATH='/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin'
export PWD='/'
export TERM='linux'
export UPSTART_EVENTS='started'
export UPSTART_INSTANCE=''
export UPSTART_JOB='checkbox-certification-server'

echoing $LANG turns up nothing.

I'll next try to find a place to explicitly set the encoding so that checkbox is at least able to read the jobs. If this works, we can then determine whether it's sensible to always fall back to UTF-8 encoding, but honor the system's encoding first if present.

Revision history for this message
Daniel Manrique (roadmr) wrote :

OK, steps to reproduce the underlying problem:

$ unset LANG #to fallback to C locale, ANSI-whatever encoding
$ python3
>>> from checkbox.lib.template_i18n import TemplateI18n
>>> t=TemplateI18n()
>>> t.load_file(open("/usr/share/checkbox/jobs/usb.txt","r"),"")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/checkbox/lib/template_i18n.py", line 132, in load_file
    elements = super(TemplateI18n, self).load_file(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 73, in load_file
    for string in self._reader(file):
  File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 31, in _reader
    buffer_new = file.read(size)
  File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 291: ordinal not in range(128)
>>>

Changed in checkbox:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Daniel Manrique (roadmr) wrote :

And further, to run this from trunk:

LANG= PYTHONPATH=./ python3

and from within the interpreter:

>>> from checkbox.lib.template_i18n import TemplateI18n
>>> t=TemplateI18n()
>>> t.load_file(open("/usr/share/checkbox/jobs/usb.txt","r"),"")

Daniel Manrique (roadmr)
Changed in checkbox:
assignee: nobody → Daniel Manrique (roadmr)
status: Triaged → In Progress
Daniel Manrique (roadmr)
Changed in checkbox:
status: In Progress → Fix Committed
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

Submissions are coming in freely now, so this fix seems to have worked.

Changed in checkbox:
status: Fix Committed → Fix Released
Revision history for this message
Daniel Manrique (roadmr) wrote :

Note that this bug does NOT occur on Python 2, meaning that it won't be candidate for SRU, as the Python 3 version is also not SRUable for obvious reasons :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.