1. In the PPA:+index (and others) package search form where 'name_filter' is extraced directly from the request 'QUERY_STRING', instead of the form. At that point it's not decoded (it's always in the server expected encoding, UTF-8). If it's passed directly to IArchive.getPublishedSources() (or any other storm query) it will fail badly because storm expects unicodes.
2. In the Distribution:+search package search form where the form variables (unicodes) are directly passed to urllib.urlencode() in order to build the corresponding 'source-based' search url. urllib doesn't accept unicodes.
So the *very* experimental fix are:
For case 1. we can conveniently use BrowserRequest._decode() which respect the server accepted charsets (in our case only UTF-8) to decode the QUERY_STRING variables appopriately, as they would be available in the request.form.
For case 2. we have to encode the form variable as UTF-8 (explicitly) before passing them to urllib.
I'm not entriely sure it's the right way to go. I'd prefer something embedded in get_query_string_params() that could potentially solve both aspects of the problem.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
reviewer michael.nelson
= Summary =
This branch fixes https:/ /bugs.edge. launchpad. net/soyuz/ +bug/404144, involving encoding/decoding issues on form variables not dealt by BrowserRequest.
== Proposed fix ==
The bug manifest itself in two ways:
1. In the PPA:+index (and others) package search form where 'name_filter' is extraced directly from the request 'QUERY_STRING', instead of the form. At that point it's not decoded (it's always in the server expected encoding, UTF-8). If it's passed directly to IArchive. getPublishedSou rces() (or any other storm query) it will fail badly because storm expects unicodes.
2. In the Distribution: +search package search form where the form variables (unicodes) are directly passed to urllib.urlencode() in order to build the corresponding 'source-based' search url. urllib doesn't accept unicodes.
So the *very* experimental fix are:
For case 1. we can conveniently use BrowserRequest. _decode( ) which respect the server accepted charsets (in our case only UTF-8) to decode the QUERY_STRING variables appopriately, as they would be available in the request.form.
For case 2. we have to encode the form variable as UTF-8 (explicitly) before passing them to urllib.
I'm not entriely sure it's the right way to go. I'd prefer something embedded in get_query_ string_ params( ) that could potentially solve both aspects of the problem.
== Tests ==
./bin/test -vv -t archive-views.txt -t distribution- views.txt
== Demo and Q/A ==
Type any non-ascii text in:
* https:/ /launchpad. dev/ubuntu /launchpad. dev/~cprov/ +archive/ ppa
* https:/
(I'm not entirely sure how can it be done in a US-only system, I use an US-intl keyboard and have the appropriate lang-packs for pt_BR)
= Launchpad lint =
Checking for conflicts. and issues in doctests and templates.
Running jslint, xmllint, pyflakes, and pylint.
Using normal rules.
Linting changed files: soyuz/browser/ tests/archive- views.txt registry/ browser/ tests/distribut ion-views. txt registry/ browser/ distribution. py soyuz/browser/ archive. py
lib/lp/
lib/lp/
lib/lp/
lib/lp/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkp rGZYACgkQ7KBXuX yZSjAQzQCeMNz5Q QouCHL7hI/ Xm0NLfXFq nzg+Le1gVzOs4zm hs
lI0An2jDHkNSoZQ
=/ddq
-----END PGP SIGNATURE-----